## Numpy

Effiziente Verarbeitung von Zahlenmengen

Ermöglicht uns, Arrays zu Erstellen welche mit puren Zahlen gefüllt sind, anstatt mit (Python-) Objekten

Muss installiert werden über Python Packages oder pip
- pip install numpy

In [1]:
import numpy as np

## Arrays

np.array um aus einer Python List ein Numpy Array zu erzeugen

In [2]:
np.array([1, 2, 3, 4])

array([1, 2, 3, 4])

In [3]:
a = np.array([1, 2, 3, 4])

In [4]:
print(a)

[1 2 3 4]


In [5]:
type(a)  # ndarray: n-dimensional array (1D, 2D, 3D, ...)

numpy.ndarray

Index genauso wie in Python

In [6]:
a[0]

1

In [7]:
a[-1]

4

In [8]:
a[0:2]

array([1, 2])

In [9]:
a[1:]

array([2, 3, 4])

In [10]:
a[-3:-1]

array([2, 3])

Typen angeben

In [11]:
b = np.array([1, 2, 3, 4], dtype=np.int8)

In [12]:
a.dtype  # 4 Byte

dtype('int32')

In [13]:
b.dtype  # 1 Byte

dtype('int8')

Kommazahlen

In [14]:
c = np.array([1.4, 2.2, 8.4, 7.6])

In [15]:
c.dtype

dtype('float64')

In [16]:
d = np.array([1.4, 2.2, 8.4, 7.6], dtype=np.float16)

In [17]:
d.dtype

dtype('float16')

## Matrizen

Matrizen: Zweidimensionale Arrays

z.B. DB-Tabelle, Excel-Tabelle

Ein Datenset kann als Matrix dargestellt werden

In [18]:
e = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [19]:
e

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [20]:
e.ndim  # Anzahl Dimensionen

2

In [21]:
e.size  # Anzahl Elemente

9

In [22]:
e.shape  # Form (hier 3x3)

(3, 3)

Index hier auch sehr wichtig

In [23]:
e[1]

array([4, 5, 6])

In [24]:
e[1][0]  # Zeile 1, Spalte 0

4

In [25]:
e[1, 0]  # Selbiges wie darüber, nur mit Numpy möglich

4

### Slicing von Matrizen

-> Teile einer Matrix entnehmen

z.B.:

e[1:]

e[:]

e[:, 2]

In [26]:
e[1:]

array([[4, 5, 6],
       [7, 8, 9]])

In [27]:
e[:]

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [28]:
e[-1]

array([7, 8, 9])

In [29]:
e[:, 2]  # Letzte Spalte als ein Array nehmen

array([3, 6, 9])

In [30]:
e[:, 1:]

array([[2, 3],
       [5, 6],
       [8, 9]])

In [31]:
e[1:, :2]

array([[4, 5],
       [7, 8]])

In [32]:
e[:2, 1:]

array([[2, 3],
       [5, 6]])

### Neue Werte eintragen

In [33]:
e[1, 1] = 10

In [34]:
e

array([[ 1,  2,  3],
       [ 4, 10,  6],
       [ 7,  8,  9]])

In [35]:
e[1] = 100

In [36]:
e

array([[  1,   2,   3],
       [100, 100, 100],
       [  7,   8,   9]])

In [37]:
e[1] = [6, 5, 4]

In [38]:
e

array([[1, 2, 3],
       [6, 5, 4],
       [7, 8, 9]])

### Einfache Analyse von Arrays

In [39]:
a.sum()

10

In [40]:
a.mean()

2.5

In [41]:
a.std()

1.118033988749895

In [42]:
a.var()

1.25

Mit Matrizen

In [43]:
e.sum()

45

In [44]:
e.mean()

5.0

Diese Funktionen können auch mit einem Index kombiniert werden

In [45]:
e[0].sum()

6

In [46]:
e[:2, 1:].mean()

3.5

Bei diesen Funktionen kann auch eine Achse mitgegeben werden

In [47]:
e.sum(axis=0)  # Senkrecht

array([14, 15, 16])

In [48]:
e.sum(axis=1)  # Waagrecht

array([ 6, 15, 24])

## Vektorisierung von Arrays

Ein gesamtes Array mit einer einzigen Operation verarbeiten

In [49]:
np.array(range(10, 20))

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

In [50]:
f = np.arange(10, 20)  # arange: array + range

In [51]:
f

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

Aufgabe: Zahlen filtern, welche durch 2 teilbar sind

In [52]:
f + 10

array([20, 21, 22, 23, 24, 25, 26, 27, 28, 29])

In [53]:
f * f

array([100, 121, 144, 169, 196, 225, 256, 289, 324, 361])

### Boolean Masken
Ein Array mit einer Bedingung vergleichen

Verwendung: array[Bedingung]

In [54]:
f % 2 == 0

array([ True, False,  True, False,  True, False,  True, False,  True,
       False])

In [55]:
f[f % 2 == 0]

array([10, 12, 14, 16, 18])

In [56]:
f[f >= 15]

array([15, 16, 17, 18, 19])

In [57]:
g = np.arange(0, 20, 2)

In [58]:
g

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [59]:
f[f % g == 0]  # Vektorisierung kann auch mit ganzen Arrays durchgeführt werden

  f[f % g == 0]  # Vektorisierung kann auch mit ganzen Arrays durchgeführt werden


array([10, 12])

In [60]:
# Alle Werte, bei denen die obere dividiert durch die untere Zahl 0 ergibt
print(f)
print(g)

[10 11 12 13 14 15 16 17 18 19]
[ 0  2  4  6  8 10 12 14 16 18]


In [61]:
e

array([[1, 2, 3],
       [6, 5, 4],
       [7, 8, 9]])

In [62]:
e.mean()

5.0

Alle Werte finden, welche unter dem Durchschnitt sind

In [63]:
e[e < e.mean()]

array([1, 2, 3, 4])

In [64]:
e[e > e.mean()]

array([6, 7, 8, 9])

In [65]:
h = np.random.randint(100)  # Eine Random Zahl

In [66]:
h

10

In [67]:
h = np.random.randint(100, size=(10, 10))

In [68]:
h

array([[77,  2, 60, 74, 52,  9, 11,  0,  6, 46],
       [89, 94, 70, 51, 23, 46, 50, 86, 46, 25],
       [81, 19, 62, 29, 44, 75, 81, 11, 10, 32],
       [92, 91, 81, 64, 51, 29,  5, 13, 19, 10],
       [77, 95,  9, 25, 13, 21, 86, 78, 97, 27],
       [15, 70, 74, 62, 79, 79, 44, 37, 81, 67],
       [18, 92, 27, 50, 35, 87, 52, 58, 46, 15],
       [95, 78, 40, 33, 71, 12, 96,  0, 37, 51],
       [42, 82, 75, 55, 91, 26,  3, 40, 76, 82],
       [ 2, 32, 90, 23, 70, 70, 42, 62,  0, 55]])

In [69]:
g50 = h[h > 50]

In [70]:
g50

array([77, 60, 74, 52, 89, 94, 70, 51, 86, 81, 62, 75, 81, 92, 91, 81, 64,
       51, 77, 95, 86, 78, 97, 70, 74, 62, 79, 79, 81, 67, 92, 87, 52, 58,
       95, 78, 71, 96, 51, 82, 75, 55, 91, 76, 82, 90, 70, 70, 62, 55])

In [71]:
len(g50)

50

## Performance

In [72]:
import sys

In [73]:
x = 5

In [74]:
sys.getsizeof(x)  # 28 Byte für einen Python Int

28

In [75]:
i = np.array([1])

In [76]:
i

array([1])

In [77]:
np.dtype(int).itemsize

4

In [78]:
np.dtype(np.int8).itemsize

1

In [79]:
np.dtype(np.float16).itemsize

2

Vergleich von Python Liste und Numpy Array

In [80]:
pList = list(range(1_000_000))

In [81]:
pArray = np.array(range(1_000_000))

In [82]:
%time sum([x ** 2 for x in pList])

CPU times: total: 141 ms
Wall time: 151 ms


333332833333500000

In [83]:
%time np.sum(pArray ** 2)

CPU times: total: 0 ns
Wall time: 3.01 ms


584144992

150ms -> 3ms

## Andere Funktionen

#### Zufallszahlen

In [85]:
np.random.random(100)  # 100 Kommazahlen zw. 0 und 1

array([0.81764974, 0.26944901, 0.02890722, 0.96061274, 0.53236791,
       0.94012166, 0.79826229, 0.27365162, 0.66845558, 0.91303774,
       0.90398939, 0.74820547, 0.23196984, 0.65941023, 0.89595708,
       0.86727629, 0.08640634, 0.88171943, 0.33741521, 0.3763924 ,
       0.2191705 , 0.01503626, 0.12174629, 0.70171915, 0.63270167,
       0.31206865, 0.51487361, 0.89303782, 0.53430418, 0.3404478 ,
       0.76404361, 0.6746336 , 0.41613949, 0.17781342, 0.41460633,
       0.32627217, 0.99743212, 0.97064847, 0.26871185, 0.04891679,
       0.19882804, 0.33705324, 0.87988677, 0.14968568, 0.66172828,
       0.30915744, 0.13395406, 0.38516222, 0.0509083 , 0.57830116,
       0.80847593, 0.32544063, 0.40117589, 0.47319811, 0.33042188,
       0.00259253, 0.46470741, 0.64282008, 0.30939323, 0.39747704,
       0.87135809, 0.76681499, 0.04150795, 0.42272597, 0.12353841,
       0.37756244, 0.16379295, 0.55422825, 0.93316588, 0.5655237 ,
       0.91384819, 0.51921845, 0.40807526, 0.86123509, 0.15295

In [88]:
np.random.randint(100, size=(100))

array([41, 69, 85, 45, 39, 90, 54, 80, 15, 30, 58, 45, 12, 99, 95, 61, 68,
       12, 56, 78, 86, 89,  2, 33, 64, 22, 27, 48, 31, 84, 47, 42, 59, 39,
       63,  2, 23, 12, 79, 79, 72, 62, 46, 73, 47, 33, 43, 67, 72,  7, 81,
       51, 45, 87, 54, 28, 59, 55, 29,  1, 96, 60, 13, 64, 46, 79, 98, 78,
       63, 57, 96, 82, 94, 90, 85, 20,  0, 59, 16,  6, 33, 30, 16,  7, 16,
       22, 28, 78, 52, 92,  0, 88, 74, 38, 81, 80, 85, 41, 83, 89])

#### Form von einem Array/einer Matrix ändern

In [89]:
j = np.arange(0, 100)

In [92]:
j.reshape(10, 10)  # 100 Array zu 10x10 Matrix umwandeln

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

In [95]:
# j.reshape(3, 33)  # Nur 99 Platz für 100 Elemente -> Nicht möglich

In [97]:
j.reshape(5, 5, 4)  # Quader mit 5x4 Oberfläche und 5 Schichten

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15],
        [16, 17, 18, 19]],

       [[20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31],
        [32, 33, 34, 35],
        [36, 37, 38, 39]],

       [[40, 41, 42, 43],
        [44, 45, 46, 47],
        [48, 49, 50, 51],
        [52, 53, 54, 55],
        [56, 57, 58, 59]],

       [[60, 61, 62, 63],
        [64, 65, 66, 67],
        [68, 69, 70, 71],
        [72, 73, 74, 75],
        [76, 77, 78, 79]],

       [[80, 81, 82, 83],
        [84, 85, 86, 87],
        [88, 89, 90, 91],
        [92, 93, 94, 95],
        [96, 97, 98, 99]]])

In [98]:
k = np.arange(0, 99).reshape(11, 9)

In [99]:
k

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8],
       [ 9, 10, 11, 12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23, 24, 25, 26],
       [27, 28, 29, 30, 31, 32, 33, 34, 35],
       [36, 37, 38, 39, 40, 41, 42, 43, 44],
       [45, 46, 47, 48, 49, 50, 51, 52, 53],
       [54, 55, 56, 57, 58, 59, 60, 61, 62],
       [63, 64, 65, 66, 67, 68, 69, 70, 71],
       [72, 73, 74, 75, 76, 77, 78, 79, 80],
       [81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98]])

In [100]:
k.reshape(99)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
       51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
       68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
       85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98])

In [108]:
k.reshape(-1)  # reshape(-1): Entferne alle Dimensionen

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
       51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
       68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
       85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98])

In [109]:
l = j.reshape(5, 5, 4)

In [110]:
l

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15],
        [16, 17, 18, 19]],

       [[20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31],
        [32, 33, 34, 35],
        [36, 37, 38, 39]],

       [[40, 41, 42, 43],
        [44, 45, 46, 47],
        [48, 49, 50, 51],
        [52, 53, 54, 55],
        [56, 57, 58, 59]],

       [[60, 61, 62, 63],
        [64, 65, 66, 67],
        [68, 69, 70, 71],
        [72, 73, 74, 75],
        [76, 77, 78, 79]],

       [[80, 81, 82, 83],
        [84, 85, 86, 87],
        [88, 89, 90, 91],
        [92, 93, 94, 95],
        [96, 97, 98, 99]]])

In [111]:
l.reshape(-1)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
       51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
       68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
       85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])

In [116]:
m = np.arange(10).reshape(10, 1)

In [117]:
m

array([[0],
       [1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8],
       [9]])

In [118]:
m.reshape(-1)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

#### Zeros, Ones

In [121]:
np.zeros(10)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [122]:
np.ones(10)

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

Truthiness: Wann ist ein Wert True/False in Python?

In [123]:
print(bool(1))

True


In [125]:
print(bool(0))

False


In [127]:
print(bool("Hallo"))

True


In [128]:
print(bool(""))

False


#### Linspace

= linear Space

In [135]:
n = np.linspace(0, 10, 10)  # X bis Y, mit Z Elementen

In [136]:
n

array([ 0.        ,  1.11111111,  2.22222222,  3.33333333,  4.44444444,
        5.55555556,  6.66666667,  7.77777778,  8.88888889, 10.        ])

In [139]:
np.linspace(0, 10, 21)

array([ 0. ,  0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5,  5. ,
        5.5,  6. ,  6.5,  7. ,  7.5,  8. ,  8.5,  9. ,  9.5, 10. ])

In [147]:
np.linspace(-10, 40, 501)

array([-10. ,  -9.9,  -9.8,  -9.7,  -9.6,  -9.5,  -9.4,  -9.3,  -9.2,
        -9.1,  -9. ,  -8.9,  -8.8,  -8.7,  -8.6,  -8.5,  -8.4,  -8.3,
        -8.2,  -8.1,  -8. ,  -7.9,  -7.8,  -7.7,  -7.6,  -7.5,  -7.4,
        -7.3,  -7.2,  -7.1,  -7. ,  -6.9,  -6.8,  -6.7,  -6.6,  -6.5,
        -6.4,  -6.3,  -6.2,  -6.1,  -6. ,  -5.9,  -5.8,  -5.7,  -5.6,
        -5.5,  -5.4,  -5.3,  -5.2,  -5.1,  -5. ,  -4.9,  -4.8,  -4.7,
        -4.6,  -4.5,  -4.4,  -4.3,  -4.2,  -4.1,  -4. ,  -3.9,  -3.8,
        -3.7,  -3.6,  -3.5,  -3.4,  -3.3,  -3.2,  -3.1,  -3. ,  -2.9,
        -2.8,  -2.7,  -2.6,  -2.5,  -2.4,  -2.3,  -2.2,  -2.1,  -2. ,
        -1.9,  -1.8,  -1.7,  -1.6,  -1.5,  -1.4,  -1.3,  -1.2,  -1.1,
        -1. ,  -0.9,  -0.8,  -0.7,  -0.6,  -0.5,  -0.4,  -0.3,  -0.2,
        -0.1,   0. ,   0.1,   0.2,   0.3,   0.4,   0.5,   0.6,   0.7,
         0.8,   0.9,   1. ,   1.1,   1.2,   1.3,   1.4,   1.5,   1.6,
         1.7,   1.8,   1.9,   2. ,   2.1,   2.2,   2.3,   2.4,   2.5,
         2.6,   2.7,

#### hstack

= horizontal stack

In [158]:
o = np.arange(100).reshape(10, 10)

In [159]:
o

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

In [160]:
p = np.arange(10).reshape(10, 1)

In [161]:
p

array([[0],
       [1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8],
       [9]])

In [162]:
np.hstack((o, p))  # WICHTIG: Hier innerhalb der Funktion die Arrays zum stapeln nochmal in eine Tupel packen

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  0],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19,  1],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29,  2],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39,  3],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49,  4],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59,  5],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69,  6],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79,  7],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89,  8],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99,  9]])

In [166]:
np.hstack((o, o.mean(axis=1).reshape(10, 1)))

array([[ 0. ,  1. ,  2. ,  3. ,  4. ,  5. ,  6. ,  7. ,  8. ,  9. ,  4.5],
       [10. , 11. , 12. , 13. , 14. , 15. , 16. , 17. , 18. , 19. , 14.5],
       [20. , 21. , 22. , 23. , 24. , 25. , 26. , 27. , 28. , 29. , 24.5],
       [30. , 31. , 32. , 33. , 34. , 35. , 36. , 37. , 38. , 39. , 34.5],
       [40. , 41. , 42. , 43. , 44. , 45. , 46. , 47. , 48. , 49. , 44.5],
       [50. , 51. , 52. , 53. , 54. , 55. , 56. , 57. , 58. , 59. , 54.5],
       [60. , 61. , 62. , 63. , 64. , 65. , 66. , 67. , 68. , 69. , 64.5],
       [70. , 71. , 72. , 73. , 74. , 75. , 76. , 77. , 78. , 79. , 74.5],
       [80. , 81. , 82. , 83. , 84. , 85. , 86. , 87. , 88. , 89. , 84.5],
       [90. , 91. , 92. , 93. , 94. , 95. , 96. , 97. , 98. , 99. , 94.5]])