# Introduction to numpy

numpy is a python library for vectors, matrices and general multidimensional arrays. It is highly optimized and therefore faster and more efficient than the python standard library. 

First you have to import numpy. A common convention is to import numpy as np.

In [1]:
import numpy as np

To create a numpy-array you first create a normal Python list an then convert via `np.array(list)` to a numpy-array.

In [2]:
np.array([1,2,3,4])

array([1, 2, 3, 4])

Numpy-arrays can handle many types as Ints, Floats and Strings.

In [3]:
cvalues = [25.3, 24.8, 26.9, 23.9]
C = np.array(cvalues)

You can perform basic operations on numpy-arrays. In contrast to normal Python lists the operations are performed elementwise on the elements of the array. So to perform an operation on every element you don't have to iterate of the list or perform a list comprehension.

In [4]:
C * 9 / 5 + 32

array([77.54, 76.64, 80.42, 75.02])

In [5]:
[x * 9/5 + 32 for x in cvalues]

[77.54, 76.64, 80.42, 75.02]

## Werte mit gegebener Schrittweite erzeugen

Like the standard Python `range()` you can create numpy-arrays with evenly spaced values within a given range (`interval.arange([start, ] stop[, step,], dtype=None)`). You can specify with start the included start-value, with stop the not included stop-value and with step the step size between the values. Also you can specify the data-type with dtype, which is infered from the other parameters if not specified.

In [6]:
np.arange(3.0)

array([0., 1., 2.])

In [7]:
np.arange(1,5,2)

array([1, 3])

## Vergleich Python-Listen vs. Numpy-Arrays

Numpy-arrays and the numpy operations are highly optimized when compared to lists. This means you should use the numpy-library when ever it is possibly to get better performance and reduce runtime.

In [8]:
import time

v = [e for e in range(10000)]

start = time.time()
for i in range(10000):
    x = [e+e for e in v]
    v = [e/2 for e in x]
time_lists = time.time() - start

arr = np.array(v)
start = time.time()
for i in range(10000):
    x = arr + arr
    arr = x/2
time_arrays = time.time() - start

print('time_list:', time_lists)
print('time_arrays:', time_arrays)

time_list: 7.401436805725098
time_arrays: 0.09380006790161133


## Arrays mit mehreren Achsen

Numpy-arrays can be multi dimensinal.

You can create scalars with only a single value.

In [9]:
np.array(42)

array(42)

Or you can create a vector with one dimension.

In [10]:
np.array([3.4, 6.9, 99.8, 12.8])

array([ 3.4,  6.9, 99.8, 12.8])

But you can also create a matrix with two dimensions, like nested lists in standard Python.

In [11]:
np.array([[ 3.4,  8.7,  9.9 ], \
          [ 1.1, -7.8, -0.7 ], \
          [ 4.1, 12.3,  4.8 ]])

array([[ 3.4,  8.7,  9.9],
       [ 1.1, -7.8, -0.7],
       [ 4.1, 12.3,  4.8]])

Last but not least numpy supports tensors with a third dimension.

In [12]:
np.array([[[ 111, 112 ], [ 121, 122 ]], \
          [[ 211, 212 ], [ 221, 222 ]], \
          [[ 311, 312 ], [ 321, 322 ]]])

array([[[111, 112],
        [121, 122]],

       [[211, 212],
        [221, 222]],

       [[311, 312],
        [321, 322]]])

## Shape eines Arrays

The shape of an array indicates the dimensions of the array. To get the shape of an array numpy has the function `np.shape(array)` and every array has the property `shape`. If an array has the shape `(6,3)` it represents a 6x3 matrix with 6 rows and 3 columns.

In [13]:
x = np.array([[67, 63, 87], \
              [77, 69, 59], \
              [77, 69, 59], \
              [67, 63, 87], \
              [67, 63, 87], \
              [67, 63, 87]])

np.shape(x)
x.shape # alternative.

(6, 3)

## Shape ändern

By calling `reshape` on an array you can change the dimensions of the array. This operation is only possible if the new shape fits the length of original array.

In [14]:
a = np.arange(12).reshape(3, 4)
print(a)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


In [15]:
a.shape = (2, 6)
print(a)

[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]]


The `reshape` operation supports multiple dimensions. The product of the specified shape has to fit the product of the old shape.

In [16]:
np.arange(24).reshape(2, 3, 4)

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

## Transponieren

For matrix operations you sometimes need to transpose the matrix. In numpy you can do this by calling `array.T` on a matrix or `array.transpose(axes)` and specifing the axes.

In [17]:
b = np.arange(6).reshape(2, 3)

print(b)
print(b.T)
b.transpose(1, 0)

[[0 1 2]
 [3 4 5]]
[[0 3]
 [1 4]
 [2 5]]


array([[0, 3],
       [1, 4],
       [2, 5]])

## Einfache Operatoren

Numpy supports many basic math operations like subtracting one array from the other or multiplying two arrays. 

In [18]:
n = np.array([20, 30, 40, 50])
p = np.array([0, 1, 2, 3])

In [19]:
n - p

array([20, 29, 38, 47])

In [20]:
n * p

array([  0,  30,  80, 150])

Also the dot-product is supported by calling `array1.dot(array2)` or in a more functional way by calling `np.dot(array1, array2)`. 

In [21]:
n.dot(p)  # np.dot(n, p)

260

## Unäre Operatoren

You can perform unary operations on numpy-arrays, which numpy then handles elementwise on the array. Supported are boolean- and math-operators but also functions like the exponetialfunction `np.exp(array)`.

In [22]:
n < 35

array([ True,  True, False, False])

In [23]:
p ** 2

array([0, 1, 4, 9])

In [24]:
np.exp(p)

array([ 1.        ,  2.71828183,  7.3890561 , 20.08553692])

In [25]:
np.sqrt(p)

array([0.        , 1.        , 1.41421356, 1.73205081])

In [26]:
np.log(n)

array([2.99573227, 3.40119738, 3.68887945, 3.91202301])

## Summe, Maximum, Minimum

`array.sum(axis)` for the sum, `array.max(axis)` for the maximum and `array.min(axis)` are function build into numpy. If these functions are used on mutlidimesinal arrays you can specify on which axis the operation should be performed on. The specified axis has to be present on the array or an error will be raised.

In [27]:
m = np.arange(12).reshape(3,4)

In [28]:
m.sum(axis=0)

array([12, 15, 18, 21])

In [29]:
m.min(axis=1)

array([0, 4, 8])

## Matrix-Multiplikation

If you want to perform a matrix multiplication you have to call `np.dot`. You have to pay attention to not use the `*` operator because the `*`, like in other libraries for example scikit, does not perform matrix multiplication. The `*` operator works elementwise. 

In [30]:
X = np.array([[2, -1], [0, 3], [1, 0]])
Y = np.array([[2, 0], [1, -1]])

In [31]:
A = X.dot(Y) # np.dot(X, Y)
print(A)
print(A.shape)

[[ 3  1]
 [ 3 -3]
 [ 2  0]]
(3, 2)


## Elemente indizieren

With `array[row][column]` you can index multidimensional array similar to nested Python lists. 

In [32]:
B = np.array([[[ 111, 112 ], [ 121, 122 ]], \
              [[ 211, 212 ], [ 221, 222 ]], \
              [[ 311, 312 ], [ 321, 322 ]]])

In [33]:
print(B[2][1][0])

321


Additionaly numpy offers the index-syntax `array[row, column]`.

In [34]:
print(B[2, 1, 0])

321


By specifing only the row you get the whole column.

In [35]:
print(B[1])

[[211 212]
 [221 222]]


Likewise to lists negativ indexing is supported, to get only a part of the original array.

In [36]:
print(B[-1, -1])

[321 322]


## Indizieren mit Index-Arrays/Listen

Furthermore is indexing with another array or list supported, too. It is possible to create an array with indizes and index another array with this array, by using the index-operator `array[]`. The result is an array with the resolved indizes of the inital array.

In [37]:
s = np.arange(12) ** 2
i = np.array([1, 1, 3, 8, 5]) # i = [1, 1, 3, 8, 5]

In [38]:
s[i]

array([ 1,  1,  9, 64, 25])

## Indizieren mit Wahrheitswerten

If you create a numpy-array with boolean values and use it to index another numpy-array, you get an array of all the values, where the coresponding index in the boolean array is `True`. This is especially helpful when the goal is to filter the array.

In [39]:
g = np.arange(12).reshape(3, 4)

In [40]:
h = g > 4
print(h)

[[False False False False]
 [False  True  True  True]
 [ True  True  True  True]]


In [41]:
g[h]

array([ 5,  6,  7,  8,  9, 10, 11])

Besides filtering the boolean index array can be used to reassign all `True` values.

In [42]:
g[h] = 0
print(g)

[[0 1 2 3]
 [4 0 0 0]
 [0 0 0 0]]


## Slicing

In analogy to lists numpy supports slicing of arrays. The slicing operator works like `array[start:end:step]` where start and end values are included. All parameters are optional and can be left out to create different results.

In [43]:
S = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [44]:
S[3:6:2]

array([3, 5])

In [45]:
S[:4]

array([0, 1, 2, 3])

In [46]:
S[4:]

array([4, 5, 6, 7, 8, 9])

Defining no start, end and step returns the original array.

In [47]:
S[:]

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

## Slicing (2 Achsen)

Slicing is also possible on two or more axis. To accomplish multidimensinal slicing specify the slicing for each dimension comma seperated (`array[slicing-axis0, slicing-axis1]`). By doing this the axis are sliced independent. 

In [48]:
J = np.arange(25).reshape(5,5)

In [49]:
J[:3, 2:]

array([[ 2,  3,  4],
       [ 7,  8,  9],
       [12, 13, 14]])

In [50]:
J[3:, :]

array([[15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [51]:
K = np.arange(28).reshape(4, 7)

In [52]:
K[::2, ::3]

array([[ 0,  3,  6],
       [14, 17, 20]])

In [53]:
K[:, ::3]

array([[ 0,  3,  6],
       [ 7, 10, 13],
       [14, 17, 20],
       [21, 24, 27]])

## View

Slicing does not deep copy the array, it only creates a shallow copy, a view. Therefore you have to watch out if you edit the view the initial array also changes and sometimes this is unintended. 

In [54]:
D = np.arange(10)

In [55]:
V = D[2:6]

In [56]:
V[0] = 22
V[1] = 23

In [58]:
print(D)

[ 0  1 22 23  4  5  6  7  8  9]


To create a deep copy of an array numpy provides the function `np.copy(array)` or `array.copy()`. By using `no.copy()` the values of inital are aren't affected by changes on the copy.

In [59]:
Q = D[2:6].copy()

## Array aus Einsen/Nullen

Numpy offers a function to create arrays only containing ones or zeros. `np.ones(shape)` creates an array of the specified shape only filled with ones. `np.zeros(shape)` does the same but the values are all zero.

In [60]:
np.ones((2,3))

array([[1., 1., 1.],
       [1., 1., 1.]])

The default data-type of `np.zeros` and `np.ones` is float this can be changed by specifing the `dtype` in the function call.

In [61]:
np.ones((3,4), dtype=int)

array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]])

In [62]:
np.zeros((2,4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.]])

## Matrizen mit Zufallszahlen

For some application you need random numbers. Numpy supports initialzing arrays with random values. `np.random.rand(shape)` creates an array with the specified shape with random values from a uniform distribution from including zero to excluded one.

In [63]:
np.random.rand(2,3)

array([[0.93531562, 0.48050018, 0.89799898],
       [0.19355958, 0.62478602, 0.76575978]])

Calling `np.random.randn(shape)` works like `np.random.rand(shape)` except it takes its values form an univariate “normal” (Gaussian) distribution of mean zero and variance one.

In [64]:
np.random.randn(2,3)

array([[-0.80939722,  0.48977304, -0.52205476],
       [ 2.39497595,  1.47818157, -0.31671905]])

## Iterieren

You can iterate over a numpy-array like over a standard Python list. If you iterate over a matrix you get in each iteration a row of the matrix. 

In [65]:
for row in np.arange(12).reshape(3,4):
    print(row)

[0 1 2 3]
[4 5 6 7]
[ 8  9 10 11]


## Stacking von Arrays

Multiple numpy arrays can be combined into one array by stacking them. The corresponding numpy functions `np.vstack(arrays)` and `np.hstack(arrays)`.

In [66]:
e = np.array([[1,2], [3,4]])
r = np.array([[11, 22], [33, 44]])

By using `np.vstack(arrays)` the arrays are stacked vertically. To stack two arrays vertically the shape of axis one has to match. 

In [67]:
np.vstack((e,r))

array([[ 1,  2],
       [ 3,  4],
       [11, 22],
       [33, 44]])

By using `np.hstack(arrays)` the arrays are stacked horizontally. To stack two arrays horizontally the shape of axis zero has to match.

In [68]:
np.hstack((e,r))

array([[ 1,  2, 11, 22],
       [ 3,  4, 33, 44]])