## 11 Numpy 
Numpy is a commonly used library for the scientific use of python.
The package allows high-performance computation on multidimensional arrays. There are a lot of built-in functions for working with arrays. 
This notebook wants to show basic and advanced functionalities of numpy.

In [11]:
import numpy as np # importing numpy with an alias for easy use

Initializing arrays and accessing elements and information about a given array

In [12]:
a = np.array([1, 2, 3, 4, 5])

print(a)
print(type(a)) # python built-in function for any python object
print(a.shape) # function of the numpy object that returns the shape of the array

print(a[0]) # access elements like in a python list

a[0] = 6 # change a list object value
print(a)


[1 2 3 4 5]
<class 'numpy.ndarray'>
(5,)
1
[6 2 3 4 5]


Creating multidimensional arrays or matrices

In [13]:
b = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(b.shape)
print(b)
print(f"b[0, 0]: {b[0][0]}  ,  b[1, 2]: {b[1][2]}")

(3, 3)
[[1 2 3]
 [4 5 6]
 [7 8 9]]
b[0, 0]: 1  ,  b[1, 2]: 6


Convenience functions for easy array creation

In [14]:
a = np.zeros((3,3))
print(a)

b = np.ones((1, 2))
print(b)

c = np.eye(3) # creates a 3 dimensional identity matrix
print(c)

d = np.random.random((3,3)) #creates a 3x3 matrix with random values between 0 and 1
print(d)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
[[1. 1.]]
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
[[0.34545575 0.79850944 0.28970133]
 [0.41972691 0.62655038 0.89506978]
 [0.21492131 0.70660553 0.36802376]]


Numpy offers two built-in methods that allow you to create ranges (similar to the python method <i>range()</i>), one called np.arange() and the other np.linspace():<br>  
As the name suggests, np.arange() behaves in the same way as the python-internal method in that it lets you set a range and a step size, it's recommended by the official numpy docs for creating integer sequences. It includes the start value but not the stop value, similar to range() ( [start, stop) )<br>
np.linspace() on the other hand lets you define a range, set the amount of steps you want and then returns evenly spaced numbers from that interval, inferring the step size from the given information. The range includes the start and stop value (the borders of the interval, [start, stop] )The method is recommended for the use with floating point numbers.<br>
In practice it does not matter which method you choose for which data type, as long as you understand how your preferred method works you will get out the correct numbers:

In [15]:
linspace_int_range = np.linspace(start=0, stop=10, num=11)
linspace_float_range = np.linspace(start=0, stop=10, num=10)
arange_int_range = np.arange(start=0,stop=10,step=1)
arange_float_range = np.arange(start=0,stop=10,step=0.1)

print(f"linspace_int_range: {linspace_int_range}")
print(f"linspace_float_range: {linspace_float_range}")
print(f"arange_int_range: {arange_int_range}")
print(f"arange_float_range: {arange_float_range}")

linspace_int_range: [ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]
linspace_float_range: [ 0.          1.11111111  2.22222222  3.33333333  4.44444444  5.55555556
  6.66666667  7.77777778  8.88888889 10.        ]
arange_int_range: [0 1 2 3 4 5 6 7 8 9]
arange_float_range: [0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.  1.1 1.2 1.3 1.4 1.5 1.6 1.7
 1.8 1.9 2.  2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.  3.1 3.2 3.3 3.4 3.5
 3.6 3.7 3.8 3.9 4.  4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5.  5.1 5.2 5.3
 5.4 5.5 5.6 5.7 5.8 5.9 6.  6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 7.  7.1
 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 8.  8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9
 9.  9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9]


### Exercise 11.1
Create an array with shape (3,7) filled like
[[1, 2, 3 ,4 , 5, 6, 7]
 [8, 9, ...
 ]]

In [16]:
### your code here

Selecting and modifying elements from a matrix with lists:

In [17]:
c = np.array([ 2, 3, 6])
#selects one elements from each row with the indices given by c
print(a[np.arange(3), c])
#add 10 to those elements
a[np.arange(3), c ] += 10
print(a)

### Exercise 11.2
Create an array that contains the elements from the last two rows from column three to five

In [18]:
### your code here

### Exercise 11.3:

Select the third, sixths and second element of row one, two and three from the array and multiply all elements by ten

In [19]:
### your code here

### Boolean indexing and masks

In [20]:
mask = (a>9) #creates a boolean array where the element in a is larger than 9
print(mask)

print(a[mask]) #prints all values that fulfill the condition

print(a[a>2]) # in one line (very convenient for filtering data)

[[False False False]
 [False False False]
 [False False False]]
[]
[]


## Datatypes in numpy

numpy chooses the datatype according to the values in the list.
A particular datatype can be enforced during initialization to avoid issues in calculations and assertions

In [21]:
x = np.array([1,2])
print(x.dtype)

x = np.array([1.0, 2.0])
print(x.dtype)

x = np.array([1, 2], dtype=np.int8) #forces the datatype to be int8
print(x.dtype)

int64
float64
int8


## The useful stuff: Math with numpy arrays

numpy has a lot of built-in functionalities for basic maths

In [22]:
x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
y = 2*np.eye(3)

print(f"x:\n {x}")
print(f"y:\n {y}\n")

#elementwise addition
print("elementwise addition:")
print(x+y) 
print(f"{np.add(x,y)}\n")
#elementwise subtraction
print("elementwise subtraction:")
print(x-y) 
print(f"{np.subtract(x,y)}\n")
#elementwise multiplication
print("elementwise multiplication:")
print(x*y)
print(f"{np.multiply(x,y)}\n")
#elementwise division
print("elementwise division:")
print(x/x)
print(f"{np.divide(x,x)}\n")
#elementwise square root
print("elementwise square root:")
print(np.sqrt(x))

x:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
y:
 [[2. 0. 0.]
 [0. 2. 0.]
 [0. 0. 2.]]

elementwise addition:
[[ 3.  2.  3.]
 [ 4.  7.  6.]
 [ 7.  8. 11.]]
[[ 3.  2.  3.]
 [ 4.  7.  6.]
 [ 7.  8. 11.]]

elementwise subtraction:
[[-1.  2.  3.]
 [ 4.  3.  6.]
 [ 7.  8.  7.]]
[[-1.  2.  3.]
 [ 4.  3.  6.]
 [ 7.  8.  7.]]

elementwise multiplication:
[[ 2.  0.  0.]
 [ 0. 10.  0.]
 [ 0.  0. 18.]]
[[ 2.  0.  0.]
 [ 0. 10.  0.]
 [ 0.  0. 18.]]

elementwise division:
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]

elementwise square root:
[[1.         1.41421356 1.73205081]
 [2.         2.23606798 2.44948974]
 [2.64575131 2.82842712 3.        ]]


### Matrix multiplication

You can utilize numpy to easily calculate the dot or cross product for matrices...

In [23]:
print(f"x:\n {x}")
print(f"y:\n {y}\n")

print(f"dot product methods:")
print(x.dot(y))
print(np.dot(x,y))
print(x@y) #this method is preferred in newer versions of numpy

print("\ncross product (row wise)")
print(np.cross(x, y)) #cross product of vectors (row wise)

x:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
y:
 [[2. 0. 0.]
 [0. 2. 0.]
 [0. 0. 2.]]

dot product methods:
[[ 2.  4.  6.]
 [ 8. 10. 12.]
 [14. 16. 18.]]
[[ 2.  4.  6.]
 [ 8. 10. 12.]
 [14. 16. 18.]]
[[ 2.  4.  6.]
 [ 8. 10. 12.]
 [14. 16. 18.]]

cross product (row wise)
[[  0.   6.  -4.]
 [-12.   0.   8.]
 [ 16. -14.   0.]]


... and vectors:

In [24]:
v = np.array([1, 2, 3]) #vector for matrix vector multiplication
w = np.array([4, 5, 6])

print(f"v: {v}")
print(f"w: {w}\n")

print("scalar product methods:")
print(v.dot(w)) #scalar product
print(np.dot(v, w))

print("\ndot product vector/matrix:")
print(x.dot(v))
print(np.dot(x,v))
print(x@v)

print("\ncross product (vector orthogonal to v, w):")
print(np.cross(v,w)) # vector orthogonal to v, w

v: [1 2 3]
w: [4 5 6]

scalar product methods:
32
32

dot product vector/matrix:
[14 32 50]
[14 32 50]
[14 32 50]

cross product (vector orthogonal to v, w):
[-3  6 -3]


### More built-in functions

np.sum() sums up all elements in a array/matrix. The axis parameter lets you choose between rows and columns (this is used in many other numpy methods as well):

In [25]:
print(f"np.sum(x): {np.sum(x)}") # sums up all elements
print(f"np.sum(x, axis=0): {np.sum(x, axis=0)}") # sums up all columns
print(f"np.sum(x, axis=1): {np.sum(x, axis=1)}") # sums up all rows

np.sum(x): 45
np.sum(x, axis=0): [12 15 18]
np.sum(x, axis=1): [ 6 15 24]


np.abs() returns the absolute values from a list/array/matrix:

In [26]:
print(np.abs(np.array([1,2,-4,-6,8])))

[1 2 4 6 8]


np.average() and np.mean() behave quite similarly, both return either the average over an array/matrix or one of its axes. np.average() allows for an additional weight array to calculate the weighted average.

In [27]:
print("np.average:")
print(f"np.average(x): {np.average(x)}")
print(f"np.average(x, axis=0): {np.average(x, axis=0)}")
print(f"np.average(x, axis=1): {np.average(x, axis=1)}")
print(f"np.average(x, axis=1), weighted: {np.average(x, axis=1, weights=[5,7,25])}") # avg = sum(x * weights) / sum(weights)

print("\nnp.mean:")
print(f"np.mean(x): {np.mean(x)}")
print(f"np.mean(x, axis=1): {np.mean(x, axis=1)}")

np.average:
np.average(x): 5.0
np.average(x, axis=0): [4. 5. 6.]
np.average(x, axis=1): [2. 5. 8.]
np.average(x, axis=1), weighted: [2.54054054 5.54054054 8.54054054]

np.mean:
np.mean(x): 5.0
np.mean(x, axis=1): [2. 5. 8.]


np.argmax() can be used to return the indices of the maximum values along an axis, while np.argmin() does the same for the minimum values:

In [28]:
z = np.array([[1,4,3],[7,2,5],[3,6,9]])
print(f"z:\n{z}\n")

print(f"np.argmax(z, axis=0): {np.argmax(z, axis=0)}")
print(f"np.argmax(z, axis=1): {np.argmax(z, axis=1)}")

print(f"np.argmin(z, axis=0): {np.argmin(z, axis=0)}")
print(f"np.argmin(z, axis=1): {np.argmin(z, axis=1)}")

z:
[[1 4 3]
 [7 2 5]
 [3 6 9]]

np.argmax(z, axis=0): [1 2 2]
np.argmax(z, axis=1): [1 0 2]
np.argmin(z, axis=0): [0 1 0]
np.argmin(z, axis=1): [0 1 0]


### Randomisation

Numpy offers a wide range of functions that allow you to randomly sample from a given range or array. If you want your results to be reproducible you should opt to set a fixed seed:

In [29]:
rng = np.random.default_rng(seed=1337)

print(f"rng.integers(endpoint=True):\n{rng.integers(low=0,high=10,size=100,endpoint=True)}")
print(f"rng.integers(endpoint=False):\n{rng.integers(low=0,high=10,size=100,endpoint=False)}")

# Return random floats in the half-open interval [0.0, 1.0). Results are from the “continuous uniform” distribution over the stated interval.
print(f"\nrng.random(size=10):\n{rng.random(10)} <- sampled from 'continuous uniform' distribution")

# array with 5 samples from the standard normal distribution
print(f"\nrng.standard_normal(5): {rng.standard_normal(5)}")

# 3x4 array with samples from the normal distribution with mean=2 and standard deviation=4:
print(f"\n2 + 4*rng.standard_normal(size=(3,4)):\n{2 + 4*rng.standard_normal(size=(3,4))} <- sampled from a shifted normal distribution")

# Generates a random sample from a given array
# replace: Whether the sample is with or without replacement. Default is True, meaning that a value of 'a' can be selected multiple times.
# shuffle: Whether the sample is shuffled when sampling without replacement. Default is True, False provides a speedup.
choices = [1,2,5,19,15,12,20,21,72,45,12,5,2,3,7,40,21,42,25]

print(f"\nchoices array: {choices}")
print(f"rng.choice(choices,size=10,replace=True,shuffle=False): {rng.choice(choices,size=10)} <- 5 appears 3 times")
print(f"rng.choice(choices,size=10,replace=True,shuffle=False): {rng.choice(choices,size=10,replace=False)} <- all values unique")

# Return random bytes
print(f"\nrng.bytes(10): {rng.bytes(10)}")

rng.integers(endpoint=True):
[ 6  9  7  2  5 10  4 10  3  9  5  1  0  2  1  3  3  5  6  9  2  0  5  2
  9  8  1  4  3  0  8  4  5  3  2  0  0  9  5  0  4  9 10  5  9  0  0  8
  3  8  5  7  6  0  7  6 10 10  8  8  3  2  4  0  0 10  3  8  5 10  6  8
  8  1  7  1  3  2  4  6 10  7  0  8  4  4  3  3  6  9  5  1  9  2  7  3
  0  8  2  5]
rng.integers(endpoint=False):
[5 8 2 9 7 2 9 3 6 1 9 1 1 8 4 2 6 0 7 8 9 3 7 8 4 0 1 6 1 0 2 4 7 3 5 2 7
 2 2 2 3 7 1 1 8 6 6 1 1 9 3 5 1 5 6 6 3 4 7 7 4 4 5 6 1 1 6 7 6 5 7 1 6 3
 5 5 7 0 6 9 4 5 2 2 4 6 2 0 6 1 6 6 8 1 2 9 0 9 0 3]

rng.random(size=10):
[0.13866216 0.94755104 0.50825476 0.67010873 0.53283281 0.27643336
 0.27839879 0.41565925 0.25086189 0.50950071] <- sampled from 'continuous uniform' distribution

rng.standard_normal(5): [ 0.12085028 -0.33929705 -0.4682714  -0.91131567 -0.84963165]

2 + 4*rng.standard_normal(size=(3,4)):
[[ 6.68242195  5.33367065 -4.0828517   5.34001929]
 [ 2.96679265  0.57094653 -2.40165729 10.33879524]
 [-3.29821357 -1.

### Transposition


In [30]:
print(x)
print(x.T)
print(v)
print(v.T)

[[1 2 3]
 [4 5 6]
 [7 8 9]]
[[1 4 7]
 [2 5 8]
 [3 6 9]]
[1 2 3]
[1 2 3]


### Broacasting and Reshaping
Broadcasting can be used to add a constant vector to every row or column of a matrix. Numpy handles these operation itself.

In [31]:
print(x)
print(v)
print(x + v) #this adds v elementwise to every column of x

[[1 2 3]
 [4 5 6]
 [7 8 9]]
[1 2 3]
[[ 2  4  6]
 [ 5  7  9]
 [ 8 10 12]]


In [32]:
z = np.array([1,2])
#outer product of vectors
# reshaping the v vector to a column-like vector 
print(np.reshape(v, (3,1)))
print(np.reshape(v, (3,1))* z)

[[1]
 [2]
 [3]]
[[1 2]
 [2 4]
 [3 6]]


## Using skimage for image processing
Images can be interpreted as matrices with RGB Values. These images can be read with skimage and other packages and be processed like numpy arrays.

In [33]:
from skimage import transform, io

img = io.imread('img/raccoon.jpg')
print(f"img.dtype: {img.dtype}\nimg.shape: {img.shape}\n")
print(f"img:\n{img}")
img_tinted = img * [1, 0.8, 0.9] #changes the rgb values of the image
img_bw = np.dot(img[...,:3], [0.299, 0.587, 0.114]) #formula for greyscale conversion
print(f"\nimg_bw:\n{img_bw}")
img_resized = transform.resize(img, (200, 200), mode='symmetric', preserve_range=True) #resizes the image
print(f"img_resizedmg.dtype: {img_resized.dtype}\nimg_resized.shape: {img_resized.shape}\n")
io.imsave(arr=img_tinted, fname='img/raccoon_tinted.jpg')
io.imsave(arr=img_bw, fname='img/raccoon_bw.jpg')
io.imsave(arr=img_resized, fname='img/raccoon_resized.jpg')

img.dtype: uint8
img.shape: (1080, 1280, 3)

img:
[[[ 79  70  65]
  [ 89  85  76]
  [ 91  88  79]
  ...
  [ 38  30  27]
  [ 40  32  29]
  [ 36  28  25]]

 [[104  93  89]
  [107  98  91]
  [102  95  87]
  ...
  [ 37  29  26]
  [ 37  29  26]
  [ 38  30  27]]

 [[ 94  87  81]
  [ 94  87  81]
  [ 90  86  77]
  ...
  [ 38  30  27]
  [ 36  28  25]
  [ 35  30  26]]

 ...

 [[137 132 139]
  [129 126 133]
  [113 114 118]
  ...
  [211 204 194]
  [210 204 192]
  [210 204 192]]

 [[107 109 106]
  [ 99 104 100]
  [ 90  97  90]
  ...
  [211 204 196]
  [210 203 195]
  [210 203 195]]

 [[ 91  97  85]
  [ 76  84  71]
  [ 78  86  73]
  ...
  [211 204 198]
  [209 202 196]
  [208 201 195]]]

img_bw:
[[ 72.121  85.17   87.871 ...  32.05   34.05   30.05 ]
 [ 95.833  99.893  96.181 ...  31.05   31.05   32.05 ]
 [ 88.409  88.409  86.17  ...  32.05   30.05   31.039]
 ...
 [134.293 127.695 114.157 ... 204.953 204.426 204.426]
 [108.06  102.049  94.109 ... 205.181 204.181 204.181]
 [ 93.838  80.126  82.126 ... 2

