<a href="https://colab.research.google.com/github/laislemke/DeepLearning/blob/main/NumPy_Tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python basics with NumPy
by [Ruben Nuredini](mailto:Ruben.Nuredini@hs-heilbronn.de), [Nicolaj Stache](mailto:Nicolaj.Stache@hs-heilbronn.de), [Andreas Schneider](mailto:Andreas.Schneider@hs-heilbronn.de), Heilbronn University of Applied Sciences

NumPy is the fundamental package for scientific computing with Python. It includes a powerful N-dimensional array object as well as linear algebra and random number capabilities that are useful for data science. NumPy also provides sophisticated (broadcasting) functions out-of-the box. NumPy is maintained by a large community (http://www.numpy.org). Any time you need more info on a numpy function, we encourage you to look at [the official documentation](https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.exp.html).

[Matplotlib](https://matplotlib.org/) is a 2D plotting library for Python and NumPy. Matplotlib collection `matplotlib.pyplot` is a collection of command style functions.

As NumPy is a library, it has to be included using the `import` keyword.

In [1]:
import numpy as np

Additionally the `matplotlib.pyplot` should be imported. The `%matplotlib` is a [magic function](http://ipython.readthedocs.io/en/stable/interactive/tutorial.html#magics-explained) in IPython (the engine Jupyter notebooks run on). '%matplotlib inline' sets the backend of matplotlib to the 'inline' backend. With this backend, the output of plotting commands is displayed inline within frontends like the Jupyter notebook, directly below the code cell that produced it. The resulting plots will then also be stored in the notebook document.

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

## Basic data structures with NumPy

NumPy provides a very convenient way to create N-dimensional arrays. The data structures used in NumPy to represent these shapes (vectors, matrices...) are called numpy arrays.

### 0 - dimensional Array (Scalar)

In [None]:
x = np.array(42)
print("x: ", x)
print("The type of x: ", type(x))
print("The dimension of x:", np.ndim(x))

### 1 - dimensional Array (Vector)

In [None]:
F = np.array([[5, 1, 4, 2, 6, 0]], dtype=np.double) # the dtype parameter can be used to define the type of the array elements.
V = np.array([1., 2., 3., 4., 5., 6.]) # if the dtype parameter is omitted Python deduces the elements type.

print("F: ", F)
print("V: ", V)
print("Type of F: ", F.dtype)
print("Type of V: ", V.dtype)
print("Dimension of F: ", np.ndim(F))
print("Dimension of V: ", np.ndim(V))


Let us check the type of F.

In [None]:
type(F)

#### Shape
A very common NumPy function used in deep learning is [np.shape](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.shape.html). It is used to get the shape (dimension) of the matrix/vector X and Y.

In [None]:
print("Shape of F: ", F.shape)
print("Shape of V: ", V.shape)

#### Transpose
Transposing data structures with [np.T](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.T.html) is another common function in NumPy used in deep learning. Pay attention on what happens when we try to transpose the vector V.

In [None]:
F_transposed = F.T
V_transposed = V.T

In [None]:
print('F_transposed: \n', F_transposed)
print('Shape of F_transposed: ', F_transposed.shape)
print()

print('V_transposed: \n', V_transposed)
print('Shape of V_transposed: ', V_transposed.shape)

Please note: The orientation of a vector (column-vector or row-vector) is not stored in NumPy, as you can observe from `V` and `V_transposed`. Hence, transposition does not work as you might expect. It differs from the behavior of MATLAB, which internally treats vectors as matrices (with two dimensions) and can by this represent row and column vectors.

One approach for Python is to define vectors also as a two dimensional matrix, by using double brackets `[[ ]]`, as done for `F`. This is the recommended solution.

As an alternative, you can transpose a vector like this:

In [None]:
V_transposed2 = V[:,None]
print('V_transposed2: \n', V_transposed2)
print('Shape of V_transposed2: ', V_transposed2.shape)

#### Lists vs numpy.ndarray
The type of a regular Python array is `list` whereas the regular type of a NumPy array is `numpy.ndarray`. You can easily create a NumPy array out of a list

In [None]:
# list of room temperatures in centigrades
C = [20.1, 20.8, 21.9, 22.5, 22.7, 22.3, 21.8, 21.2, 20.9, 20.1]

# numpy array created from that list
C_np = np.array(C)

Now let's do computations with both - like doing a conversion of degree centigrade to degree fahrenheit, using this formula: $$ \vartheta(°F) = \vartheta(°C) \cdot 9/5 + 32$$

In case of a numpy array, we can make use of scalar-matrix computations...

In [None]:
# numpy supports matrix and vector algebra, hence the conversion is quite simple
print('Temperature in °F: ', C_np * 9/5 + 32)

> **Task:** as a revision to lists... please implement the same conversion without numpy by using lists only (hint: consider list comprehensions)

In [None]:
# TODO: Implement the same conversion by using the list only.



Other operations on numpy arrays are described later in the tutorial

<font color='blue'>
**Summary**:
- the type of a regular Python array is `list` whereas of a NumPy array is `numpy.ndarray`
- Make sure that you keep track of the dimensions of the data structures you create with NumPy.
- Many bugs in deep learning implementations are caused by improper shapes.
- A good practice is to define one-dimensional arrays in a full-fledged way (in fact, as a two dimensional array, like `F` was created).

### Two and higher dimensional arrays (Matrices, Tensors)

Let us create a two-dimensional matrix of shape 6x6.

In [None]:
Z = np.array([[3, 0, 1, 2, 7, 4],
              [1, 5, 8, 9, 3, 1],
              [2, 7, 2, 5, 1, 3],
              [0, 1, 3, 1, 7, 8],
              [4, 2, 1, 6, 2, 8],
              [2, 4, 5, 2, 3, 9]], np.double)
print (Z)

In [None]:
print (Z.shape) # the first dimension of Z is now 6.
print (Z.shape[0]) # each dimension of a multidimensional array can be obtained separately.

You can also create NumPy arrays in higher dimensions. Let us create a 3-D NumPy array.

In [None]:
highDimArray = np.array([
    [[0.00, 0.01, 0.02],
     [0.10, 0.11, 0.12],
     [0.20, 0.21, 0.22]],

    [[1.00, 1.01, 1.02],
     [1.10, 1.11, 1.12],
     [1.20, 1.21, 1.22]],

    [[2.00, 2.01, 2.02],
     [2.10, 2.11, 2.12],
     [2.20, 2.21, 2.22]]
])

In [None]:
highDimArray.shape

Accessing elements in a multidimensional array is by using its indices.

In [None]:
highDimArray[0] #using one index returns a 2-D NumPy array.

In [None]:
highDimArray[1, 0] #using two indices returns a 1-D NumPy array.

In [None]:
highDimArray[2, 2, 1] #three indices result with a scalar value.

---

## Initialization of NumPy vectors and matrices

A very convenient functions for quick initialization of NumPy arrays are:
- [np.zeros](https://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html) that returns a new array of given shape and type, filled with zeros.
- [np.random.rand](https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.rand.html) that creates an array of the given shape and populate it with random samples from a uniform distribution over $[0, 1)$

These functions are very useful when it comes to the initialization of the parameters (weights and biases) in a neural network.

#### Zero initialization
A typical usage of `np.zeros` is as follows

In [None]:
np.zeros ((2, 5, 6))

Let us say that the $L$-th layer in a neural network consists of 10 neurons. Each of them is associated with a bias. In order to initialize the `b_L` vector to zeros we could use:

In [None]:
layer_L_dims = 10
b_L = np.zeros((layer_L_dims, 1))
print (b_L)

#### Random Initialization

A typical usage of `np.random.rand` is as follows

In [None]:
np.random.rand(7, 4)

The usage of an equal seed makes sure your "random" numbers will be the same as ours. Running the code several times gives you always the same values.

In [None]:
np.random.seed(1)
np.random.rand(4, 3)

## Linear Algebra fundamentals with NumPy

Linear algebra is the branch of mathematics concerning linear equations such as linear functions such as
and their representations through matrices and vector spaces. Linear algebra is particularly useful in deep learning as it provides efficiency in terms of calculation speed and simplicity in coding. By employing linear algebra operations the amount of loops (`for`, `while` constructs) is decreased to minimum.

> **Task**: Examine the following code snippets and explain them to your neighbor!

In [None]:
arr1 = [1, 2, 3, 4]
arr2 = [5, 6, 7, 8]

In [None]:
# Multiplication of two python arrays with a loop
product = []
for i in range(len(arr1)):
    product.append(arr1[i] * arr2[i])
product

One remark: The code above, might be the most obvious solution to the problem using lists. However, it is surely not the most Pythonic way of implementing this. A better approach in this sense uses list comprehensions with the `zip` command. Please check the explanation and other examples for this [here](https://www.programiz.com/python-programming/methods/built-in/zip).

In [None]:
product = [x*y for x, y in zip(arr1, arr2)]
print(product)

However, multiplying two vectors seems to be complex using lists. With NumPy the same result can be achieved by:

In [None]:
# Linear algebra version by employing NumPy
np.array(arr1) * np.array(arr2)

### Matrix Multiplication

Multiplication of matrix to a scalar is done by multiplying each element of the matrix to the scalar.

In [None]:
A = np.random.randint(11, size=(3, 4)) # Generate a 3 x 4 matrix of ints between 0 (inclusive) and 11 (exclusive):
print ('The original values of the matrix: ')
print(A)
print ('The values of the matrix scaled by 2: ')
print (A * 2)

Multiplying a matrix by a matrix is tricky as not every two matrices can be multiplied. If $A$ is an $n × m$ matrix and $B$ is an $m × p$ matrix, their matrix product $AB$ is an $n × p$ matrix. The $m$ entries across a row of $A$ are multiplied with the $m$ entries down a column of $B$ and summed to produce an entry of $AB$.

The matrix multiplication can be done by the @ operator.

In [None]:
print (A)
print ('The shape of the matrix A is', A.shape)

In [None]:
B = np.random.randint(11, size=(4, 5))
print (B)
print ('The shape of the matrix B is', B.shape)

In [None]:
AB = A @ B
print (AB)
print ('The shape of the matrix AB is', AB.shape)

The attempt to obtain the product BA will result with an error. Why?

In [None]:
BA = B @ A

### Dot Product

One very important vector operation is the dot product. The dot product is the multiplication of two vectors and results with a scalar.

In [None]:
w = np.random.randn(4, 1)
w

In [None]:
data_point = np.array([[1, 1, 2, 0]]).T
data_point

In [None]:
np.dot(w.T, data_point)

In deep learning usually the datasets are represented as groups of datapoints. The dot product is very helpful when each of the datapoints in the dataset should be multiplied by weights which are stored in a vector $w$.

In [None]:
# an imaginary dataset of 10 data points each represented as a 4 x 1 vector.
data_set = np.random.randint(3, size=(4, 10))
data_set

In [None]:
# the result is a 1 x 10 vector where the elements correspond to the product of each datapoint and the weights vector.
Z = np.dot(w.T, data_set)
Z

### Elementwise operation

There is sometimes the need for elementwise algebraic operation. Elementwise operations are binary operations that take two matrices of the same dimensions, and produce another matrix where each element $(i,j)$ is the operation of elements $(i,j)$ of the original two matrices. Some of the NumPy functions for element-wise operations are:
* [np.multiply](https://docs.scipy.org/doc/numpy/reference/generated/numpy.multiply.html), alternatively use `*` for multiplying arguments element-wise. This operation is also known as the **Hadamard product**.
* [np.divide](https://docs.scipy.org/doc/numpy/reference/generated/numpy.divide.html), alternatively use `/`  for dividing arguments element-wise.
* [np.add](https://docs.scipy.org/doc/numpy/reference/generated/numpy.add.html), alternatively use `+` for adding arguments element-wise.
* [np.subtract](https://docs.scipy.org/doc/numpy/reference/generated/numpy.subtract.html), alternatively use `-` for subtracting arguments element-wise.
* [np.square](https://docs.scipy.org/doc/numpy/reference/generated/numpy.square.html), alternatively use `**` return the element-wise square of the input.

Furthermore, there are logical operations avaliable:
* [np.logical_or](https://docs.scipy.org/doc/numpy/reference/generated/numpy.logical_or.html#numpy.logical_or)
* [np.logical_and](https://docs.scipy.org/doc/numpy/reference/generated/numpy.logical_and.html#numpy.logical_and)
* [np.equal](https://docs.scipy.org/doc/numpy/reference/generated/numpy.equal.html#numpy.equal) (alternatively use `==`)
* many more functions can be explored [here](https://docs.scipy.org/doc/numpy/reference/index.html)

A simple example is applying a filter(mask) on a matrix. A filter is applied in order to zero-out some elements of the matrix.

In [None]:
matrix = np.random.rand(4, 5) # creating a random 4 x 5 matrix of floating point numbers
mask = np.random.randint(2, size=(4, 5)) # creating a random 4 x 5 matrix of zeroes or ones

In [None]:
masked_matrix = np.multiply(matrix, mask) # the Hadmard product zeroes-out the elements corresponding to the zero positions in the mask
print (masked_matrix)

In order to square all elements of the matrix the `np.squared` can be used:

In [None]:
np.square(masked_matrix)

### Broadcasting ####
A very important concept to understand in numpy is "broadcasting". It is very useful for performing mathematical operations between arrays of different shapes. For the full details on broadcasting, you can read the official [broadcasting documentation](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html).

Here are two examples to illustrate broadcasting:

In [None]:
A = np.array([ [11, 12, 13], [21, 22, 23], [31, 32, 33] ])
B = np.array([1, 2, 3])

print(A * B)

print(A + B)

In [None]:
A = np.array([10, 20, 30])
B = np.array([1, 2, 3])
A[:, None] * B

> ** Task: Describe below what broadcasting does **

### Example: Normalizing a Dataset

Element-wise operations are useful for normalizing input data. Normalizing data is one of the techniques to speed-up the learning process when training a neural network. It consists of two steps:
* Subtract-out the mean $\mu$
* Normalize the variances $\sigma$

Let us have a look at a imaginary training set $X$ with two input features $\begin{bmatrix} x_1\\x_2\end{bmatrix}$.

In [None]:
m = 40 # number of training examples
np.random.seed(4)
x1 = np.random.uniform(low=1.0, high=5.0, size=m) # values of feature x1 are between 1 and 5
x2 = np.random.uniform(low=2.0, high=3.0, size=m) # values of feature x2 are between 2 and 3
X = np.array([x1, x2])

In [None]:
print (X)

In [None]:
# A helper function for plotting the dataset as a scatter plot
def plotting_helper(X, scale):
    plt.ylim((-scale, scale)) # setting the range of the plotted y-axis
    plt.xlim((-scale, scale)) # setting the range of the plotted x-axis
    plt.axhline(0, color='gray', linewidth=1, linestyle='dotted') # plot the x-axis
    plt.axvline(0, color='gray', linewidth=1, linestyle='dotted') # plot the y-axis
    return plt.scatter(X[0], X[1])

#### Graphical representation of the data in X

In [None]:
plotting_helper(X, 5.5)

The mean can be calculated as: $$\mu = \frac{1}{m} {\sum_{i=1}^{m}x}$$
and then subtracted out of each element of the corresponding feature:
$$x := x - \mu$$

In [None]:
mu_x1 = np.sum(x1) / m # calculate the mean of the x1 feature
mu_x2 = np.sum(x2) / m # calculate the mean of the x2 feature


np.testing.assert_allclose(mu_x1, np.mean(x1), rtol=1e-5, atol=0)
np.testing.assert_allclose(mu_x2, np.mean(x2), rtol=1e-5, atol=0)

x1_new = x1 - mu_x1
x2_new = x2 - mu_x2
X_new = np.array([x1_new, x2_new])

#### Graphical representation of the data in X after subtracting the mean

In [None]:
plotting_helper(X_new, 5.5)

Our training set has now (almost) zero-mean over both features.

In [None]:
print(X_new.mean(axis=1))

The next step is to normalize the variances. It is obvious that the feature $x_1$ has much larger variance than the feature $x_2$. In order to do so:

For our distribution with zero-mean, the variance can be calculated as: $$\sigma^{2} = \frac{1}{m} {\sum_{i=1}^{m}x^{2}}$$
and then each element of the features should be divided by the corresponding variance :
$$x := \frac{x}{\sigma}$$

In [None]:
sigma_x1 = np.sum(np.square(x1_new)) / m # calculate the mean of the x1 feature
sigma_x2 = np.sum(np.square(x2_new)) / m # calculate the mean of the x2 feature

np.testing.assert_allclose(sigma_x1, np.var(x1_new), rtol=1e-5, atol=0)
np.testing.assert_allclose(sigma_x2, np.var(x2_new), rtol=1e-5, atol=0)

x1_norm = x1_new / np.sqrt(sigma_x1)
x2_norm = x2_new / np.sqrt(sigma_x2)
X_norm = np.array([x1_norm, x2_norm])

In [None]:
plotting_helper(X_norm, 5.5)

The variances of both $x_1$ and $x_2$ are now equal to 1.

In [None]:
print(X_norm.std(axis=1))

Another way to calculate the norm is by employing element-wise operations

In [None]:
X_squared = np.square(X_new) # element-wise squaring of all elements of the dataset
sum_X = np.sum(X_squared, axis=1) # summing all elements per row
sigmas_X = np.divide(sum_X , m) # dividing the sums by m
sigmas_X = np.expand_dims(sigmas_X, axis=1) # reshaping the result with aditional axis - from (2,) to  (2, 1)

X_norm_easy = np.divide(X_new, np.sqrt(sigmas_X)) # perform the element-wise division

In [None]:
plotting_helper(X_norm_easy, 5.5)

The variances of both $x_1$ and $x_2$ are again equal to 1.

In [None]:
print(X_norm_easy.std(axis=1))

Machine learning tools such as `sklearn` provide normalizing functions that let you apply various normalizations.

In [None]:
from sklearn.preprocessing import scale

In [None]:
normed_matrix = scale(X, axis=1)
plotting_helper(normed_matrix, 5.5)
print(normed_matrix.std(axis=1)) # the sum of the normalized features should be 1.

<font color='blue'>
**Important notice**:
- keep track of the dimensions of the data structures in NumPy when calculating matrix and dot product.
- in order to perform elementwise operations, the sizes of the NumPy arrays must be identical.

# More Information
* https://www.python-kurs.eu
* https://matplotlib.org/users/pyplot_tutorial.html

# Media
http://neuralnetworksanddeeplearning.com/images/tikz11.png

# References

[1] @article{DBLP:journals/corr/HeZRS15,
  author    = {Kaiming He and
               Xiangyu Zhang and
               Shaoqing Ren and
               Jian Sun},
  title     = {Deep Residual Learning for Image Recognition},
  journal   = {CoRR},
  volume    = {abs/1512.03385},
  year      = {2015},
  url       = {http://arxiv.org/abs/1512.03385},
  archivePrefix = {arXiv},
  eprint    = {1512.03385},
  timestamp = {Wed, 07 Jun 2017 14:41:17 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/HeZRS15},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}