# numpy for Deep Learning


`numpy` is a Python library for efficient processing of **multidimensional arrays**. It provides us with:
- typical operations from Linear Algebra like vector/matrix addition, multiplication and inversion
- advanced indexing
- array reshaping
- broadcasting

To use numpy we have to first import it with:

In [120]:
import numpy as np

This constructs a numpy array from our Python list `x_list`.

In [121]:
x_list = [1, 2, 1]
x = np.array(x_list)

Let us do the same with a second list:

In [122]:
y_list = [0, 4, 1]
y = np.array(y_list)

Now if we want to add the vectors `x` and `y` together:

In [123]:
x + y

array([1, 6, 2])

Compare this to the result of "adding" the original Python lists `x_list` and `y_list` together:

In [124]:
x_list + y_list

[1, 2, 1, 0, 4, 1]

The `+` operator for lists leads to concentation, while for numpy arrays the result is vector/matrix/.. addition.

With numpy elementwise multiplication or the dot product are similarily easy to perform:

In [125]:
z = x * y
z

array([0, 8, 1])

In [126]:
x.dot(y)

np.int64(9)

Let us now turn our attention to 2-dimensional arrays, i.e. matrices:

In [127]:
X = np.array([[3, 4, 2],
              [2, 1, 3]])
Y = np.array([[4, 3],
              [1, 2],
              [3, 2]])

We can retrieve the shapes of these arrays with:

In [128]:
X.shape

(2, 3)

In [129]:
Y.shape

(3, 2)

Now let us compute the matrix product of `X` and `Y`:

In [130]:
P = np.matmul(X, Y)
P

array([[22, 21],
       [18, 14]])

In [131]:
P.shape

(2, 2)

The output matrix has the expected shape.

## Advanced indexing

numpy offers indexing features beyond those provided by Python.

Let us first revise the slice syntax `start:end` available in Python (no numpy):

In [132]:
my_list = [4, 7, 3, 2, 9, 4, 5]

Selecting the elements with indices `2, 3, 4` is possible with:

In [133]:
my_list[2:5]

[3, 2, 9]

It is also possible to either omit the `end` or `start`.

Omitting the `end` leads to selection of elements from specified `start` until the end of the list: 

In [134]:
my_list[2:]

[3, 2, 9, 4, 5]

Omitting the `start` leads to selection of elements from the start of the list to the specified `end`:

In [135]:
my_list[:5]

[4, 7, 3, 2, 9]

If we omit both `start:end` all elements are selected.

In [136]:
my_list[:]

[4, 7, 3, 2, 9, 4, 5]

For 1-dimensional arrays (or lists) this does not appear very useful, but this comes in handy for multidimensional arrays.

Let us take a look at our previously defined matrix `X`:

In [137]:
X

array([[3, 4, 2],
       [2, 1, 3]])

`X[i, j]` selects the element in the `i`-th row and `j`-th column of of numpy array `X`. With Python lists the same element can be selected with `X_list[i][j]`.

In [138]:
X[1, 0]

np.int64(2)

The "standard" Python way of indexing is also possible with numpy arrays..

In [139]:
X[1][0]

np.int64(2)

But advanced indexing is only possible when using the `[row_indices,column_indices]` syntax.

In [140]:
X

array([[3, 4, 2],
       [2, 1, 3]])

We can select entire rows of our matrix `X` via:

In [141]:
X[0, :]

array([3, 4, 2])

In [142]:
X[1, :]

array([2, 1, 3])

In [143]:
X

array([[3, 4, 2],
       [2, 1, 3]])

We can select entire columns of our matrix `X` via: 

In [144]:
X[:, 0]

array([3, 2])

In [145]:
X[:, 1]

array([4, 1])

In [146]:
X[:, 2]

array([2, 3])

We can select the first and the last column of `X` with:


In [147]:
X[:, [0, 2]]

array([[3, 2],
       [2, 3]])

Generates a 5x5 matrix with random entries for further examples.

In [148]:
R = np.random.randn(5, 5)
R

array([[ 0.83327704, -0.16005179, -0.36754527, -0.50691117, -0.08906481],
       [ 0.80727961,  0.74696881,  0.7152345 , -0.24888953,  0.25428815],
       [-0.15731702, -1.31404539, -0.66165558, -0.16742421, -0.65929808],
       [ 0.0214782 , -0.61365763, -0.16855952, -0.67666063,  0.79854266],
       [-1.26456369,  1.31160937, -1.97011486,  1.01072571, -1.51692587]])

Selects a 3x3 submatrix from `R`.

In [149]:
R[1:4, 2:5]

array([[ 0.7152345 , -0.24888953,  0.25428815],
       [-0.66165558, -0.16742421, -0.65929808],
       [-0.16855952, -0.67666063,  0.79854266]])

## Reshaping

numpy makes it easy to reinterpret vectors as matrices or vice versa. For example we can reinterpret a vector of size 6 as a 2x3 matrix or 3x2 matrix.

In [150]:
z = np.array([8, 5, 7, 2, 4, 1])

In [151]:
z.reshape(2, 3)

array([[8, 5, 7],
       [2, 4, 1]])

In [152]:
z.reshape(3, 2)

array([[8, 5],
       [7, 2],
       [4, 1]])

When we put `-1` for one of the desired sizes, than numpy automatically infers it from the other arguments.

In [153]:
z.reshape(2, -1)

array([[8, 5, 7],
       [2, 4, 1]])

## Broadcasting

Let us say we want to add a `v` vector to each row or column of a matrix `A`.

In [154]:
A = np.array([[4, 5],
              [6, 7]])
v = np.array([10, 20])
A + v

array([[14, 25],
       [16, 27]])

We see that `v` gets interpreted as a row vector, this row vector is then added to each row of `A`.

However, we also might want that `v` gets interpreted as a column vector and is added to each column. One can use the reshape function to reinterpret `v` as a column vector.

In [155]:
v.reshape(-1, 1)

array([[10],
       [20]])

In [156]:
A + v.reshape(-1, 1)

array([[14, 15],
       [26, 27]])

The result of this operation depends on v's layout:
- if v is a 1x2 row vector, then v is added to each row
- if v is a 2x1 column vector, then v is added to each column

Many of numpys operations support inputs of "compatibile" shapes. Inputs are then "broadcasted" based on their layout. Knowing the shapes of input arrays is thus crucial for understanding which operation is performed.

Learn more about broadcasting: https://numpy.org/doc/stable/user/basics.broadcasting.html

## Batch matrix multiplication

Let `n` matrices A and `n` matrices B be given. Our goal is to multiply matrix `A[0]` with matrix `B[0]`, matrix `A[1]` with matrix `B[1]`, `A[2]` with matrix `B[2]` and so forth:

In [157]:
n = 128
A = np.random.randn(n, n, n, 3, 4)
B = np.random.randn(n, 4, 7)

In [158]:
C = np.matmul(A, B)
C.shape

(128, 128, 128, 3, 7)

As expected 128 matrices each of size 3x7 are returned.

Let us we want to multiply our matrices `A` all with the same matrix `B_single`.

In [159]:
B_single = np.random.randn(4, 7)
# Alternatively: np.matmul(A, B_single)
C = np.matmul(A, B_single.reshape(1, 4, 7))
C.shape

(128, 128, 128, 3, 7)

What's happening here?

The input dimensions are
- `A`: 128x3x4
- `B_single.reshape(1, 4, 7)`: 1x4x7

Thanks to broadcasting the first axis of B (size 1) is extended to match the first axis of A (size 128). In principle `B` gets extended by `np.matmul` into a 128x4x7 array to match `A`'s shape.


In the previous example we used `reshape` to insert a new axis of size 1 into our array. With help of `np.newaxis` we have an alternative to do this:

In [160]:
B_single_reshaped = B_single[np.newaxis, :, :]
B_single_reshaped.shape

(1, 4, 7)

In [161]:
B_single_reshaped = B_single[:, np.newaxis, :]
B_single_reshaped.shape

(4, 1, 7)

In [162]:
B_single_reshaped = B_single[:, :, np.newaxis]
B_single_reshaped.shape

(4, 7, 1)

This is a very useful way to reshape inputs to control broadcasting behaviour.