# Course 3: The NumPy package

## Motivation

NumPy (Numerical Python) is the core package for scientific computations in Python. It supports vectors and multidimensional matrices (tensors), functions for random number generation, linear algebra, signal processing, basic stats, etc. NumPy is a standard used by other packages. The manipulated data must fit RAM memory. NumPy relies on  C compiled code. 

In many situations data can be converted to numbers:
* an image in grey can be seen as a 2d matrix; every number is the intensity of the corresponding pixel  (0 - black, 255 - white)
![Imagine minsi scalata 0-1](./images/mnist_image.png)
* a colour image can be seen as a 3d matrix, of represented as RGB: the "parallel" matrices are representing a colour channel:
![Imagine RGB descompusa in 3 canale](./images/channelsrgb.gif)
* an audio file can be seen as one or more vectors, where the number of vectors is the number of recoring channels. For a wav file, the values represent membrane's displacement, sampled in time:
![sunet](./images/sound.png)
* a text can be translated in numerical vectors by using techniques as [Bag of words](https://en.wikipedia.org/wiki/Bag-of-words_model), [Word2vec](https://en.wikipedia.org/wiki/Word2vec), or any other type of embedding.

Vector and matrix representation used by NumPy is much more efficient than Python lists. The NumPy code actually uses native libraries, not interpreted code. If you vectorize you code, the efficiency is even greater on actual CPUs. 

The basic data type from NumPy is `ndarray` - n-dimensional array. 

## Creating arrays

In [None]:
# importing numpy, traditionally aliased as np
import numpy as np

# print numpy version, for further reproducibility
...

In [None]:
# create a vector starting from a Python list
x = np.array([1, 4, 2, 5, 3])

# printing x's type
print(...) 

# all the elements in the array are of the same type
print(...) 

# you can explicitely specify the underlying type in the array
y = np.array([1, 2, 3], dtype=np.float16)
print(y.dtype)

In [None]:
# useful functions
all_zeros = ...
print(all_zeros)
# printing number of elements on each dimension (axes)
print(all_zeros.shape)

In [None]:
# 2d matrix
mat = np.array(...)  # note we start from lists of lists
print(mat)
print(mat.shape)
print(mat[0, 1])

In [None]:
# matrix with same value:
mat_7 = ....
print(mat_7)

mat_7_2 = ...

print(mat_7 == mat_7_2)

The number of dimensions of a matrix is given by:

In [None]:
print('Number of dimensions of all_zeros:', all_zeros.ndim)
print('Number of dimensions of mat:', mat.ndim)

The total number of elements, byte size of an element are found as:

In [None]:
print('mat size: {0}\nmat element size: {1} bytes\nmat.dtype:{2}'.format(mat.size, mat.itemsize, mat.dtype))

In [None]:
# usefull function
# identity matrix
print(...)

In [None]:
# equally spaced values in an interval; the interval bounds are part of the vector
print(...)

In [None]:
# arange works identically with Python's range function
some_values = np.arange(0, 10, 3)
print(some_values)
print(type(some_values))

In [None]:
# random numbers
x = np.random.random(...)
print(x)

The following data types can be used for NumPy arrays:

| Type  | Explanation |
| ---- | -----------|
| bool_ | 	Boolean (True or False) stored as a byte | 
| int_ | 	Default integer type (same as C long; normally either int64 or int32) | 
| intc | 	Identical to C int (normally int32 or int64) | 
| intp | 	Integer used for indexing (same as C ssize_t; normally either int32 or int64) | 
| int8 | 	Byte (-128 to 127) | 
| int16 | 	Integer (-32768 to 32767) | 
| int32 | 	Integer (-2147483648 to 2147483647) | 
| int64 | 	Integer (-9223372036854775808 to 9223372036854775807) | 
| uint8 | 	Unsigned integer (0 to 255) | 
| uint16 | 	Unsigned integer (0 to 65535) | 
| uint32 | 	Unsigned integer (0 to 4294967295) | 
| uint64 | 	Unsigned integer (0 to 18446744073709551615) | 
| float_ | 	Shorthand for float64. | 
| float16 | 	Half precision float: sign bit, 5 bits exponent, 10 bits mantissa | 
| float32 | 	Single precision float: sign bit, 8 bits exponent, 23 bits mantissa | 
| float64 | 	Double precision float: sign bit, 11 bits exponent, 52 bits mantissa | 
| complex_ | 	Shorthand for complex128. | 
| complex64 | 	Complex number, represented by two 32-bit floats (real and imaginary components) | 
| complex128 | 	Complex number, represented by two 64-bit floats (real and imaginary components) |

A popular operation is taking an array and changing its shape:

In [None]:
# from a vector to a matrix
vec = np.arange(10)
mat = vec.reshape(...)
print(vec)
print(mat)

In [None]:
#...and vice versa:
vec2 = ...
print(vec2)

Matrices can be concatenated; the concatenation dimension (axis) should be specified:

In [None]:
a = np.array([[1, 2], [3, 4]], float)
b = np.array([[5, 6], [7,8]], float)

In [None]:
# vertical concatenation (stacking)
vertical = np.concatenate(...)
print(vertical)

The axis concept is defined for matrices with at least two dimensions. For a 2D matrix, the axis 0 sweeps the matrix vertically, while axis 1 sweeps it horizontally.

In [None]:
# horizontal concatenation
horizontal = np.concatenate(...)
print(horizontal)

In [None]:
# same as:
vertical = np...
horizontal = np...
print(vertical)
print(horizontal)

In [None]:
matrix = np.arange(15).reshape(3, 5)
print(matrix)

In [None]:
sum_by_columns= ...
print(sum_by_columns)

In [None]:
sum_by_rows= ...
print(sum_by_rows)

## Operating with ndarrays

As expected, the basic operations from linear algebra are already implemented.

In [None]:
# multiplying by a scalar
a = np.array([[1, 2, 3], [4, 5, 6]])
print('a=\n', a)
b = ...
print('b=\n', b)

In [None]:
# add and substract two matrices
sum_mat = a + b
print(sum_mat)
diff_mat = a - b
print(diff_mat)

The multiplication operator \* is implemented in a different way from linear algebra: For two matrices with 
same shapes the elements at the same coordinates are multiplied: c[i, j] = a[i, j] * b[i, j]. This is called pointwise multiplication, or Hadamard product, and it is often used in signal processing and machine learning

In [None]:
# multiplication by * leads to Hadamard product: c[i, j] = a[i, j] * b[i, j]
c = a * b
print(c)
for i in range(c.shape[0]):  # c.shape[0] = number of rows of matrix c
    for j in range(c.shape[1]): # c.shape[1] = number of columns of matrix c
        print(c[i, j] == a[i, j] * b[i, j])

The above operations are using linear algebra libraries, which are optimized for the current microprocessors. It is recommended to use them instead of (nested) for cycles:

In [None]:
# create matrices
matrix_shape = (100, 100)
a_big = np.random.random(matrix_shape)
b_big = np.random.random(matrix_shape)

In [None]:
%%timeit
c_big = np.empty_like(a_big)
for i in range(c_big.shape[0]):
    for j in range(c_big.shape[1]):
        c_big[i, j] = a_big[i, j] * b_big[i, j]

In [None]:
%%timeit
c_big = a_big * b_big

In [None]:
# 'raising to power' using ** : each element of the matrix is individually raised to that power.
print('initial matrix:\n', a)
a_to_the_power_of_2 = ...
print('after squaring each component:\n', a_to_the_power_of_2)
power_3 = ...
print('after raising to the power of 3:\n', power_3)

You can use the / operator to get pointwise division (element by element) of two matrices:

In [None]:
print('a=', a)
print('b=', b)
print('a/b=', a/b)

In [None]:
# raising a square matrix to a power, as defined in linear algebra
square_matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
pow_3 = np.linalg...
print(pow_3)

If you call a NumPy numerical function on an ndarray, the result will be an ndarray of same shape as the original one. Its elements are computed by applying the function on each of the initial matrix componets, one by one:

In [None]:
x = np.arange(6).reshape(2, 3)
print(x)
y = np.exp(x)
print(f'e^x=\n{y}')
assert x.shape == y.shape
for i in range(0, x.shape[0]):
    for j in range(0, x.shape[1]):
        assert ...

In [None]:
# linear-algebra style matrix product
a = np.random.rand(3, 5)
b = np.random.rand(5, 10)
assert a.shape[1] == b.shape[0]
c = ...

# equivalent form
c = ...
assert a.shape[0] == c.shape[0] and b.shape[1] == c.shape[1]

# or even shorter:
c = ...

NumPy defines a plethora of functions: `all, any, apply_along_axis, argmax, argmin, argsort, average, bincount, ceil, clip, conj, corrcoef, cov, cross, cumprod, cumsum, diff, dot, floor, inner, inv, lexsort, max, maximum, mean, median, min, minimum, nonzero, outer, prod, re, round, sort, std, sum, trace, transpose, var, vdot, vectorize, where` - [docs here](https://docs.scipy.org/doc/numpy-dev/reference/generated/).

## Indexing 

So far, we used individual indices to refer to particular elements in a NumPy array:
```python
vector[index]
# or
matrix[i, j]
```

In [None]:
vector = np.arange(10)
print(vector)
print('vector[4]={0}'.format(vector[4]))

In [None]:
matrix = np.arange(12).reshape(3, 4)
print(matrix)
print(matrix[2, 1])

For a matrix one can use indices as:
```python
m[i][j] 
```
but this is innefficient compared to `m[i,j]`, as in the former case a temporary copy of row `i` is done, and from this one the element at index `j` is retrieved.

By using `slicing`, one can refer to a whole subset of elements. For example, one gets for vectors:

In [None]:
vector = 10 * np.arange(10)
print(vector)
print(vector[...]) # note that the rightmost boundary is not used for selection; 
# i.e. the elements are retrieved up to index 6-1

In [None]:
indices = [1, 3, 2, 7]
print(vector)
print(...)

In [None]:
# the same index can be repeated:
indices = [1, 3, 2, 3, 3, 3, 3]
print(vector)
print(vector[indices])

In [None]:
# or we can use a sequence of indices, with initial/step/final value given
vector[...]

For a matrix you can use:

In [None]:
matrix = 10 * np.arange(20).reshape(4, 5)
print(matrix)

print(matrix[1,])
# which is the same with the more explicit form:
print(matrix[1, :])

In [None]:
# you can slice indices, on any axis
matrix[1:3, :]

In [None]:
# slicing on each axis
matrix[1:3, 2:4]

### Logical indexing

You can execute logical operations against the elements of a NumPy array. As a result of this, you will obtain an ndarray of the same shape as the initial array, filled in with `True` and `False`. The exact boolean value is the result of the logical operation on the values at the same positions:

In [None]:
a = np.array([[1,2], [3, 4], [5, 6]])
print(a)
print(...)

Furthermore, the resulted ndarray can be used for indexing. You will get only those elements for which the boolean operation yielded `True`:

In [None]:
larger_than_2 = a > 2
print(a[larger_than_2])

# direct selection
print(...)

# Note that the initial shape of a is lost. Can you explain why?

If you want to find the positions in the matrix where the values fulfill a given condition, use method `where`:

In [None]:
np.where(a > 2)

For joint conditions one can use boolean operators. For example, to select elements which are larger than 2, but smaller than 6, one writes:

In [None]:
a[np.logical_and(...)]

Similar operators are: `np.logical_or`, `np.logical_not`, `np.logical_xor`.

A popular request for ndarrays is getting those FP elements which are defined - i.e. those which are not NaNs (NaN = not a number, values resulted due to invalid operations like: 0/0, np.sqrt(-1), np.log(-1), etc.):

In [None]:
tab = np.array([[1.0, 2.3, np.nan, 4], [10, np.nan, np.nan, 0]])
print(tab[...])

In all cases, indexing returns a view of the initial array. Modifications done on this view will actually update the underlying initial array:

In [None]:
print('Before:\n', tab)
tab[np.isnan(tab)] = 0.0
print('After:\n', tab)

Logical indexing allows selection of the elements in an array whose contents is to be changed:

In [None]:
# even numbers are multiplied by 10, the other ones remain unchanged 
matrix = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print('Before update:\n', matrix)
matrix...
print('After update:\n', matrix)

You may perform update on a specific axis:

In [None]:
matrix = np.array([[1, 2, 3, 4], [5, 6, 7, 8]], dtype=np.float32)
print('Before update:\n', matrix)
# columns 0, 2, 3 will be updated
bool_columns = [True, False, True, True]
matrix[:, bool_columns] = (matrix[:, bool_columns] +3 )/10
print('After update\n', matrix)

### Supplementary bibliography 
1. https://docs.scipy.org/doc/numpy-1.13.0/glossary.html
1. https://engineering.ucsb.edu/~shell/che210d/numpy.pdf
1. http://www.scipy-lectures.org/intro/numpy/numpy.html#indexing-and-slicing

## Broadcasting

Broadcasting allows operations with matrices of incompatible dimensions, under specific circumstances. For example, following the mathematical definition of matrix addiction, the matrices `a` and `b`b below could not be added:

In [None]:
a = np.array([[0.0,0.0,0.0],[10.0,10.0,10.0],[20.0,20.0,20.0],[30.0,30.0,30.0]]) 
b = np.array([0.0,1.0,2.0])  

print('a=\n{0}\n'.format(a))
print('b=\n{0}\n'.format(b))

Through broadcasting, the matrix `b` is automatically extended through copy/paste of its line:
![broadcast](./images/broadcast1.png)

In [None]:
# broadcasting
result = ...
print('result=\n{0}\n'.format(result))

Broadcasting is done if some conditions are fulfilled. When one operates on two matrices, NumPy compares the sizes on each dimension, starting with the last dimension. Two dimensions are compatible when: 

1. they are equal, or
1. one of them is 1

The rules above are not fulfilled, for example, by:

![Broadcat imposibil](./images/broadcast2.png)
or for the case below:

In [None]:
x = np.arange(4)
y = np.ones(5)
print(x.shape, y.shape)

#print(x+y) # ValueError: operands could not be broadcast together with shapes (4,) (5,) 

In [None]:
x = np.arange(4).reshape(4, 1)
print('x shape: ', x.shape)
print('x:\n', x)

In [None]:
y = np.arange(5).reshape(1, 5)
print('y shape: ', y.shape)
print('y:\n', y)

In [None]:
z = x + y
print('z shape:', z.shape)
print('z\n', z)

The broadcast internally produced the following two matrices right before doing the addition:

In [None]:
# clone x and create multiple identical columns:
x_broadcast = np.tile(x, (1, 5))
x_broadcast

In [None]:
# clone y and create multiple identical rows:
y_broadcast = np.tile(y, (4, 1))
y_broadcast

Now, the two matrices can be added:

In [None]:
z_broadcast = x_broadcast + y_broadcast
assert np.alltrue(z_broadcast == z)

### Broadcast practical example

([Source](https://eli.thegreenplace.net/2015/broadcasting-arrays-in-numpy/)) For specific food types, we decompose them in fats, carbs, proteins, all of them weighted in grams. We want to convert each food type components into calories, by using multiplicative constants:
1. calories for fats = 9 * fats in grams
1. calories for proteins = 4 * protein grams
1. calories for carbs = 4 * carbs in grams

![tabel portii](./images/broadcast3.png)

The multiplication is computed by broadcasting:

In [None]:
weights = np.array([
  [0.3, 2.5, 3.5],
  [2.9, 27.5, 0],
  [0.4, 1.3, 23.9],
  [14.4, 6, 2.3]]
)

cal_per_g = np.array([9, 4, 4])

# broadcasting
calories = ...

print('Calories:\n', calories)

### Supplementary bibliography 

[Basic broadcasting: https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

http://scipy.github.io/old-wiki/pages/EricsBroadcastingDoc

http://cs231n.github.io/python-numpy-tutorial/#numpy-broadcasting

## Vectorized computation

### Example

We have 2 collections of numbers: the first one contains distances, the second one is time requested to cover them. We want to compyte the speed. We will present two approaches: a classical `for` cycle, and vectorized computation.

In [None]:
distances = [10, 20, 23, 14, 33, 45]
durations = [0.3, 0.44, 0.9, 1.2, 0.7, 1.1]

In [None]:
# Version 1: traditional cycle
speeds = []
for i in range(len(distances)):
    ...
    
print('Speed computed by for loop: ', speeds)

In [None]:
# Version 2: vectorization
# Vectorization works on vector/matrices, the first step is to convert lists to NumPy arrays

distances_array = np.array(distances)
durations_array = np.array(durations)

# We use NumPy operations which work with full matices. The C code  
# uses Single Instruction Multiple Data (SIMD) facilities from the CPU. 

speeds_array = ...

print(f'speed computed by vectorization: {speeds_array}')

assert np.alltrue(speeds_array == speeds)

### Benefits of vectorization

1. Fast(er) execution
1. Short and often more readable code

Example: let us compute
$$
\sum\limits_{i=0}^{N-1} (i\%3-1) \cdot i
$$

In [None]:
# Python funciton with non-vectorized computation (`for` cycle)

n = 100000

def func_python(n):
    d = 0.0
    for i in range(n):
        d += (i%3-1) * i
    return d

print(func_python(n))

In [None]:
%timeit func_python(n)

In [None]:
# function with vectorized computation

def func_numpy(n):
    i_array = np.arange(n)
    return ((i_array % 3 - 1 ) * i_array).sum()

print(func_numpy(n))

In [None]:
%timeit func_numpy(n)

Most operations and NumPy functions work for any number of values, element by element - they are also called universal functions. They are optimized to work with SIMD devices. The following operators and functions are efficiently working with arrays:

- arithmetic operators: + - * / // % **
- bitwise operators: & | ~ ^ >> <<
- comparisons: < <= > >= == !=
- math functions: np.sin, np.log, np.exp, ...
- special scipy functions: `scipy.special.*`

Although some NumPy functions are already provided by Python - e.g. `sum`, `min`, `mean` -ufuncs are much faster:

In [None]:
from random import random
c = [random() for i in range(n)]

In [None]:
# Python builtin function
%timeit sum(c)

In [None]:
# NumPy vectorized: prepare the NumPy array
c_array = np.array(c)

In [None]:
# NumPy vectorized: use ufunc sum
%timeit ...

### Vectorization exercises

#### Sum of filtered numbers

For a given list of numbers, compute the sum of contained values which are in a specified interval $[a, b]$. 

In [None]:
n = 100000000
numbers = [int(1000 * random()) for _ in range(n)]

a, b = 100, 500

A direct Python implementatin would be:

In [None]:
%%timeit
...

Using comprehension:

In [None]:
%%timeit
s = ...

Python built-in filtering:

In [None]:
%%timeit
s = ...

Python vectorization - build ndarray vector, then find indices of interest:

In [None]:
np_numbers = np.array(numbers)

In [None]:
%%timeit
ind_a = ...
ind_b = ...
s = np_numbers[np.logical_and(ind_a, ind_b)].sum()

#### Counting the number of sign changes

For a given vector with positive an negative values, count how many times the values are changing their signs in consecutive positions. It is guaranteed that the vector does not contain 0 values. 

Example: for input 
$$
v = [-1, 2, 3, -3, 1, -1, 1]
$$
the 4 sign of the values change at positions (0, 1), (2, 3), (3, 4), (4, 5), (5, 6). The requested answer is 5.

In [None]:
n = 1000000
v = np.random.randint(-10, 10, n)
# replace 0s with random -1 or +1
v[v==0] = np.random.randint(0, 2) * 2 - 1 # the initial random values {0, 1} are changed to {0*2-1, 1*2-1} = {-1, 1} 
assert 0 == (v == 0).sum()

The Python-base approach is straightforward:

In [None]:
%%timeit
...

The vectorized solution is:

In [None]:
%%timeit
# get the signs of the values
signs = ...
# compute the difference between succesive values. Two successive equal signs produce 0, different signs produce -2 or +2
first_order_difference = ...
# find non zero positions
positions_not_0_in_diff = ...
# the number of non-zero positions is the requested value
sign_changes = len(positions_not_0_in_diff)
# sign_changes = len(np.where(np.diff(np.sign(v)))[0])

#### Cumulative sum of values in a vector

Given a vector v[0..n-1], find the cumulative sum of its values: s=[v[0], v[0]+v[1], v[0]+v[1]+v[2], ..., v[0]+v[1]+...+v[n-1]]

In [None]:
n = 10000000
v = np.random.rand(n)

We will use the obvious relation:
$$
s[i] = 
\begin{cases}
v[0] & \textrm{if }i =0 \\
s[i-1] + v[i] & \textrm{if }i> 0
\end{cases}
$$

The unvectorized version is:

In [None]:
%%timeit
...

For the vectorized version, we mention that NumPy offers a function specifically for this, called `cumsum`. Its documentation is found [here](https://numpy.org/doc/stable/reference/generated/numpy.cumsum.html).

In [None]:
%%timeit
s = ...

Another approach is based on the accumulate extension introduced by NumPy for any universal function, as seen [here](https://numpy.org/doc/stable/reference/generated/numpy.ufunc.accumulate.html). The 

In [None]:
%%timeit
s = ...

#### Converting vectors to monotonically increasing vectors

Given an input vector, it is requested to add minimum amount to the its components, where needed, to ensure that the resulted vector is piecewise constant or piecewise increasing. 

For example, if we start from:
$$
v = [1, 2, \color{red}{1}, 3, 3, 5, \color{red}{4}, 6]
$$
we transform it as requested into:
$$
v = [1, 2, \color{red}{2}, 3, 3, 5, \color{red}{5}, 6]
$$

In [None]:
n = 10000000
v = np.random.randint(0, 1000, n)

v_clone_py = v.copy()
v_clone_np = v.copy()

The Pythron unvectorize approach would be:

In [None]:
%%timeit
max_so_far = v_clone_py[0]
for i in range(len(v_clone_py)):
    if v_clone_py[i] > max_so_far:
        max_so_far = v_clone_py[i]
    v_clone_py[i] = max_so_far

For the vectorized approach, we use the aforementioned `cumsum` extension, now called for `maximum` universal function:

In [None]:
%%timeit
np.maximum.accumulate(v_clone_np)

#### Finding the closest pair of points

We have $n$ points in 2D space. Their coordinates are given in vectors **x** and **y**, respectively. Compute the closest pair of points, using Euclidean distance:
$$
d^2((x_i, y_i), (x_j, y_j)) = (x_i-x_j)^2 + (y_i-y_j)^2
$$

In [None]:
n = 2000
x = np.random.random(size = n)
y = np.random.random(size = n)

In [None]:
# Version 1: compute the matrix `d` of pairwise distances. d[i, j] will store the square of the distance 
# between points of coordinates (xi, yi) and (xj, yj), respectively.

In [None]:
%%timeit

d = np.empty((n, n))
for i in range(n):
    for j in range(i, n):
        d[i, j] = d[j, i] = (x[i] - x[j])**2 + (y[i]-y[j])**2

In [None]:
# compute the indices i and j, i != j, for which the distance is minimized

def closest_pair(mat):
    n = mat.shape[0]
    # distance between a point and itself is always 0; we will exclude these cases by setting infinity on the main diagonal of distances
    i = np.arange(n)
    mat[i, i] = np.inf
    pos_flatten = np.argmin(mat)
    return pos_flatten // n, pos_flatten % n
    

In [None]:
# Version 2: vectorized computation and broadcasting

In [None]:
%%timeit

dx = (x[:, np.newaxis] - x[np.newaxis, :]) ** 2
dy = (y[:, np.newaxis] - y[np.newaxis, :]) ** 2

d = dx + dy

### Supplementary bibliography 

[https://speakerdeck.com/jakevdp/losing-your-loops-fast-numerical-computing-with-numpy-pycon-2015](https://speakerdeck.com/jakevdp/losing-your-loops-fast-numerical-computing-with-numpy-pycon-2015)

[Losing your Loops Fast Numerical Computing with NumPy](https://www.youtube.com/watch?v=EEUXKG97YRw)

Ivan Idris, NumPy Cookbook, Packt Publishing; 2nd Revised ed., 2015