# <u>An introduction to Numpy vectorisation</u>
## Or: the science of avoiding for loops in Python
### 

<hr style="border:1px solid blue">

### 
### This notebook serves as supplementary material for the class `Advanced Scientific Programming in Python` class.
### It is based on various guest lectures I have given on `Numpy` vectorization at EPFL and other institutions.
### 

<hr style="border:1px solid blue">

### 
###
### `Numpy` is the de-facto standard for doing "`Matlab`" stuff in Python.
### It is widely regarded as one of the best open-source libraries for scientific computing / academic programming in existence
### and has contributed substantially to the establishment of Python as a standard language in the academic sector.

### 

<hr style="border:1px solid blue">

### 

### As we all know, the most important `object` that `numpy` provides is the `numpy.ndarray`.

### A `numpy.ndarray` can be regarded as a `list of lists` containing numeric data or (in more exotic cases), other python `objects`.
### The `list of lists` statement is evidenced by the fact that we can convert `numpy.ndarray`'s to lists and vice-versa.

In [None]:
import numpy as np

# array [0, 1, 2, ..., 9]
arr = np.arange(10)

print('`arr` as a list: ', arr.tolist(), '\n')

# list [0, 1, 2, ..., 9]
arr_list = list(range(10))

print('`arr_list` as an array: ', np.asarray(arr_list))

*** 
### 
### A key difference is the way we can manipulate `numpy.ndarray`'s using mathematical operations.

### 

### <u> Example </u>:

In [None]:
import numpy as np

arr = np.arange(10)

print('`arr + arr` as a numpy array: ', arr + arr, '\n')

print('`arr + arr` as a list: ', arr.tolist() + arr.tolist())

*** 
### 

### `numpy.ndarray`'s can have **ANY** shape. 
### And when I say **any** I mean (virtually) any.
### <u> Example </u>:

In [None]:
import numpy as np

a0 = np.array(0)
a1 = np.array([0, 1])
a2 = np.array([[0, 1], [0, 1]])

# and so on ....

a10 = np.array([[[[[[[[[[0, 1], [0, 1]]]]]]]]]])

print(a0.shape)
print(a1.shape)
print(a2.shape)
print(a10.shape)

*** 
### 

## One last thing:
### What happens when I do this ?

In [None]:
import numpy as np

arr = np.array([[0, 1, 2], [3, 4]])

### It didn't work because `[[0, 1, 2], [3, 4]]` is not a valid tensor. 

<hr style="border:1px solid blue">

### 

# <u>Lesson 1</u>: Numpy broadcasting.

### 

In [None]:
%reset -f

*** 
### 
## <u> Task </u> (towards matrix multiplication):
### Given an array `mat` of shape `(n, m)` and an array `vec` of shape `(m,)`,
### create the 2D array (matrix) `A`, with `A[i, j] = mat[i, j] * vec[j]`

### 

In [None]:
import numpy as np

n = 4
m = 5

# create arrays of shape (n, m) and (m,) with n = 4, m = 5
mat = np.array([ [1, 2, 0, 0, 0],
                 [2, 1, 0, 0, 0],
                 [0, 1, 2, 3, 0],
                 [4, 1, 3, 2, 5] ], dtype=float)

vec = np.array([1, 3, 2, 5, 6], dtype=float)

*** 
### We start with the (non-pythonic) **C / C++ style solution**
#### (seeing it breaks my heart)
### 

In [None]:
# create an empty array of shape (n, m) and data type `float`
A = np.empty((n, m), dtype=float)

# populate the array using two nested for loops
for i in range(n):
    for j in range(m):
        A[i, j] = mat[i, j] * vec[j]

print("A equals: \n \n", A)

*** 
### It gets the job done but it **defeats the purpose of using python** !
### 

***
### On to the pythonic, vectorised solution.

### The most important ingredient of vectorisation is adding new `artificial` axes.
### For this, we utilise the `np.newaxis` variable.
### 
*** 
### For the sake of readability, it is very common to set `_ = np.newaxis`
### 

In [None]:
# create the variable `_` as a reference to `np.newaxis`
_ = np.newaxis

*** 
### 
### Using our previous definition of `vec`, let us see what this does

In [None]:
# read: `vec of newaxis comma everything`
_vec = vec[_, :]

*** 
### 
### We print `_vec`'s shape to see what prepending an artificial axis did.

In [None]:
print("The shape of _vec is: ", _vec.shape)

*** 
### 
### Now both `mat` and `_vec` are matrices of shape `(4, 5)` and `(1, 5)`, respectively.
### Common sense suggests that it shouldn't be possible to multiply them (elementwise).
### 
*** 
### We'll try it anyways:

In [None]:
print("mat * _vec: \n \n", mat * _vec, '\n\n')
print("The C++ style A equals: \n\n", A)

### 
### They are the same !!
### 
*** 
### Can we understand why ?

### 

### the `numpy.broadcast_shapes` function gives us the output shape of performing
### `+, -, *, / , ...` between two (or more) arrays of given shapes.

In [None]:
# get the input shapes of `mat * _vec` and print the output shape as predicted by numpy
input_shape0 = mat.shape
input_shape1 = _vec.shape

output_shape = np.broadcast_shapes(input_shape0, input_shape1)

print("Multiplying arrays of shape {} and {} gives the output shape: {}.".format(input_shape0, input_shape1, output_shape))

*** 
### 
### Somehow `numpy` must have filled in the missing values in the `1` axis of `_vec`.
### We can utilise the `numpy.broadcast_to(arr, output_shape)` function to see what `_vec` was broadcast to under the hood:

In [None]:
_vec_broadcast = np.broadcast_to(_vec, output_shape)

print('_vec broadcast to shape {} equals: \n \n'.format(output_shape), _vec_broadcast)

*** 
### 
### $\implies$ Numpy has repeated the artificial `1` axis **as many times as necessary** to match the `output_shape`.

*** 
### 

### In general, if two arrays have shapes 
### `(n0, n1, ..., nM)` and 
### `(m0, m1, ..., mM)`,
### the output shape is: 
### `(max(n0, m0), max(n1, m1), ..., max(nM, mM))`. 
### If `ni != mi` then either `ni == 1` or `mi == 1` must hold.
<hr style="border:1px solid blue">

### 
## <u>Exercise 1.1</u>:
### What are the output shapes of performing `+, -, *, /, ...` between arrays of the following shapes ?

### 1. `(5, 6)` and `(1, 6)`
### 2. `(7, 1)` and `(1, 6)`
### 3. `(1, 6)` and `(7, 2)`
### 4. `(4, 5)` and `(5,)`  (make a guess)
*** 

### 
## <u>Solution</u>:

### 1. `(5, 6)`
### 2. `(7, 6)`
### 3. Not allowed because `6 != 2` and none of them are `1` !
### 4. `(4, 5)`

<hr style="border:1px solid blue">

### 
### In the last example, numpy has **prepended** as many `1` axes as necessary to match the shape of the longer array.
### `(4, 5)` and `(5,)` becomes `(4, 5)` and `(1, 5)` becomes `(4, 5)` and `(4, 5)` (repeating the artificial `1` axis).

### 
*** 

## <u>Example</u>:

In [None]:
mat = np.random.randn(4, 5)
vec = np.random.randn(5)

print('mat * vec == mat * vec[_, :] ? \n\n', mat * vec == mat * vec[_, :])

*** 
### 

## <u>Exercise 1.2</u>:
### What are the output shapes ?

In [None]:
import numpy as np


### 1.
shape0, shape1 = (1, 6, 5), (5,)
print('1: ', np.broadcast_shapes(shape0, shape1), '\n')

### 2.
shape0, shape1 = (1, 5, 1), (6,)
print('2: ', np.broadcast_shapes(shape0, shape1), '\n')

### 3. Multiply (5, 6) array by a number. Output shape ?
arr0 = np.random.randn(5, 6)
a = 5
print('3: ', (a * arr0).shape)

<hr style="border:1px solid blue">

### 
### Now that we have gained an intuition for numpy broadcasting, we conclude this lession with a very important
### 
## <u>Exercise 1.3</u>: 
### Given arrays `arr0` of shape `(5,)` and `arr1` of shape `(7,)`, create the `(5, 7)` array `abs_outer`
### with `abs_outer[i, j] = abs(arr0[i] - arr1[j])`.

In [None]:
import numpy as np
_ = np.newaxis

arr0 = np.arange(5)
arr1 = np.arange(7)

*** 
### 
### The (wrong) *C*-style implementation first:

In [None]:
abs_outer_C = np.empty((5, 7), dtype=int)

for i in range(5):
    for j in range(7):
        abs_outer_C[i, j] = abs(arr0[i] - arr1[j])

*** 
### 
### Now the correct implementation. 
### **HINT**: `np.abs` is the vectorised version of `abs` that can be applied elementwise to any array.
### 
### <u>Solution</u>:

In [None]:
abs_outer = np.abs(arr0[:, _] - arr1[_, :])

print('abs_outer equals abs_outer_C ? \n\n', abs_outer == abs_outer_C)

<hr style="border:1px solid blue">

### 

# <u>Lesson 2</u>: Array contractions.

### 

In [None]:
%reset -f

*** 
### 
### An often-encountered operation in scientific computing is summing an array over a specified axis.
### We start with the most basic example.
## <u>Task</u>: Given an array `arr` of shape `(n,)` sum all its elements.

In [None]:
import numpy as np

n = 5

# the numbers [0, 1, 2, 3, 4]
arr = np.arange(n)

*** 
### 
### We start with the (incorrect) **C**-style implementation
#### (doing this will result in capital punishment)

In [None]:
arr_sum_C = 0
for i in range(n):
    arr_sum_C += arr[i]
    
print('The C-style sum of {} equals: {}'.format(arr, arr_sum_C))

*** 
### 
### We may instead utilise the `numpy.sum` function to sum the elements of an array:

In [None]:
arr_sum = np.sum(arr)

print('The pythonic-style sum of {} equals: {}'.format(arr, arr_sum))

*** 
### 
### A more consise (but equivalent) syntax is immediately invoking the .sum function on `arr`:

In [None]:
arr_sum = arr.sum()

print('The pythonic-style sum of {} equals: {}'.format(arr, arr_sum))

<hr style="border:1px solid blue">

### 

### The `.sum()` function takes the optional keyword argument `axis=None`, where `None` sums over all of the array's axes.
### Alternatively, `axis` can be an `int`, specifying one axis to sum over, or a `Sequence` of `int`'s for summing over several axes.
*** 

### 
## <u>Exercise 2.1</u>: 
### Given an array `arr` of shape `(n, m)` compute the array of shape `(n,)` containing
### the $l^1(\mathbb{R}^m)$-norm of each **row** of `arr`.

In [None]:
import numpy as np

# create the array

# [[0, 1, -2, 3, -4], [-5, 6, -7, 8, -9]]

# of shape (2, 5)

arr = np.array([ [ 0, 1, -2, 3, -4],
                 [-5, 6, -7, 8, -9] ])

*** 
### 
### <u>Solution</u>:

In [None]:
l1_norm_rows = np.abs(arr).sum(1)

print('\n'.join(["The l^1 norm of {} equals {}.".format(row, norm) for row, norm in zip(arr, l1_norm_rows)]))

<hr style="border:1px solid blue">

### 
## <u>Exercise 2.2</u>:
### We are given an array `arr_of_matrices` of shape `(n, m, p)`.
### 
### The `i`-th element along the zeroth axis represents a matrix of shape `(m, p)`
### i.e, `arr_of_matrices[i, :, :]` is the `i`-th matrix.
*** 
### 
### Compute the array `arr_frob` of shape `(n,)` containing the Frobenius norm of each matrix. 
### **FYI**: $ ||A||_F^2 = \sum_i \sum_j \vert A_{ij} \vert^2 $

In [None]:
import numpy as np

n, m, p = 9, 4, 6

arr_of_matrices = np.arange(n * m * p).reshape((n, m, p))

*** 
### 
### <u>Solution</u>:

In [None]:
arr_frob = np.sqrt((arr_of_matrices ** 2).sum(axis=(1, 2)))

for matrix, norm in zip(arr_of_matrices, arr_frob):
    print('The matrix \n\n{}\n\n has Frobenius norm ||A|| = {}. \n\n'.format(matrix, norm))

<hr style="border:1px solid blue">

### 

### In this course, a particularly important application of vectorisation is
### integration via quadrature.
*** 
### 
### In one of the homework exercises, we have seen the following function
### returning the `weights` and `points` of a Gaussian quadrature scheme over $(a, b)$.

In [None]:
from typing import Tuple
from numbers import Number
import numpy as np


def gauss_quadrature(a: Number, b: Number, order: int = 3) -> Tuple[np.ndarray, np.ndarray]:
    """ Given the element boundaries `(a, b)`, return the weights and evaluation points
        corresponding to a gaussian quadrature scheme of order `order`.

        Parameters
        ----------

        a : `float`
          the left boundary of the element
        b : `float`
          the right boundary of the element
        order : `int`
          the order of the Gaussian quadrature scheme

        Returns
        -------

        weights : `np.ndarray`
          the weights of the quadrature scheme
        points : `np.ndarray`
          the points (abscissae) over (a, b)
    """
    assert b > a
    points, weights = np.polynomial.legendre.leggauss(order)
    points = (points + 1) / 2
    return (b - a) / 2 * weights, a + points * (b - a)



weights, points = gauss_quadrature(0, 1, 5)

*** 
### 
## <u>Task</u>:
### Using the `weights` and `points` and vectorisation, approximate the cirumference `C(a, b)` of an ellipse
### with major axes `a` and `b`.
### 

### **FYI**: $C = \int_{0}^1 2 \pi \sqrt{a^2 \sin^2(2 \pi x) + b^2 \cos^2(2 \pi x)} \, \mathrm{d} x$
### 
### The integral needs to be computed in at most **two lines of code** !

### We can utilise the `np.stack([arr0, arr1], axis=0)` function to stack two arrays along the
### zeroth (first) axis.
### 

In [None]:
def compute_ellipse_circumference(a: float, b: float, order: int = 5) -> float:
    """
    Approximately compute the circumference of an ellipse with major axes of length `a` and `b`.
    
    Parameters
    ----------
    
    a : `float`
      the length of the first major axis
    b : `float`
      the length of the second major axis
    order : `int`
      the order of the gaussian quadrature scheme
    """
    
    # get gauss quadrature scheme over (0, 1) of order `order`
    weights, points = gauss_quadrature(0, 1, order=order)
    
    # two lines
    circle_points = np.stack([a * np.sin(2 * np.pi * points),
                              b * np.cos(2 * np.pi * points)], axis=0)
    
    # We print the shape to see what `np.stack` did
    print(circle_points.shape)
    
    return 2 * np.pi * (weights * ((circle_points**2).sum(0)**.5)).sum()


# test:

# a = b = 1 should give 2 pi

for (a, b) in [(1, 1), (4, 5), (3, 3), (1, 9)]:
    print("\nThe circumference of the ellipse with (a, b) = {} equals approximately {}. \n".format((a, b), compute_ellipse_circumference(a, b)))

<hr style="border:1px solid blue">

### 

### The concepts of broadcasting and `numpy.sum` contractions can be combined to
### perform various important operations between arrays.
*** 
### 
## <u>Task</u> (matrix multiplication):
### Given `mat` of shape `(n, m)` and `vec` of shape `(m,)` use broadcasting + `numpy.sum`
### to write a one-liner for matrix multiplication. Note that the output has shape `(n,)`.
### We come back to the example from the beginning.

In [None]:
import numpy as np
_ = np.newaxis

n = 4
m = 5

# create arrays of shape (n, m) and (m,) with n = 4, m = 5
mat = np.array([ [1, 2, 0, 0, 0],
                 [2, 1, 0, 0, 0],
                 [0, 1, 2, 3, 0],
                 [4, 1, 3, 2, 5] ], dtype=float)

vec = np.array([1, 3, 2, 5, 6], dtype=float)

***
### 
### Three equivalent implementations:

In [None]:
# option 0
matvec0 = (mat * vec[_, :]).sum(1)

# option 1 => vec of shape (5,) is automatically broadcast to (1, 5)
matvec1 = (mat * vec).sum(1)

# option 2, syntactic sugar for the above
matvec2 = mat @ vec

print('matvec0: \n\n', matvec0, '\n\n')
print('matvec1: \n\n', matvec1, '\n\n')
print('matvec2: \n\n', matvec2, '\n\n')

<hr style="border:1px solid blue">

### 

### we come to this lesson's last
### 
## <u>Exercise 2.3</u>:
### Given `arr0` of shape `(p, n)` and `arr1` of shape `(q, n)`,
### compute the `(p, q)`-shaped matrix `dist` of **all** Euclidean
### distances between the points represented by the rows of `arr0` and `arr1`.

### 
### `dist[i, j]` = `||arr0[i, :] - arr1[j, :]||`.
### 

### **HINT**:
### `(p, n)` and `(q, n)` become
### `(p, 1, n)` and `(1, q, n)` become 
### `(p, q, n)` and `(p, q, n)` becomes
### `(p, q)`.

In [None]:
import numpy as np
_ = np.newaxis


p, q, n = 3, 4, 15

arr0 = np.arange(p * n).reshape((p, n))
arr1 = np.arange(q * n, 2 * q * n).reshape((q, n))

*** 
### 
### <u>Solution</u>:

In [None]:
dist = ((arr0[:, _, :] - arr1[_, :, :])**2).sum(2)**.5

print("The `dist` matrix of shape (p, q) is given by: \n\n{}.".format(dist))

<hr style="border:1px solid blue">

### 

## <u>Lesson 3</u>: Advanced broadcasting (if time permits):
### 
### We have seen how broadcasting can be utilised to perform various array
### operations **without the use of for loops**.
### 
### So far, the arrays always had a **fixed number of dimensions**.
### In this lesson we learn advanced broadcasting techniques that are especially
### useful for generalising array operations to **arbitrarily-shaped arrays**.
*** 

### 
### We have seen how we can utilise `_ = np.newaxis` to prepend or append a new axis.

In [None]:
import numpy as np
_ = np.newaxis

A = np.random.randn(4, 5)

print('A.shape: ', A.shape, '\n')
print('A[_, :, :].shape: ', A[_, :, :].shape)

### 
### What if the number of dimensions of `A` is not known at runtime ?
### We do not know how many `:` to add at the end.
*** 

### 
### There is good news. Let's try the following:

In [None]:
import numpy as np
_ = np.newaxis

A0 = np.random.randn(1)[0]
A1 = np.random.randn(5)
A2 = np.random.randn(5, 6)
A3 = np.random.randn(2, 5, 6)
A4 = np.random.randn(6, 2, 5, 4)

print('A0[_].shape: ', A0[_].shape, '\n')
print('A1[_].shape: ', A1[_].shape, '\n')
print('A2[_].shape: ', A2[_].shape, '\n')
print('A3[_].shape: ', A3[_].shape, '\n')
print('A4[_].shape: ', A4[_].shape, '\n')


### $\implies$ we don't need to provide all the `:`, `Numpy` will infer the correct number of `:` and add them.
*** 
### 
### However, if we want to `append` an axis at the end, we are still in trouble.
### For this, `Python / Numpy` provides the `Ellipsis` or `...` operator.

In [None]:
import numpy as np
_ = np.newaxis

A0 = np.random.randn(1)[0]
A1 = np.random.randn(5)
A2 = np.random.randn(5, 6)
A3 = np.random.randn(2, 5, 6)
A4 = np.random.randn(6, 2, 5, 4)

print('A0[..., _].shape: ', A0[..., _].shape, '\n')
print('A1[..., _].shape: ', A1[..., _].shape, '\n')
print('A2[..., _].shape: ', A2[..., _].shape, '\n')
print('A3[..., _].shape: ', A3[..., _].shape, '\n')
print('A4[..., _].shape: ', A4[..., _].shape, '\n')

*** 
### 
### The Ellipsis operator can be utilised in more advanced cases.
### 

## <u>Task</u>:
### Given an `n`-dimensional `numpy.ndarray` `A`. Prepend and append a new axis.

In [None]:
import numpy as np
_ = np.newaxis


A = np.random.randn(2, 3, 4, 2, 7, 3)

print('A.shape: ', A.shape, '\n')
print('A[_, ..., _].shape: ', A[_, ..., _].shape, '\n')

### $\implies$ The `Ellipsis` `...` operator **infers** the number of `:` from `A`'s shape.
### In this case: `A[_, ..., _] = A[_, :, :, :, :, :, :, _]`.

<hr style="border:1px solid blue">

### 

## <u>Exercise 3.1</u>:
### Given the `numpy.nadarray` `A`, add a new axis at the second and second to last spot.
### The number of dimensions `A.ndim` has to be at least `2`.

In [None]:
import numpy as np
_ = np.newaxis

# add axes to these guys
A2 = np.random.randn(4, 5)
A3 = np.random.randn(4, 5, 6)
A4 = np.random.randn(3, 2, 5, 3)

## <u>solution</u>:

In [None]:
for A in (A2, A3, A4):
    print('A.shape: ', A.shape, '\n')
    print('A[:, _, ..., _, :].shape: ', A[:, _, ..., _, :].shape, '\n')

<hr style="border:1px solid blue">

### 

### We come to this lesson's last
## <u>Task</u>:
### Given `numpy.ndarray` `A` of shape `(..., n)` and `np.ndarray` `vec` of shape `(m,)`
### compute `outer_abs` of shape `(..., m, n)` where
### `outer_abs[..., i, j] == abs(A[..., j] - vec[i])`.

In [None]:
import numpy as np
_ = np.newaxis

A = np.random.randn(3, 4, 2, 5, 6, 4, 3)
vec = np.random.randn(10)

print('A.shape: ', A.shape, '\n')
print('vec.shape: ', vec.shape, '\n')

# use boradcasting + numpy.abs to perform the required task
outer_abs = np.abs(A[..., _, :] - vec[..., _])

print('outer_abs.shape: ', outer_abs.shape, '\n')
print('abs(A[0, 0, 0, 0, 0, 0, 2] - vec[3]): ', abs(A[0, 0, 0, 0, 0, 0, 2] - vec[3]), '\n')
print('outer_abs[0, 0, 0, 0, 0, 0, 3, 2]: ', outer_abs[0, 0, 0, 0, 0, 0, 3, 2], '\n')

### 
### <u>Explanation</u>:
### 1. `A[..., _, :]`: shape `(..., n)` becomes `(..., 1, n)`,
### 2. `vec[..., _]`: shape `(m,)` becomes `(m, 1)`
### 3. `A[..., _, :] - vec[..., _]`: shape `(m, 1)` becomes `(1, ..., 1, m, 1)` (prepend `1` axes to match `A[..., _, :]`'s number of dimensions)
### 4. shape `(..., 1, n)` minus shape `(1, ..., 1, m, 1)`:  broadcasts to shape `(..., m, n)`

<hr style="border:1px solid blue">

### 

### Check out yourself: **fancy indexing** and `numpy.einsum`.
### We will provide a Jupyter notebook for practice.