In [None]:
!pip install -q numpy pandas

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

This notebook uses many images from the excellent [A Visual Intro to NumPy and Data Representation](https://jalammar.github.io/visual-numpy/) from [Jay Alammar](https://jalammar.github.io/).

## Axes

Axes == dimension

i.e. three axes == three dimensions

## Scalars, vectors, matricies and tensors

See Chapter 2 of [Deep Learning](https://www.deeplearningbook.org/).

Being specific about how we use these terms (there is no solid concensus - many people (including me) will use array and/or tensor).

### Scalar

$\textit{x}$

- single number
- lowercase, italic $\textit{x}$
- point

### Vector

$\textbf{x} = \begin{bmatrix}x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{bmatrix}$

- array of $n$ numbers
- lowercase, bold 
- $x_{1}$ = first element
- line

### Matrix

$\textbf{A}_{2, 2} = \begin{bmatrix}A_{1, 1} & A_{1, 2} \\ A_{2, 1} & A_{2, 2}\end{bmatrix}$

- two dimensional
- uppercase, bold $\textbf{A}_{m, n}$
- $A_{1, 1}$ = first element
- area

### Tensor

- n-dimensional
- 3 = volume
- uppercase, bold $\textbf{A}_{i,j,k}$

## When `numpy`

Linear algebra, data processing

Pandas sits on top of `numpy`:

In [None]:
#  access the numpy array that holds the data
pd.DataFrame([1, 2]).values

## What `numpy`

Library for working with n-dimensional data
- **store and operate on data using C structures**

<img src="assets/c.png" alt="" width="350"/>

## Why `numpy`

Functionality
- vector, matrix & tensor operations

Uses less memory
- fixed data types

Speed
- fixed data types (benefit from static typing)
- C implementation

Below we implement a sum operation using a Python loop:

In [None]:
def loop(left, right):
    data = np.zeros(left.shape[0])
    for i in range(data.shape[0]):
        data[i] = left[i] + right[i]
    return data
    
left = np.arange(10000000)
right = np.arange(10000000)

#  excuse the horrbile hack here
#  want to always print the time in seconds
res = %timeit -qo loop(left, right)

'{:.2f} seconds'.format(res.average)

Now lets try it using `numpy` addition:

In [None]:
res = %timeit -qo left + right

'{:.2f} seconds'.format(res.average)

Note that not only is `numpy` quicker, it is **more readable**!

The reason that `numpy` is faster is **vectorization**
- running multiple operations from a single instruction

Many CPU's have operation that run in parallel (modern x86 chips have the SSE instructions)

Vectorization is
- the process of rewriting a loop 
- instead of processing a single element of an array N times
- it processes 4 elements of the array simultaneously N/4 times

## `list` versus `np.array`

Python list
- general-purpose container - can hold different data types
- support (fairly) efficient insertion, deletion, appending, and concatenation
- list comprehensions make them easy to construct and manipulate
- only a few list operations can be carried out in C (because of the need for type checking)
- the list holds pointers to items scattered across memory

Numpy array
- **only one data type**
- less flexible
- vectorized operations
- fixed size
- data in one place in memory

Only holding one data type means that numpy can efficiently store data in memory

A list doesn't know what the next object will be - this makes storing it in memory challenging

```python
[0, 1.0, '2.0']
```

We can make an array from a list - `numpy` will make assumptions about what datatype the array should hold:

In [None]:
#  the integer 10 is converted to a float
np.array([10, 20.0])

We can see the data type by accessing the `.dtype` attribute:

In [None]:
#  64 bits (0 or 1) per float
np.array([10, 20.0]).dtype

We can change the datatype of an array:

In [None]:
np.array([10, 20.0]).astype('int')

Note that changing the datatype will by default create a newly allocated array (new location in memory) - you can control this using a an argument:

In [None]:
np.array([10, 20.0], copy=False).astype('int')

We can see the number of elements in an array:

In [None]:
np.array([10, 20.0, 30]).size

For a vector the size will be the same as the shape:

In [None]:
np.array([10, 20.0, 30]).shape

We can also get the number of elements in a vector using the Python bulitin `len`:

In [None]:
len(np.array([10, 20.0, 30]))

## Vectors

$\begin{bmatrix}x_{1} & x_{2} & \cdots & x_{n} \end{bmatrix}$

- array of $n$ numbers
- lowercase, bold $\textbf{x}$
- $x_{1}$ = first element
- line

We can visualize a vector as a line:

In [None]:
data = np.array([4, 6, 8, 8, 6, 4])

_ = plt.plot(data)

## Vector Arithmetic

In Python when we add iterables together they are joined:

In [None]:
[0, 1, 2] + [1]

`numpy` works differently - addition works **element wise**:

<img src="assets/add.png" alt="" width="300"/>

In [None]:
np.array([1, 2]) + np.array([1, 1]) 

All of the logic above holds for subtraction, multiplication etc:

In [None]:
np.array([0, 1, 2]) - np.array([1]) 

In [None]:
np.array([0, 1, 2]) * np.array([2]) 

## Broadcasting

The smaller array will be broadcast across the larger array

<img src="assets/broad.png" alt="" width="300"/>

In [None]:
np.array([1, 2]) + np.array([1.6]) 

Note how different adding lists together is:

In [None]:
[1, 2] + [1.6]

Broadcasting is important because the larger array **keep its shape**
- matrix multiplication (ie dot products) often result in differently shaped arrays

## Working in a single dimension

Vectors - flat lists

### Indexing

<img src="assets/idx.png" alt="" width="500"/>

### Aggregation

<img src="assets/agg.png" alt="" width="800"/>

## Practical 

Implement the indexing and aggregations shown above

In [None]:
data = np.array([1, 2, 3])

You can also use `np.sum`, `np.min` etc:

In [None]:
np.sum(data)

## Vector norms

Size of a vector

Function that maps from a vector to a non-negative scalar

$||x||_{p} = \left( \sum |x| \right)^{\frac{1}{p}} $

A common operation in machine learning is **gradient clipping** - this can be done by clipping by value, norm or global norm
- global norm will keep their relative scale 

We can do a norm in `numpy` using:

In [None]:
data = np.array([4, 6, 8, 8, 6, 4, 0, 0, 1])

np.linalg.norm(data, ord=2)

There are various kinds of norms:
- $L^1$ = used when the difference between 0 and close to 0 elements is important
- $L^2$ = Euclidean norm

## Practical

Implement $L^{2}$ norm in pure Python

## Making vectors

`np.arange` - similar to the Python builtin `range`

In [None]:
np.arange(start=0, stop=10, step=2)

`np.linspace` - evenly spaced between two points

In [None]:
np.linspace(0, 100, 9)

## Sampling random uniform

This can be done two ways
- `np.random.random`
- `np.random.rand`

Only difference is the shape argument is not a tuple
- saves writing the brackets

Sample uniformly across the interval [0, 1)

In [None]:
#  shape defined as a tuple
np.random.random((2, 4))

In [None]:
#  shape defined as *args
np.random.rand(2, 4)

## Sample from a standard normal

`np.random.randn`

$\mathcal{N}(0,1)$

In [None]:
np.random.randn(2, 4)

## Sample from a Gaussian

`np.random.normal`

$\mathcal{N}(\mu,\sigma)$

We choose the statistics (mean & standard deviation)

In [None]:
np.random.normal(1, 2, size=(2, 4))

## Practical - Gaussian mixture

Create a program that calculates the weighted sum of three Gaussian's

You will need too:
- create arrays to hold the weights, means & standard deviations
- make sure the weights are normalized & standard deviations are > 0

It doesn't matter what the statistics of the Gaussian or the weights are - assume these are given

The use case for this is a **Mixed Density Network**