# Setup for NumPy slides

* Imports.

In [4]:
import numpy as np
import matplotlib
import matplotlib.pyplot as plt

from IPython.display import HTML
np.random.seed(1)

# Introduction to NumPy


## What is NumPy? (1/2)

* A library to manipulate **arrays**
* An array is an **indexable collection** of items of the **same type**
    * Indexable means we can fetch specific parts of the array by their location  
* Arrays are **mutable** by default -- we can modify values
    * New values need to be compatible with the **type**
    * Can make them **immutable** (use `.setflags(write=False)` on an array)


## What is NumPy? (2/2)
* NumPy makes manipulating arrays **fast** and **easy** 
* Apply mathematical operations over specific dimensions of **multi-dimensional arrays**
* Support for many data [types](https://numpy.org/doc/stable/user/basics.types.html) 
* Many scientific packages are built on top of it (e.g. Pandas)

## NumPy Arrays
* NumPy arrays can have as many dimensions as you like, for example, an array could be:
    * a **vector** with $n$ elements
    * a **matrix** with $m$ rows and n columns $(m,n)$
    * a **tensor** of size $(m,n,p)$
* The most common way to create a NumPy array is from a standard Python list

## Practical Introduction

In [2]:
import numpy as np

a = np.array([1,2,3])
type(a)

numpy.ndarray

In [3]:
a.shape

(3,)

## Sizes, Shapes, and Dimensions

In [4]:
a.size

3

In [5]:
a.ndim # number of dimensions

1

## 2-Dimensional Arrays

We can make a two-dimensional array (a matrix) by inputting a list of lists:

In [6]:
# best way to creata a matrix is a list in a list when pandas is not available
b = np.array([[0,1,2],[3,4,5]])
b

array([[0, 1, 2],
       [3, 4, 5]])

In [7]:
b.shape

(2, 3)

In [8]:
b.size

6

In [9]:
b.ndim

2

# 2-Dimensional Arrays
![2D%20Array.svg](attachment:2D%20Array.svg)

## N-Dimensional Arrays

We can continue scaling this up to as many dimensions as we would like:

In [14]:
c = np.array([ [[1,2,3],[4,5,6]], [[1,2,3],[4,5,6]]])
c

array([[[1, 2, 3],
        [4, 5, 6]],

       [[1, 2, 3],
        [4, 5, 6]]])

In [15]:
c.shape

(2, 2, 3)

In [16]:
c.size

12

In [17]:
c.ndim

3

## Reshaping

In [18]:
b = np.array([[1,2,3],[4,5,6]])
b

array([[1, 2, 3],
       [4, 5, 6]])

In [19]:
b.shape

(2, 3)

In [20]:
b.reshape(6)

array([1, 2, 3, 4, 5, 6])

In [21]:
b.reshape(3,2)

array([[1, 2],
       [3, 4],
       [5, 6]])

## Indexing Arrays

To get the $n$th element of an array, we can index with `a[n]`. 
* Python/NumPy index from 0 
* We can index over multiple dimensions

In [22]:
a = np.array([1,2,3])
a[0]

1

In [23]:
b = np.array([[1,2,3],[4,5,6]])
b[0,0]

1

In [24]:
b[0,2]

3

## Indexing Arrays 

To index a whole dimension of the array, use `:`. For instance, array `b` has two dimensions, rows and columns. 

To get the whole of row 0, we can either index as `b[0,:]` or `b[0]`

In [25]:
b[0,:]

array([1, 2, 3])

In [26]:
b[0]

array([1, 2, 3])

## Indexing Arrays

The columns are the second dimension, so to get the whole of column 0 we do:

In [27]:
b[:,0]

array([1, 4])

So the syntax `b[M,N]` indexes `b` by row `M` and column `N`.

<img src="img/jupyter.png" width="200">

Now open the following workbook: `practical1.ipynb`

## Indexing continued

We can index with a **list** too. Say we want the first, second, and fourth elements of a:

In [28]:
a = np.array([0,1,2,3,4,5])
indices = [0,1,3]
a[indices]

array([0, 1, 3])

## Boolean Indexing

We can also do indexing by True/False statements:

In [29]:
a = np.array([0,1,2,3,4,5])
a > 3

array([False, False, False, False,  True,  True])

We can use this mask to index `a` itself:

In [30]:
a[a > 3] 

array([4, 5])

## Slicing

We can slice up arrays into sub-arrays by providing a lower and upper (exclusive) index:

In [31]:
a = np.array([0,1,2,3,4,5])

a[1:4]

array([1, 2, 3])

Get the first and last 3 elements of the array:

In [32]:
a[:3]

array([0, 1, 2])

In [33]:
a[3:]

array([3, 4, 5])

## Slicing 
You can also count back from the end of the array using a minus sign:


In [34]:
a[-3:]

array([3, 4, 5])

So `-3` means count back 3 indices from the end of `a`, and `:` returns all elements after this position.

How would you get all the elements before the item in position `-3`?  
How about just the last element?

## Slicing in multiple dimensions

Just as with regular indexing, we can slice index over multiple dimensions:

In [35]:
b = np.array([[1,2,3],
              [4,5,6]])

# columns 0 and 1
b[:, 0:2]

array([[1, 2],
       [4, 5]])

In [36]:
# row 0, columns 0 and 1
b[0,0:2]

array([1, 2])

## NumPy Functions

NumPy makes it easy to apply **functions** to **arrays**:
- Maths functions (`np.sin, np.sum, np.round, np.exp`)
- Logic (`np.equal, np.allclose, np.any`)
- Sorting, searching, counting (`np.argwhere, np.sort`)
- and many more!

For example:

In [37]:
a = np.array([1,2,3])
np.sum(a)

6

## NumPy Functions

We can **apply** functions over specific dimensions/**axes**:

In [50]:
b = np.array([[1,2,3],
              [4,5,6]])

there are two dimensions/axes (rows = axis 0, columns = axis 1).

So to sum up each of the columns:

In [39]:
np.sum(b, axis=0)

array([5, 7, 9])

In [40]:
np.sum(b, axis=1)

array([ 6, 15])

## Broadcasting

NumPy also supports broadcasting when the dimensions don't match up. For instance, to double every element in an array we could either do:

In [41]:
a = np.array([1,2,3])
twos = np.array([2,2,2])
a*twos

array([2, 4, 6])

Or we could do:

In [42]:
a*2

array([2, 4, 6])

NumPy automatically stretches out the scalar 2 to be the vector of 2s in the first example so that they can be multiplied together.

## Broadcasting 

- As with all things NumPy, this scales up to **multiple dimensions** very easily.  
  
- NumPy checks each **aligned** pair of dimensions to see if they are:
    - equal in size
    - one of them is of size one
    
- Which of these pairs can be broadcast?
    - 5,1?
    - 2,4?
    - 3,3?


## Broadcasting over multiple dimensions

For `c = a * b`, we can infer the shape of `c` from the shapes of `a` and `b`:

```python
a.shape = (1 x 2)
b.shape = (5 x 2) 
c.shape = (5 x 2) 
```

When they have different dimensions, align the **trailing** dimension:
```python
a.shape = (5 x 6) 
b.shape =         (1)
c.shape = (5 x 6)
```

## Broadcasting over multiple dimensions


In [43]:
a = np.array([[1,2,3],
              [4,5,6]])

# multiply col 1 by 2, col 2 by 5, col 3 by 10
b = np.array([2,5,10])
a*b

array([[ 2, 10, 30],
       [ 8, 25, 60]])

## Modifying Arrays

We can use **indexing**, **slicing**, and **broadcasting** to modify the values in arrays:


In [44]:
# indexing
a = np.array([1,2,3])
a[0] = 5

In [45]:
# slicing
a[0:3] = [5, 5, 5]
a

array([5, 5, 5])

In [46]:
# broadcasting
a[0:3] = 2
a

array([2, 2, 2])

<img src="img/jupyter.png" width="200">

Now open the following workbook: `practical2.ipynb`

## Generating New Arrays Deterministically

- `np.linspace(lower, upper, N)`: N **linearly distributed** numbers between lower and upper
- `np.ones((shape))`: an array of **ones** of shape `shape` 
- `np.zeros((shape))`: an array of **zeros** of shape `shape`
- `np.arange(upper)`: integers from 0 to upper
- `np.eye(N)`: identity matrix of size $N \times N$

## Generating New Arrays Randomly
- `np.random.randn()`:
- `np.random.random()`:
- `np.random.choice(list)`: choose a number from the list


## Stacking and Reshaping
- Stack two arrays **vertically** `np.vstack([a,b])` or **horizontally** `np.hstack([a,b])`


In [47]:
a = np.ones((4))
b = np.ones((4))

np.vstack([a,b])

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [48]:
np.hstack([a,b])

array([1., 1., 1., 1., 1., 1., 1., 1.])

## NumPy Array Functions 

There are certain functions which can be applied **directly** to the array (i.e. you can do `a.function()` instead of `np.function(a)`. 

- `a.mean()` mean 
- `a.T` : transpose
- `a.argmax()`: index of the max element of `a`

See [documentation](https://numpy.org/doc/stable/reference/arrays.ndarray.html#array-methods) for more.

<img src="img/jupyter.png" width="200">

Now open the following workbook: `practical3.ipynb`