# worksheet6: Numpy Part 2

## numpy broadcasting basics
- https://numpy.org/doc/stable/user/basics.broadcasting.html

## 2-D and 3-D arrays
- multidimension arrays also called as tensors
- array programming with numpy
  https://www.nature.com/articles/s41586-020-2649-2

In [None]:
import numpy as np

### Transform a simple 1D array into a 2D array

In [None]:
array_1D = np.arange(6)

In [None]:
array_1D

In [None]:
array_2D = array_1D.reshape(3,2)

In [None]:
array_2D

#### Q: how can you check shape and dimensions of `array_2D`?

### simple assignment

In [None]:
array_2D = np.array([[0,1],[2,3], [4,5]])

In [None]:
array_2D

### other example n-D functions
- np.ones
- np.zeros
- np.random.rand
- https://numpy.org/doc/stable/user/basics.creation.html

In [None]:
np.ones((2,3))

In [None]:
np.ones((2,3)).shape

In [None]:
np.zeros((4, 5))

In [None]:
np.random.rand(10)

In [None]:
# expects separate args for each dim
np.random.rand(3,3)

### Transform 2D array back to 1D array

In [None]:
array_2D.flatten()

### Indexing
- Access elements of 2D array

In [None]:
array_2D

In [None]:
# can also do array_2D[0][0], but this isn't necessary
array_2D[0, 0]

In [None]:
array_2D[0, 1]

In [None]:
array_2D[2, 0]

### Q: what should the following print?
- `array_2D[1]`

### Slicing
- Indexing (view)
- basic syntax is `i:j:k`, i=starting index, j=stopping index, k=step size

#### refresh 1D slicing

In [None]:
array_1D

### Q: what should the following print?
- array_1D[0]
- array_1D[4]
- array_1D[0:4]
- array_1D[0:4:2]
- array_1D[-2:5]
- array_1D[-2:6]
- array_1D[2:]

#### 2D slicing

In [None]:
array_2D

In [None]:
array_2D.shape

In [None]:
array_2D[0]

In [None]:
array_2D[1]

In [None]:
array_2D[2]

In [None]:
array_2D[3]

In [None]:
# slice along the rows
array_2D[0:3]

In [None]:
array_2D[0:3, 0]

In [None]:
array_2D[:]

In [None]:
array_2D[0:3:2]

In [None]:
array_2D[2:]

In [None]:
array_2D[-3:]

In [None]:
array_2D[-3:1]

In [None]:
array_2D[-2:]

In [None]:
# below is as good as [-2:]
# it will give you all the rows after -2
array_2D[-2:10]

### A few ways to convert 1D to 2D array
- reshape
- np.newaxis (adds a dim)

In [None]:
array_1D

In [None]:
z = array_1D.reshape(6,1)

In [None]:
z

#### np.newaxis adds a new dimension
- 1D to 2D

In [None]:
array_1D[:, np.newaxis]

In [None]:
array_1D[:, np.newaxis].shape

### 3D array
- x, y, z dimensions
- block, row, column

In [None]:
x = np.array([
    [
        [1],
        [2],
        [3]
    ], 
    [
        [4],
        [5],
        [6]
    ]
])

In [None]:
x.shape

In [None]:
x = np.array([
    [[1],
     [2],
     [3]], 
    [[4],
     [5],
     [6]]
])

In [None]:
x[1]

#### how to access value 6?


In [None]:
x[1,2,0]

#### Q: access value 3

In [None]:
x[0,2,0]

#### all blocks, all rows, first column

In [None]:
x[:, :, 0]

#### equivalent to above
- ellipsis

In [None]:
x[...,0]

In [None]:
x

In [None]:
x[:, 1: ]

### np.newaxis add two dimensions

In [None]:
array_1D[:, np.newaxis, np.newaxis]

In [None]:
array_1D[:, np.newaxis, np.newaxis].shape

### Integer indexing

In [None]:
array_2D

### Q: what should the following print?
 - array_2D[np.arange(3)]
 - array_2D[np.arange(2)]

### Boolean array indexing

In [None]:
x = np.array([[1., 2.], [np.nan, 3.], [np.nan, np.nan]])

In [None]:
x

In [None]:
~np.isnan(x)

In [None]:
x[~np.isnan(x)]

### Q: what should the following print?
- array_2D[np.arange(2), np.arange(2)]
- example of fancy indexing in 2D space

### Reductions

Numpy aggregations
- np.sum
- np.mean

In [None]:
z = np.array([
    [4, 0],
    [5, 2],
    [2, 1]
])

In [None]:
z.sum()

In [None]:
z.sum(axis=0) # sum by columns

In [None]:
z.sum(axis=1) # sum by rows

In [None]:
z.sum(axis=(0,1)) # sum all the axes

In [None]:
z.mean()

In [None]:
z.mean(axis=0)

In [None]:
z.mean(axis=1)

### Q: Experiment with the following

In [None]:
z = np.array([[[1, 2], [2, 1], [3, 3]],
              [[4, 0], [5, 1], [6, 2]]])

- z.sum(axis=0)
- z.sum(axis=1)
- z.sum(axis=2)
- z.sum(axis=3)

### stacking
- np.hstack
- np.vstack

In [None]:
a = np.array([
    [2,4],
    [6,8]
])

In [None]:
b = np.array([
    [10,12],
    [14,16]
])

In [None]:
np.hstack((a,b))

In [None]:
np.vstack((a,b))

#### Q: for hstack to work, should arrays a,b have the same number of  rows or columns? and for vstack?

#### Q: Experiment with the following:
- np.dot(a,b) where a and b are arrays of shape (2,2) and (2,), resp.
- np.matmul(a,b)
- a.T (transpose of a)
- np.linalg.inv(a) # inverse of a

### Matrix broadcasting
- Input arrays do not need to have the same number of dimensions. The resulting array will have the same number of dimensions as the input array with the greatest number of dimensions, where the size of each dimension is the largest size of the corresponding dimension among the input arrays.
- https://numpy.org/doc/stable/user/basics.broadcasting.html

In [None]:
a = np.random.randint(1,10,[4,3])

In [None]:
a

In [None]:
b = np.random.randint(1,10,[4,1])

In [None]:
b

In [None]:
a.shape

In [None]:
b.shape

In [None]:
a + b

In [None]:
(a + b).shape

In [None]:
b = np.random.randint(1,10,[4,2])

In [None]:
b

#### Q: will a + b work? why or why not?

In [None]:
a = np.array([
    [0], 
    [3], 
    [6], 
    [9]
])
b = np.array([1, 2])

In [None]:
a.shape

In [None]:
b.shape

#### Q: what will be the shape of a*b?

In [None]:
a*b

### Vectorization (add element by element)

In [None]:
a = np.arange(10).reshape(5,2)

In [None]:
a

In [None]:
b = np.ones((5,2))

In [None]:
b

In [None]:
a + b

### Q: Experiment with the following:
Construct a 2D array `a`
- find the min using np.min(a), experiment with axis pased singly and as a tuple
- find the max using np.max(a), experiment with axis pased singly and as a tuple
- find the mean, median and qth percentile using np.mean, np.median and np.percentile

# Collaborative exercises

## Exercise 1
Construct a 4x3 array and select only the corner elements of the array using basic indexing techniques. Thus all elements for which the column is one of [0, 2] and the row is one of [0, 3] need to be selected


## Exercise 2
Construct a 2D array with shape (5,7) with values ranging from 0 to 35. Filter this
array to only consist of values > 20 using boolean indexing

## Exercise 3
Construct a 2D array of with shape (3,2),where rows can take the values [0,1,2].
From this array, select all rows which sum up to less than or equal to two

### Exercise 4
Construct a 2-D boolean array of shape (2, 3) with four True elements to 
select rows from a 3-D array of shape (2, 3, 5).
What is the shape of the resultant array?

### Exercise 5
Construct a 2D array `array_2D` of shape (3,2) from a 1D array (`array_1D`) of shape (6,).
Assign a different value to `array_2D[0, 1]`.
Compare values in `array_1D` and `array_2D`, are they the same or different?
If they are the same, how can you ensure you only change the value in `array_2D` and not `array_1D`?

### Exercise 6
Construct a 3D array `x` with shape 2,2,2. Assume each block is an MRI image, with values representing pixel intensities
- Can you perform min-max normalization across this whole 3D array? 

normalized = (x - x_min) / (x_max - x_min)

- Can you now normalize each image separately?


### Exercise 7

The following clinical observation is height and weight of an individual:

observation = np.array([111.0, 188.0])

The following 2D array called `measurements` records heights and weights of several individuals at different time points, including the individual above. Can you figure out `observation` above is closer to which individual from `measurements`?

```
measurements = np.array([[102.0, 203.0],
               [132.0, 193.0],
               [45.0, 155.0],
               [57.0, 173.0]])
```
Hint: you can use your own metric, e.g. sqrt(sum of squared differences)