# Lecture 5, Part 1 – More NumPy

## CSS Summer Bootcamp, Week 1 🥾

#### Suraj Rampure

In [None]:
import numpy as np

## 2D Arrays

### Motivation

So far, the lists and arrays we've looked at have been flat, or **one-dimensional**.

In [None]:
numbers_arr = np.array([4, 5, 1, -2.1, 9])
numbers_arr

You can think of a 1D array as being a table with **one row and $n$ columns**. What if we want more rows?

### 2D arrays

We can create arrays with multiple rows (i.e. 2D arrays) by passing a **list of lists** to the `np.array` function:

In [None]:
grid = np.array([[1, 2, 3, 4], 
                 [5, 6, 7, 8],
                 [9, 10, 11, 12]])

grid

Arrays have a `shape` _attribute_, which allows us to check the number of rows and columns they have.

In [None]:
grid.shape

### Accessing entries in a 2D array

If `arr` is a 2D array, then `arr[i]` is the **row** at position `i`. Row 0 is at the "top".

In [None]:
grid

In [None]:
grid[0]

In [None]:
grid[-1]

To access a particular entry in the array, we can first index into the row that it's in, then index into the column that it's in.

In other words, `arr[i][j]` gives us the element in row `i` and column `j`.

In [None]:
grid

In [None]:
grid[0][2]

In [None]:
grid[-1][-2]

A shortcut to the above notation is the following:

```py
arr[i, j]
```

where `i` is the position of the **row** of interest and `j` is the position of the **column** of interest.

In [None]:
grid[0, 2]

In [None]:
grid[-1, -2]

Both pieces of syntax can be used for modifying elements in a (2D) array.

In [None]:
grid_copy = grid.copy()
grid_copy

In [None]:
grid_copy[0][0] = 18

In [None]:
grid_copy[1, 2] = 100

In [None]:
grid_copy

### Slicing

The "shortcut" syntax, `arr[i, j]`, supports slicing. This allows us to extract a subset of the columns (or even just a single column) from a 2D array.

**Remember: We select rows first, then columns!**

In [None]:
grid

In [None]:
# Rows 0 and 1, columns 2 and 3
grid[:2, 2:4]

In [None]:
# All rows, only column 1
grid[:, 1]

In [None]:
# All rows, only the last column
grid[:, -1]

<h3><span style='color:purple'>Activity</span></h3>

Consider the following 2D array.

```py
vals = np.array([[1, 2, 3, -1],
                 [2, -1, 4, 2],
                 [0, 0, 1, 9],
                 [8, -6, -1, 3],
                 [0, 1, 8, 7]])
```

What do the following evaluate to? Try and determine the answers **WITHOUT** running any code. (Remember, in `arr[i, j]`, we select rows first, then columns!)

- `vals[1, 1]`
- `vals[1:3, 2:]`
- `vals[2, :]`
- `vals[:, -2]`

## Shapes

### Shapes

As we saw earlier, we can access the shape of an array using `.shape`:

In [None]:
grid.shape

In [None]:
# All arrays have shapes
first_six = np.arange(1, 7)
first_six.shape

### Reshaping

The array `.reshape` method changes the shape of an array. Specifically,

```py
arr.reshape((r, c))
```

returns a new array with `r` rows and `c` columns. Think of shapes as tuples!

In [None]:
grid

In [None]:
grid.reshape((2, 6))

In [None]:
grid.reshape((6, 2))

In [None]:
grid.reshape((12,))

In [None]:
grid.reshape((-1, 1))

In [None]:
grid.reshape((5, 3))

### Using shapes as arguments

Many NumPy functions accept a "shape" argument, which allow you to specify what you want the shape of the output to be.

`np.ones(s)` returns an array containing all 1s with shape `s`.

In [None]:
np.ones(5)

In [None]:
np.ones((3, 6))

`np.zeros(s)` returns an array containing all 0s with shape `s`.

In [None]:
np.zeros(5)

In [None]:
np.zeros((3, 6))

Even the `np.random` functions we look at accept shapes.

In [None]:
np.random.choice(['Heads', 'Tails'], size=(4, 3))

<h3><span style='color:purple'>Activity</span></h3>

Consider the following 2D array.

```py
vals = np.array([[1, 2, 3, -1],
                 [2, -1, 4, 2],
                 [0, 0, 1, 9],
                 [8, -6, -1, 3],
                 [0, 1, 8, 7]])
```

Which of the following would result in an error?

- `vals.reshape((-1, 1))`
- `vals.reshape((10, 2))`
- `vals.reshape((5, 4))`
- `vals.reshape((6, 3))`

### Aside: `.T`

If `arr` is an array, then `arr.T` is the **transpose** of that array. That is, it is the result of turning the rows of `arr` into columns, and the columns of `arr` into rows (or "rotating" the array).

In [None]:
grid

In [None]:
grid.T

## Axes

### Numerical operations on arrays

Yesterday, we looked at several functions/methods that take in an array and output a single number, like `sum` and `mean`.

In [None]:
first_six

In [None]:
np.sum(first_six)

In [None]:
np.mean(first_six)

How do these work on 2D arrays?

In [None]:
grid

In [None]:
np.sum(grid)

In [None]:
np.mean(grid)

They work as expected – they take the sum and mean (respectively) of all elements in the array.

But what if we want to take the sum or mean of each column? Or each row?

### Axes

NumPy arrays have **axes**. Instead of an $x$-axis and a $y$-axis, they have **axis 0 (rows)** and **axis 1 (columns)**.

<center><img src='images/axis-table.png' width=20%><i>Pretend this is an array, not a table.</i></center>

We can specify which axis we want to perform numerical operations on!

<center><img src='images/axis.png' width=40%>Pretend this is an array, not a table.</center>

In [None]:
grid

In [None]:
# Squeeze axis 0 (the rows) together, to end up with a single row
np.sum(grid, axis=0)

In [None]:
# Squeeze axis 1 (the columns) together, to end up with a single column
np.sum(grid, axis=1)

<h3><span style='color:purple'>Activity</span></h3>

Consider the following 2D array.

```py
other_vals = np.array([[1, 2, 3],
                       [2, -1, 4],
                       [0, 0, 2]])
```

Determine each of the following **WITHOUT** running any code.

```py
other_vals.mean(axis=0)

other_vals.mean(axis=1)
```

## Example: Images

### Even more axes!

NumPy arrays can have arbitrarily many axes, not just 1 or 2.

In [None]:
grid

In [None]:
grid_3d = grid.reshape((3, 2, 2))
grid_3d

The shape of `grid_3d` is `(3, 2, 2)`. One way to think of this is as being 3 2D arrays, each with 2 rows and 2 columns, stacked on top of each other.

<center><img src='https://i.stack.imgur.com/Tbe9W.png' width=50%></center>

### Images are arrays!

As you saw in the last lab, we use three values to describe color: a red amount, a green amount, and a blue amount. Each can range between 0 and 255.

We can store this information in an array – which is how images are stored!

In [None]:
import skimage.io as skio
import matplotlib.pyplot as plt

In [None]:
geisel = skio.imread('images/geisel.jpeg')
geisel.shape

In [None]:
plt.imshow(geisel)
plt.axis('off');

### Manipulating images

In the lab, we saw that we can convert an image to **greyscale** by averaging its red, green, and blue values.

We can do that for the entire image at once, using `mean` with the correct `axis` argument!

In [None]:
geisel

In [None]:
average_geisel = geisel.mean(axis=2)
average_geisel.shape

In [None]:
average_geisel_3d = np.zeros_like(geisel)
for i in range(3):
    average_geisel_3d[:, :, i] = average_geisel

In [None]:
plt.imshow(average_geisel_3d);

### Removing channels

If we set the "green" and "blue" values for each pixel to 0, the result will be an image with only "red"!

In [None]:
red_geisel = geisel.copy()
red_geisel[:, :, 1] = 0
red_geisel[:, :, 2] = 0

In [None]:
plt.imshow(red_geisel)
plt.axis('off');

In [None]:
green_geisel = geisel.copy()
green_geisel[:, :, 0] = 0
green_geisel[:, :, 2] = 0

In [None]:
plt.imshow(green_geisel)
plt.axis('off');

In [None]:
blue_geisel = geisel.copy()
blue_geisel[:, :, 0] = 0
blue_geisel[:, :, 1] = 0

In [None]:
plt.imshow(blue_geisel)
plt.axis('off');

## Future preview

### NumPy 🤝 linear algebra

You'll be covering linear algebra in a few weeks. When you do, know that matrices can be stored as 2D arrays, and that NumPy has countless built-in functions pertinent to linear algebra.

In [None]:
square = np.array([[3, 1, -2],
                   [4, 0, 1/2],
                   [-1, 2, 0]])

square

In [None]:
other_square = np.array([[0, 1, -2],
                         [-2, 3, 9],
                         [1, 4, 2]])

other_square

In [None]:
# Matrix-multiply the two matrices
square @ other_square

In [None]:
# Element-wise multiply the two matrices
square * other_square

In [None]:
# Multiply a matrix with a vector
square @ other_square[:, 0]

In [None]:
# Find the rank of a matrix
np.linalg.matrix_rank(square)

In [None]:
# Compute the "eigen"-decomposition of a matrix,
# whatever that is
np.linalg.eig(square)