## NumPy arrays

- Quick look at numpy arrays
  - Basics
  - Slicing
  - Array creation
- Multi-dimensional arrays
  - Basics
  - Matrices
  - Image processing
  - Solution of linear systems

## The `numpy` module

* Efficient, powerful array type
* Abstracts out standard operations on arrays
* Convenience functions
* Fundamental and important
  * The basis for all scientific/numerical computing
  * Similar API for pytorch, tensorflow, cupy etc.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

### `numpy` arrays

* Fixed size (`arr.size`)
* Contiguous block of memory
* Same type (`arr.dtype`)
* Arbitrary dimensionality: `arr.shape`
* `shape` : extent (size) along each dimension
* `arr.itemsize` : number of bytes per element
* Note:`shape`  can change so long as the`size` is constant
* Indices start from 0
* Negative indices work like lists
* Similar indexing/slicing as lists.

In [None]:
a = np.array([1, 2, 3, 4])
b = np.array([2, 3, 4, 5])
print(a[0], a[-1])
print("b", b[0], b[-1])

In [None]:
a[0] = -1
a[0] = 1

### Simple operations

Basic NumPy operations are typically elementwise

In [None]:
a + b

In [None]:
a*b

In [None]:
a/b

In [None]:
a//b

### Examples
$\pi$  and `e`  are defined.


In [None]:
x = np.linspace(0.0, 10.0, 200)
x *= 2*np.pi/10
# apply functions to array.
y = np.sin(x)
y = np.cos(x)

In [None]:
# Setting the values is just like with lists.
x[0] = -1
print(x[0], x[-1])

### `size, shape`  etc.


In [None]:
x = np.array([1., 2, 3, 4])
x.size

In [None]:
x.dtype

In [None]:
x.shape

In [None]:
x.itemsize

In [None]:
x.nbytes

### Multi-dimensional arrays


In [None]:
a = np.array([[0, 1, 2, 3],
              [10, 11, 12, 13]])
a.shape  # (rows, columns)

In [None]:
a[1, 3]

In [None]:
a[1, 3] = -1
a[1]  # The second row

In [None]:
a[1] = 0  # Entire row to zero.
a

### Slicing arrays


In [None]:
a = np.array([[1, 2, 3], [4, 5, 6],
              [7, 8, 9]])
a[0, 1:3]

In [None]:
a[1:, 1:]

In [None]:
a[:, 2]

### Slicing np.arrays ...


In [None]:
a = np.array([[1, 2, 3], [4, 5, 6],
              [7, 8, 9]])

a[0::2, 0::2]  # Striding...

Slices refer to the same memory!

In [None]:
a[::2, ::2] = 0
a

In [None]:
# What if?
t = a[::2, ::2]
t = 1

Beware of the above, if you want to set all the values use this:

In [None]:
t = a[::2, ::2]
t[:] = 1
a

### Array creation functions

* `array(object)`
* `linspace(start, stop, num=50)`
* `ones(shape)`
* `zeros((d1,...,dn))`
* `empty((d1,...,dn))`
* `identity(n)`
* `ones_like(x)` ,`zeros_like(x)` ,`empty_like(x)`
* `arange`
* `fromfunction`
* May pass an optional `dtype=`  keyword argument
* For more dtypes see: `numpy.typecodes`

### Creation examples


In [None]:
a = np.array([1, 2, 3], dtype=float)

In [None]:
np.ones_like(a)

In [None]:
np.ones((2, 3))

In [None]:
np.identity(3)

In [None]:
np.fromfunction(lambda i, j: i+j+1, (3, 3))

### Array math

* Basic elementwise math (given two arrays`a, b` ):

* `a + b`  $\rightarrow$ `add(a, b)`
* `a - b` , $\rightarrow$ `subtract(a, b)`
* `a * b` , $\rightarrow$ `multiply(a, b)`
* `a / b` , $\rightarrow$ `divide(a, b)`
* `a % b` , $\rightarrow$ `remainder(a, b)`
* `a ** b` , $\rightarrow$ `power(a, b)`
* Inplace operators:`a += b` , or`add(a, b, a)`
* Note: What happens if `a` has dtype `int`  and `b` `float`?

### Array math

* Logical operations:`==, !=, <, >` , etc.

* `sin(x), arcsin(x), sinh(x)`, `exp(x), sqrt(x)`  etc.

* `sum(x, axis=0), product(x, axis=0)`
* `dot(a, b)`: Or use `@`

### Convenience functions:`loadtxt`

* `loadtxt(file_name)`: already seen this!

* `savetxt(file_name, data)` : saves data to a text file and

* `load(fname, **kw)`: Load a binary numpy file. Lot more powerful.
* `save(fname, data)`: Save array to binary numpy file.
* `savez(fname, *args, **kw)`: Save several arrays into a single file in ``.npz`` format.
* `savez_compressed(fname, *args, **kw)`: Same as `savez` but also compresses the file.

In [None]:
data = np.loadtxt('../data/pendulum.txt')
data.shape

### Exercise

Convert this 90x2 shaped data and extract this into two 90 element columns.

In [None]:
# Solution

### `savetxt, savez` and `load`


In [None]:
np.savetxt('test.txt', data)  # Overwrites the text file!

In [None]:
# savez is very handy.
x, y = data.T
np.savez('test.npz', x=x, y=y, z=np.random.random(10))

In [None]:
# Loading back the data.
data = np.load('test.npz')
data['x']

In [None]:
data['y']

In [None]:
data['z']

## Matrices

All matrix operations are performed using arrays

### Initializing


In [None]:
c = np.array([[11, 12, 13],
              [21, 22, 23],
              [31, 32, 33]])
c

### Initializing some special matrices


In [None]:
np.ones((3, 5))

In [None]:
np.ones_like(c)

In [None]:
np.identity(2)

### Accessing elements


In [None]:
c[1][2]

In [None]:
c[1,2]

In [None]:
c[1]

### Changing elements


In [None]:
c[1,1] = -22
c

In [None]:
c[1] = 0
c

### Exercise

How do you access one column?

### Slicing


In [None]:
c[:, 1]

In [None]:
c[1, :]

In [None]:
c[0:2, :]

In [None]:
c[1:3, :]

In [None]:
c[:2, :]

In [None]:
c[1:, :]

In [None]:
c[1:, :2]

### Striding


In [None]:
c[::2, :]

In [None]:
c[:, ::2]

In [None]:
c[::2, ::2]

## Elementary image processing

Let us see how to load a PNG image into a numpy array and plot it.

In [None]:
a = plt.imread('../data/penguins.png')

In [None]:
type(a)

In [None]:
a.shape

In [None]:
plt.imshow(a);


- `imread`  returns an array of shape (370, 370, 4) which represents an
  image of 370x370 pixels and 4 channels.
- The 4 channels represent R, G, B, Alpha
- `imshow`  renders the array as an image.


Instead of using matplotlib's imread which is limited, one can use
[imageio](https://imageio.github.io)

In [None]:
import imageio
a = imageio.imread("../data/penguins.png")

In [None]:
plt.imshow(a);

### Showing the different channels of the image.

### Exercise
- Extract the red channel of the image and show it.

In [None]:
# Solution

### How can we see things in different colors?

- Matplotlib provides a ton of different colormaps
- First let us understand what a colormap is.
- Read more in the matplotlib docs here: https://matplotlib.org/stable/users/explain/colors/colormaps.html

In [None]:
list(plt.colormaps)

In [None]:
plt.imshow(a[:, :, 0], cmap='Reds')

### Showing all the different channels?

- Use subplots

In [None]:
plt.figure(figsize=(10, 5))
plt.subplot(2, 2, 1)
plt.imshow(a)
plt.subplot(2, 2, 2)
plt.imshow(a[:, :, 0], cmap='Reds')
plt.colorbar();
plt.subplot(2, 2, 3)
plt.imshow(a[..., 1], cmap='Greens')  # Notice the use of ellipsis
plt.colorbar();
plt.subplot(2, 2, 4)
plt.imshow(a[..., 2], cmap='Blues');
plt.colorbar();

### Showing a histogram of the red channel

- Cannot blindly use `plt.hist` as the array is not 1D
- We have to "ravel" or "flatten" it.

In [None]:
plt.hist(a[..., 0].flat, bins='auto');

In [None]:
plt.hist(a[..., 1].ravel(), bins='auto');

### Exercise

Make a histogram for each of the 4 channels and plot these in a single
figure with subplots.

### Slicing and Striding Exercises

* Crop the penguin image to get the top-left quarter

* Crop the image to get only the face of the baby penguin

* Resize image to half by dropping alternate pixels


In [None]:
# Solution

### Transpose of a Matrix


In [None]:
a = np.array([[1,  1,  2, -1],
              [2,  5, -1, -9],
              [2,  1, -1,  3],
              [1, -3,  2,  7]])
a.T

### Elementwise operations


In [None]:
b = np.array([[3, 2, -1, 5],
              [2, -2, 4, 9],
              [-1, 0.5, -1, -7],
              [9, -5, 7, 3]])
a + b

In [None]:
a*b

### The `axis` argument

Consider this:

In [None]:
np.sum(a)

In [None]:
# Recall that a is
a

In [None]:
# What if you want to sum across the rows or columns?
np.sum(a, axis=0)  # Rows?

In [None]:
np.sum(a, axis=1)  # Columns?


- Many functions take the axis argument, as this is very useful.

## Reshaping, concatenation, stack and split

- Reshape an array to get a different view, if compatible and total size
  does not change
- Concatenate two arrays
- Stack and split arrays

In [None]:
a = np.arange(9)
c = a.reshape((3, 3))  # Note that this does not make a copy!
print(c)

In [None]:
c[0] = 1
print(a)

In [None]:
c = np.arange(1, 7)
a, b = np.split(c, 2)
a, b

In [None]:
np.concatenate((a, b))

In [None]:
np.hsplit(np.arange(6), 3)

In [None]:
np.hstack((a, b))

In [None]:
np.stack((a, b))

In [None]:
np.stack((a, b), axis=1)

### Matrix Multiplication


In [None]:
np.dot(a, b)

In [None]:
# Easier to do this!
a @ b  # @ is the matmult operation

## Linear algebra


In [None]:
np.linalg.inv(a)

In [None]:
a @ np.linalg.inv(a)

In [None]:
# The determinant
np.linalg.det(a)

### Eigenvalues and Eigen Vectors


In [None]:
e = np.array([[3, 2, 4], [2, 0, 2], [4, 2, 3]])
np.linalg.eig(e)

In [None]:
np.linalg.eigvals(e)

### Computing Norms


In [None]:
np.linalg.norm(a)

In [None]:
np.linalg.norm(e)

### Singular Value Decomposition


In [None]:
np.linalg.svd(e)

## Least Squares Fit (linear regression)



- Linear trend visible between $L$  vs. $T^2$


In [None]:
L, t = np.loadtxt('../data/pendulum.txt', unpack=True)
tsq = t*t

In [None]:
plt.scatter(L, tsq);

- This plot is not a straight line, can we do better?
- Can we find a best-fit line?

### Matrix Formulation

- We need to fit a line through points for the equation $T^2 = m \cdot L+c$
- In matrix form, the equation can be represented as $T_{sq} = A \cdot p$,

* We need to find $p$ to plot the line


$$
\begin{bmatrix}
  T^2_1 \\
  T^2_2 \\
  \vdots\\
  T^2_N \\
\end{bmatrix}
=
\begin{bmatrix}
  L_1 & 1 \\
  L_2 & 1 \\
  \vdots & \vdots\\
  L_N & 1 \\
\end{bmatrix}
\cdot\begin{bmatrix}
  m\\
  c\\
  \end{bmatrix}
$$
Or

$$T_{sq} = A\cdot p$$


### Generating $A$


In [None]:
A = np.array([L, np.ones_like(L)])
A = A.T

### Computing the least-square fit

* Now use the `np.linalg.lstsq`  function

* Along with other things, it returns the least squares solution


In [None]:
result = np.linalg.lstsq(A, tsq)
coef = result[0]

### Least Square Fit Line ...
We get the points of the line from `coef`

In [None]:
t_line = coef[0]*L + coef[1]
t_line.shape

* Now plot`t_line`  vs.`L` , to get the least squares fit line.


In [None]:
plt.plot(L, t_line, 'r')
plt.plot(L, tsq, 'o');

## Advanced concepts

- Broadcasting and broadcasting rules
- Special indexing
- Masked indexing


### Simple broadcasting examples


In [None]:
a = np.array([1.0, 2, 3.0])
b = 2.0
a * b

In [None]:
a = np.ones((3, 3))
b = np.array([1, 2, 3])
a * b

In [None]:
b*a

In [None]:
a = np.array([1.0, 2, 3.0])
a.T

In [None]:
a = np.array([1.0, 2, 3.0])
b = np.ones(5)
a * b

### Broadcasting rules

- Consider `A <operator> B`

- Compare their shapes element-wise starting with the rightmost dimension
  going left.

- Two dimensions are compatible if:

  - they are equal
  - or one of them is 1 or `None` (not available)

- Do not need the same number of dimensions


### Illustration

<br/>

<img height="80%" src="images/broadcasting_1.png" align="center"/>

<br/>
<span style="font-size:50%" >
Image source: https://numpy.org/doc/stable/user/basics.broadcasting.html
</span>


### Examples

- From the documentation!

```
Image  (3d array): 256 x 256 x 3
Scale  (1d array):             3
Result (3d array): 256 x 256 x 3
```

```
A      (4d array):  8 x 1 x 6 x 1
B      (3d array):      7 x 1 x 5
Result (4d array):  8 x 7 x 6 x 5
```


```
A      (2d array):      2 x 1
B      (3d array):  8 x 4 x 3
```


### Examples


In [None]:
a = np.fromfunction(lambda i, j: 10*i, (4, 3))
a

In [None]:
b = np.array([1.0, 2.0, 3.0])
print(a.shape, b.shape)
a + b

### Illustration

<br/>

<img height="80%" src="images/broadcasting_2.png" align="center"/>

<br/>
<span style="font-size:50%" >
Image source: https://numpy.org/doc/stable/user/basics.broadcasting.html
</span>


### Examples

- Introducing a new axis


In [None]:
# No magic!
b = np.array([1.0, 2.0, 3.0, 4.0])
a + b

In [None]:
b = np.array([1.0, 2.0, 3.0, 4.0]).reshape((4, 1))
# This is now (4, 3) + (4, 1) which are compatible.
a + b

- Use `np.newaxis` or `None`

In [None]:
a = np.array([0.0, 10.0, 20.0, 30.0])
b = np.array([1.0, 2.0, 3.0])
a[:, np.newaxis] + b  # or a[:, None] + b

### Illustration

<br/>

<img height="80%" src="images/broadcasting_4.png" align="center"/>

<br/>
<span style="font-size:50%" >
Image source: https://numpy.org/doc/stable/user/basics.broadcasting.html
</span>





## Special indexing

- Can be indexed using arrays of indices

In [None]:
a = np.arange(12)**2
i = np.array([10, 1, 3, 8, 5])
a[i]

In [None]:
j = np.array([[3, 4], [9, 7]])
a[j]

In [None]:
a[i] = 4
a

### More examples

- Note: indices for each dimension must have same shape


In [None]:
a = np.arange(12).reshape(3, 4)
a

In [None]:
i = np.array([[0, 1],
              [1, 2]])
j = np.array([[2, 1],
              [3, 3]])
a[i, j]

Warning:

In [None]:
a[i]

In [None]:
a[np.array([i, j])]

## Indexing with a boolean array

- Very powerful
- Allows one to vectorize conditionals

In [None]:
a = np.arange(12).reshape(3, 4)
b = a > 4
b

In [None]:
a[b]

In [None]:
# Or
a[a > 4]

In [None]:
# Can also assign.
a[b] = 21
a

In [None]:
b = (a > 4) | (a % 3 == 0)  # Can do ~: for not, & is for and, | or, ^ xor
b

In [None]:
a[b]

## Creating a grid of points

- A box of 25x25 points between $[-2, -2]$ to $[2, 2]$


In [None]:
xo, yo = -2, -2
n = 25
dx = dy = 4.0/(n - 1)
x = np.fromfunction(lambda i, j: xo + i*dx, (n, n))
y = np.fromfunction(lambda i, j: yo + j*dy, (n, n))

In [None]:
plt.scatter(x, y);
plt.axis('equal')

In [None]:
# Easier way
t = np.linspace(-2, 2, 25)
x, y = np.meshgrid(t, t)
x.shape, y.shape

In [None]:
plt.contourf(x, y, np.sin(x*x + y*y))  # Can try with cmap='hot'
plt.colorbar();

In [None]:
# Can also do
t = np.linspace(-2, 2, 25)
u = np.linspace(-2, 2, 20)
x, y = np.meshgrid(t, u)  # Note the axis convention.
x.shape, y.shape

### Exercise

- Plot interior of the unit circle by generating a collection of points
  inside the circle.
- Use what you have learned so far to do this without a loop.


In [None]:
# Solution

## Learn more

* https://numpy.org/doc/stable/
* https://numpy.org