In [2]:
import pickle 
import numpy as np

# Matrices

This notebook uses many images from the excellent [A Visual Intro to NumPy and Data Representation](https://jalammar.github.io/visual-numpy/) from [Jay Alammar](https://jalammar.github.io/).

In the first notebook ([1. vector.ipynb](https://github.com/utstikkar/numpy/blob/master/1.%20vector.ipynb)) we dealt with vectors (one dimensional). 

Now we deal with **Matrices** - arrays with two dimensions.

$\textbf{A}_{2, 2} = \begin{bmatrix}A_{1, 1} & A_{1, 2} \\ A_{2, 1} & A_{2, 2}\end{bmatrix}$

- two dimensional
- uppercase, bold $\textbf{A}_{m, n}$
- $A_{1, 1}$ = first element
- area
- tabular data

## Reshaping

Now that we have multiple dimensions, we need to start considering shape.

We can see the shape using `.shape`

In [None]:
data = np.arange(16)
data

In [None]:
data.shape

And the number of elements using `.size`

In [None]:
data.size

The **shape** of a matrix becomes more than just an indication of the length.  We can change the shape using reshape:

In [None]:
data.reshape(4, 4)

A very useful tool when reshaping is using `-1` - this is a free dimension that will be set to match the size of the data
- this is often set to the batch / number of samples dimension

In [None]:
data.reshape(2, -1)

In [None]:
data.reshape(-1, 4)

We can use `.reshape` to flatten

In [None]:
data.reshape(-1)

We can also use `.flatten`

In [None]:
data.flatten()

And finally `.ravel`

In [None]:
data.ravel()

Difference between ravel and flatten: ravel returns a view (not actual copy, just view of the original object). So changing the raveled array changes the original too. And ravel does not occupy any memory so it is faster than flatten.

In [None]:
i = data.ravel()
i

In [16]:
i[0] = 100

In [None]:
data

`.flatten` always returns a copy - `.ravel` doesn't (if it can)

Closely related to a reshape is the **transpose**, which flips the array along the diagonal:

<img src="../assets/trans.png" alt="" width="300"/>

In [None]:
np.arange(0, 6)

In [None]:
np.arange(0, 6).reshape(3, -1)

In [None]:
np.arange(0, 6).reshape(3, -1).T

Reshape is (usually) computationally **cheap** - to understand why we need to know a little about how a `np.array` is laid out in memory

## The `np.array` in memory

- the data is stored in a single block
- the shape is stored as a tuple

Why is storing in a single block (known as a contiguous layout) a good thing?
- to access the next value in the array 
- we just move to the next memory address
- length = defined by the data type

> ... storing data in a contiguous block of memory ensures that the architecture of modern CPUs is used optimally, in terms of memory access patterns, CPU cache, and vectorized instructions - [iPython cookbook](https://ipython-books.github.io/45-understanding-the-internals-of-numpy-to-avoid-unnecessary-array-copying/)

Changing the shape only means changing the tuple 
- the layout of the data in memory is not changed

The operations that will change the memory layout are ones that change the order of the data - for example a transpose:

In [None]:
data = np.arange(10000000).reshape(5, -1)
%timeit data.reshape((1, -1))

In [None]:
data = np.arange(10000000).reshape(5, -1)
%timeit data.T.reshape((1, -1))

## Two dimensional indexing

<img src="../assets/idx2.png" alt="" width="500"/>

In [None]:
data = np.random.rand(2, 3)
data

We specify both dimensions using a familiar `[]` syntax

`:` = entire dimension

In [None]:
# first row
data[0, :]

`-1` = last element

In [None]:
# last column
data[:, -1]

### Two dimension aggregation

<img src="../assets/agg-2d.png" alt="" width="900"/>

Now that we are working in two dimensions, we have more flexibility in how we aggregate
- we can specify the axis (i.e. the dimension) along which we aggregate

In [None]:
data

In [None]:
# average over all the data
np.mean(data)

In [None]:
# average over the rows - so we end up with an array with one element per column (3)
np.mean(data, axis=0)

In [None]:
# average over the columns - so we end up with an array with one element per row (2)
np.mean(data, axis=1)

By default `numpy` will remove the dimension you are aggregating over:

In [None]:
data

In [None]:
np.mean(data, axis=1).shape

You can choose to keep this dimension using a `keepdims` argument:

In [None]:
np.mean(data, axis=1, keepdims=True).shape

## Practical

Aggregate by variance `np.var` 
- over the rows
- over the columns
- over all data

## Two dimensional broadcasting

The general rule with broadcasting - dimensions are compatible when
- they are equal
- or when one of them is 1

<img src="../assets/broad-2d.png" alt="" width="500"/>

In [None]:
data = np.arange(1, 7).reshape(3, 2)
data

In [None]:
data + np.array([0, 1, 1]).reshape(3, 1)

In [None]:
data + 1

## Matrix arithmetic

We can add matrices of the same shape as expected:

<img src="../assets/add-matrix.png" alt="" width="300"/>

In [None]:
data + np.ones_like(data)

In [None]:
# result is the same as above
data + 1

## Matrix multiplication

This kind of matrix multiplication will often **change the shape** of the array
- this is what happens in neural networks

<img src="../assets/dot1.png" alt="" width="900"/>

This operation can be visualized:

<img src="../assets/dot2.png" alt="" width="900"/>

In [None]:
data = np.array([1, 2, 3])
powers_of_ten = np.array([10**n for n in range(6)]).reshape(3, 2)
powers_of_ten

This is done in numpy using either `np.dot()`:

In [None]:
np.dot(data, powers_of_ten)

Or calling the `.dot()` method on the array itself:

In [None]:
data.dot(powers_of_ten)

## Making arrays from nested lists

In [None]:
data = np.array([[1, 2], [3, 4]])

data

## Making arrays from shape tuples

The argument to these functions is a tuple

### `zeros`, `ones`, `full`

In [None]:
np.zeros((2, 4))

In [None]:
np.ones((3, 5))

In [None]:
np.full((2, 4), 3)

### `zeros_like`, `ones_like`, `full_like`

Similar to counterparts above, except their shape is defined by another array:

In [None]:
parent = np.arange(10).reshape(2, 5)
parent

In [None]:
np.zeros_like(parent)

In [None]:
np.ones_like(parent)

In [None]:
np.full_like(parent, 3)

### `empty`

Similar to `zeros`, except the array is filled with garbage from RAM 
- this is a bit quicker than `zeros`

In [None]:
d = np.empty(4)
for e in range(4):
    d[e] = e

d

### `eye`

Identity matrix :

In [None]:
np.eye(2)

In [None]:
np.eye(6)

The linear algebra verision of a 1

In [None]:
d = np.arange(4).reshape(2, 2)
d

*** dot product times the identity matrix is like multiplying by 1

In [None]:
np.dot(np.eye(2), d)

## Matrix Practice

1. Write a function (using numpy) to sort a given array of shape 2 along the first axis (rows), second axis (column), and on a flattened array.

example: 

In [None]:
# Expected Output:
# Original array:
np.array([[10, 40],
          [30, 20]])
# Sort the array along the first axis:
np.array([[10, 20],
          [30, 40]])
# Sort the array along the last axis:
np.array([[10, 40],
          [20, 30]])
# Sort the flattened array:
np.array([10, 20, 30, 40])

#### Answer:

2. Write a function to get the indices of the sorted elements of a given array

Expected Output:

Original array:


`[1023 5202 6230 1671 1682 5241 4532]`


Indices of the sorted elements of a given array:


`[0 3 4 6 1 5 2]`

#### Answer

3. Write a function to create a 5x5 array with random values and find the minimum and maximum values.

#### Answer