In [1]:
import pickle 
import numpy as np

# Matricies

This notebook uses many images from the excellent [A Visual Intro to NumPy and Data Representation](https://jalammar.github.io/visual-numpy/) from [Jay Alammar](https://jalammar.github.io/).

In the first notebook ([vector.ipynb](https://github.com/ADGEfficiency/teaching-monolith/blob/master/numpy/1.vector.ipynb)) we dealt with vectors (one dimensional). 

Now we deal with **Matricies** - arrays with two dimensions.

$\textbf{A}_{2, 2} = \begin{bmatrix}A_{1, 1} & A_{1, 2} \\ A_{2, 1} & A_{2, 2}\end{bmatrix}$

- two dimensional
- uppercase, bold $\textbf{A}_{m, n}$
- $A_{1, 1}$ = first element
- area
- tabular data

## Reshaping

Now that we have multiple dimensions, we need to start considering shape.

We can see the shape using `.shape`

In [2]:
data = np.arange(16)
data

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

In [3]:
data.shape

(16,)

And the number of elements using `.size`

In [4]:
data.size

16

The **shape** of a matrix becomes more than just an indication of the length.  We can change the shape using reshape:

In [5]:
data.reshape(4, 4)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

A very useful tool when reshaping is using `-1` - this is a free dimension that will be set to match the size of the data
- this is often set to the batch / number of samples dimension

In [6]:
data.reshape(2, -1)

array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14, 15]])

In [7]:
data.reshape(-1, 4)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

We can use `.reshape` to flatten

In [8]:
data.reshape(-1)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

We can also use `.flatten`

In [9]:
data.flatten()

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

And finally `.ravel`

In [10]:
data.ravel()

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

Looking at the difference between ravel returning a view (not actual copy, just view of the original object)

In [11]:
i = data.ravel()
i

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

In [12]:
i[0] = 100

In [13]:
data

array([100,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,
        13,  14,  15])

`.flatten` always returns a copy - `.ravel` doesn't (if it can)

Closely related to a reshape is the **transpose**, which flips the array along the diagonal:

<img src="../assets/trans.png" alt="" width="300"/>

In [14]:
np.arange(0, 6)

array([0, 1, 2, 3, 4, 5])

In [15]:
np.arange(0, 6).reshape(3, -1)

array([[0, 1],
       [2, 3],
       [4, 5]])

In [16]:
np.arange(0, 6).reshape(3, -1).T

array([[0, 2, 4],
       [1, 3, 5]])

Reshape is (usually) computationally **cheap** - to understand why we need to know a little about how a `np.array` is laid out in memory

## The `np.array` in memory

- the data is stored in a single block
- the shape is stored as a tuple

Why is storing in a single block (known as a contiguous layout) a good thing?
- to access the next value an the array 
- we just move to the next memory address
- length = defined by the data type

> ... storing data in a contiguous block of memory ensures that the architecture of modern CPUs is used optimally, in terms of memory access patterns, CPU cache, and vectorized instructions - [iPython coobook](https://ipython-books.github.io/45-understanding-the-internals-of-numpy-to-avoid-unnecessary-array-copying/)

Changing the shape only means changing the tuple 
- the layout of the data in memory is not changed

The operations that will change the memory layout are ones that change the order of the data - for example a transpose:

In [17]:
data = np.arange(10000000).reshape(5, -1)
%timeit data.reshape((1, -1))

123 ns ± 0.17 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [18]:
data = np.arange(10000000).reshape(5, -1)
%timeit data.T.reshape((1, -1))

12.7 ms ± 111 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


## Two dimensional indexing

<img src="../assets/idx2.png" alt="" width="500"/>

In [19]:
data = np.random.rand(2, 3)
data

array([[0.35182986, 0.50479884, 0.0419396 ],
       [0.526252  , 0.93281568, 0.24237533]])

We specify both dimensions using a familiar `[]` syntax

`:` = entire dimension

In [20]:
# first row
data[0, :]

array([0.35182986, 0.50479884, 0.0419396 ])

`-1` = last element

In [21]:
# last column
data[:, -1]

array([0.0419396 , 0.24237533])

### Two dimension aggregation

<img src="../assets/agg-2d.png" alt="" width="900"/>

Now that we are working in two dimensions, we have more flexibility in how we aggregate
- we can specify the axis (i.e. the dimension) along which we aggregate

In [22]:
data

array([[0.35182986, 0.50479884, 0.0419396 ],
       [0.526252  , 0.93281568, 0.24237533]])

In [23]:
# average over all the data
np.mean(data)

0.4333352193903852

In [24]:
# average over the rows - so we end up with an array with one element per column (3)
np.mean(data, axis=0)

array([0.43904093, 0.71880726, 0.14215746])

In [25]:
# average over the columns - so we end up with an array with one element per row (2)
np.mean(data, axis=1)

array([0.29952277, 0.56714767])

By default `numpy` will remove the dimension you are aggregating over:

In [26]:
data

array([[0.35182986, 0.50479884, 0.0419396 ],
       [0.526252  , 0.93281568, 0.24237533]])

In [27]:
np.mean(data, axis=1).shape

(2,)

You can choose to keep this dimension using a `keepdims` argument:

In [28]:
np.mean(data, axis=1, keepdims=True).shape

(2, 1)

## Practical

Aggregate by variance `np.var` 
- over the rows
- over the columns
- over all data

In [29]:
np.var(data, axis=0)

array([0.00760577, 0.0457996 , 0.01004362])

In [30]:
np.var(data, axis=1)

array([0.03707446, 0.08028754])

In [31]:
np.var(data)

0.0765867744040815

## Two dimensional broadcasting

The general rule with broadcasting - dimensions are compatible when
- they are equal
- or when one of them is 1

<img src="../assets/broad-2d.png" alt="" width="500"/>

In [32]:
data = np.arange(1, 7).reshape(3, 2)
data

array([[1, 2],
       [3, 4],
       [5, 6]])

In [33]:
data + np.array([0, 1, 1]).reshape(3, 1)

array([[1, 2],
       [4, 5],
       [6, 7]])

In [34]:
data + 1

array([[2, 3],
       [4, 5],
       [6, 7]])

## Matrix arithmetic

Can make arrays from nested lists:

In [35]:
np.array([[1, 2], [3, 4]])

array([[1, 2],
       [3, 4]])

We can add matricies of the same shape as expected:

<img src="../assets/add-matrix.png" alt="" width="300"/>

In [36]:
data + np.ones_like(data)

array([[2, 3],
       [4, 5],
       [6, 7]])

In [37]:
# result is the same as above
data + 1

array([[2, 3],
       [4, 5],
       [6, 7]])

## Matrix multiplication

This kind of matrix multiplication will often **change the shape** of the array
- this is what happens in neural networks

<img src="../assets/dot1.png" alt="" width="900"/>

This operation can be visualized:

<img src="../assets/dot2.png" alt="" width="900"/>

In [38]:
data = np.array([1, 2, 3])
powers_of_ten = np.array([10**n for n in range(6)]).reshape(3, 2)
powers_of_ten

array([[     1,     10],
       [   100,   1000],
       [ 10000, 100000]])

This is done in numpy using either `np.dot()`:

In [39]:
np.dot(data, powers_of_ten)

array([ 30201, 302010])

Or calling the `.dot()` method on the array itself:

In [40]:
data.dot(powers_of_ten)

array([ 30201, 302010])

## Making arrays from nested lists

In [41]:
data = np.array([[1, 2], [3, 4]])

## Making arrays from shape tuples

The argument to these functions is a tuple

### `zeros`, `ones`, `full`

In [42]:
np.zeros((2, 4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [43]:
np.ones((3, 5))

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [44]:
np.full((2, 4), 3)

array([[3, 3, 3, 3],
       [3, 3, 3, 3]])

### `zeros_like`, `ones_like`, `full_like`

Similar to counterparts above, except their shape is defined by another array:

In [45]:
parent = np.arange(10).reshape(2, 5)
parent

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [46]:
np.zeros_like(parent)

array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

In [47]:
np.ones_like(parent)

array([[1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1]])

In [48]:
np.full_like(parent, 3)

array([[3, 3, 3, 3, 3],
       [3, 3, 3, 3, 3]])

### `empty`

Similar to `zeros`, except the array is filled with garbage from RAM 
- this is a bit quicker than `zeros`

In [49]:
d = np.empty(4)
for e in range(4):
    d[e] = e

d

array([0., 1., 2., 3.])

### `eye`

Identity matrix :

In [50]:
np.eye(2)

array([[1., 0.],
       [0., 1.]])

In [51]:
np.eye(6)

array([[1., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 1.]])

The linear algebra verision of a 1

In [52]:
d = np.arange(4).reshape(2, 2)
d

array([[0, 1],
       [2, 3]])

*** dot product times the identity matrix is like multiplying by 1

In [53]:
np.dot(np.eye(2), d)

array([[0., 1.],
       [2., 3.]])

## Matrix Practice

1. Write a function (using numpy) to sort a diven array of shape 2 along the first axis (rows), second axis (column), and on a flattened array.

example: 

In [54]:
# Expected Output:
# Original array:
np.array([[10, 40],
          [30, 20]])
# Sort the array along the first axis:
np.array([[10, 20],
          [30, 40]])
# Sort the array along the last axis:
np.array([[10, 40],
          [20, 30]])
# Sort the flattened array:
np.array([10, 20, 30, 40])

array([10, 20, 30, 40])

#### Answer:

In [55]:
# original array
a = np.array([[10, 40],
          [30, 20]])
print("Original Array:", a)

print("Sort the array along the first axis (rows):")
print(np.sort(a, axis=0))

print("Sort the array along the last axis (columns):")
print(np.sort(a, axis=1))

print("Sort the flattened array:")
print(np.sort(a, axis=None))

Original Array: [[10 40]
 [30 20]]
Sort the array along the first axis (rows):
[[10 20]
 [30 40]]
Sort the array along the last axis (columns):
[[10 40]
 [20 30]]
Sort the flattened array:
[10 20 30 40]


2. Write a function to get the indicies of the sorted elements of a given array

Expected Output:

Original array:


`[1023 5202 6230 1671 1682 5241 4532]`


Indices of the sorted elements of a given array:


`[0 3 4 6 1 5 2]`

#### Answer

In [56]:
student_id = np.array([1023, 5202, 6230, 1671, 1682, 5241, 4532])
def id_sorter(id_array):
    print("Original Array:")
    print(id_array)
    i = np.argsort(student_id)
    print("Indicies of the sorted elements of student ID array:")
    print(i)
    return i 

id_sorter(student_id)


Original Array:
[1023 5202 6230 1671 1682 5241 4532]
Indicies of the sorted elements of student ID array:
[0 3 4 6 1 5 2]


array([0, 3, 4, 6, 1, 5, 2])

3. Write a function to create a 5x5 array with random values and find the minimum and maximum values.

#### Answer

In [57]:
def rand5_min_max():
    x = np.random.random((5,5))
    print("Original Array:")
    print(x) 
    xmin, xmax = x.min(), x.max()
    print("Minimum and Maximum Values:")
    print(xmin, xmax)
    return xmin, xmax

rand5_min_max()

Original Array:
[[0.10692117 0.77249894 0.26497063 0.20244497 0.3627213 ]
 [0.68652303 0.15103275 0.05835538 0.74999069 0.98601996]
 [0.89658771 0.87459671 0.92502281 0.93525835 0.53746398]
 [0.83605902 0.69589308 0.92224564 0.81938786 0.96665166]
 [0.0116458  0.18240519 0.87470089 0.92294342 0.78609639]]
Minimum and Maximum Values:
0.011645796360233773 0.986019958230802


(0.011645796360233773, 0.986019958230802)