In [1]:
import numpy as np

# Matrix 

A matrix is a grid of numbers arranged into m rows and n columns, kind of like a excel spread sheet or a pandas dataframe.

$$ A = 
\begin{bmatrix}
 1 & 2 \\ 
 3 & 4 \\ 
 5 & 6 
\end{bmatrix} $$


 $ A $ is a matrix with 3 row and 2 columns or a 3 by 2 matrix. The items inside a matrix are called elements, so each element in $ A $ is a number. In numpy there a numerous methods to create matrices, some are demonstrated bellow.


In [2]:
np.random.random((3,3))

array([[ 0.8296749 ,  0.27116018,  0.27894468],
       [ 0.611523  ,  0.69195817,  0.88242514],
       [ 0.02233499,  0.77758855,  0.08718171]])

In [3]:
np.zeros((3,4))

array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])

In [4]:
np.ones((2,10))

array([[ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.]])

In [5]:
np.eye(5) #

array([[ 1.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  1.]])

In [6]:
np.array([[1402,191],[1371,821],[949,1437]])

array([[1402,  191],
       [1371,  821],
       [ 949, 1437]])

## Matrix indexing

We can access individual elements using the syntax

```
Matrix[row,col]
```

In [7]:
A = np.array([[1402,191],[1371,821],[949,1437]])
row = 0
col = 0
A[row,col]

1402

The normal python index rules apply so we can index matrix using 

```python
Matrix[start:stop:step,start:stop:step]
```

In [8]:
#A[0::2] #alternative syntax
A[0::2,:] # return only odd rows

array([[1402,  191],
       [ 949, 1437]])

In [9]:
A[1::2,:] # return only even rows

array([[1371,  821]])

In [10]:
A[0::2,0::2] # odd rows and odd cols

array([[1402],
       [ 949]])

## Matrix addition

When we you add two matrices the elements are added element wise. The same rules apply for subtraction

$$ \begin{bmatrix}
 1 & 2 \\ 
 3 & 4 \\ 
 5 & 6 
\end{bmatrix} +
 \begin{bmatrix}
 1 & 1 \\ 
 1 & 1 \\ 
 1 & 1 
\end{bmatrix} =
 \begin{bmatrix}
 1 +1 & 1 +2 \\ 
 1 + 3& 1 + 4 \\ 
 1 +5 & 1 + 6 
\end{bmatrix} $$

In traditional maths only matrices of the same size can be added or subtracted. However in numpy these rules are relax by something called broadcasting, which well cover later.

In [11]:
np.ones((5,5)) + np.ones((5,1)) 

array([[ 2.,  2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.,  2.]])

## Multiplication 

Matrix multiplication is slghtly more complicated than addition and subtraction. I'll start by explaining when it's valid to multiply two matrices together and later how you do it.

### Inside outside rule

Matrix multiplication has different rules to addition and subtraction. Matrices must be of certain dimensions if we wish to multiply them together. Say we have matrix $ A $ which is **m by n** and we have matrix $ B $ which is **n by p**, the two matrices can only be multiplied if n ==  n. The resulting product will be **m by p**. This is explained by the picture bellow



![inny outey rule](https://www.freemathhelp.com/images/lessons/mat11.png)

We can express the above matrix multiplication in numpy like so.

In [12]:
X = np.array([[1,2,3],[4,5,6]])
print (X)
X.shape

[[1 2 3]
 [4 5 6]]


(2, 3)

In [13]:
y = np.array([9,8,7])
y = np.expand_dims(y,1)
print(y)
y.shape

[[9]
 [8]
 [7]]


(3, 1)

In [14]:
X@y

array([[ 46],
       [118]])

### The calculation

But how is the answers calculated? We can think of the calculation in terms of dot products.  We can figure out the resulting matrix shape using the inside out rule. So to fill position 1,1  in our new matrix we take the dot product from row 1 of the first matrix and column 1 from the second matrix

![matrix mul](https://www.mathsisfun.com/algebra/images/matrix-multiply-a.svg)

The next position we want to calculated is $ (1,2) $, so we take the dot product of row 1 from the first matrix and column two from the second. 

![](https://www.mathsisfun.com/algebra/images/matrix-multiply-b.svg)

## Reshaping

As we have seen above if matrices are not of the correct shape we can't do operations upon them. Often in deep learning we need to reshape matrices in order to create our networks. 

In [15]:
A = np.arange(1,10)
A

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [23]:
B = A.reshape((3,3))
B

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

We can get the shape of a matrix like so.

In [24]:
A.shape

(9,)

In [25]:
B.shape

(3, 3)

## Order matters

The order of matrix multiplication matters. That means that

$$ AB \neq BA $$ 



In [26]:
A = np.arange(1,10).reshape((3,3))
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [27]:
B = np.arange(10,19).reshape((3,3))
B

array([[10, 11, 12],
       [13, 14, 15],
       [16, 17, 18]])

In [28]:
# np.matmul(A,B)
A @ B

array([[ 84,  90,  96],
       [201, 216, 231],
       [318, 342, 366]])

In [29]:
B @ A

array([[138, 171, 204],
       [174, 216, 258],
       [210, 261, 312]])

In [30]:
A @ B  == B @ A #check eqaulity element wise

array([[False, False, False],
       [False,  True, False],
       [False, False, False]], dtype=bool)

In [31]:
(A @ B  == B @ A).all() #Check if all of them are true

False

# Transpose

The transpose switches the rows and columns in a matrix.A transpose can be thought of as rotating a matrix around it's diagonal. The diagonal starts in the top left hand corner and ends in the bottom right and corner. In maths the transpose is indicated by a T in the top-right corner of the matrix.

$$ \begin{bmatrix}
 1 & 2 & 3 \\ 
 4 & 5 & 6
\end{bmatrix}^T =
\begin{bmatrix}
1 & 2\\ 
3 & 4\\ 
5 & 6
\end{bmatrix} $$

In [None]:
A = np.array([[1,2,3],[4,5,6],[7,8,9],[1,5,6]])
A

In [None]:
A.T

# Broadcasting

In numpy the rules for addition, subtraction and multiplication are relaxed. Broadcasting allows us to take the hadamard product (element wise multiplication) of a matrix and a vector, even though there shapes are not the same.



$$ \begin{bmatrix}
a_1  \\ 
a_2   
\end{bmatrix} \cdot
\begin{bmatrix}
b_1 & b_2  \\ 
b_3 & b_4   
\end{bmatrix} =
\begin{bmatrix}
a_1 \cdot b_1 & a_1 \cdot b_2  \\ 
a_2 \cdot b_3 &  a_2 \cdot b_4   
\end{bmatrix} $$

In [None]:
A = np.array([1,2])
B = np.array([[2,3],
              [4,5]])
np.multiply(A,B)

The same rules apply for addition

In [None]:
A = np.array([0,1])
B = np.array([[1,2],
              [3,4]])
A + B

## Questions


** 1. **   Define a function that returns True of False depending if two matrices can be multiplied together.


In [33]:
def can_be_multiply(A,B):
    if A.shape[1] == B.shape[0]:
        return True
    return False

In [36]:
A = np.array([[1,2],[3,4],[5,6]])
B = np.array([[2,3,4],[7,8,9]])
print (A.shape,B.shape)

(3, 2) (2, 3)


In [37]:
can_be_multiply(A,B)

True

** 2. **  Define a function that given two matrices will return the dimension of the multiplication.

In [38]:
def return_shape(A,B):
    return A.shape[0],B.shape[1]

In [39]:
return_shape(A,B)

(3, 3)

** 3. **  Create a 3 by 3 matrix with values ranging from 1 to 9

In [60]:
A = np.arange(1,10).reshape(3,3)
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

### 4.

Create a 3 by 3 by 3 matrix with random values

In [71]:
np.arange(1,10).reshape(3,3)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [70]:
np.random.randint(1,100,9).reshape(3,3)

array([[54,  1, 31],
       [74, 54, 12],
       [65,  5, 66]])

### 5.

Reshape the following matrices $ A $ and $ B $ so that they can be multiplied together.

In [3]:
A = np.ones((3,4))
B = np.random.random((6,2))
print (A,A.shape)
print (B,B.shape)

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]] (3, 4)
[[0.53195344 0.54501929]
 [0.73991632 0.86253498]
 [0.57100699 0.46409718]
 [0.10707019 0.81734006]
 [0.71987676 0.53955744]
 [0.36088275 0.03921437]] (6, 2)


In [5]:
A @ B.reshape(4,3)

array([[2.04111605, 2.29424909, 1.96310464],
       [2.04111605, 2.29424909, 1.96310464],
       [2.04111605, 2.29424909, 1.96310464]])

### 6.

Using C and D (defined bellow) perform the following operation $ C \cdot D^T $

In [12]:
C = np.random.randint(1,10,3)
D = np.random.randint(1,10,3)
print (C,D)

[9 2 9] [6 9 7]


In [14]:
np.dot(C,D.T)

135

### 7. 
Given the bellow 3 matrices put them in a order that they can multiplied together

In [16]:
X1 = np.ones((4,3))
X2 = np.ones((5,4)) + 2
X3 = np.ones((3,5)) + 1

In [21]:
X2 @ (X1 @ X3)   # size(5,4)  x  (size(4,3) x size (3,5) -> size (4,5)  ) ---> size (5,5)

array([[72., 72., 72., 72., 72.],
       [72., 72., 72., 72., 72.],
       [72., 72., 72., 72., 72.],
       [72., 72., 72., 72., 72.],
       [72., 72., 72., 72., 72.]])

### 8.

How would you create a 8 by 8 matrix with a checkerboard pattern? * Hint * [::2] indexing will help. 

In [49]:
A = np.zeros((8,8))
A

array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

In [52]:
A[::2,::2] = 1
A[1::2,1::2] = 1

In [53]:
A

array([[ 1.,  0.,  1.,  0.,  1.,  0.,  1.,  0.],
       [ 0.,  1.,  0.,  1.,  0.,  1.,  0.,  1.],
       [ 1.,  0.,  1.,  0.,  1.,  0.,  1.,  0.],
       [ 0.,  1.,  0.,  1.,  0.,  1.,  0.,  1.],
       [ 1.,  0.,  1.,  0.,  1.,  0.,  1.,  0.],
       [ 0.,  1.,  0.,  1.,  0.,  1.,  0.,  1.],
       [ 1.,  0.,  1.,  0.,  1.,  0.,  1.,  0.],
       [ 0.,  1.,  0.,  1.,  0.,  1.,  0.,  1.]])

### 9.

Can you multiply the two matrices, $ A $ and $ B $, without using numpy?

In [None]:
A=[[1,2]
  ,[3,4]]

B=[[3,4]
  ,[5,6]]

### 10. 

How would you take the transpose of a matrix without using numpy?

## Additional resources

Bellow are list of useful resources

* [SciPy numpy talk](https://www.youtube.com/watch?v=lKcwuPnSHIQ)
* [Matrix multiplication ](https://www.freemathhelp.com/matrix-multiplication.html)
* [Maths needed for deep learning](https://www.quora.com/What-mathematical-background-does-one-need-for-learning-Deep-Learning)
* [Linear algebra for deep learning](http://www.deeplearningbook.org/contents/linear_algebra.html)
* [Great youtube playlist on linear algebra](https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw)
