# Matrices

A matrix is a collection of numbers ordered in rows and columns.

Each value in these rows and columns is an element of the matrix. 

You'd normally denote a matrix with a capital letter, A. If A had 2 rows and 3 columns, it would be known as a 2-by-3 matrix.

e.g. $\begin{bmatrix}
1 & 2 & 3\\
a & b & c
\end{bmatrix}$

Matrices seem similar to tables and spreadsheets, but they are more complex in math operations.

Two matrices can be added, subtracted and multiplied.

A matrix can only contain numbers, symbols or expressions. 
numbers = 1,2. symbols = a,b,x. exprssions = x+2,y*4

Element at $A_{ij}$, is the element in matrix A at row i and column j.




## Scalars and vectors

Matrices with 1 row and 1 column are common. It is called a scalar. All numbers we know from algebra are referred to as scalars in linear algebra. e.g. [15], [1], [-2]. Scalars have 0 dimensions but have important properties. 

Vectors are common objects in linear algebra. They sit in between scalars and matrices, as they have a single dimension, 1 column or 1 row. A vector is practically the simplest linear algebraic object.
e.g. $\begin{bmatrix}
1\\2\\3
\end{bmatrix}$. It is more common to view a matrix as a collection of vectors.


Generally there are two types of vectors, row vectors and column vectors. 

Length = the number of elements in a vector.

In general, a vectors length is denoted by m e.g. $\begin{bmatrix}
x_1 & x_2 & ... & x_m
\end{bmatrix}$

To summarise, matrices have 2 dimensions, vectors have 1 dimension and scalars have none.

## Linear algebra and geometry

A scalar has no dimensions, which is similar to a geometric point. It has no dimension, no direction, no size.

A vector has one dimension, it is like a line. It has a direction and can be oriented in various ways. It would be nice to plot on a 2D plane, but we need a second line to cross it. 2D space is defined by two lines. Two lines means two vectors, and two vectors is essentially a matrix. This means that any 2D space can be defined by a matrix. 

This idea is simple, but very powerful.

$\begin{bmatrix}
1 & 0\\
0 & 1
\end{bmatrix}$ is essentially a matrix of an x and y axis. 



## Arrays in python

### Import the relevant libraries

In [1]:
import numpy as np

### Declaring scalars, vectors and matrices

#### Scalars

In [2]:
s = 5

In [3]:
s

5

#### Vectors

In [4]:
v = np.array([5,-2,4]) 

In [5]:
v

array([ 5, -2,  4])

#### Matrices

In [7]:
m = np.array([[5,12,6],[-3,0,14]])

In [8]:
m

array([[ 5, 12,  6],
       [-3,  0, 14]])

### Data Types

In [9]:
type(s)

int

In [12]:
type(v) # n-dimensional array (1d)

numpy.ndarray

In [13]:
type(m) # n-dimensional array (2d)

numpy.ndarray

In [17]:
s_array = np.array(5) # convert integer to an array
s_array

array(5)

In [18]:
type(s_array)

numpy.ndarray

### Data Shapes

In [19]:
m.shape

(2, 3)

In [20]:
v.shape

(3,)

In [22]:
s_array.shape

()

### Reshape


In [23]:
v.reshape(1,3)

array([[ 5, -2,  4]])

In [24]:
v.reshape(3,1)

array([[ 5],
       [-2],
       [ 4]])

## Tensors

What is a tensor?

A scalar is a vector of length 1 or a 1x1 matrix.

A vector is a mx1 or 1xm matrix of scalars.

A matrix is a collection of vectors. mxn. (Also a collection of scalars.)

A tensor is the most general concept of what we have seen so far. Scalars, vectors and matrices are all tensors of rank 0, 1 and 2 respectively. 

We havent seen a tensor of rank 3. Its dimensions are k x m x n. It can be thought of as a collection of matrices. 




### Creating a tensor

In [25]:
m1 = np.array([[5,12,6],[-3,0,14]]) 
m1

array([[ 5, 12,  6],
       [-3,  0, 14]])

In [45]:
m2 = np.array([[9,8,7],[1,3,-5]])
m2

array([[ 9,  8,  7],
       [ 1,  3, -5]])

Tensors can be stored in ndarrays, and thats how we often deal with them.

In [46]:
t = np.array([m1,m2])
t

array([[[ 5, 12,  6],
        [-3,  0, 14]],

       [[ 9,  8,  7],
        [ 1,  3, -5]]])

### Checking its shape

In [47]:
t.shape

(2, 2, 3)

### Manually creating a tensor

In [48]:
t_manual = np.array([[[5,12,6],[-3,0,14]],[[9,8,7],[1,3,-5]]])
t_manual

array([[[ 5, 12,  6],
        [-3,  0, 14]],

       [[ 9,  8,  7],
        [ 1,  3, -5]]])

Usually you would load, pre-process and transform the data to get tensors, but this manual method is useful for background.

## Adding and subtracting matrices

#### Adding

Addition is very easy. It only has one condition: The two matrices must have the same dimensions.

$ M_1{2\times3} = \begin{bmatrix}5 & 12 & 6\\-3 & 0 & 14\end{bmatrix}  \hspace{1cm}   M_2{2\times3} = \begin{bmatrix}9 & 8 & 7\\1 & 3 & -5\end{bmatrix}$

To add the two:

$ 
\begin{bmatrix}5 & 12 & 6\\-3 & 0 & 14\end{bmatrix}  
\hspace{1cm} + \hspace{1cm}
\begin{bmatrix}9 & 8 & 7\\1 & 3 & -5\end{bmatrix} 
\hspace{1cm} = \hspace{1cm}
\begin{bmatrix}5+9 & 12+8 & 6+7\\-3+1 & 0+3 & 14-5\end{bmatrix}$

Resulting in:


$\begin{bmatrix}14 & 20 & 13\\-2 & 3 & 9\end{bmatrix}$

To do it in python, you just need to declare them and add a plus sign:

In [49]:
m1 + m2

array([[14, 20, 13],
       [-2,  3,  9]])

#### Subtracting

$ 
\begin{bmatrix}5 & 12 & 6\\-3 & 0 & 14\end{bmatrix}  
\hspace{1cm} - \hspace{1cm}
\begin{bmatrix}9 & 8 & 7\\1 & 3 & -5\end{bmatrix} 
\hspace{1cm} = \hspace{1cm}
\begin{bmatrix}5-9 & 12-8 & 6-7\\-3-1 & 0-3 & 14+5\end{bmatrix}$

Resulting in: <br>

$\begin{bmatrix}-4 & 4 & -1\\-4 & -3 & 19\end{bmatrix}$

In [50]:
m1 - m2

array([[-4,  4, -1],
       [-4, -3, 19]])

This works with floats as well as integers.

### Adding vectors together

In [51]:
v1 = np.array([1,2,3,4,5])
v2 = np.array([5,4,3,2,1])

In [52]:
v1 + v2

array([6, 6, 6, 6, 6])

In [53]:
v1 - v2

array([-4, -2,  0,  2,  4])

## Errors when adding scalars, vectors and matrices

### Import the relevant libraries

In [54]:
import numpy as np

### Addition

#### Addition of scalars

In [56]:
5 + 5

10

In [57]:
10 - 4

6

#### Addition of matrices

Forms must match when adding matrices

In [58]:
m1 = np.array([[5,12,6],[-3,0,14]])
m1

array([[ 5, 12,  6],
       [-3,  0, 14]])

In [60]:
m3 = np.array([[5,3],[-2,4]])
m3

array([[ 5,  3],
       [-2,  4]])

In [61]:
m1 + m3

ValueError: operands could not be broadcast together with shapes (2,3) (2,2) 

#### Addition of vectors

In [62]:
v1 = np.array([1,2,3,4,5])
v1

array([1, 2, 3, 4, 5])

In [63]:
v3 = np.array([1,2,3])
v3

array([1, 2, 3])

In [64]:
v1 + v3

ValueError: operands could not be broadcast together with shapes (5,) (3,) 

#### Exceptions (addition with a scalar)

In [65]:
m1

array([[ 5, 12,  6],
       [-3,  0, 14]])

In [66]:
m1 + 1

array([[ 6, 13,  7],
       [-2,  1, 15]])

Each element was increased by 1

In [67]:
v1 

array([1, 2, 3, 4, 5])

In [68]:
v1 + 1

array([2, 3, 4, 5, 6])

Mathematically these operations are not allowed, but in programming (python) they are. This peculiarity is important.

This result has a meaning in terms of arrays, but not in terms of linear algebra.

### Transpose of a matrix

Transposing a column vector $x = \begin{bmatrix}1\\2\\3\end{bmatrix}$ would result in a row vector $y =\begin{bmatrix}1&2&3\end{bmatrix}$.

The notation for this is T superscript.

$x^T =  \begin{bmatrix}1&2&3\end{bmatrix}$

$\begin{bmatrix}1\\2\\3\end{bmatrix}^T =  \begin{bmatrix}1&2&3\end{bmatrix}$

When we transpose a vector, we are not losing any information.

Transposing the same vector twice yields the same initial vector.

A 3x1 matrix transposed is a 1x3 matrix.

### Import the relevant libraries

In [69]:
import numpy as np

### Transposing matrices

In [70]:
A = np.array([[5,12,6],[-3,0,14]])
A

array([[ 5, 12,  6],
       [-3,  0, 14]])

In [71]:
A.T

array([[ 5, -3],
       [12,  0],
       [ 6, 14]])

In [72]:
B = np.array([[5,3],[-2,4]])
B

array([[ 5,  3],
       [-2,  4]])

In [73]:
B.T

array([[ 5, -2],
       [ 3,  4]])

In [74]:
C = np.array([[4,-5],[8,12],[-2,-3],[19,0]])
C

array([[ 4, -5],
       [ 8, 12],
       [-2, -3],
       [19,  0]])

In [75]:
C.T

array([[ 4,  8, -2, 19],
       [-5, 12, -3,  0]])

### Transposing scalars

In [76]:
s = np.array([5])

In [77]:
s.T

array([5])

### Transposing vectors

In [78]:
x = np.array([1,2,3])
x

array([1, 2, 3])

In [79]:
x.T

array([1, 2, 3])

In [80]:
x.shape

(3,)

In [82]:
x_reshaped = x.reshape(1,3)
x_reshaped

array([[1, 2, 3]])

In [83]:
x_reshaped.T

array([[1],
       [2],
       [3]])

## Dot product

### Scalar multiplication

$\begin{bmatrix}6\end{bmatrix} . \begin{bmatrix}5\end{bmatrix} = \begin{bmatrix}30\end{bmatrix}$ 

$\begin{bmatrix}10\end{bmatrix} . \begin{bmatrix}-2\end{bmatrix} = \begin{bmatrix}-20\end{bmatrix}$ 

### Vector multiplication

$\begin{bmatrix}2\\8\\-4\end{bmatrix} . 
\begin{bmatrix}1\\-7\\3\end{bmatrix} = 
[2\times1 + 8\times(-7) + (-4)\times3] = 
[-66]$ 


You can get two types of products.

Dot products are inner products. (heavily used in multiplying vectors or matrices)

Tensor products are outer products. (We arent looking at these here).

We multiplied two vectors and got a scalar, which is why we call it a scalar product. It is nothing more than the sum of the products of the corresponding elements.

### Dot product

In [90]:
x = np.array([2,8,-4])
y = np.array([1,-7,3])

In [91]:
np.dot(x,y)

-66

In [95]:
u = np.array([0,2,5,8])
v = np.array([20,3,4,-1])

In [96]:
np.dot(u,v)

18

### Scalar * Scalar

In [97]:
np.dot(5,6)

30

In [98]:
np.dot(10,-2)

-20

### Scalar * Vector

In [99]:
x

array([ 2,  8, -4])

In [100]:
5*x

array([ 10,  40, -20])

The initial vector has been SCALED 5 times (thus the name).

## Matrix multiplication

In [101]:
A = np.array([[5,12,6],[-3,0,14]])
A

array([[ 5, 12,  6],
       [-3,  0, 14]])

In [103]:
3*A

array([[15, 36, 18],
       [-9,  0, 42]])

To multply matrices together, you can only multiply an MxN with an NxK matrix.i.e. a 2x3 by a 3x1,3x4,3xk...

Output of MxN . NxK = MxK

e.g. 2x3 . 3x6 = 2x6 matrix

$\begin{bmatrix}2\\8\\-4\end{bmatrix} \hspace{1cm}. \hspace{1cm}
\begin{bmatrix}1\\-7\\3\end{bmatrix} $

These two matrices need to have matching forms. One of them needs to be transposed. 

$\begin{bmatrix}2&8&-4\end{bmatrix} \hspace{1cm}. \hspace{1cm}
\begin{bmatrix}1\\-7\\3\end{bmatrix}\hspace{1cm} =\hspace{1cm}
[-66]$

Now for a more complex example...

$\begin{bmatrix}5&12&6\\-3&0&14\end{bmatrix} \hspace{1cm}.\hspace{1cm}
\begin{bmatrix}2&1\\8&0\\3&0\end{bmatrix}
\hspace{1cm} =\hspace{1cm}
\begin{bmatrix}(5\times2+12\times8+6\times3 = 124)&(5\times-1+12\times0+6\times0 = -5)\\(-3\times2+0\times8+14\times3 = 36)&(-3\times-1+0\times0+14\times0 = 3)\end{bmatrix}
\hspace{1cm} =\hspace{1cm}
\begin{bmatrix}124&-5\\36&3\end{bmatrix}$

Matrices are nothing more than a collection of vectors. When we have a dot product, we always multiply a row vector times a column vector.

In [107]:
B = np.array([[2,-1],[8,0],[3,0]])
B

array([[ 2, -1],
       [ 8,  0],
       [ 3,  0]])

In [108]:
np.dot(A,B)

array([[124,  -5],
       [ 36,   3]])

## Why is linear algebra useful?

Many applications in data science, several of high importance. 

Vectorising code (array programming)
Image recognition
Dimensionality reduction



### Vectorised code

In machine learning, you dont want to write for loops to run through multiple calculations such as testing all possible x values in a regression. 

You are better off using matrix multiplication.

For example, consider a list of inputs for x and the linear regression to get y

$x = [693,656,1060,487,1275] \\
y = 10190 + (223 \times x)$

You could run a for loop to calculate this:

In [114]:
x = [693,656,1060,487,1275]
for size in x:
    print(10190 + 223 * size)

164729
156478
246570
118791
294515


It is better to do this using matrix multiplication, as such:

$\begin{bmatrix}1&693\\1&656\\1&1060\\1&487\\1&1275\end{bmatrix}
.
\begin{bmatrix}10190\\223\end{bmatrix}
=
\begin{bmatrix}164729\\156478\\246570\\118791\\294515\end{bmatrix}
$

This is how algorithms work. 

We have an input matrix, a weights (coefficients) matrix and an outputs matrix.

Whenever we are using linear algebra to compute many values simultaneously, we call it array programming, or vectorising code. It is much faster.

Numpy is optimised for performing this kind of operations. 

### Image recognition

Deep learning and deep neural networks have pushed forward great work in image recognition.

Convolutional neural networks (CNNS) are at the core of this. The idea is that you can take a photo, feed it to the algorithm and classify it. 

Given a 400x400 pixel photo, each pixel is a single colour. for a black and white photo, there are 256 shades of gray. 0 is totally white, 255 is totally black.

You can express this photo as a matrix. 400 x 400 matrix, where each element of the matrix is a value from 0 to 255, which shows the intensity of the colour gray in that pixel. That is how a computer sees a pixel.

For colour photos, we have RGB scale, where you have a colour that can be decomposed into some combination of Red (0-255) Green or Blue. We can then take the original 400x400 matrix and add in a matrix per colour. We end up with a 3 x 400 x 400 tensor, containing 3 matrices, one for each colour.



### Dimensionality reduction

This is where Eigenvalues and Eigenvectors come in, although they have not been covered yet.

Imagine we had a dataset with 3 variables. You could visualise it by doing an x,y,z scatter in 3-dimensions. If you were to look and analyse those data points, you might be able to find a 2-d plane in that space that approxmiately was a good fit for the data points (a bit like a regression line). That plane would be described by two variables, u and v. What you now have is a 2D approximation (u,v) of a 3D space (x,y,z).

Linear algebra gives us efficient ways to transform a problem set into fewer variables. It make sense in the real world when some variables are broadly similar and can be combined.

