# Matrices

A matrix is a grid of numbers that can represent data and a linear transformation. In machine learning and data science, you will find yourself using matrices a lot. Typically when you ingest data, you will often have to represent it in an numeric matrix. This even applies to non-numeric data like text, which is turned into numeric vectors and matrices so machine learning applications (like GPT) can mathematically operate on it. 

Let's first look at a matrix in its numeric form and then we will talk about how it represents a linear transformation. 

## What is a Matrix? 

A **matrix** is a grid of numbers that can be manipulated and operated on as a single unit. It is a way of organizing multiple numeric values efficiently and describe a set of operations on it. It is the fundamental building block to machine learning and data science. 

Here is a matrix $ A $ below: 

$$
A = \begin{bmatrix} -1 & 1  \\ 0.5 & -2 \end{bmatrix}
$$

We can think of each row or column as an individual vector, and we will explore this shortly. A matrix can be used to represent data, such such as housing prices and their respective square footage. It can also be used to hold parameters about a machine learning model, such as the weights and biases in a neural network. It can even be used to capture image data!

However, what makes a matrix more interesting than a numeric grid is it represents a linear transformation. For instance, the matrix $ A $ above is actually the following linear transformation below. 

<video src="https://github.com/thomasnield/anaconda_linear_algebra/raw/main/media/01_MatrixTransformationScence.mp4" controls="controls" style="max-width: 730px;">
</video>


The matrix is represented by the red and green arrows above as two vectors (each a column of the matrix). The black vector is not part of the matrix, but it is there to show how it transforms with the matrix. This is effectively matrix vector multiplciation and we will learn about it in this section too. 

## $ \hat{i} $ (i-hat) and $ \hat{j} $ (j-hat) 

Let's formalize this idea about matrices a little more. Let's declare this matrix $ A $ with vector components $ \hat{i} $ ("i-hat") and $ \hat{j} $ ("j-hat") as its columns. Note how $ \hat{i} $ moves up 1 unit along the x-axis, and $ \hat{j} $ moves up 1 unit along the y-axis. These are what we call **basis vectors**, which are used to describe linear transformations. When our matrix has two perpendicular vectors at $ \begin{bmatrix} 1 \\ 0 \end{bmatrix} $ and $ \begin{bmatrix} 0 \\ 1 \end{bmatrix} $ positions (so that the diagonal vector from top-left to bottom-rigth is all 1's and all other values are 0), we call it the **identity matrix**.  

img


Note the position of the identity matrix. They are lined up to the axes, pointed in positive directions, and have a length of 1. 

img


Okay, now what happens if we transform $ \hat{i} $ and $ \hat{j} $ so $ \hat{i} $  lands at $ \begin{bmatrix} 1 \\ -1 \end{bmatrix} $ and $ \hat{j} $ lands at $ \begin{bmatrix} 2 \\ 1 \end{bmatrix} $ ?  

img


Here is an animation of the transformation below. Look at what happens and note how the whole space is transforming. 

<video src="https://github.com/thomasnield/anaconda_linear_algebra/raw/main/media/05_BasisVectorsEndScene.mp4" controls="controls" style="max-width: 730px;">
</video>

So really, a matrix describes a transformation. Let's make it more interesting by using a matrix to transform a vector next. 

## Matrix Vector Multiplication

Let's take an identity matrix of $ \hat{i} $ and $ \hat{j} $, and throw in another vector $ v $ which is $ \begin{bmatrix} 0.5 \\ 1.5 \end{bmatrix} $. What happens to $ \vec{v} $ (the black vector) below when we move $ \hat{i} $ and $ \hat{j} $ to $ \begin{bmatrix} 1 \\ -1 \end{bmatrix} $ and $ \begin{bmatrix} 2 \\ 1 \end{bmatrix} $  respectively?

<video src="https://github.com/thomasnield/anaconda_linear_algebra/raw/main/media/06_MatrixTransformationScene2.mp4" controls="controls" style="max-width: 730px;">
</video>

As you can see visualized above, $ \vec{v} $ moves with the transformation described by the basis vectors, and it lands at $ \begin{bmatrix} 3.5 \\ 1 \end{bmatrix} $. This operation can be computed with **matrix vector multiplication**, which applies a transformation to a given vector.

We can achieve this in NumPy using the `dot()` function or `@` operator between a matrix and a vector. 

In [None]:
import numpy as np 

i_hat = np.array([1,-1])
j_hat = np.array([2,1])

A = np.array([i_hat,j_hat]).transpose()
v = np.array([0.5, 1.5])

w = A @ v 
w

Note the `transpose()` function is used to swap the rows and columns in a matrix. Since NumPy will populate on a row-by-row basis, not a column-by-column, we need to swap the rows and columns so $ \hat{i} $ and $ \hat{j} $ will populate each column, now row. 

In [None]:
i_hat = np.array([1,-1])
j_hat = np.array([2,1])

A = np.array([i_hat,j_hat])
A

In [None]:
A.transpose()

You can also use use `dot()` to achieve matrix-vector multiplication, but generally best practice is to use `@` as it has favorable policies on multidimensional matrices. 

In [None]:
A.dot(v) 

Numerically, a matrix-vector multiplication will multiply each respective row with each respective column, and sum the products together respectively. 

$ A\vec{v} $

$ = \begin{bmatrix} 1 & 2 \\ -1 & 1 \end{bmatrix} \begin{bmatrix} 0.5 \\ 1.5 \end{bmatrix} $

$ = \begin{bmatrix} (1)(0.5) + (2)(1.5) \\ (-1)(0.5) + (1)(1.5) \end{bmatrix} $

$ = \begin{bmatrix} 3.5 \\ 1 \end{bmatrix} $

Here is an animation of matrix vector multiplication. Note how each row of matrix $ A $ is multiplied and added with each respective element of vector $ \vec{v} $. 

<video src="https://github.com/thomasnield/anaconda_linear_algebra/raw/main/media/07_MatrixVectorMultiplicationScene.mp4" controls="controls" style="max-width: 730px;">
</video>


## Types of Linear Transformations 

In linear algebra, we can only perform *linear* transformations. This means we are limited to four types of movements. 

### Rotate 

The rotate is a linear transformation that literally rotates the space. 

<video src="https://github.com/thomasnield/anaconda_linear_algebra/raw/main/media/08_RotateScene.mp4" controls="controls" style="max-width: 730px;">
</video>

In [None]:
import numpy as np 

i_hat = np.array([0,-1])
j_hat = np.array([1,0])

A = np.array([i_hat,j_hat]).transpose()
v = np.array([1, 1])

A @ v 

### Scale 

Scaling will stretch or squeeze the space. 

<video src="https://github.com/thomasnield/anaconda_linear_algebra/raw/main/media/09_ScaleScene.mp4" controls="controls" style="max-width: 730px;">
</video>

In [None]:
import numpy as np 

i_hat = np.array([1,0])
j_hat = np.array([0,2])

A = np.array([i_hat,j_hat]).transpose()
v = np.array([1, 1])

A @ v 

### Shear 

A shear is much like scaling, except it skews the space in how it stretches it. 

<video src="https://github.com/thomasnield/anaconda_linear_algebra/raw/main/media/10_ShearScene.mp4" controls="controls" style="max-width: 730px;">
</video>

In [None]:
import numpy as np 

i_hat = np.array([1,0])
j_hat = np.array([1,1])

A = np.array([i_hat,j_hat]).transpose()
v = np.array([1, 1])

A @ v 

### Inversion 

An inversion will flip a vector space so the basis vectors $ \hat{i} $ and $ \hat{j} $ will swap places. 

<video src="https://github.com/thomasnield/anaconda_linear_algebra/raw/main/media/11_InversionScene.mp4" controls="controls" style="max-width: 730px;">
</video>

In [None]:
import numpy as np 

i_hat = np.array([0,1])
j_hat = np.array([1,0])

A = np.array([i_hat,j_hat]).transpose()
v = np.array([3,2])

A @ v 

These four linear transformations can be combined to create complex operations on vector spaces and solve several types of problems, ranging from machine learning to linear programming. Below is an operation that combines several of these transformation types into one, and we will learn how to combine matrix transformations later. 

<video src="https://github.com/thomasnield/anaconda_linear_algebra/raw/main/media/01_MatrixTransformationScence.mp4" controls="controls" style="max-width: 730px;">
</video>


In [None]:
import numpy as np 

i_hat = np.array([-1, 0.5])
j_hat = np.array([1, -2])

A = np.array([i_hat,j_hat]).transpose()
v = np.array([3, 2])
w = A @ v 
w

## Inverse Matrices

Now that you have an intuition on how matrices are in fact linear transformations, an inverse matrix will make just as much sense. An **inverse matrix** (not to be confused with an inversion) * undoes the transformation of a matrix. Let's take a look at this animation below. 

These four linear transformations can be combined to create complex operations on vector spaces and solve several types of problems, ranging from machine learning to linear programming. Below is an operation that combines several of these transformation types into one, and we will learn how to combine matrix transformations later. 

<video src="https://github.com/thomasnield/anaconda_linear_algebra/raw/main/media/01_MatrixTransformationScence_reversed.mp4" controls="controls" style="max-width: 730px;">
</video>


Let's take our previous code where we transformed matrix $ A $ and applied it to vector $ \vec{v} $, resulting in vector $ \vec{w} $. 

In [None]:
import numpy as np 

i_hat = np.array([-1, 0.5])
j_hat = np.array([1, -2])

A = np.array([i_hat,j_hat]).transpose()
v = np.array([3, 2])
w = A @ v 
w

We can use NumPy's `inv()` function to calculate the inverse of $ A $, which is a transformation that undoes the transformation. 

In [None]:
A_inv = inv(A)
A_inv

Now if we applied this inverse matrix to the transformed vector $ w $, we can undo the transformation and make it $ v $ again. 

In [None]:
v = A_inv @ w
v

Inverse matrices are a common operation we will apply later, from solving systems of equations to fitting a linear regression via matrix decomposition. 

## Exercise

Vector $ \vec{a} $ starts at location $ \begin{bmatrix} 2 \\ 2 \end{bmatrix} $. However, a transformation occurs with matrix $ \begin{bmatrix} 3 & 2 \\ -2 & -5.5 \end{bmatrix} $. 

Where does $ \vec{a} $ land? 

Use Python and NumPy to calculate the answer, or calculate by hand on pencil and paper. 

### SCROLL DOWN FOR ANSWER
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
|<br>
v 

When you perform the matrix-vector multiplication, the answer is $ \begin{bmatrix} 10 \\ -15 \end{bmatrix} $. You can calculate this by hand as shown below or use NumPy. 

$ A\vec{v} $

$ = \begin{bmatrix} 3 & 2 \\ -2 & -5.5 \end{bmatrix} \begin{bmatrix} 2 \\ 2 \end{bmatrix} $

$ = \begin{bmatrix} (3)(2) + (2)(2) \\ (-2)(2) + (-5.5)(2) \end{bmatrix} $

$ = \begin{bmatrix} 10 \\ -15 \end{bmatrix} $

In [None]:
import numpy as np 

i_hat = np.array([3, -2])
j_hat = np.array([2, -5.5])

A = np.array([i_hat,j_hat]).transpose()
v = np.array([2, 2])
w = A @ v 
w