# Linear Algebra - Matrices (Pt 1)


## 1. Matrices

These are $\mathbb{R}^{m\times n}$ objects.

### 1.1. <mark>Geometric intuition (from 3B1B)</mark>:
- <mark>Think of matrices as **encoding** linear transformations of vector spaces</mark>
    - I.e. A matrix $\textbf{A}$ moves **every input vector** (more precisely, the **point where every vector's tip is**) **linearly** to a new location.
- The columns of the matrix can be thought of as **landing points** for the basis (unit) vectors $\hat{i} = \left[\begin{smallmatrix} 1 \\ 0 \end{smallmatrix}\right]$ and $\hat{j} = \left[\begin{smallmatrix} 0 \\ 1 \end{smallmatrix}\right]$ after the transformation is applied
- A linear transformation means after applying the matrix $\textbf{A}$:
    - **the origin remains fixed**, and 
    - **all grid lines in the space remain straight, parallel, and evenly spaced**

### 1.2. Length: 

If you treat the $m \times n$ elements of $\textbf{M}$ as an $mn$-dimensional (flattened 2D to 1D) **"vector"**, the $p$-norm of that "vector" is:

$$\Vert \textbf{M} \Vert_{p} = \sqrt[p]{(\sum_i^m \sum_j^n |m_{ij}|^p)}$$

For a matrix we also commonly use the L2 norm (referred to as the ***Frobenius norm***) to calculate its magnitude vector. Substitute $p=2$ above


## 2. Matrix addition, $\textbf{A} + \textbf{B}$

Same mechanics as for vectors (see above)
* **Matrix addition**: $\textbf{M} = \textbf{A} + \textbf{B}$ is the matrix with elements $M_{ij} = A_{ij} + B_{ij}$


In [1]:
import numpy as np

# Sum matrices A and B:
A = np.array([[1, 7], [2, 3], [5, 0]])
B = np.array([[3, 1], [4, 7], [9, 5]])

M = A + B

print("A =\n", A, "\nB =\n", B)
print("\nMatrix addition: \nA + B =\n", M)


A =
 [[1 7]
 [2 3]
 [5 0]] 
B =
 [[3 1]
 [4 7]
 [9 5]]

Matrix addition: 
A + B =
 [[ 4  8]
 [ 6 10]
 [14  5]]


## 3. Scalar-matrix multiplication, $\alpha\textbf{A}$
* **Scalar multiplication of a matrix**: $\textbf{M} = \alpha \textbf{A}$ is the matrix with elements $M_{ij} = \alpha A_{ij}$


In [2]:
# Multiply matrix A by scalar value alpha = 2
A = np.array([[1, 7], [2, 3], [5, 0]])
alpha = 2
M = alpha * A

print("alpha =\n", alpha, "\nA =\n", A)
print("\nScalar matrix multiplication: \nalpha * A =\n", M)


alpha =
 2 
A =
 [[1 7]
 [2 3]
 [5 0]]

Scalar matrix multiplication: 
alpha * A =
 [[ 2 14]
 [ 4  6]
 [10  0]]


#TODO The below section probably needs updating for square vs non-square matrices

## 4. Matrix multiplication (4 approaches)
### 4.1. Matrix times vector, $\textbf{Av}$

<mark>Geometric intuition for square matrices (3B1B)</mark>:
- A square matrix $\textbf{A}$ represents a <mark>linear transformation of a vector space</mark> (where $\textbf{A} \in \mathbb{R}^{m \times n}$).
- Matrix-vector multiplication $\textbf{v}_\text{new}=\textbf{Av}$ <mark>applies this transformation</mark> (encoded in $\textbf{A}$) <mark>to a column vector</mark> $\textbf{v}$ (where $\textbf{v} \in \mathbb{R}^{n \times 1}$).
    - Note how inner dimensions match: $\textbf{A} \in \mathbb{R}^{m \times n}$ and $\textbf{v} \in \mathbb{R}^{n \times 1}$, therefore the result $\textbf{v}_\text{new}$ will also be a column vector, albeit in $\mathbb{R}^{m \times 1}$.
- Elements of the resultant vector (let's call it $\textbf{y}=\textbf{v}_\text{new}$) are given by: $$ y_i = \sum _{j=1}^{n}a_{ij}v_{j} $$
- <mark>**3B1B intuition for square matrices**</mark>: 
    - The new vector $\textbf{v}_\text{new}$ will be a linear combination of the columns of $\textbf{A}$ (i.e. where the basis vectors $\hat{i} = \left[\begin{smallmatrix} 1 \\ 0 \end{smallmatrix}\right]$ and $\hat{j} = \left[\begin{smallmatrix} 0 \\ 1 \end{smallmatrix}\right]$ **end up** in the new space)
    - with coefficients given by the elements of $\textbf{v}$.
    - For non-square matrices see [this link](https://math.stackexchange.com/questions/1988948/geometric-interpretation-of-non-square-matrices)


In [3]:
# Multiply matrix A by (column) vector v = [2, 3]
A = np.array([[1, 7], [2, 3], [5, 0]])
v = np.array([[2], [3]])  # column vector
M = A @ v

print("A =\n", A, "\nv =\n", v)
print("\nMatrix times vector: \nAv =\n", M)


A =
 [[1 7]
 [2 3]
 [5 0]] 
v =
 [[2]
 [3]]

Matrix times vector: 
Av =
 [[23]
 [13]
 [10]]


### 4.2. Vector times matrix (pre-multiply), $\textbf{v}^\mathrm{T}\textbf{A}$
- To multiply a (column) vector by a matrix, first transpose the vector $\textbf{v}$ (i.e. make it a **row** vector $\textbf{v}^\textbf{T}$) to make the inner dimensions match.
    - Note how in $\textbf{v}_\text{new}=\textbf{v}^\mathrm{T}\textbf{A}$, inner dimensions match: $\textbf{v}^\mathrm{T} \in \mathbb{R}^{1 \times m}$, and $\textbf{A} \in \mathbb{R}^{m \times n}$ therefore the result $\textbf{v}_{new}$ will also be a row vector in $\mathbb{R}^{1 \times n}$.
- Elements of the resultant vector (let's call it $\textbf{y}=\textbf{v}_\text{new}$) are given by: $$y_{k}=\sum _{j=1}^{n}v_{j}a_{jk}$$
  

In [4]:
# Transpose the column vector v = [2, 3, 1], and multiply the resulting row vector v^T by matrix A
v = np.array([[2], [3], [1]])  # column vector
A = np.array([[1, 7], [2, 3], [5, 0]])
M = v.T @ A  # v.T is a row vector

print("v.T =\n", v.T, "\nA =\n", A)
print("\nVector times matrix: \nv^T A =", M[0])


v.T =
 [[2 3 1]] 
A =
 [[1 7]
 [2 3]
 [5 0]]

Vector times matrix: 
v^T A = [13 23]


### 4.3. Matrix Hadamard (element-wise) product, $A\odot B$:

- Definition (same as for vectors): Element-wise product on two matrices of same-dimension (i.e. $A, B\in \mathbb{R}^{m \times n}$)
- Elements of the resultant matrix are given by: $ (A\odot B)_{ij} = (A)_{ij}(B)_{ij} $. Example:

$$ \begin{bmatrix}2&3&1\\0&8&-2\end{bmatrix} \odot {\begin{bmatrix}3&1&4\\7&9&5\end{bmatrix}}={\begin{bmatrix}2\times 3&3\times 1&1\times 4\\0\times 7&8\times 9&-2\times 5\end{bmatrix}}={\begin{bmatrix}6&3&4\\0&72&-10\end{bmatrix}} $$


In [5]:
# Compute the Hadamard product of matrices A and B:
A = np.array([[2, 3, 1], [0, 8, -2]])
B = np.array([[3, 1, 4], [7, 9, 5]])
M = np.multiply(A, B)

print("A =\n", A, "\nB =\n", B)
print("\nMatrix Hadamard product: \nA ⊙ B =\n", M)


A =
 [[ 2  3  1]
 [ 0  8 -2]] 
B =
 [[3 1 4]
 [7 9 5]]

Matrix Hadamard product: 
A ⊙ B =
 [[  6   3   4]
 [  0  72 -10]]


### 4.4. Matrix times matrix, $\textbf{AB}$ (use `np.dot(A,B)` to multiply)

The inner matrix dimensions of the two matrices (e.g. $\textbf{A}$ and $\textbf{B}$) **must match**. 
- $\textbf{A}$ is of dimension $\mathbb{R}^{m \times p}$
- $\textbf{B}$ is of dimension $\mathbb{R}^{p \times n}$ 
- Here, the dimension of size $p$ is the **inner matrix dimension**. 
    - If they match, it means # columns in $\textbf{A}$ equals # rows in $\textbf{B}$.
- Dimensions $m$ and $n$ are the **outer matrix dimensions**. Thus each element of $\textbf{M=AB}$ can be computed as:

$$M_{ij} = \sum_{k=1}^p P_{ik}Q_{kj}$$

- <mark>I.e. (important)</mark> the $i,j$'th element of $\textbf{M}$ is the dot product of the $i$'th row of $\textbf{A}$ with $j$'th column of $\textbf{Q}$

In [6]:
# Multiply A=[[1,7],[2,3],[5,0]] and B=[[2,6,3,1],[1,2,3,4]] -> [3x2] * [2x4] = output [3x4]

A = np.array([[1, 7], [2, 3], [5, 0]])
B = np.array([[2, 6, 3, 1], [1, 2, 3, 4]])

print("A =\n", A, "\nB =\n", B)
print("\nMatrix times matrix: \nAB =\n", np.dot(A, B))  # <-- inner dimensions match (p=2). output is a [3x4] matrix


A =
 [[1 7]
 [2 3]
 [5 0]] 
B =
 [[2 6 3 1]
 [1 2 3 4]]

Matrix times matrix: 
AB =
 [[ 9 20 24 29]
 [ 7 18 15 14]
 [10 30 15  5]]


In [7]:
# Multiplying B and A will raise a ValueError
np.dot(B, A)  # <-- inner dimensions don't match ...4] * [3...; Error


ValueError: shapes (2,4) and (3,2) not aligned: 4 (dim 1) != 3 (dim 0)

## 5. Matrix multiplication as composition of linear transformations

To perform multiple linear transformations (in order) on a vector $\textbf{v}$, you can pre-multiply $\textbf{v}$ by the matrices representing each transformation.

### 5.1. Example:
For example, say you wish to apply **two** transformations to $\textbf{v}$. **First**, a rotation, **then** a shear (in that order). 

- Let's call $\textbf{M}_1$ the $\text{Rotation}$ matrix, and
- $\textbf{M}_2$ the $\text{Shear}$ matrix

We can apply both transformations as follows:

$$

\begin{equation*}
\underbrace{
\begin{bmatrix}
    1 & 1 \\
    0 & 1
\end{bmatrix}
}_{\text{\small 2. Shear}}
\left(
\smash{\underbrace{
\begin{bmatrix}
    0 & -1 \\
    1 & 0
\end{bmatrix}
}_{\text{\small 1. Rotation}}}
\begin{bmatrix}
    x \\
    y
\end{bmatrix}
\right)
=
\underbrace{
\begin{bmatrix}
    1 & -1 \\
    1 & 0
\end{bmatrix}
}_{\text{\small Composition}}
\begin{bmatrix}
    x \\
    y
\end{bmatrix}
\end{equation*}

$$

Note this reads from **right to left**: $\textbf{M}_2(\textbf{M}_1\textbf{v})$ means first apply $\textbf{M}_1$ to $\textbf{v}$, then apply $\textbf{M}_2$ to the result of that.

Note that the **composition** matrix $\textbf{M}_2\textbf{M}_1$ is the matrix that represents the **combined** transformation of the two individual transformations:

$$

\begin{equation*}
\underbrace{
\begin{bmatrix}
    1 & 1 \\
    0 & 1
\end{bmatrix}
}_{\text{\small 2. Shear}}
\;\;
\underbrace{
\begin{bmatrix}
    0 & -1 \\
    1 & 0
\end{bmatrix}
}_{\text{\small 1. Rotation}}
=
\underbrace{
\begin{bmatrix}
    1 & -1 \\
    1 & 0
\end{bmatrix}
}_{\text{\small Composition}}
\end{equation*}

$$


### 5.2. Geometric intuition

#### 5.2.1. Rotation matrix $\textbf{M}_1$:
$\textbf{M}_1$ ($\text{Rotation}$ matrix) tells you where the basis vectors $\hat{i} = \left[\begin{smallmatrix} 1 \\ 0 \end{smallmatrix}\right]$ and $\hat{j} = \left[\begin{smallmatrix} 0 \\ 1 \end{smallmatrix}\right]$ end up after the **first** transformation (Rotation).

$$
\begin{equation*}
\textbf{M}_1
=
\begin{bmatrix}
    0 & -1 \\
    1 & 0
\end{bmatrix}
\end{equation*}
$$

Specifically:

- Column 1 of $\textbf{M}_1$ states that the first basis vector $\hat{i}$ is rotated as follows: $\left[\begin{smallmatrix} 1 \\ 0 \end{smallmatrix}\right] \rightarrow \left[\begin{smallmatrix} 0 \\ 1 \end{smallmatrix}\right]$
- Column 2 of $\textbf{M}_1$ states that the second basis vector $\hat{j}$ is rotated as follows: $\left[\begin{smallmatrix} 0 \\ 1 \end{smallmatrix}\right] \rightarrow \left[\begin{smallmatrix} -1 \\ 0 \end{smallmatrix}\right]$


#### 5.2.1. Shear matrix $\textbf{M}_2$:

Next, $\textbf{M}_2$ ($\text{Shear}$ matrix) states where these "new" (rotated) basis vectors end up after the **second** (shear) transformation.

The rotated $\hat{i}$ is "sheared":

$$

\begin{equation*}
\begin{aligned}
&\underbrace{
\begin{bmatrix}
    1 & 1 \\
    0 & 1
\end{bmatrix}
}_{\textbf{M}_2}
\underbrace{
\begin{bmatrix}
    0 \\
    1
\end{bmatrix}
}_{\text{Rotated }\hat{i}}
= 
0 
\begin{bmatrix}
    1 \\ 0
\end{bmatrix}
+ 1
\begin{bmatrix}
    1 \\ 1
\end{bmatrix}
=
\underbrace{
\begin{bmatrix}
    1 \\
    1
\end{bmatrix}
}_{\substack{\text{Rotated, then} \\ \text{Sheared $\hat{i}$}}}
\end{aligned}
\end{equation*}
$$

Similarly, the rotated $\hat{j}$ is "sheared" as follows:

$$

\begin{equation*}
\begin{aligned}
&\underbrace{
\begin{bmatrix}
    1 & 1 \\
    0 & 1
\end{bmatrix}
}_{\textbf{M}_2}
\underbrace{
\begin{bmatrix}
    -1 \\
    0
\end{bmatrix}
}_{\text{Rotated }\hat{i}}
= 
-1
\begin{bmatrix}
    1 \\ 0
\end{bmatrix}
+ 0
\begin{bmatrix}
    1 \\ 1
\end{bmatrix}
=
\underbrace{
\begin{bmatrix}
    -1 \\
    0
\end{bmatrix}
}_{\substack{\text{Rotated, then} \\ \text{Sheared $\hat{j}$}}}
\end{aligned}
\end{equation*}
$$

$$

\begin{equation*}
\begin{bmatrix}
    a & b \\
    c & d
\end{bmatrix}
\begin{bmatrix}
    e & f \\
    g & h
\end{bmatrix}
=
\begin{bmatrix}
    ae+bg & af+bh \\
    ce+dg & cf+dh
\end{bmatrix}
\end{equation*}

$$