# Linear Algebra - Matrices (Pt 1)


#TODO: This should be rewritten for square vs non-square matrices

## 1. Matrices

These are $\mathbb{R}^{m\times n}$ objects.

### 1.1. <mark>Geometric intuition of square matrices (from 3B1B)</mark>:
- <mark>Think of square matrices **encoding** linear transformations of vector spaces</mark>
    - A matrix $\textbf{A} : \textbf{A} \in \mathbb{R}^{n \times n}$ moves **every input vector** (more precisely, the **point where every vector's tip is**) **linearly** to a new location.
- For a 2x2 matrix, the columns of the matrix can be thought of as **landing points** for the basis (unit) vectors $\hat{i} = \left[\begin{smallmatrix} 1 \\ 0 \end{smallmatrix}\right]$ and $\hat{j} = \left[\begin{smallmatrix} 0 \\ 1 \end{smallmatrix}\right]$ after the transformation is applied
- A linear transformation means after applying the matrix $\textbf{A}$:
    - **the origin remains fixed**, and 
    - **all grid lines in the space remain straight, parallel, and evenly spaced**

### 1.2. Length: 

If you treat the $m \times n$ elements of $\textbf{M}$ as an $mn$-dimensional (flattened 2D to 1D) **"vector"**, the $p$-norm of that "vector" is:

$$\Vert \textbf{M} \Vert_{p} = \sqrt[p]{(\sum_i^m \sum_j^n |m_{ij}|^p)}$$

For a matrix we also commonly use the L2 norm (referred to as the ***Frobenius norm***) to calculate its magnitude vector. Substitute $p=2$ above


## 2. Matrix addition, $\textbf{A} + \textbf{B}$

Same mechanics as for vectors (see above)
* **Matrix addition**: $\textbf{M} = \textbf{A} + \textbf{B}$ is the matrix with elements $M_{ij} = A_{ij} + B_{ij}$


In [1]:
import numpy as np

# Sum matrices A and B:
A = np.array([[1, 7], [2, 3], [5, 0]])
B = np.array([[3, 1], [4, 7], [9, 5]])

print("A =\n", A, "\nB =\n", B)
print("\nMatrix addition: A + B \npy: A + B =\n", A + B)


A =
 [[1 7]
 [2 3]
 [5 0]] 
B =
 [[3 1]
 [4 7]
 [9 5]]

Matrix addition: A + B 
py: A + B =
 [[ 4  8]
 [ 6 10]
 [14  5]]


## 3. Scalar-matrix multiplication, $\alpha\textbf{A}$
* **Scalar multiplication of a matrix**: $\textbf{M} = \alpha \textbf{A}$ is the matrix with elements $M_{ij} = \alpha A_{ij}$


In [2]:
# Multiply matrix A by scalar value alpha = 2
A = np.array([[1, 7], [2, 3], [5, 0]])
alpha = 2

print("alpha =", alpha, "\nA =\n", A)
print("\nScalar matrix multiplication: alpha * A\npy: alpha * A =\n", alpha * A)


alpha = 2 
A =
 [[1 7]
 [2 3]
 [5 0]]

Scalar matrix multiplication: alpha * A
py: alpha * A =
 [[ 2 14]
 [ 4  6]
 [10  0]]


#TODO The below section probably needs updating for square vs non-square matrices

## 4. Matrix multiplication (4 approaches)
### 4.1. Matrix times vector, $\textbf{Av}$

#### <mark> 4.1.1. Geometric intuition for square matrices (3B1B)</mark>:
- A square matrix $\textbf{A}$ represents a <mark>linear transformation of a vector space</mark> (where $\textbf{A} \in \mathbb{R}^{m \times n}$).
- Matrix-vector multiplication $\textbf{v}_\text{new}=\textbf{Av}$ <mark>applies this transformation</mark> (encoded in $\textbf{A}$) <mark>to a column vector</mark> $\textbf{v}$ (where $\textbf{v} \in \mathbb{R}^{n \times 1}$).
    - Note how inner dimensions match: $\textbf{A} \in \mathbb{R}^{m \times n}$ and $\textbf{v} \in \mathbb{R}^{n \times 1}$, therefore the result $\textbf{v}_\text{new}$ will also be a column vector, albeit in $\mathbb{R}^{m \times 1}$.
- Elements of the resultant vector (let's call it $\textbf{y}=\textbf{v}_\text{new}$) are given by: $$ y_i = \sum _{j=1}^{n}a_{ij}v_{j} $$
- <mark>**3B1B intuition for square matrices**</mark>: 
    - The new vector $\textbf{v}_\text{new}$ will be a linear combination of the columns of $\textbf{A}$ (i.e. where the basis vectors $\hat{i} = \left[\begin{smallmatrix} 1 \\ 0 \end{smallmatrix}\right]$ and $\hat{j} = \left[\begin{smallmatrix} 0 \\ 1 \end{smallmatrix}\right]$ **end up** in the new space)
    - with coefficients given by the elements of $\textbf{v}$.
    - For non-square matrices see [this link](https://math.stackexchange.com/questions/1988948/geometric-interpretation-of-non-square-matrices)

#### 4.1.2. Geometric intuition for non-square matrices (3B1B):

In [3]:
# Multiply matrix A by (column) vector v = [2, 3]
A = np.array([[1, 7], [2, 3], [5, 0]])
v = np.array([[2], [3]])  # column vector

print("A =\n", A, "\nv =\n", v)
print("\nMatrix times vector: Av\npy: A @ v =\n", A @ v)

# These methods of multiplying a matrix by a vector produce the same result as A @ v
print("\n----------------- Below methods produce same result -----------------")
print("\npy: np.matmul(A, v) =\n", np.matmul(A, v))
print("\npy: np.dot(A, v) =\n", np.dot(A, v))
print("\npy: np.inner(A, v.T) =\n", np.inner(A, v.T))  # confusing. inner() needs second argument to be transposed


A =
 [[1 7]
 [2 3]
 [5 0]] 
v =
 [[2]
 [3]]

Matrix times vector: Av
py: A @ v =
 [[23]
 [13]
 [10]]

----------------- Below methods produce same result -----------------

py: np.matmul(A, v) =
 [[23]
 [13]
 [10]]

py: np.dot(A, v) =
 [[23]
 [13]
 [10]]

py: np.inner(A, v.T) =
 [[23]
 [13]
 [10]]


### 4.2. Vector times matrix (pre-multiply), $\textbf{v}^\mathrm{T}\textbf{A}$
- To multiply a (column) vector by a matrix, first transpose the vector $\textbf{v}$ (i.e. make it a **row** vector $\textbf{v}^\textbf{T}$) to make the inner dimensions match.
    - Note how in $\textbf{v}_\text{new}=\textbf{v}^\mathrm{T}\textbf{A}$, inner dimensions match: $\textbf{v}^\mathrm{T} \in \mathbb{R}^{1 \times m}$, and $\textbf{A} \in \mathbb{R}^{m \times n}$ therefore the result $\textbf{v}_{new}$ will also be a row vector in $\mathbb{R}^{1 \times n}$.
- Elements of the resultant vector (let's call it $\textbf{y}=\textbf{v}_\text{new}$) are given by: $$y_{k}=\sum _{j=1}^{n}v_{j}a_{jk}$$
  

In [4]:
# Transpose the column vector v = [2, 3, 1], and multiply the resulting row vector v^T by matrix A
v = np.array([[2], [3], [1]])  # column vector
A = np.array([[1, 7], [2, 3], [5, 0]])

print("v.T =\n", v.T, "\nA =\n", A)

print("\nVector times matrix: v^T A\npy: v.T @ A =", (v.T @ A)[0])

# These methods of multiplying a vector by a matrix produce the same result as v.T @ A
print("\n----------------- Below methods produce same result -----------------")
print("py: np.matmul(v.T, A) =", np.matmul(v.T, A)[0])
print("py: np.dot(v.T, A) =", np.dot(v.T, A)[0])
print("py: np.inner(v, A) =", np.inner(v.T, A.T)[0])  # confusing. inner() needs second argument to be transposed


v.T =
 [[2 3 1]] 
A =
 [[1 7]
 [2 3]
 [5 0]]

Vector times matrix: v^T A
py: v.T @ A = [13 23]

----------------- Below methods produce same result -----------------
py: np.matmul(v.T, A) = [13 23]
py: np.dot(v.T, A) = [13 23]
py: np.inner(v, A) = [13 23]


### 4.3. Matrix Hadamard (element-wise) product, $\textbf{A}\odot \textbf{B}$:

- Definition (same as for vectors): Element-wise product on two matrices of same-dimension (i.e. $\textbf{A}, \textbf{B}\in \mathbb{R}^{m \times n}$)
- Elements of the resultant matrix are given by: $ (A\odot B)_{ij} = (A)_{ij}(B)_{ij} $. Example:

$$ \begin{bmatrix}2&3&1\\0&8&-2\end{bmatrix} \odot {\begin{bmatrix}3&1&4\\7&9&5\end{bmatrix}}={\begin{bmatrix}2\times 3&3\times 1&1\times 4\\0\times 7&8\times 9&-2\times 5\end{bmatrix}}={\begin{bmatrix}6&3&4\\0&72&-10\end{bmatrix}} $$


In [5]:
# Compute the Hadamard product of matrices A and B:
A = np.array([[2, 3, 1], [0, 8, -2]])
B = np.array([[3, 1, 4], [7, 9, 5]])

print("A =\n", A, "\nB =\n", B)
print("\nMatrix Hadamard product: A ⊙ B\npy: np.multiply(A, B) =\n", np.multiply(A, B))


A =
 [[ 2  3  1]
 [ 0  8 -2]] 
B =
 [[3 1 4]
 [7 9 5]]

Matrix Hadamard product: A ⊙ B
py: np.multiply(A, B) =
 [[  6   3   4]
 [  0  72 -10]]


### 4.4. Matrix times matrix, $\textbf{AB}$:

The inner matrix dimensions of the two matrices (e.g. $\textbf{A}$ and $\textbf{B}$) **must match**. 
- $\textbf{A}$ is of dimension $\mathbb{R}^{m \times p}$
- $\textbf{B}$ is of dimension $\mathbb{R}^{p \times n}$ 
- Here, the dimension of size $p$ is the **inner matrix dimension**. 
    - If they match, it means # columns in $\textbf{A}$ equals # rows in $\textbf{B}$.
- Dimensions $m$ and $n$ are the **outer matrix dimensions**. Thus each element of $\textbf{M=AB}$ can be computed as:

$$M_{ij} = \sum_{k=1}^p P_{ik}Q_{kj}$$

- <mark>I.e. (important)</mark> the $i,j$'th element of $\textbf{M}$ is the dot product of the $i$'th row of $\textbf{A}$ with $j$'th column of $\textbf{Q}$

For example, for a 2x2 matrix, the multiplication is as follows. See [Composition - Geometric Intuition](#52-geometric-intuition) section:

$$

\begin{equation*}
\begin{bmatrix}
    a & b \\
    c & d
\end{bmatrix}
\begin{bmatrix}
    e & f \\
    g & h
\end{bmatrix}
=
\begin{bmatrix}
    ae+bg & af+bh \\
    ce+dg & cf+dh
\end{bmatrix}
\end{equation*}

$$

In [6]:
# Multiply A=[[1,7],[2,3],[5,0]] and B=[[2,6,3,1],[1,2,3,4]] -> [3x2] * [2x4] = output [3x4]

A = np.array([[1, 7], [2, 3], [5, 0]])
B = np.array([[2, 6, 3, 1], [1, 2, 3, 4]])

print("A =\n", A, "\nB =\n", B)
print("\nMatrix times matrix: AB \npy: A @ B =\n", A @ B)  # <-- inner dims match (p=2), so output is a [3x4] matrix

# These methods of multiplying matrices produce the same result as np.dot(A, B)
print("\n----------------------------------------")
print("\npy: np.matmul(A, B) =\n", np.matmul(A, B))
print("\npy: np.dot(A, B) =\n", np.dot(A, B))
print("\npy: np.inner(A, B.T): \n", np.inner(A, B.T))  # confusing. inner() needs second argument to be transposed


A =
 [[1 7]
 [2 3]
 [5 0]] 
B =
 [[2 6 3 1]
 [1 2 3 4]]

Matrix times matrix: AB 
py: A @ B =
 [[ 9 20 24 29]
 [ 7 18 15 14]
 [10 30 15  5]]

----------------------------------------

py: np.matmul(A, B) =
 [[ 9 20 24 29]
 [ 7 18 15 14]
 [10 30 15  5]]

py: np.dot(A, B) =
 [[ 9 20 24 29]
 [ 7 18 15 14]
 [10 30 15  5]]

py: np.inner(A, B.T): 
 [[ 9 20 24 29]
 [ 7 18 15 14]
 [10 30 15  5]]


In [7]:
# Multiplying B and A throw errors:

# Throws ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 4)
# B @ A

# Throws ValueError: shapes (2,4) and (3,2) not aligned: 4 (dim 1) != 3 (dim 0)
np.dot(B, A)


ValueError: shapes (2,4) and (3,2) not aligned: 4 (dim 1) != 3 (dim 0)