# Overview of Gilbert Strang's 2018 Matrix Methods course

- Linear algebra => Optimisation => Deep learning
- Linear Algebra => Statistics => Deep learning
- ["Learning from data" book](math.mit.edu/learningfromdata)

## Lecture 01: The column space of $A$ contains all vectors $A x$

- Think of the product $A x$ as a linear combination of the columns of $A$: $A x = A_{:,1} x_1 + A_{:,2} x_2 + \ldots + A_{:,n} x_n$
- $A = C R$ where the columns of $C$ form a basis for $C(A)$, and each column in $C$ is a column of $A$; then $R$ is the first $rank(A)$ rows of (a column-permutation of) $rref(A)$.
- Given $C$, a matrix formed from $r = rank(A)$ l.i. columns of $A$, and $R$, a matrix formed from $r$ l.i. rows of $A$, then there is a matrix $U$ such that $A = C U R$, and $U$ is an $r \times r$ invertible matrix
  - Question: Are there any more properties of $U$?

## Lecture 02: Multiplying and factoring matrices

### 5 key factorisations

- $A = L U$ -- Elimination
- $A = Q R$ -- Gram-Schmidt decomposition
- $S = Q \Lambda Q^T$ -- Spectral theorem (for symmetric matrices $S$)
- $A = X \Lambda X^{-1}$ -- Doesn't work for all matrices
- $A = U \Sigma V^T$ -- Singular Value Decomposition; works for all matrices; orthogonal * diagonal * orthogonal

### LU decomposition in rank-1 picture

- $A = l_1 u_1^T + \begin{pmatrix}0 & 0 \\ 0 & l_2 u_2^T \end{pmatrix} + \begin{pmatrix}0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & l_3 u_3^T \end{pmatrix} + \ldots $

### Orthogonality of fundamental spaces of a matrix $A$

- $C(A^T)$ is orthogonal to $N(A)$ 
  - i.e. because each row in $A$ is orthogonal to any vector in $N(A)$
- $C(A)$ is orthogonal to $N(A^T)$



## Lecture 03: Orthonormal columns in $Q$ give $Q^T Q = I$

$Q$ is used to denote a matrix with orthonormal columns - that is, $q_{:,i}^T q_{:,j} = \delta_{i,j}$.

Thus:
- $Q^T Q = I_m$, and
- $Q Q^T = \begin{pmatrix}I_n & 0 \\ 0 & 0\end{pmatrix}$

If $Q^T Q = Q Q^T = I$, then $Q$ is 'orthogonal'.

### Orthogonal matrices preserve length under $l_2$

i.e. $|Q x| = |x|$

**proof**: $|Q x|^2 = |(Q x)^T (Q x)| = |x^T (Q^T Q) x| = |x^T x| = |x|^2$

### Examples of orthogonal matrices

#### rotation matrices
$\begin{pmatrix} cos{\theta} & sin{\theta} \\ -sin{\theta} & cos{\theta}\end{pmatrix}$

Rotates anti-clockwise by $\theta$ around the origin in 2-d
  
#### reflection matrices
$\begin{pmatrix} cos{\theta} & sin{\theta} \\ sin{\theta} & -cos{\theta}\end{pmatrix}$

Reflects plane in the line at $\theta/2$

#### "Householder reflections"
Given unit vector $u$, then $H = I - 2 u u^T$

#### "Hadamard" matrices

$H_2 = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}$

$H_{2^n} = \frac{1}{\sqrt{2}} \begin{pmatrix} H_{2^{n-1}} & H_{2^{n-1}} \\ H_{2^{n-1}} & -H_{2^{n-1}} \end{pmatrix}$

**Conjecture**: There is an orthogonal matrix of size $n \times n$ with entries $1$ and $-1$ for $n$ a multiple of $4$ --- known up to $n=668$.

#### Wavelets

$W_4 = \begin{pmatrix}
1 & 1 & 1 & 0 \\
1 & 1 &-1 & 0 \\
1 &-1 & 0 & 1 \\
1 &-1 & 0 &-1
\end{pmatrix}$

(with some scaling on the columns to make them orthonormal)

Haar invented in 1910; Ingrid Daubechies 1988 - found families of wavelets with entries that were not just 1 and -1.

#### Eigenvectors of a symmetric matrix

Example: discrete fourier transform is the matrix of eigenvectors of $Q^T Q$, with $Q = P_{2,3,\ldots,n-1,n,1}$ (i.e. $Q$ is the permutation matrix that puts row 2 in row 1, row 3 in row 2, etc.)