## 1. Recap: A few ways to multiply vectors and matrices

### 1.1. Vector multiplication operations (4 approaches)

Given we have 2 vectors, $\textbf{a}$ and $\textbf{b}$, of same length (i.e. $\textbf{a}, \textbf{b}\in\mathbb{R}^{n}$,), we can "multiply" them in the following ways:

1. Vector dot (inner) product: $\textbf{a} \cdot \textbf{b} = \textbf{a}^T\textbf{b} = \Sigma_{i=1}^n a_i b_i = \Vert\textbf{a}\Vert\:\Vert\textbf{b}\Vert \cos \theta$
2. Vector outer product: $\textbf{a} \otimes \textbf{b} = \textbf{a} \textbf{b}^T$. The resultant matrix is of size $n \times n$ and its elements are given by: $ (\textbf{a} \otimes \textbf{b})_{ij} = a_i b_j$
3. Vector Hadamard (aka element-wise) product: $\mathbf{a}\odot \mathbf{b}$. Elements of the resultant vector are given by: $ (\mathbf{a}\odot \mathbf{b})_{i} = (\mathbf{a})_{i}(\mathbf{b})_{i} $
4. Vector cross product: $\mathbf{a} \times \mathbf{b} = \Vert \mathbf{a} \Vert \: \Vert \mathbf{b} \Vert\sin{(\theta)} \, \mathbf{n}$

### 1.2. Matrix multiplication operations (4 approaches)

Given we have a matrix, $\textbf{A}:\textbf{A}\in\mathbb{R}^{m \times n}$, following are a few multiplication operations involving $\textbf{A}$. <mark>**NB: inner dimensions must match!**</mark>
1. Matrix $\textbf{A}$ and a vector:
   1. Matrix times vector: $\textbf{Av}$, where vector $\textbf{v}\in\mathbb{R}^{n \times 1}$. Hence, resultant column vector $\textbf{Av}\in\mathbb{R}^{m \times 1}$
   2. Vector times matrix: $\textbf{w}^\textrm{T}\textbf{A}$, where vector $\textbf{w}\in\mathbb{R}^{1 \times m}$. Hence, resultant row vector $\textbf{w}^\textrm{T}\textbf{A}\in\mathbb{R}^{1 \times n}$
2. Matrix $\textbf{A}$ and another matrix 
   1. Matrix Hadamard (aka element-wise) product: $\mathbf{A}\odot \mathbf{B}$, where $\textbf{A},\textbf{B}\in\mathbb{R}^{m \times n}$. Elements of the resultant matrix are given by: $ (A\odot B)_{ij} = (A)_{ij}(B)_{ij} $
   2. Matrix multiplication: $\textbf{AB}$, where $\textbf{A}\in\mathbb{R}^{m \times p}$ and $\textbf{B}\in\mathbb{R}^{p \times n}$. Hence, inner dimensions match, and resultant matrix $\textbf{AB}\in\mathbb{R}^{m \times n}$
      


In [11]:
import numpy as np  # more basic functionality
import scipy  # advanced functionality, built on numpy

# Find the inner and outer products of two 1D arrays (not exactly vectors, no double [[]])
a = np.array([4, 5, 6])
b = np.array([7, 8, 9])

print("Given vectors a:", a, "and b:", b)

print("\n4 types of vector multiplication")
print(
    "- Inner (aka dot) product: a•b = (a^T)b =", np.inner(a, b)
)  # dot prod; dims are: [1x3][3x1]=[1x1] <-- output dim, scalar
print("- Hadamard (elementwise) product: a⊙b", a * b)  # elementwise (or hadamard) product
print("- Cross product, a⨉b:", np.cross(a, b))
print("- Outer product, a[3⨉1] ⨂ b[1⨉3]:\n", np.outer(a, b))  # dims are [3x1][1x3]=[3x3] <-- output dim


Given vectors a: [4 5 6] and b: [7 8 9]

4 types of vector multiplication
- Inner (aka dot) product: a•b = (a^T)b = 122
- Hadamard (elementwise) product: a⊙b [28 40 54]
- Cross product, a⨉b: [-3  6 -3]
- Outer product, a[3⨉1] ⨂ b[1⨉3]:
 [[28 32 36]
 [35 40 45]
 [42 48 54]]


## 2. More terminology:

### 2.1. Spans and spaces

- **Span**: The span of a set of vectors is **all linear combinations of those vectors**
- **Vector space** is denoted as $\mathbb{R}^n$. 
    - Every element in a vector space can be written as a **linear combination** of the elements in the **basis** (unit) vectors
        - **Basis (unit) vectors** (example): For a 2D vector space, the basis vectors are $\hat{i} = \left[\begin{smallmatrix} 1 \\ 0 \end{smallmatrix}\right]$ and $\hat{j} = \left[\begin{smallmatrix} 0 \\ 1 \end{smallmatrix}\right]$
        - A matrix $\textbf{A}$ applies a linear transformation to a vector space (i.e. all vectors in the space)
        - <mark>The columns of</mark> $\textbf{A}$ <mark>represent the **landing points** for the basis (unit) vectors **after the transformation**</mark>
            - By extension, $\textbf{A}$ moves **every input vector** (more precisely, the **point where every vector's tip is**) **linearly** to a new location.
        - We only need to know how $\textbf{A}$ transforms the bases $\hat{i}$ and $\hat{j}$, since
            - any other vector $\textbf{v}$ is <mark>just a **linear combination** of $\hat{i}$ and $\hat{j}$ **both before and after being transformed by $\textbf{A}$**</mark>
- **Subspace**: a subset of a larger vector space
    - **Column space** (aka *range*, or *image*): Span of all column vectors of of matrix $\textbf{A}$
    - **Row space**: Span of all row vectors of matrix $\textbf{A}$
    - **Null space**: If $\textbf{A} \cdot \textbf{x} = 0$, the span of all solutions $\textbf{x}$ constitutes the **null space** of $\textbf{A}$
    - **Left-Null space**: If $\textbf{A}^\textrm{T} \cdot \textbf{x} = 0$, the span of all solutions $\textbf{x}$ constitutes the **left-null space** of $\textbf{A}$
    
### 2.2. More properties

- **Rank**: The number of **linearly independent** columns (or rows) in $\textbf{A}$ is its rank
- **Orthonormal vectors**: Two unit length vectors whose inner (i.e. dot) products are 0 (e.g. $\hat{i}$ and $\hat{j}$)
- **Real value matrices**:
    - **Orthogonal matrices**: If $\textbf{A}$'s rows and cols are orthonormal vectors, $\textbf{A}$ is an orthogonal matrix. It satisfies:
        - $\textbf{A}^\textrm{T}\textbf{A} = \textbf{AA}^\textrm{T} = \textbf{I}$
    - **Symmetric matrix**: where $\textbf{A}^\textrm{T} = \textbf{A}$ (square matrices only) 
- **Complex value matrices**
    - Hermitian matrix: Complex matrices' analog to orthogonal matrix
    - Unitary matrix: Complex matrices' analog to symmetric matrix
- **Determinant**: This can be computed for any square matrix $\textbf{A}$
    - Matrices are only invertible if $det(\textbf{A}) \ne 0$. Such matrices are **non-singular**; and satisfy $\textbf{AA}^{-1}=\textbf{I}$
    - Matrices where $det(\textbf{A}) = 0$ are not invertible. They are **singular** matrices
  
### 2.3. Examples of the above (where relevant):

- The **span** of vectors $\left[\begin{smallmatrix} 1 \\ 0 \end{smallmatrix}\right]$ and $\left[\begin{smallmatrix} 0 \\ 1 \end{smallmatrix}\right]$ is the whole $x$-$y$ plane.
- A vector, $\textbf{v}$, with 3 elements is said to exist in **vector space** $\mathbb{R}^3$
- The $\mathbb{R}^3$ vector, $\textbf{v}$ is composed of (i.e. a linear combination of) the **basis vectors** $\hat{i} = \left[\begin{smallmatrix} 1 \\ 0 \\ 0 \end{smallmatrix}\right]$, $\hat{j} = \left[\begin{smallmatrix} 0 \\ 1 \\ 0 \end{smallmatrix}\right]$ and $\hat{k} = \left[\begin{smallmatrix} 0 \\ 0 \\ 1 \end{smallmatrix}\right]$.
- The subspace of a 3D vector (in $\mathbb{R}^3$) is the span of vectors $\left[\begin{smallmatrix} 1 \\ 0 \\ 0 \end{smallmatrix}\right]$, $\left[\begin{smallmatrix} 0 \\ 1 \\ 0 \end{smallmatrix}\right]$.
    - in this case the 2D $x$-$y$ plane is a subspace (subset) of the 3D $x, y, z$ vector space
- Vectors $\left[\begin{smallmatrix} 1 \\ 2 \\ 3 \end{smallmatrix}\right]$ and $\left[\begin{smallmatrix} 10 \\ 20 \\ 30 \end{smallmatrix}\right]$ are linearly dependent since one is a multiple of the other.
    - A matrix with those two vectors would be rank 1

# TODO: Kroenecker product, Tensors, Tensor products, 

https://en.wikipedia.org/wiki/Tensor
https://en.wikipedia.org/wiki/Tensor_product
https://en.wikipedia.org/wiki/Kronecker_product
https://en.wikipedia.org/wiki/Block_matrix
https://math.stackexchange.com/questions/973559/outer-product-of-two-matrices
https://stackoverflow.com/questions/24839481/python-matrix-outer-product


## 3. Gram-Schmidt Process

Use this to orthonormalise anything (vector or matrix (orthogonalise))

- Orthonormalise a set of vectors $\{\textbf{v}_1, \textbf{v}_2, \textbf{v}_3, ..., \textbf{v}_n\}$ 
    - to $\{\textbf{u}_1, \textbf{u}_2, \textbf{u}_3, ..., \textbf{u}_n\}$, where each $\textbf{u}_i$ vector is in the same $\mathbb{R}^n$ vector space, 
        - but each $\textbf{u}_i$ vector is unit length, and 
        - is mutually orthogonal with other vectors

I.e. Transform a set of vectors into a set of orthonormal vectors in the same vector space

## 4. Matrix decompositions

### 4.1. Gaussian Elimination (or Decomposition?)

- **Purpose**: We use Gaussian Elimination to simplify a system of linear equations, $Ax=b$ into *row echelon form* (or *reduced row echelon form*; which allows solving $Ax=b$ by simple inspection)
- Application: 
    - Solving linear system $Ax=b$, 
    - Computing inverse matrices
    - Computing rank
    - Computing determinant
    - **Elementary row operations**: Methods by which the above are done
        - Swapping rows
        - Scaling rows
        - Adding rows to each other (i.e. creating linear combinations)
        
- **Row echelon form**: The first *non-zero* element from the left in each row (aka leading coefficient, pivot) is **always to the right of** the first *non-zero* element in the row above
- **Reduced row echelon form**: Row echelon form whose pivots are $1$ and column containing pivots are $0$ elsewhere

- Elementary row operation

### 4.2. LU Decomposition

Like Gaussian Decomposition, but more computationally efficient

Decompose any matrix $A$ (square or not) into:
- A lower triangular matrix $L$
- An upper triangular matrix $U$
- Sometimes needing to reorder $A$ using a $P$ matrix

In [None]:
a = np.random.randn(3, 4)
print("A:\n", a)

p, l, u = scipy.linalg.lu(a)
print("\nP:\n", p)
print("\nL:\n", l)
print("\nU:\n", u)
print("\n----\n\nRecomposition: PLU = A:\n", p @ l @ u)


### 4.3. QR Decomposition

Decompose a matrix $A$ into:
- an orthogonal matrix $Q$
- an upper triangular matrix $R$

It's used in QR algorithms to solve the linear least square problem. 

Also, the $Q$ matrix is sometimes what we desire after the **Gram-Schmidt process**

In [None]:
a = np.random.randn(3, 4)
print("A:\n", a)

q, r = np.linalg.qr(a)
print("\nQ:\n", q)
print("\nR:\n", r)
print("\n----\n\nRecomposition: QR = A:\n", q @ r)


### 4.4. Cholesky Decomposition

Decompose a symmetric (or Hermitian) positive-definite matrix into:

- a lower triangular matrix $L$
- and its transpose (or conjugate transpose) $L.H.$

Used in algorithms for numerical convenience

In [None]:
x = np.diagflat([[1, 2], [3, 4]])
print("x:\n", x)

L = np.linalg.cholesky(x)
print("\nL:\n", L)

print("\n----\n\nRecomposition: LL^T:\n", L @ L.T)


# Questions

- When exactly do we use decompositions?