# PCA

## Dot product

- $cos(\theta) = \dfrac{x^{T}y}{\|x\|\|y\|} = \dfrac{<x,y>}{\|x\|\|y\|}$

## Inner product

- $x,y \in V$
- $<,>: V x V -> R$

### Symmetric

- $<x,y> = <y,x>$

### Positive definite

- $<x,x> \ge 0$ and $<x,x> = 0$ iff $x = 0$

### Bilinear

- $x,y,z \in V, \lambda \in R$
- $<\lambda x + z, y> = \lambda<x,y> + <z,y>$
- $<x, \lambda y + z> = \lambda<x,y> + <x,z>$

## Length / Norm of a vector

- $\|x\| = \sqrt{<x,x>}$

## Triangle inequality

- $\|x+y\| \le \|x\| + \|y\|$

## Distance

- $d(x,y) = \|x-y\| = \sqrt{<x-y,x-y>}$

## Projection onto 1D subspace

Using two conditions
- $\Pi_{u}(x) \in u => \exists \lambda \in R: \Pi_{u}(x) = \lambda b$
- $<b, \Pi_{u}(x)-x>$ = 0

Then
- $\Pi_{u}(x) = \dfrac{bb^{T}}{\|b\|^{2}}x$

## Projection onto higher dimensional subspace

- $\lambda = \begin{pmatrix} \lambda_{1} \\ \vdots \\ \lambda_{M}  \end{pmatrix} (M \times 1)$ 
- $B = \begin{pmatrix} B_{1} | \dots | B_{M} \end{pmatrix} (D \times M)$

Then
- $\Pi_{u}(x) = B(B^{T}B)^{-1}B^{T}X$

## PCA

- Given $X = \{x_{1} \dots x_{n}\}$ where $x_{i} \in {\rm I\!R}^{D}$

We have
- $x_{n} = \displaystyle\sum_{i=1}^{D}B_{in}b_{i}, B_{in} = x_{n}^{T}b_{i}, B = (b_{1} \dots b_{n})$
- $\tilde{x} = BB^{T}X$

From above
- $x_{n} = \displaystyle\sum_{i=1}^{M}B_{in}b_{i} + \displaystyle\sum_{i=M+1}^{D}B_{in}b_{i}$ (PCA ingores the second term. Also $b_{1} \dots b_{M}$ span the principal subspace)

Loss function is given by
- $J = \dfrac{1}{N}\displaystyle\sum_{n=1}^{N}\|x_{n}-\tilde{x_{n}}\|^{2}$

Then, derivates are
- $\dfrac{\partial J}{\partial \{B_{in}b_{i}\}} = \dfrac{\partial J}{\partial \tilde{x_{n}}}\dfrac{\partial \tilde{x_{n}}}{\partial \{B_{in}b_{i}\}} = -\dfrac{2}{N}(x_{n}-\tilde{x_{n}})^{T}\dfrac{\partial \tilde{x_{n}}}{\partial \{B_{in}b_{i}\}}$
- $\dfrac{\partial \tilde{x_{n}}}{\partial B_{in}} = b_{i}$
- $\dfrac{\partial J}{\partial B_{in}} = \dfrac{\partial J}{\partial \tilde{x_{n}}}\dfrac{\partial \tilde{x_{n}}}{\partial B_{in}} = -\dfrac{2}{N}(x_{n}-\tilde{x_{n}})^{T}b_{i} = -\dfrac{2}{N}(x_{n}-\displaystyle\sum_{j=1}^{M}B_{jn}b_{j})^{T}b_{i} = -\dfrac{2}{N}(x_{n}^{T}b_{i}-B_{in}b_{i}^{T}b_{i}) = -\dfrac{2}{N}(x_{n}^{T}b_{i}-B_{in}) = 0$

Thus
- $B_{in} = x_{n}^{T}b_{i}$

Now
- $\tilde{x_{n}} = \displaystyle\sum_{j=1}^{M}B_{jn}b_{j} = \displaystyle\sum_{j=1}^{M}(x_{n}^{T}b_{j})b_{j} = \displaystyle\sum_{j=1}^{M}b_{j}(b_{j}^{T}x_{n}) = \displaystyle\sum_{j=1}^{M}(b_{j}b_{j}^{T})x_{n}$

Note
- $x_{n} = \left(\displaystyle\sum_{j=1}^{M}b_{j}b_{j}^{T}\right)x_{n} + \left(\displaystyle\sum_{j=M+1}^{D}b_{j}b_{j}^{T}\right)x_{n}$

Then
- $x_{n} - \tilde{x_{n}} = \left(\displaystyle\sum_{j=M+1}^{D}b_{j}b_{j}^{T}\right)x_{n} = \displaystyle\sum_{j=M+1}^{D}(b_{j}^{T}x_{n})b_{j}$

Loss function becomes
- $J = \dfrac{1}{N}\displaystyle\sum_{n=1}^{N}\|x_{n}-\tilde{x_{n}}\|^{2} = \dfrac{1}{N}\displaystyle\sum_{n=1}^{N}\|\displaystyle\sum_{j=M+1}^{D}(b_{j}^{T}x_{n})b_{j}\|^{2} = \dfrac{1}{N}\displaystyle\sum_{n=1}^{N}\displaystyle\sum_{j=M+1}^{D}(b_{j}^{T}x_{n})^{2} = \dfrac{1}{N}\displaystyle\sum_{n}\displaystyle\sum_{J}b_{j}^{T}x_{n}x_{n}^{T}b_{j} = \displaystyle\sum_{j=M+1}^{D}b_{j}^{T}\left(\dfrac{1}{N}\displaystyle\sum_{n=1}^{N}x_{n}x_{n}^{T}\right)b_{j} = \displaystyle\sum_{j=M+1}^{D}b_{j}^{T}Sb_{j} = trace\left[\left(\displaystyle\sum_{j=M+1}^{D}b_{j}^{T}b_{j}\right)S\right]$

Example
- $J = b_{2}^{T}b_{2}, b_{2}^{T}b_{2} = 1$
- Lagrange $L = b_{2}^{T}Sb_{2} + \lambda(1 - b_{2}^{T}b_{2})$
- $\dfrac{\partial L}{\partial \lambda} = 1 - b_{2}^{T}b_{2} = 0$
- $\dfrac{\partial L}{\partial b_{2}} = 2b_{2}^{T}S - 2\lambda b_{2}^{T} = 0$
- $Sb_{2} = \lambda b_{2}$
- Now, $J = b_{2}^{T}Sb_{2} = b_{2}^{T}b_{2}\lambda = \lambda$
- Thus, $J = \displaystyle\sum_{j=M+1}^{D}\lambda$