# Computing SVD

In [1]:
%matplotlib inline
%config InlineBackend.figure_format='retina'
# import libraries
import numpy as np
import matplotlib as mp
import pandas as pd
import matplotlib.pyplot as plt
import pandas as pd
import slideUtilities as sl
import laUtilities as ut
import seaborn as sns
from importlib import reload
from datetime import datetime
from IPython.display import Image
from IPython.display import display_html
from IPython.display import display
from IPython.display import Math
from IPython.display import Latex
from IPython.display import HTML
print('')




In [2]:
%%html
<style>
 .container.slides .celltoolbar, .container.slides .hide-in-slideshow {
    display: None ! important;
}
</style>

%Set up useful MathJax (Latex) macros.
%See http://docs.mathjax.org/en/latest/tex.html#defining-tex-macros
%These are for use in the slideshow
$\newcommand{\mat}[1]{\left[\begin{array}#1\end{array}\right]}$
$\newcommand{\vx}{{\mathbf x}}$
$\newcommand{\hx}{\hat{\mathbf x}}$
$\newcommand{\vbt}{{\mathbf\beta}}$
$\newcommand{\vy}{{\mathbf y}}$
$\newcommand{\vz}{{\mathbf z}}$
$\newcommand{\R}{{\mathbb{R}}}$
$\newcommand{\vu}{{\mathbf u}}$
$\newcommand{\vv}{{\mathbf v}}$
$\newcommand{\vw}{{\mathbf w}}$
$\newcommand{\col}{{\operatorname{Col}}}$
$\newcommand{\nul}{{\operatorname{Nul}}}$
$\newcommand{\vb}{{\mathbf b}}$
$\newcommand{\va}{{\mathbf a}}$
$\newcommand{\ve}{{\mathbf e}}$
$\newcommand{\setb}{{\mathcal{B}}}$
$\newcommand{\rank}{{\operatorname{rank}}}$
$\newcommand{\vp}{{\mathbf p}}$

As a reminder, here is what the SVD looks like:

$$ \mbox{objects}\left\{\begin{array}{c}\;\\\;\\\;\\\;\\\;\end{array}\right.\;\;\overbrace{\left[\begin{array}{cccc}\begin{array}{c}\vdots\\\vdots\\{\bf a_1}\\\vdots\\\vdots\end{array}&\begin{array}{c}\vdots\\\vdots\\{\bf a_2}\\\vdots\\\vdots\end{array}&\dots&\begin{array}{c}\vdots\\\vdots\\{\bf a_n}\\\vdots\\\vdots\end{array}\\\end{array}\right]}^{\mbox{features}} =
\overbrace{\left[\begin{array}{cc}\vdots&\vdots\\\vdots&\vdots\\\sigma_1\vu_1&\sigma_k\vu_k\\\vdots&\vdots\\\vdots&\vdots\end{array}\right]}^{\large k}
\times
\left[\begin{array}{ccccc}\dots&\dots&\vv_1&\dots&\dots\\\dots&\dots&\vv_k&\dots&\dots\end{array}\right]$$


$$ A = U\Sigma V^T$$

More formally: Let $A$ be an $m\times n$ matrix with $m \leq n$.
Recall that by Singular Value Decomposition there exist matrices $U_{m x m}$, $\
\Sigma_{m x n}$, $V_{n x n}$, with $A = U \Sigma V^T$,
where $U$ and $V$ have orthonormal columns, and $\Sigma$ is diagonal. 

**Claim 1:** For a square symmetric matrix $M$, any two eigen-vectors $\mathbf{v}_1$, $\mathbf{v}_2$
 with distinct eigen-values $\lambda_1$, $\lambda_2$, are orthogonal, i.e. inner product of $\mathbf{v}_1$ and $\mathbf{v}_2$ is zero.

**Proof:** Since $\mathbf{v}_1$ is an eigenvector of $M$, we have that:

$$M\mathbf{v}_1 = \lambda_1 \mathbf{v}_1$$

Now we get that

$$M\mathbf{v}_1\cdot \mathbf{v}_2 = \lambda_1\mathbf{v}_1\cdot \mathbf{v}_2.$$

We also know that

$$M\mathbf{v}_1\cdot \mathbf{v}_2 = \mathbf{v}_1\cdot M^T\mathbf{v}_2 = \mathbf{v}_1\cdot M\mathbf{v}_2 = \mathbf{v}_1\cdot \lambda_2\mathbf{v}_2.$$

That is,

$$M\mathbf{v}_1\cdot \mathbf{v}_2 = \lambda_2\mathbf{v}_1\cdot \mathbf{v}_2.$$

Thus,

$\lambda_1\mathbf{v}_1\cdot \mathbf{v}_2 = \lambda_2\mathbf{v}_1\cdot \mathbf{v}_2$.

Since $\lambda_1\neq \lambda_2$, $\mathbf{v}_1\cdot \mathbf{v}_2=0$ and therefore the two vectors are orthogonal.

**Claim 2:** Matrices $AA^T$ and $A^TA$ are symmetric

**Proof:** It is enough to show that $C=AA^T$ is equal to $C^T = (AA^T)^T$:

$$C^T = (AA^T)^T = (A^T)^TA^T = AA^T = C.$$

The proof is the same for $A^TA$

**Claim 3:** Show that if $A = U\Sigma V^T$, then $AA^T=U\Sigma ^2 U^T$ and $U^{-1}=U^T$.

**Proof:** First we need to show that $U^{-1}=U^T$. For this, we have that:

$U^TU = \left[\begin{array}{cccc}\begin{array}{c}\mathbf{u}_1\cdot \mathbf{u}_1\\ \mathbf{u}_2\cdot\mathbf{u}_1\\ \vdots\end{array}&\begin{array}{c}\mathbf{u}_1\cdot \mathbf{u}_2\\ \mathbf{u}_2\cdot\mathbf{u}_2\\ \vdots\end{array}
& \begin{array}{c}\ldots\\ \ldots\\ \vdots\end{array}
\end{array}\right]$

Since we know that the columns of $U$ are orthonormal we know that $\mathbf{u}_i\cdot \mathbf{u}_i=1$ and
$\mathbf{u}_i\cdot \mathbf{u}_j=0$ for $i\neq j$. Therefore $U^TU=I$ and thus $U^{-1}=U^T$.


Similar for $V$: $V^{-1}=V^T$.

As a result of the above we have that:

$$ AA^T = U\Sigma V^T\left(U\Sigma V^T\right)^T =U\Sigma V^T V \Sigma U^T = U\Sigma^2 U^T= U\Sigma^2U^{-1}.$$

Similarly:

$$A^TA = V\Sigma^2 V^T=V\Sigma^2V^{-1}$$

$M$ symmetric matrix

$M = X\Lambda X^{-1}$

$X$ has as columns the eigenvectors of $M$

$\Lambda$ is a diagonal matrix with diagonal entries $\lambda_1,\lambda_2....$ being the eigenvalues of $M$

Does the above discussion imply an algorithm for SVD?

### The power method

1. Generate $\mathbf{x}_0(i)=\mathcal{N}(0,1)$ and set $\mathbf{x}_0 = \mathbf{x}_0/||\mathbf{x}_0||$
2. For $i=1\ldots s$ (Repeat)

    $\mathbf{x}_i = A^TA\mathbf{x}_{i-1}$
    
    $\mathbf{x}_i = \mathbf{x}_i/||\mathbf{x}_i||$
    
    
    $\delta_i= ||A^TA - \mathbf{x}_i\sigma^2\mathbf{x}_i^T||$
    
    (until $\delta_{i-1}-\delta_i\approx 0$)
    
3. $\mathbf{v}_1 = \mathbf{x}_i$
4. $\sigma_1 = ||A\mathbf{v}_1||$
5. $\mathbf{u}_1 = A\mathbf{v}_1/\sigma_1$

$A = A - \mathbf{u}_1\sigma_1\mathbf{v}_1^T$

The number of iterations depends on $\min_{i<j}\frac{\sigma_i}{\sigma_j}$




$A = U\Sigma V^T = U\Sigma V^{-1}\rightarrow AV = U\Sigma$

Thus,

$A\mathbf{v}_1 = \mathbf{u}_1\sigma_1 \rightarrow \sigma_1 = ||A\mathbf{v}_1||$ (explanation of line 4)

$A = \sigma_1 \mathbf{u}_1\mathbf{v_1}^T$

$A\mathbf{v}_1 = \sigma_1 \mathbf{u}_1\mathbf{v_1}^T\mathbf{v}_1 \rightarrow $$\mathbf{u}_1 = A\mathbf{v}_1 /\sigma_1$


