## Singular Value Decomposition (SVD)

Notation

| Symbol        | meaning                                                                             |
| ------------- | ----------------------------------------------------------------------------------- |
| $\delta_{ij}$ | Kronecker delta, that is $\delta_{ij}=1$ if $i=j$, and $\delta_{ij}=0$ if $i\neq j$ |


**Summary** (Case I: $m\ge n$) 

- For any $m$-by-$n$ matrix $A$ ($m\ge n$), we can choose 
  - $\left\{u_1, \ldots, u_m\right\}$ orthonormal vectors of length $m$, (left singular vector)
  - $\left\{v_1, \ldots, v_n\right\}$ orthonormal vectors of length $n$, (right singular vector) and
  - $s_1 \geq \cdots \geq$ $s_{n} \geq 0$, (singular values) satisfying
$$
\begin{gathered}
A v_1=s_1 u_1 \\
A v_2=s_2 u_2 \\
\vdots \\
A v_n=s_n u_n .
\end{gathered}
$$


**Geometric Intuition** (Sauer (2017) p. 579)

$v_i$'s form the basis of a rectangular coordinate system on which $A$ acts in a simple way: It produces the basis vectors of a new coordinate system, the $u_i$’s, with some stretching quantified by the scalars $s_i$'s. The stretched basis vectors $s_i u_i$ are the semimajor axes of the ellipse.

![SVD geometry](https://blogs.sas.com/content/iml/files/2017/08/svd1.png)

Figure: Rick Wicklin, SAS blog (Geometry of 2-by-2 SVD)

**Example** (Building SVD easy case; Sauer (2017) p. 580)

(Step 1)

Find the singular values and singular vectors for $A=\begin{bmatrix}3 & 0 \\ 0 & 1/2 \end{bmatrix}$. 


###### hide/show


$$
\begin{aligned} & A\left[\begin{array}{l}1 \\ 0\end{array}\right]=3\left[\begin{array}{l}1 \\ 0\end{array}\right] \\ & A\left[\begin{array}{l}0 \\ 1\end{array}\right]=\frac{1}{2}\left[\begin{array}{l}0 \\ 1\end{array}\right]\end{aligned}
$$


(Step 2)

Find the singular values and singular vectors for $A=\begin{bmatrix}0 & -1/2 \\ 3 & 0 \\ 0 & 0 \end{bmatrix}$.


###### hide/show



$$
\begin{aligned}
& A\left[\begin{array}{l}
1 \\
0
\end{array}\right]=3\left[\begin{array}{l}
0 \\
1 \\
0
\end{array}\right] \\
& A\left[\begin{array}{l}
0 \\
1
\end{array}\right]=\frac{1}{2}\left[\begin{array}{r}
-1 \\
0 \\
0
\end{array}\right]
\end{aligned}
$$

Even for this simple matrix, guessing singular vectors is not that easy, especially when requiring the left singular vectors be orthogonal.

(Step 3) 

Can you do the same for $A=\begin{bmatrix}2 & -1/2 \\ 3 & 1 \\ -2 & 5 \end{bmatrix}$?

We need more systematic approach.


### Fundamentals of SVD

**Theorem** (Spectral theorem for real symmetric matrix; Rephrase of Horn and Johnson (2013) Matrix analysis 2ed. Theorem 4.1.5. p. 229)

If $A$ is a real symmetric $n$-by-$n$ matrix, then there exists an orthonormal basis of $R^n$ consisting of eigenvectors of $A$. Each eigenvalue of $A$ is real.


**Lemma** 

Let $A$ be an $m \times n$ matrix. The eigenvalues of $A^T A$ are nonnegative.



Proof

Let $v$ be a unit eigenvector of $A^T A$, and $A^T A v=\lambda v$. Then
$$
0 \leq\|A v\|^2=v^T A^T A v=\lambda v^T v=\lambda .
$$

**Theorem** (Sauer (2017) p. 581)

Let $A$ be an $m \times n$ matrix where $m \geq n$. Then there exist two orthonormal bases $\left\{v_1, \ldots, v_n\right\}$ of $R^n$, and $\left\{u_1, \ldots, u_m\right\}$ of $R^m$, and real numbers $s_1 \geq \cdots \geq s_n \geq 0$ such that $A v_i=s_i u_i$ for $1 \leq i \leq n$. The columns of $V=\left[v_1|\ldots| v_n\right]$, the right singular vectors, are the set of orthonormal eigenvectors of $A^T A$; and the columns of $U=\left[u_1|\ldots| u_m\right]$, the left singular vectors, are the set of orthonormal eigenvectors of $A A^T$. That is, we have $A=USV^T$.

Constructive version (Human-friendly; Sauer (2017) p. 581)

1. $s_i$'s (singular values): Find eigenvalues (nonnegative) of $A^T A$ ($n$-by-$n$) in the decreasing order $s_1^2 \ge s_2^2 \ge \cdots \ge s_n^2 \ge 0$ along with
1. $v_i$'s (right singular vectors): corresponding eigenvectors $v_i$ ($i=1,2,\cdots, n$).
1. $u_i$'s (left singular vectors): If $s_i \neq 0$, define $u_i$ by the equation $s_i u_i=A v_i$. Choose each remaining $u_i$ as an arbitrary unit vector subject to being orthogonal to $u_1, \ldots, u_{i-1}$ ($i=1,2,\cdots, m$).

**Remark** 

- The SVD is not unique. 
  - Replacing $v_1$ by $-v_1$ and $u_1$ by $-u_1$ does not change the equality, but changes the matrices $U$ and $V$.

**Example** (Sauer (2017) p. 581)

Find the singular value decomposition of the $4 \times 2$ matrix
$$
A=\left[\begin{array}{rr}
3 & 3 \\
-3 & -3 \\
-1 & 1 \\
1 & -1
\end{array}\right] .
$$

Preliminary

$$
A^T A=\left[\begin{array}{ll}
20 & 16 \\
16 & 20
\end{array}\right]
$$

Eigenvectors and eigenvalues 

$$
v_1=\begin{bmatrix}1 / \sqrt{2} \\ 1 / \sqrt{2}\end{bmatrix}, 
\quad 
v_2=\begin{bmatrix}1 / \sqrt{2} \\ -1 / \sqrt{2}\end{bmatrix},
\quad
\begin{array}{l}
s_1^2=36 \\ 
s_2^2=4
\end{array}
$$



Singular values

$$
\begin{array}{l}
s_1=6 \\ 
s_2=2
\end{array}
$$

Right singular vectors

$v_1, v_2$ (same as eigenvectors of $A^T A$)

Left singular vectors

From 

$$
6 u_1=A v_1=\left[\begin{array}{r}
3 \sqrt{2} \\
-3 \sqrt{2} \\
0 \\
0
\end{array}\right] \quad 2 u_2=A v_2=\left[\begin{array}{r}
0 \\
0 \\
-\sqrt{2} \\
\sqrt{2}
\end{array}\right]
$$

we have

$$
u_1=\left[\begin{array}{r}
\frac{1}{\sqrt{2}} \\
-\frac{1}{\sqrt{2}} \\
0 \\
0
\end{array}\right] \quad u_2=\left[\begin{array}{r}
0 \\
0 \\
-\frac{1}{\sqrt{2}} \\
\frac{1}{\sqrt{2}}
\end{array}\right] .
$$

For $i = 3, 4$, choose
$$
u_3=\left[\begin{array}{c}
\frac{1}{\sqrt{2}} \\
\frac{1}{\sqrt{2}} \\
0 \\
0
\end{array}\right] \quad u_4=\left[\begin{array}{c}
0 \\
0 \\
\frac{1}{\sqrt{2}} \\
\frac{1}{\sqrt{2}}
\end{array}\right]
$$

If such vectors are not easy to guess, we can use Gram-Schmidt starting with $\{u_1, u_2, e_3, e_4\}$, where $e_i = [\delta_{ij}]_{1\le j \le 4}^T$ and $\delta_{ij}$ is Kronecker delta.



SVD

$$
A=\left[\begin{array}{rr}
3 & 3 \\
-3 & -3 \\
-1 & 1 \\
1 & -1
\end{array}\right]=U S V^T=\left[\begin{array}{rrrr}
\frac{1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} & 0 \\
-\frac{1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} & 0 \\
0 & -\frac{1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} \\
0 & \frac{1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}}
\end{array}\right]\left[\begin{array}{ll}
6 & 0 \\
0 & 2 \\
0 & 0 \\
0 & 0
\end{array}\right]\left[\begin{array}{cc}
\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\
\frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}}
\end{array}\right] .
$$



- For any $m$-by-$n$ matrix $A$, no matter $m\ge n$ or $m \le n$, we can choose 
  - $\left\{u_1, \ldots, u_m\right\}$ orthonormal vectors of length $m$,
  - $\left\{v_1, \ldots, v_n\right\}$ orthonormal vectors of length $n$, 
  - $s_1 \geq \cdots \geq$ $s_{\min(m,n)} \geq 0$, satisfying
$$
\begin{gathered}
A v_1=s_1 u_1 \\
A v_2=s_2 u_2 \\
\vdots \\
A v_n=s_n u_n .
\end{gathered}
$$

The vectors are visualized in Figure 12.3. The $v_i$ are called the right singular vectors of the matrix $A$, the $u_i$ are the left singular vectors of $A$, and the $s_i$ are the singular values of $A$. (The reason for this terminology will become clear shortly.)

**Example** (Visualization of SVD)

- $x=\left[\begin{array}{cccc}-10 & -10 & 20 & 20 \\ -10 & 20 & 20 & -10\end{array}\right]$ 
- $A=\left[\begin{array}{cc}1 & 0.3 \\ 0.45 & 1.2\end{array}\right]$.
- $A=USV^T = \begin{bmatrix} -0.5819 & -0.8133 \\ -0.8133 & 0.5819 \end{bmatrix} \begin{bmatrix} 1.4907 & 0 \\ 0 & 0.7144 \end{bmatrix} \begin{bmatrix} -0.6359 & -0.7718 \\ -0.7718 & 0.6359 \end{bmatrix}$.

Example and figures: Alyssa Quek ([SVD visualization](https://alyssaq.github.io/2015/singular-value-decomposition-visualisation/))

| | |
|---|---|
| $$Ax$$ <br> ![Figure 1](https://alyssaq.github.io/blog/images/eigens-transformation_matrix.png) | | 
| $$V^Tx$$ <br> ![Figure 2](https://alyssaq.github.io/blog/images/svd_Vx.png) | $$SV^Tx$$ <br> ![Figure 3](https://alyssaq.github.io/blog/images/svd_SVx.png) | 
| $$USV^Tx$$ <br> ![Figure 4](https://alyssaq.github.io/blog/images/svd_USVx.png) | |


### Appendix

#### Raw citations

**Theorem** (Horn and Johnson (2013) Matrix analysis 2ed. Theorem 4.1.5. p. 229) 

A matrix $A \in M_n$ is Hermitian if and only if there is a unitary $U \in M_n$ and a real diagonal $\Lambda \in M_n$ such that $A=U \Lambda U^*$, where $M_n$ is the set of $n$-by- $n$ complex matrices. Moreover, $A$ is real and Hermitian (that is, real symmetric) if and only if there is a real orthogonal $P \in M_n$ and a real diagonal $\Lambda \in M_n$ such that $A=P \Lambda P^T$.

**Remark**

- Observe the subtlety of the statement: If $A$ is symmetric as a complex matrix, then the conclusion is different. (See e.g., [Wikipedia - Complex symmetric matrices](https://en.wikipedia.org/wiki/Symmetric_matrix#Complex_symmetric_matrices))