In [1]:
import numpy as np
from scipy import linalg

# 1. Quick Recap of Singular Value Decomposition

SVD in plain English:

For any given $A \in \mathbb{R}^{M \times N}$,

* Find an orthonormal basis $\{\mathbf{v}_1, \: \mathbf{v}_2, \: \cdots, \: \mathbf{v}_N \}$ for $\mathbb{R}^{N \times N}$ that:
* when transfromed by $A$, still remains being orthogonal basis, this time for $\mathbb{R}^{M \times M}$. The resultant orthogonal basis is $\{\sigma_1 \mathbf{u}_1, \: \sigma_2 \mathbf{u}_2, \: \cdots, \: \sigma_M \mathbf{u}_M \}$, which becomes $\{\mathbf{u}_1, \: \mathbf{u}_2, \: \cdots, \: \mathbf{u}_M \}$ when normalized.

(if M > N, $\{\sigma_1 \mathbf{u}_1, \: \sigma_2 \mathbf{u}_2, \: \cdots, \: \sigma_N \mathbf{u}_N \}$, which becomes $\{\mathbf{u}_1, \: \mathbf{u}_2, \: \cdots, \: \mathbf{u}_N \}$ when normalized.)

Rewrite above in formula, we get,

$$
AV = U \Sigma
$$
$$
\text{where } \quad V =
\begin{bmatrix}
| & | &        & | \\
\mathbf{v}_1 & \mathbf{v}_2 & \cdots & \mathbf{v}_N \\
| & | &        & |
\end{bmatrix}_{N \times N}, \quad
U =
\begin{bmatrix}
| & | &        & | \\
\mathbf{u}_1 & \mathbf{u}_2 & \cdots & \mathbf{u}_M \\
| & | &        & |
\end{bmatrix}_{M\times M}, \quad
\Sigma =
\begin{bmatrix}
\sigma_{1} & 0 & \cdots & 0 \\
0 & \sigma_{2} & \cdots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & \sigma_{N} \\
0 & 0 & \cdots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & 0
\end{bmatrix}_{M\times N}\; (M > N) \quad \text{or } \; \\
\begin{bmatrix}
\sigma_{1} & 0 & \cdots & 0 & 0 & 0 \\
0 & \sigma_{2} & \cdots & 0 & 0 & 0 \\
\vdots & \vdots & \ddots & \vdots & \vdots & \vdots \\
0 & 0 & \cdots & \sigma_{M} & 0 & 0
\end{bmatrix}_{M\times N} \; (M < N)
$$

By fundamental linear algebra theormes and through arithematic calculations, it is known that,

* $\{\mathbf{v}_1, \: \mathbf{v}_2, \: \cdots, \: \mathbf{v}_N \}$ is a set of orthonormal eigenvectors of $A^\top A$.
* For all $i$, $\sigma_i$ is a nonnegative square root of $\lambda_i$, an eigenvalue of $A^\top A$.
* $\lambda_1 \geq \lambda_2 \geq  \; \cdots \; \geq \lambda_r > \lambda_{r+1} = \; \cdots \; = 0 \iff \sigma_1 \geq \sigma_2 \geq  \; \cdots \; \geq \sigma_r > \sigma_{r+1} = \; \cdots \; = 0 \iff \mathrm{rank}(A) = r$
* $|\{ \sigma_1 \geq \sigma_2,  \; \cdots \;  \}| = \mathrm{min}(M, N)$

Since, $V$ is an orthogonal matrix, $V$ is invertible and $V^{-1} = V^\top$ :

$$
A = U \Sigma V^\top
$$

$$
U = \begin{bmatrix} U_r & U_{M-r} \end{bmatrix}, 
\quad
\Sigma = \left[
\begin{array}{c|c}
D & 0 \\ \hline
0 & 0
\end{array}
\right],
\quad
V^\top = \left[
\begin{array}{c}
V_r^\top \\ \hline
V_{N-r}^\top
\end{array}
\right]
$$

This is SVD, Singluar Value Decomposition.

Reference,

1. Lay, D. C., McDonald, J., & Lay, S. R. (2016). Linear algebra and its applications (5th ed., Chapter 7.4). Pearson.
2. Angelo's Math Notes. (2019, August 1). Singular Value Decomposition (SVD). 공돌이의 수학정리노트. Retrieved from https://angeloyeo.github.io/2019/08/01/SVD_en.html

# 2. SVD in Python Scipy (```linalg.svd```, ```linalg.diagsvd```)

$$
A = U \Sigma V^\top \quad A = U \Sigma V^H \; \text{(complex case)}
$$

```python
""" SVD """
U, s, VT = linalg.svd(A, compute_uv=True) # default

s= linalg.svd(A, compute_uv=False) # singluar values only
```

* ```U```: 2D array, m x m, corresponding to $U$
* ```VT```: 2D array, n x n, corresponding to $V^\top$
* ```s```: 1D array in descending order, min(m, n), corresponding to diagonal entries of $\Sigma$

```python
""" Sigma """
U @ linalg.diagsvd(s, A.shape[0], A.shape[1]) @ VT
```

Reconstruct $A$

* corresponding Lapack function: ```gesdd```

## 2.1 Reduced SVD

$$
A = U \Sigma V^\top = \begin{bmatrix} U_r & U_{M-r} \end{bmatrix}
\left[
\begin{array}{c|c}
D & 0 \\ \hline
0 & 0
\end{array}
\right]\left[
\begin{array}{c}
V_r^\top \\ \hline
V_{N-r}^\top
\end{array}
\right] = U_r D V_r^\top
$$

In accordance with the reduced SVD, A can be reconstructed as follow:

```python
U[:, :r] @ np.diag(s[:r]) @ VT[:r, :]

# or
U[:, :r] * s[:r] @ VT[:r, :]
```

Ways to get ```r```:

```python
# loop the singular values and check whether they are zeros.
r = s.shape[0] - sum(np.allclose(lx, 0) for lx in s) # np.allclose, 1e-8 = 0

# alternative
r = s.shape[0] - sum(lx < 1.e-10 for lx in s)
```

## 2.2 Truncated SVD

Reconstruct $A$ as much as possible with limited information.

$$
A = U_r D V_r^\top
$$

$$
A^\star = U_t D_t V_t^\top
$$

$$
D =
\begin{bmatrix}
\sigma_1 & & & &  &\\
& & \ddots & &  & \\
& & & \sigma_t & & \\
& & & & \ddots & \\
& & & &  & \sigma_r\\
\end{bmatrix}, \quad
D_t =
\begin{bmatrix}
\sigma_1 & &  \\
& & \ddots &   \\
& & & \sigma_t   \\
\end{bmatrix}
$$

In accordance with the reduced SVD, $A^\star$ can be reconstructed as follow:

```python
U[:t] @ np.diag(s[:t]) @ VT[:t, :]

# or
U[:t] * s[:t] @ VT[:t, :]
```

Ways to get good ```t``` (take large singluar values, disregard small singular values:

```python
# s[0]: largest singluar value
# threshold: s[0]/1000
# t - count such singluar values
t = sum(lx > 1.e-3 * s[0] for lx in s)
```

### Excercises

1. 

In [2]:
A = np.array([
    [1, -1],
    [-2, 2],
    [2, -2],
], dtype=np.float64)

# SVD
U, s, VT = linalg.svd(A, compute_uv=True)

# A = U Sigma VT
A_recon = U @ linalg.diagsvd(s, A.shape[0], A.shape[1]) @ VT

# A = U_r Sigma_r VT_r
r = s.shape[0] - sum(np.allclose(lx, 0) for lx in s) 
A_recon_2 = U[:, :r] * s[:r] @ VT[:r, :]

print(U)
print(s)
print(VT)
print(A_recon)
print(A_recon_2)

[[-0.33333333  0.66666667 -0.66666667]
 [ 0.66666667  0.66666667  0.33333333]
 [-0.66666667  0.33333333  0.66666667]]
[4.24264069 0.        ]
[[-0.70710678  0.70710678]
 [ 0.70710678  0.70710678]]
[[ 1. -1.]
 [-2.  2.]
 [ 2. -2.]]
[[ 1. -1.]
 [-2.  2.]
 [ 2. -2.]]


2. 

In [3]:
A = np.array([
    [4, 11, 14],
    [8, 7, -2],
], dtype=np.float64)

# SVD
U, s, VT = linalg.svd(A, compute_uv=True)

# A = U Sigma VT
A_recon = U @ linalg.diagsvd(s, A.shape[0], A.shape[1]) @ VT

# A = U_r Sigma_r VT_r
r = s.shape[0] - sum(np.allclose(lx, 0) for lx in s) 
A_recon_2 = U[:, :r] * s[:r] @ VT[:r, :]

print(U)
print(s)
print(VT)
print(A_recon)
print(A_recon_2)

[[-0.9486833  -0.31622777]
 [-0.31622777  0.9486833 ]]
[18.97366596  9.48683298]
[[-0.33333333 -0.66666667 -0.66666667]
 [ 0.66666667  0.33333333 -0.66666667]
 [-0.66666667  0.66666667 -0.33333333]]
[[ 4. 11. 14.]
 [ 8.  7. -2.]]
[[ 4. 11. 14.]
 [ 8.  7. -2.]]


3.

In [4]:
A = np.random.rand(10, 10)

# SVD
U, s, VT = linalg.svd(A, compute_uv=True)

# A = U Sigma VT
A_recon = U @ linalg.diagsvd(s, A.shape[0], A.shape[1]) @ VT

# A = U_r Sigma_r VT_r
r = s.shape[0] - sum(np.allclose(lx, 0) for lx in s) 
A_recon_2 = U[:, :r] * s[:r] @ VT[:r, :]


# A = U_t Sigma_t VT_t
t = sum(lx > 1.e-1 * s[0] for lx in s)
A_appx = U[:, :t] * s[:t] @ VT[:t, :]

#print(U)
#print(s)
#print(VT)
#print(A_recon)
#print(A_recon_2)
print(np.allclose(A, A_recon))
print(np.allclose(A, A_recon_2))
print(f'We took t as t = {t} for approximation, while true size of A is {A.shape}')
print(f'true A minus appxed A, mean deviation per entry: {np.mean(A-A_appx)}')

True
True
We took t as t = 6 for approximation, while true size of A is (10, 10)
true A minus appxed A, mean deviation per entry: 0.0004038711104458421


# 3. SVD and Fundamental Subspaces (```linalg.orth```, ```linalg.null_space```)

![Fundamental Subspaces](./SVD_subspaces.jpg)

(Source: Lay, D. C., McDonald, J., & Lay, S. R. (2016). Linear algebra and its applications (5th ed., p. 423, Fig. 4). Pearson.)

If you understood the fundamentals of SVD explained in Section #1., you could easily understand this figure as well.

You can use the results of SVD to compute the column and null spaces of $A$.

Thankfully, however, ```scipy``` provides separate functions to compute the column and null spaces of $A$:

```python

""" returns matrix consisting of basis vectors for column space as columns """
ColA = linalg.orth(A, rcond=None) # default

""" returns matrix consisting of basis vectors for null space as columns """
NulA = linalg.null_space(A, rcond=None) # default
```

Example.

$$
A =
\begin{bmatrix}
1 & -1 \\
-2 & 2 \\
2 & -2
\end{bmatrix}, \quad
U = 
\begin{bmatrix}
-0.3333 & -0.6667 & -0.6667 \\
 0.6667 & 0.6667 & 0.3333 \\
 -0.6667 & 0.3333 & 0.6667
\end{bmatrix}, \quad
\Sigma = 
\begin{bmatrix}
4.2426 & 0 \\
 0 & 0  \\
 0 & 0
\end{bmatrix}, \quad
V^\top = 
\begin{bmatrix}
-0.7071 & 0.7071 \\
 0.7071 & 0.7071  \\
\end{bmatrix}
$$

Before typing codes, let's think. For $\mathbf{v}_i$'s, $\mathbf{v}_2$ should be in $\mathrm{Nul}(A)$ whereas $\mathbf{v}_2 \in \mathrm{Row}(A)$ 

On the other hand, for $\mathbf{u}_i$'s, only $\mathbf{v}_1$ should be in $\mathrm{Col}(A)$.

In [5]:
A = np.array([
    [1, -1],
    [-2, 2],
    [2, -2],
], dtype=np.float64)

# SVD
U, s, VT = linalg.svd(A, compute_uv=True)

# ColA
ColA = linalg.orth(A, rcond=None)

# NulA
NulA = linalg.null_space(A, rcond=None)

In [6]:
ColA # form is in matrix, not vector

array([[-0.33333333],
       [ 0.66666667],
       [-0.66666667]])

In [7]:
NulA # form is in matrix, not vector

array([[0.70710678],
       [0.70710678]])

The result corresponds to our analysis.