<h1 align = "center">Randomized Singular Value Decomposition</h1>

<h6 align = "center">Author: Xinyu Chen</h6>

The accurate and efficient decomposition of large data matrices is one of the cornerstones of modern computational mathematics and data science.

For reproducing this notebook, please clone or download the **tensor-learning** repository ([https://github.com/xinychen/tensor-learning](https://github.com/xinychen/tensor-learning)) on your computer first.

**Lemma 1.** Suppose any matrix $\boldsymbol{X}\in\mathbb{R}^{m\times n}$ with $m\ll n$, then a fast approach for computing Singular Value Decomposition of $\boldsymbol{X}=\boldsymbol{U}\boldsymbol{\Sigma}\boldsymbol{V}^\top$ is given by
\begin{equation}
\begin{aligned}
    \boldsymbol{X}\boldsymbol{X}^\top=\boldsymbol{U}\tilde{\boldsymbol{\Sigma}}\tilde{\boldsymbol{V}}^\top,\quad\boldsymbol{\Sigma}\boldsymbol{\Sigma}^\top=\tilde{\boldsymbol{\Sigma}}\tilde{\boldsymbol{V}}^\top\boldsymbol{U},\quad\boldsymbol{V}=\boldsymbol{X}^\top\boldsymbol{U}\boldsymbol{\Sigma}^{-1},
\end{aligned}
\end{equation}
where $\boldsymbol{U}$ and $\boldsymbol{V}$ consist of left and right singular vectors, respectively. $\sigma_1\geq\sigma_{\min{m,n}}\geq 0$ are singular values.


*Proof.* For the fact that $\boldsymbol{X}=\boldsymbol{U}\boldsymbol{\Sigma}\boldsymbol{V}^\top$ is the SVD of $\boldsymbol{X}$, there holds
\begin{equation}
    \boldsymbol{X}\boldsymbol{X}^\top=\boldsymbol{U}\boldsymbol{\Sigma}\boldsymbol{V}^\top\boldsymbol{V}\boldsymbol{\Sigma}^\top\boldsymbol{U}^\top=\boldsymbol{U}\boldsymbol{\Sigma}\boldsymbol{\Sigma}^\top\boldsymbol{U}^\top=\boldsymbol{U}\tilde{\boldsymbol{\Sigma}}\tilde{\boldsymbol{V}}^\top\Rightarrow\boldsymbol{\Sigma}\boldsymbol{\Sigma}^\top=\tilde{\boldsymbol{\Sigma}}\tilde{\boldsymbol{V}}^\top\boldsymbol{U}.
\end{equation}

Following this, we have
\begin{equation}
    \boldsymbol{X}^\top\boldsymbol{X}=\boldsymbol{X}^\top\boldsymbol{U}\boldsymbol{\Sigma}\boldsymbol{V}^\top=\boldsymbol{V}\boldsymbol{\Sigma}\boldsymbol{\Sigma}^\top\boldsymbol{V}^\top\Rightarrow\boldsymbol{V}=\boldsymbol{X}^\top\boldsymbol{U}\boldsymbol{\Sigma}^{-1}.
\end{equation}



In [33]:
import numpy as np
np.seterr(divide='ignore', invalid='ignore')

def fast_svd(mat):
    dim1, dim2 = mat.shape
    if dim1 <= dim2:
        U, s_tilde, V_tilde = np.linalg.svd(mat @ mat.T, full_matrices = 0)
        S = np.sqrt(np.diag(s_tilde) @ V_tilde @ U)
        V = mat.T @ U @ np.linalg.inv(S)
        return U, S, V
    else:
        U0, S, V0 = fast_svd(mat.T)
        U = V0.T
        V = U0.T
        return U, S, V

def rsvd(mat, rank, power):
    dim1, dim2 = mat.shape
    Phi = np.random.randn(dim2, rank)
    A = mat @ Phi
    if power > 0:
        for k in range(power):
            A = mat @ (mat.T @ A)
    Q, R = np.linalg.qr(A)
    U_tilde, S, V = fast_svd(Q.T @ mat)
    return Q @ U_tilde, S, V

### Testing `fast_svd` against `numpy.linalg.svd`

In [2]:
import time

mat = np.random.rand(10000, 18000)
start = time.time()
U, S, V = fast_svd(mat)
end = time.time()
print(np.diag(S))
print(end - start)

[6708.2749416    67.51813228   67.46474809 ...    9.96231502    9.92972879
    9.86411576]
617.9077150821686


In [3]:
start = time.time()
U, S, V = np.linalg.svd(mat, full_matrices = 0)
end = time.time()
print(S)
print(end - start)

[6708.2749416    67.51813228   67.46474809 ...    9.96231502    9.92972879
    9.86411576]
736.6943187713623


### Testing `rsvd` against `numpy.linalg.svd`

In [34]:
import time

mat = np.random.rand(10000, 9000)
start = time.time()
U, S, V = fast_svd(mat)
end = time.time()
print(np.diag(S[:10]))
print(end - start)

[4743.47345882   56.30120971   56.14880174   56.13332907   56.03742922
   56.00312683   55.98918016   55.8925917    55.86384718   55.80551466]
387.3423180580139


In [43]:
import time

start = time.time()
U, S, V = rsvd(mat, 100, 2)
end = time.time()
print(np.diag(S[:10]))
print(end - start)

[4743.47345882   52.08820533   52.03989705   51.98184477   51.84658782
   51.78052861   51.68910186   51.59906955   51.51632298   51.43246594]
1.8465938568115234
