<h1 align = "center">Randomized Singular Value Decomposition</h1>

<h6 align = "center">Author: Xinyu Chen</h6>

In the fields of both machine learning and signal processing, matrix decomposition is a foundational tool for some critical applications like data compression, dimensionality reduction, and sparsity learning. In many cases, for purposes of approximating a data matrix by a low-rank structure, the Singular Value Decomposition (SVD) is the best choice. However, the accurate and efficient SVD of large-scale datasets is computationally challenging. To resolve the SVD in this situation, there are many methods by applying randomized linear algebra. One of the most important method for fast SVD is randomized SVD. This post will introduce the preliminary and essential idea of the randomized SVD. To help readers gain a better understanding of randomized SVD, we also provide the corresponding Python implementation in this post.

> For reproducing this notebook, please clone or download the **tensor-learning** repository ([https://github.com/xinychen/tensor-learning](https://github.com/xinychen/tensor-learning)) on your computer first.

### Power Iterations



In [1]:
import numpy as np
np.seterr(divide='ignore', invalid='ignore')

def power_iteration(mat, Phi, power_iter = 3):
    B = mat @ Phi
    for q in range(power_iter):
        B = mat @ (mat.T @ B)
    Q, _ = np.linalg.qr(B)
    return Q

### Randomized Singular Value Decomposition

In [7]:
def rsvd(mat, rank, power_iter):
    dim1, dim2 = mat.shape
    Phi = np.random.randn(dim2, rank)
    A = mat @ Phi
    if power_iter > 0:
        for k in range(power_iter):
            A = mat @ (mat.T @ A)
    Q, R = np.linalg.qr(A)
    u_tilde, s, v = np.linalg.svd(Q.T @ mat, full_matrices = 0)
    return Q @ u_tilde, s, v

### Small Worked Example

We will give a sufficiently detailed understanding with a small worked example. The problem is a simple SVD of 5-by-4 matrix, i.e.,
$$\boldsymbol{X}=\left(\begin{array}{cccc}
1 & 3 & 2 & 4 \\
5 & 3 & 1 & 2 \\
3 & 4 & 5 & 2 \\
4 & 4 & 2 & 1 \\
4 & 2 & 3 & 3 \\
\end{array}\right)\in\mathbb{R}^{5\times 4}.$$

In [9]:
mat = np.array([[1, 3, 2, 4],
                [5, 3, 1, 2],
                [3, 4, 5, 2],
                [4, 4, 2, 1],
                [4, 2, 3, 3]])
u, s, v = np.linalg.svd(mat, full_matrices = 0)
print('Left singular vectors:')
print(u)
print()
print('Singular values:')
print(s)
print()
print('Right singular vectors:')
print(v)
print()

Left singular vectors:
[[-0.35579275  0.61467653  0.58103306 -0.39692223]
 [-0.43905533 -0.5883267   0.34849729 -0.03803043]
 [-0.53040969  0.37421029 -0.65683124  0.0736076 ]
 [-0.44100936 -0.36870485 -0.24920217 -0.50936999]
 [-0.45256849  0.00823753  0.21776416  0.75903265]]

Singular values:
[13.1975984   3.6191375   2.70009861  1.85329644]

Right singular vectors:
[[-0.58469804 -0.5436866  -0.4578419  -0.39104205]
 [-0.7311674   0.0324791   0.49718495  0.46598977]
 [ 0.0841724  -0.14814801 -0.59949834  0.78202872]
 [ 0.34122931 -0.82547087  0.42870697  0.1355387 ]]



In [10]:
rank = 3
power_iter = 3
u, s, v = rsvd(mat, rank, power_iter)
print('Left singular vectors:')
print(u)
print()
print('Singular values:')
print(s)
print()
print('Right singular vectors:')
print(v)
print()

Left singular vectors:
[[-0.35579222 -0.61522273  0.58260656]
 [-0.43905528  0.58827643  0.34864337]
 [-0.53040979 -0.37411357 -0.65711692]
 [-0.44100867  0.36799365 -0.24717929]
 [-0.45256952 -0.00717968  0.21474902]]

Singular values:
[13.1975984   3.61913492  2.70008736]

Right singular vectors:
[[-0.5846981  -0.54368644 -0.45784198 -0.39104207]
 [ 0.73141086 -0.03306811 -0.49688356 -0.46588774]
 [ 0.08323863 -0.14589787 -0.60066191  0.78165876]]



In [34]:
import time

mat = np.random.rand(10000, 9000)
start = time.time()
U, S, V = fast_svd(mat)
end = time.time()
print(np.diag(S[:10]))
print(end - start)

[4743.47345882   56.30120971   56.14880174   56.13332907   56.03742922
   56.00312683   55.98918016   55.8925917    55.86384718   55.80551466]
387.3423180580139


In [43]:
import time

start = time.time()
U, S, V = rsvd(mat, 100, 2)
end = time.time()
print(np.diag(S[:10]))
print(end - start)

[4743.47345882   52.08820533   52.03989705   51.98184477   51.84658782
   51.78052861   51.68910186   51.59906955   51.51632298   51.43246594]
1.8465938568115234


### Summary

In this post, you discovered the randomized linear algebra method for SVD.

Specifically, you learned:

- The essential idea of randomized SVD.
- How to implement randomized SVD step-by-step.

Do you have any question? Ask your question by creating an issue at the **tensor-learning** repository ([https://github.com/xinychen/tensor-learning](https://github.com/xinychen/tensor-learning)) and I will do my best to answer. If you find these codes useful, please star (â˜…) this repository.

### References

[1] Per-Gunnar Martinsson (2016). Randomized methods for matrix computations and analysis of high dimensional data. arXiv:1607.01649. [[PDF](https://arxiv.org/pdf/1607.01649v1.pdf)]

[2] N. Benjamin Erichson, Sergey Voronin, Steven L. Brunton, J. Nathan Kutz (2016). Randomized Matrix Decompositions Using R. arXiv:1608.02148. [[PDF](https://arxiv.org/pdf/1608.02148.pdf)]