In [55]:
import numpy as np

import metrics
import svd as ttsvd

np.random.seed(12)

# SVD (Singular Value Decomposition)

let $M \in \mathbb{R}^{m*n}$
The singular value decomposition of $M$ is:

$$M = U \Sigma V^T$$

$U$: $m * m$ orthogonal matrix.  
$\Sigma$: $m * n$ rectangular diagonal matrix with non-negative numbers on the diagonal.  
$V$: $n * n$ orthogonal matrix.

The columns of $U$ are the left-singular vectors of $M$, and the eigeinvectors of $MM^T$.  
The columns of $V$ are the right-singular vectors of $M$, and the eigeinvectors of $M^TM$.  
The diagonal entries of $\Sigma$ are the singular values of $M$ and the square root of the eigeinvalues of $MM^T$ ($or M^TM$).

## Implementation (Golub - Reincsh SVD)

### Ressources

- http://people.duke.edu/~hpgavin/SystemID/References/Golub+Reinsch-NM-1970.pdf
- http://www.cs.utexas.edu/users/inderjit/public_papers/HLA_SVD.pdf

1 Use Houlesholder transformations to reduce A to bidiagonal form.  
2 Use QR to find the singular values of the bidiagonaal matrix.  
3 Combine results to get SVD of A

### Householder transformation

Let $x \in \mathbb{R^n}$, $v \in \mathbb{R}^n$
$$x \to x - 2v(v^Tx)$$

The householder matrix is the projection matrix for that transformation.
$$P = I - 2vv^T$$

### Bidiagonalization

Transform the matrix $A \in \mathbb{R}^{m*n}$, $m \geq n$ to bidiagonal form.

$$P^TAQ = J^0$$

$J^0$ upper bidiagonal matrix of size $m*n$.  
$P$ orthogonal matrix of size $m*m$.  
$Q$ orthogonal matrix of size $n*n$.    
$P$ and $Q$ are sequances of householder matrices.

In [333]:
def house_vect(x):
    v = x.copy()
    v[0] = x[0] + np.sign(x[0]) * np.linalg.norm(x)
    return v
    
def house_mat(v):
    return np.eye(len(v)) - 2 * np.outer(v, v) / (v@v)

def bidiagonalize(A):
    m, n = A.shape
    P = np.eye(m)
    Q = np.eye(n)
    B = A.copy()
    
    for j in range(n):
        
        B_sub = B[j:, j:]
        v = house_vect(B_sub[:, 0])
        H = np.eye(m)
        H[j:, j:] = house_mat(v)
        B = H @ B
        P = P @ H.T
        
        if j < n - 2:
        
            B_sub = B[j:, j+1:]
            v = house_vect(B_sub[0])
            H = np.eye(n)
            H[j+1:, j+1:] = house_mat(v)
            B = B @ H
            Q = Q @ H.T
        
    return P, B, Q

In [334]:
A = np.random.randn(5, 4)
print(A)
U, B, V = bidiagonalize(A)

B[np.abs(B) < 1e-10] = 0
print(B)

print(metrics.tdist(U.T @ U, np.eye(A.shape[0])))
print(metrics.tdist(U @ U.T, np.eye(A.shape[0])))
print(metrics.tdist(V.T @ V, np.eye(A.shape[1])))
print(metrics.tdist(V @ V.T, np.eye(A.shape[1])))
print(metrics.tdist(U @ B @ V.T, A))
print(metrics.tdist(U.T @ A @ V, B))

[[-0.84025229  0.25448849 -0.19640052 -1.06657939]
 [-1.01254346 -2.11589758 -1.02469004 -1.03484659]
 [ 0.71226925  2.01759503 -1.12228014 -0.32515742]
 [ 0.30572838  2.00714828 -1.2403303   1.7976405 ]
 [-0.22123684  1.45673859 -0.73226059  0.0805405 ]]
[[ 1.54305257 -2.78328288  0.          0.        ]
 [ 0.         -2.93562704  1.49696294  0.        ]
 [ 0.          0.         -1.84518027  0.30308817]
 [ 0.          0.          0.         -1.62361101]
 [ 0.          0.          0.          0.        ]]
8.134087991091807e-16
7.256540328333924e-16
3.1819185660814247e-16
3.3740035919205897e-16
1.89247439205462e-15
9.29607591659322e-16


In [335]:
A = np.random.randn(5, 5)
print(A)
U, B, V = bidiagonalize(A)

B[np.abs(B) < 1e-10] = 0
print(B)

print(metrics.tdist(U.T @ U, np.eye(A.shape[0])))
print(metrics.tdist(U @ U.T, np.eye(A.shape[0])))
print(metrics.tdist(V.T @ V, np.eye(A.shape[1])))
print(metrics.tdist(V @ V.T, np.eye(A.shape[1])))
print(metrics.tdist(U @ B @ V.T, A))
print(metrics.tdist(U.T @ A @ V, B))

[[ 1.94288192  0.39300256  1.29424839  1.44871386 -0.23592944]
 [ 1.09422847 -0.45159208 -0.01750505  0.99546323 -0.79615515]
 [ 1.09839163 -1.2093778   0.64620034 -0.45895617  0.15703196]
 [-0.73949431  0.93267878  2.10169939 -1.16836193 -0.40287577]
 [-1.36951958 -0.42852993 -1.11420516  1.04906738 -0.62795135]]
[[-2.93275059 -1.50320825  0.          0.          0.        ]
 [ 0.         -1.37696001  1.38393211  0.          0.        ]
 [ 0.          0.         -1.58352148  2.06499593  0.        ]
 [ 0.          0.          0.         -1.73348536  0.92707749]
 [ 0.          0.          0.          0.         -0.97761746]]
7.576491868655485e-16
7.537385281391632e-16
7.66003098431927e-16
7.784611273070291e-16
2.5601062928726914e-15
1.0012112114945322e-15


In [336]:
A = np.random.randn(7, 3)
print(A)
U, B, V = bidiagonalize(A)

B[np.abs(B) < 1e-10] = 0
print(B)

print(metrics.tdist(U.T @ U, np.eye(A.shape[0])))
print(metrics.tdist(U @ U.T, np.eye(A.shape[0])))
print(metrics.tdist(V.T @ V, np.eye(A.shape[1])))
print(metrics.tdist(V @ V.T, np.eye(A.shape[1])))
print(metrics.tdist(U @ B @ V.T, A))
print(metrics.tdist(U.T @ A @ V, B))

[[ 0.06596372 -0.89782288 -0.27252311]
 [ 0.29133411  0.64422717  0.34720509]
 [-1.27497946  1.00709587  1.35958683]
 [ 1.11703865  0.172686    0.40592015]
 [ 0.03063637  1.33592507 -0.5857411 ]
 [-1.17130874 -0.15749017  0.06591547]
 [ 1.29171742  1.1647744   1.77292261]]
[[-2.45030843  0.51379642  0.        ]
 [ 0.          2.8997257   0.23739381]
 [ 0.          0.         -1.54963981]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]]
5.143597635264457e-16
5.715984939371122e-16
3.5875141997657774e-16
3.5875141997657774e-16
1.1406885234323035e-15
5.27272358443338e-16


The singular values of $J^0$ are the same than $A$.  
$$J^0 = G\Sigma H^T$$
$$A = PG \Sigma H^T Q^T$$
$$U = PG$$
$$V = QH$$

### SVD of a bidiagonal matrix

### Eigenvalues and eigeinvectors a a symetric matrix

Let $B = P^{-1}AP$.  
The matrices $A$ and $B$ are said to be similar. They share several properties, one of them is they both have the same eigenvalues.

The QR algorithm:  
- $[Q_k, R_k] \leftarrow qr(A_k)$
- $A_{k+1} \leftarrow R_kQ_k$

$A_k$, $k \to \infty$ converges to a triangular matrix with the eigenvalues on its diagonal.

$$A = QR$$
$$Q = Q^{-1}$$
$$R = Q^TA$$
$$A_{k+1} = R_kQ_k = Q_k^TA_kQ_k$$
$A_1$, $A_2$ ..., $A_k$ are similar so they share the same eigenvalues: the diagonal entries of $A_k$

If $A$ symetrics, $Q_1Q_2$...$Q_k$ is a matrix whose columns are the eigeinvectors of $A$

In [160]:
def qr_algorithm(A, max_iters=100, prec=1e-6):
    
    Ak = A.copy()
    Qk = np.eye(A.shape[0])
 
    for k in range(max_iters):
        
        Q, R = np.linalg.qr(Ak)
        Ak = R @ Q
        Qk = Qk @ Q
        
        if np.linalg.norm(Ak - np.triu(Ak)) < prec:
            break
    
    vals = np.diag(Ak)
    vects = Qk
    return vals, vects

def test_eig(A, fn): 

    vals, vects = fn(A)
    vals_sol, vects_sol = np.linalg.eigh(A)
    vals_sol = vals_sol[::-1]
    vects_sol = vects_sol[:, ::-1]

    for i in range(vects.shape[1]):
        if vects[0, i] < 0: vects[:, i] = -vects[:, i]
        if vects_sol[0, i] < 0: vects_sol[:, i] = -vects_sol[:, i]

    #print(A @ vects_sol[:, 0] - vects_sol[:, 0] * vals_sol[0])
    print(vals)
    print(vals_sol)
    print(metrics.tdist(vals, vals_sol))
    print(metrics.tdist(vects, vects_sol))

In [145]:
A = np.random.randn(4, 3)
AAT = A @ A.T
ATA = A.T @ A

test_eig(AAT, qr_algorithm)
test_eig(ATA, qr_algorithm)

[ 1.58963726e+01  4.45972443e+00  1.71017814e+00 -1.46480942e-16]
[1.58963726e+01 4.45972443e+00 1.71017814e+00 8.43901255e-16]
1.6664235230451168e-13
3.007674017122803e-07
[15.89637263  4.45972443  1.71017814]
[15.89637263  4.45972443  1.71017814]
1.0388650559302042e-13
2.3590020677700243e-07


In [159]:
def qr_algorithm_shift(A, max_iters=100, prec=1e-6):
    
    Ak = A.copy()
    Qk = np.eye(A.shape[0])
    I = np.eye(A.shape[0])
        
    for k in range(max_iters):
        
        lbda = 0.01
        
        Q, R = np.linalg.qr(Ak - lbda * I)
        Ak = R @ Q + lbda * I
        Qk = Qk @ Q
        
        if np.linalg.norm(Ak - np.triu(Ak)) < prec:
            break
    
    vals = np.diag(Ak)
    vects = Qk
    return vals, vects

In [162]:
A = np.random.randn(4, 3)
AAT = A @ A.T
ATA = A.T @ A

test_eig(AAT, qr_algorithm_shift)
test_eig(ATA, qr_algorithm_shift)

[8.16983204e+00 4.58328010e+00 6.77661128e-01 5.34294831e-16]
[8.16983204e+00 4.58328010e+00 6.77661128e-01 9.87830910e-16]
3.963764521953199e-13
3.9265728241119094e-07
[8.16983204 4.5832801  0.67766113]
[8.16983204 4.5832801  0.67766113]
2.700588239800599e-13
3.25426392462262e-07


## SVD computing $AA^T$

$$A v_i = \sigma_i u_i$$
$$A^T u_i = \sigma_i v_i$$

We can compute the SVD of $A$ naively:
- Find the left singular vectors and the singular values by applying the QR algorithm on $AA^T$
- Find the right singular vectors by computing $v_i = \frac{A^T u_i}{\sigma_i}$

In [337]:
def svd_naive(A):
    lvals, lvects = qr_algorithm(A @ A.T)
    U = lvects 
    S = np.sqrt(lvals[:A.shape[1]])
    
    VT = np.empty((A.shape[1], A.shape[1]))
    for i in range(A.shape[1]):
        VT[i] = (A.T @ U[:, i]) / S[i]
    
    return U, S, VT

In [338]:
A = np.random.randn(4, 3)
U_sol, S_sol, VT_sol = np.linalg.svd(A)

U, S, VT = svd_naive(A)

print(metrics.tdist(U @ U.T, np.eye(A.shape[0])))
print(metrics.tdist(U.T @ U, np.eye(A.shape[0])))
print(metrics.tdist(VT @ VT.T, np.eye(A.shape[1])))
print(metrics.tdist(VT.T @ VT, np.eye(A.shape[1])))
print(metrics.tdist(S, S_sol))

S_mat = np.zeros(A.shape)
S_mat[:len(S),:len(S)] = np.diag(S)
print(metrics.tdist(U @ S_mat @ VT, A))

1.7860136728276433e-15
1.804409319573525e-15
4.189218656364947e-07
4.1892186562477863e-07
1.2614478883277597e-13
3.117166796758288e-15


##  Naive SVD with bidiagonalization

In [390]:
def svd_naive_bidioagonal(A):
    P, J, Q = bidiagonalize(A)
    G, S, HT = svd_naive(J)
    U = P @ G
    VT = HT @ Q.T
    return U, S, VT
    
A = np.random.randn(4, 3)
U_sol, S_sol, VT_sol = np.linalg.svd(A)

U, S, VT = svd_naive_bidioagonal(A)

print(metrics.tdist(U @ U.T, np.eye(A.shape[0])))
print(metrics.tdist(U.T @ U, np.eye(A.shape[0])))
print(metrics.tdist(VT @ VT.T, np.eye(A.shape[1])))
print(metrics.tdist(VT.T @ VT, np.eye(A.shape[1])))
print(metrics.tdist(S, S_sol))

S_mat = np.zeros(A.shape)
S_mat[:len(S),:len(S)] = np.diag(S)
print(metrics.tdist(U @ S_mat @ VT, A))

1.8449683377577863e-15
1.8533422017944336e-15
3.2146628135659837e-07
3.2146628142711365e-07
2.362663111950196e-13
2.9260748727207743e-15


## SVD of bidiagonal matrix

Let $J_0$ of size $m * n$ upper bidiagonal matrix.  
$J$ is iteratively diagonalized to $\Sigma$
$$J_{i+1} = S_i^T J_i T_i$$
$S_i$ and $T_i$ are orthogonal matrices that represent given rotations.

Let $M_i= J_i^TJ_i$ tridiagonal matrix
$$M_{i+1} = T_i^T M_i T_i$$

The transformation $M_i \to M_{i+1}$ is actually a $QR$ transformation wit shift $s$.
$$M_i - sI = T_sR_s$$
$$M_{i+1} = R_sT_s + sI$$

Maybe:
- $J_k \to \Sigma$ (diagonal matrix), singular values of $J_0$
- SVD of diagonal matrix is $IDI$
- $S_1$, $S_2$, ..., $T_1$, $T_2$, ... orthogonal matrices
- $U = S_1 S_2$ ...
- $V = T_1 T_2$ ...