<a href="https://colab.research.google.com/github/stephenbeckr/randomized-algorithm-class/blob/master/Demos/demo01_exactRankR.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Demo #1

APPM 5650 Randomized Algorithms, Fall 2021

Stephen Becker (original MATLAB '19, jupyter version '21) & Jake Knigge (Python '19)

In [None]:
import numpy as np
from numpy.linalg import norm
from scipy.sparse.linalg import LinearOperator, svds

np.set_printoptions(precision = 4)      # display only four digits
rng = np.random.default_rng(12345)
n = np.int(4e3); m = n                  # dimension of problem
r = np.int(100)                         # rank of matrix

Left = rng.standard_normal( size=(m,r))
Right= rng.standard_normal( size=(r,n))
A = Left@Right
# Another case is that we *know* A has this structure, in which case we can exploit:
A_operator = LinearOperator( (m,n), matvec = lambda x : Left@(Right@x), 
                            rmatvec = lambda y : Right.T@(Left.T@y) )

def printError(U,s,Vh):
  S = np.reshape( s, (len(s),1) )
  A_estimate = U@(S*Vh)
  err = norm( A - A_estimate ) / norm( A )
  print(f'The error ||A-A_estimate||_F/||A||_F is {err:0.2e}')
  print(f'The largest and smallest (non-zero) singular values are {s[0]:0.4f} and {s[-1]:0.4f}')

## Find SVD of $A$ with conventional methods

Dense SVD

In [None]:
%time U, S, Vh = np.linalg.svd(A, full_matrices=False)

printError(U,S,Vh)

CPU times: user 1min 15s, sys: 2.29 s, total: 1min 17s
Wall time: 39.6 s
The error ||A-A_estimate||_F/||A||_F is 2.60e-15


Krylov subspace method (usually best for sparse matrices or some kind of structure)

In [None]:
%time U, S, Vh = scipy.sparse.linalg.svds( A, k=r)

printError(U,S,Vh)

CPU times: user 6.89 s, sys: 4.26 s, total: 11.1 s
Wall time: 5.7 s
The error ||A-A_estimate||_F/||A||_F is 9.11e-16
The largest and smallest (non-zero) singular values are 3.16e+03 and 4.85e+03


... and **if we knew the structure of $A$** :
(careful: for `svds` the documentation says "The order of the singular values is not guaranteed.")

In [None]:
%time U, S, Vh = scipy.sparse.linalg.svds( A_operator, k=r)

printError(U,S,Vh)

CPU times: user 619 ms, sys: 433 ms, total: 1.05 s
Wall time: 560 ms
The error ||A-A_estimate||_F/||A||_F is 1.30e-15
The largest and smallest (non-zero) singular values are 4.85e+03 and 3.16e+03


## Find SVD of $A$ with randomized method

(no knowledge of the structure of $A$ required, other than knowing a good value for $r$)

In [None]:
%%time
Omega = np.random.normal(mu, sigma, (n, r));
Y     = A@Omega       # matrix multiply
Q, R  = np.linalg.qr(Y, mode='reduced');
QtA   = Q.T@A
# A = Q@QtA, which is a low-rank factorization. If we also want
#   the SVD of A, then continue a little bit more:
U_temp, S, Vh = np.linalg.svd(QtA, full_matrices=False)
U     = Q@U_temp

CPU times: user 612 ms, sys: 42.7 ms, total: 654 ms
Wall time: 364 ms


In [None]:
printError( U, S, Vh )

The error ||A-A_estimate||_F/||A||_F is 2.03e-14
The largest and smallest (non-zero) singular values are 4854.7887 and 3159.3836


By the way, if we do know the structure of $A$, we can also exploit that in the randomized method and get something a bit faster:

In [None]:
%%time
Omega = np.random.normal(mu, sigma, (n, r));
Y     = A_operator@Omega       # matrix multiply
Q, R  = np.linalg.qr(Y, mode='reduced');
QtA   = (A_operator.T@Q).T
# A = Q@QtA, which is a low-rank factorization. If we also want
#   the SVD of A, then continue a little bit more:
U_temp, S, Vh = np.linalg.svd(QtA, full_matrices=False)
U     = Q@U_temp

CPU times: user 308 ms, sys: 54.7 ms, total: 363 ms
Wall time: 202 ms


In [None]:
printError( U, S, Vh )

The error ||A-A_estimate||_F/||A||_F is 4.69e-15
The largest and smallest (non-zero) singular values are 4854.7887 and 3159.3836
