# SVD to PCA

### SVD properties
* $X = UDV^T$
* U and V are orthogonal matrix meaning that $U^T U = I$ and $V^T V = I$
* S is a diagonal, where S**2 is the variance explained

### application to PCA (pg 535 ESL)
* pca is formulated $f(\lambda) = V_q \lambda$ where $\lambda$ is the transformed
* $min_{V_q} \sum_i^N \| x_i - V_q \lambda_i \|^2$
* $min_{V_q} \sum_i^N \| x_i - V_q V_q^T x_i \|^2$
* after applying PCA, $UD$ are the principal components (the transformed)
* $V_q$ is the linear transformation, and $X V_q$ is also the principal component

### how is PCA implemented
* [pca](https://github.com/scikit-learn/scikit-learn/blob/14031f6/sklearn/decomposition/pca.py#L372)
* [base pca](https://github.com/scikit-learn/scikit-learn/blob/14031f65d144e3966113d3daec836e443c6d7a5b/sklearn/decomposition/base.py#L130)

In [36]:
%matplotlib inline
import numpy as np
from scipy import linalg
from sklearn.utils.extmath import svd_flip

In [90]:
a = np.random.randn(9, 6)
a = a - np.mean(a,axis=0)
print a.shape, np.mean(a, axis=0)

(9, 6) [  0.00000000e+00  -2.46716228e-17  -2.46716228e-17  -4.62592927e-17
   4.93432455e-17  -3.70074342e-17]


In [145]:
U, s, Vh = linalg.svd(a, full_matrices=False) # the full matrix false, makes it so that U is Nxp
# note that Vh is V transposed

In [144]:
U.shape, Vh.shape, s.shape

((9, 6), (6, 6), (6,))

In [93]:
S = linalg.diagsvd(s, 6, 6)

In [94]:
a[0]

array([ 0.60647006, -0.24558807,  0.84208774,  0.01716088,  1.62191761,
        1.11562508])

In [95]:
U.dot(S).dot(Vh)[0]

array([ 0.60647006, -0.24558807,  0.84208774,  0.01716088,  1.62191761,
        1.11562508])

## PCA example

* Vh is the transposed matrix

In [113]:
def rss(m0, m1):
    return np.sum((m0 - m1)**2)

In [121]:
# % var explained
s ** 2 / np.sum(s ** 2)

array([ 0.31984305,  0.28146664,  0.1737282 ,  0.16005849,  0.05110582,
        0.01379781])

In [128]:
print "total var %.4f" % np.sum(a ** 2)

for i in range(1, a.shape[1] + 1):
    Vq = Vh.T[:,:i]
    reconstructed = Vq.dot(Vq.T).dot(a.T).T
    error = rss(a, reconstructed)
    print "%d principal components, %.2f RSS" % (i, error)

total var 61.9086
1 principal components, 42.11 RSS
2 principal components, 24.68 RSS
3 principal components, 13.93 RSS
4 principal components, 4.02 RSS
5 principal components, 0.85 RSS
6 principal components, 0.00 RSS


In [139]:
# demonstrating how the transformation works
transformed = a.dot(Vh.T)
fit_a = transformed.dot(Vh) # 14.49 pg 535 ESL
rss(a, fit_a)

1.3627629915591311e-28

In [143]:
# demonstrating that UD is the transformed =)
rss(U.dot(S), transformed)

4.0015640223178823e-29