The goal of this script is to understand better the sk-learn tool for PCA and SVD. The matrix I use here was taken from wikipedia:
https://en.wikipedia.org/wiki/Singular-value_decomposition

In [1]:
from sklearn.decomposition import PCA
import numpy as np
from scipy.linalg import svd

In [2]:
X = np.array([[1, 0, 0, 0, 2],
              [0, 0, 3, 0, 0],
              [0, 0, 0, 0, 0],
              [0, 2, 0, 0, 0]])

First I use scipy.linalg.svd which gives me the result showed on the wikipedia page.

In [3]:
U, s, Vh = svd(X)
print('U = %s'% U)
print('Vh = %s'% Vh)
print('s = %s'% s)

U = [[ 0.  1.  0.  0.]
 [ 1.  0.  0.  0.]
 [ 0.  0.  0. -1.]
 [ 0.  0.  1.  0.]]
Vh = [[-0.          0.          1.          0.          0.        ]
 [ 0.4472136   0.          0.          0.          0.89442719]
 [-0.          1.          0.          0.          0.        ]
 [ 0.          0.          0.          1.          0.        ]
 [-0.89442719  0.          0.          0.          0.4472136 ]]
s = [ 3.          2.23606798  2.          0.        ]


Then I try to identify what are the attributes obtained by sklearn.decomposition.PCA

In [15]:
pca = PCA(svd_solver='auto', whiten=True)
pca.fit(X)
print(pca.components_)
print(pca.singular_values_)

[[ -1.47295237e-01  -2.15005028e-01   9.19398392e-01  -0.00000000e+00
   -2.94590475e-01]
 [  3.31294578e-01  -6.62589156e-01   1.10431526e-01   0.00000000e+00
    6.62589156e-01]
 [ -2.61816759e-01  -7.17459719e-01  -3.77506920e-01   0.00000000e+00
   -5.23633519e-01]
 [  8.94427191e-01  -2.92048264e-16  -7.93318415e-17   0.00000000e+00
   -4.47213595e-01]]
[  2.77516885e+00   2.12132034e+00   1.13949018e+00   1.69395499e-16]


What is pca.components_ ? Is it the principal components as explained here:
    https://stats.stackexchange.com/questions/134282/relationship-between-svd-and-pca-how-to-use-svd-to-perform-pca#134283
    i.e. the columns of $$XV$$?

In [4]:
S = np.array([[s[0], 0, 0, 0, 0],
              [0, s[1], 0, 0, 0],
              [0, 0, s[2], 0, 0],
              [0, 0, 0, s[3], 0]])

In [17]:
print('The theoretical principal components are the colums of XV:')
print('XV = %s' %X.dot(Vh.T))

The theoretical principal components are the colums of XV:
XV = [[  0.00000000e+00   2.23606798e+00   0.00000000e+00   0.00000000e+00
    4.44089210e-16]
 [  3.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00]
 [  0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00]
 [  0.00000000e+00   0.00000000e+00   2.00000000e+00   0.00000000e+00
    0.00000000e+00]]


It doesn't look like it. Is it the loadings? According to this page:
http://www.nxn.se/valent/loadings-with-scikit-learn-pca
it is supposed to be the loadings. And according to this page:
https://stats.stackexchange.com/questions/134282/relationship-between-svd-and-pca-how-to-use-svd-to-perform-pca#134283
the loadings are supposed to be equal to the rows of 
$$ \frac{1}{\sqrt{n-1}} SV^T $$.

In [14]:
(3**(1/2))*S.dot(Vh)

array([[ 0.        ,  0.        ,  5.19615242,  0.        ,  0.        ],
       [ 1.73205081,  0.        ,  0.        ,  0.        ,  3.46410162],
       [ 0.        ,  3.46410162,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ]])

But it doesn't look like it.