Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2 PCA implementations that give same results but different from Python scikit-learn implementation ... #81

Closed
SergeStinckwich opened this issue Aug 3, 2018 · 17 comments

Comments

@SergeStinckwich
Copy link
Member

SergeStinckwich commented Aug 3, 2018

We have 2 implementations of PCA :

  • one based on Jacobi Transformation of the covariance matrix
  • another one based on SVD.

They give the same results but the result are different from the one you can find with sci-kit learn in Python:

import numpy as np
from sklearn.decomposition import PCA
X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
pca = PCA(n_components=2)
pca.fit(X)
pca.components_
pca.transform(X)

pca.components_ returns :

array([[-0.83849224, -0.54491354],
       [ 0.54491354, -0.83849224]])

pca.transform(X) returns:

array([[ 1.38340578,  0.2935787 ],
       [ 2.22189802, -0.25133484],
       [ 3.6053038 ,  0.04224385],
       [-1.38340578, -0.2935787 ],
       [-2.22189802,  0.25133484],
       [-3.6053038 , -0.04224385]])

I try to implement a flipsvd method like this one : https://github.com/scikit-learn/scikit-learn/blob/4c65d8e615c9331d37cbb6225c5b67c445a5c959/sklearn/utils/extmath.py#L609
but fails until now.

Please have a look to tests of PMPrincipalComponentAnalyserTest.

@hemalvarambhia
Copy link
Contributor

Maybe this is an odd suggestion, but to debug this can we write a simpler test, say, with the identity matrix as X? I wrote this test in Pharo and the output is:

( -1/sqrt(2), 1/sqrt(2) )
( 1/sqrt(2), 1/sqrt(2) )

which, interestingly, with numpy the value of pca.components:

[[-0.70710678 0.70710678]
[ 0.70710678 0.70710678]]

@hemalvarambhia
Copy link
Contributor

hemalvarambhia commented Jan 16, 2019

Hi, @SergeStinckwich I looked into the failing test for the Jacobi Transformation form of PCA a little deeper and noted that transposing and negating the matrix of eigenvectors results in the test passing:

transposeandnegate

The Scikit Learn library, from what I can see, uses the SVD implementation which involves flipping signs. For this test, then, does it make sense to directly compare the Jacobi matrix of eigenvectors with their SVD computed one? Might we instead assert that the negated transpose of Jacobi equals the SVD one?

Please forgive my ignorance- I'm not too familiar with this domain, so I'm sure I'm missing something!

@hemalvarambhia
Copy link
Contributor

hemalvarambhia commented Jan 16, 2019

@SergeStinckwich I noted a similar thing with PolyMath's PCA implementation using SVD, where transposing and negating the returned matrix from transformMatrix also ets the test passing:

pca_svd_test

This phenomenon is interesting, I think, perhaps worth investigating.

Also I note that flipEigenvectorsSign has been commented out in the fit: message because it doesn't work. Could you elaborate on this a little more, please?

@SergeStinckwich
Copy link
Member Author

Thank you for spending time to investigate more on this. I forget a little bit about the details of implementation. I will have to spend some time to remember what I have done here.

@hemalvarambhia
Copy link
Contributor

Thank you for your help on this, @SergeStinckwich. The code is really well-written compared to the Python library. I read this paper on the subject and your work was very easy to follow from that. So far the difference between your work and the Python one is just transpose negate.

@hemalvarambhia
Copy link
Contributor

hemalvarambhia commented Mar 11, 2019

Hi, @SergeStinckwich,
I made some more progress on this, where I extracted out the algorithm from SciKit-Learn and tried to determine the outputs of each step. Here is something you can run on the command line to see this at a more granular level: scikit-learn flip algorithm

Given that in SciKit-Learn, U is

[[-0.21956688  0.53396977]
 [-0.35264795 -0.45713538]
 [-0.57221483  0.07683439]
 [ 0.21956688 -0.53396977]
 [ 0.35264795  0.45713538]
 [ 0.57221483 -0.07683439]]

(the first two columns from PolyMath match SciKit-Learn's)

the first thing to note is the max_abs_column variable, which in SciKitLearn is a 2-element array: [2 0], whereas in PolyMath it is #(3 4 5...). For the former, this leads to a sub matrix: [-0.57221483 0.53396977], which then leads to the signs being [-1 1]. This is the key difference because in PolyMath, we get #(-1-1 -1 1 1 -1). Our our argMaxOnColumns message yields -0.53396... from row 4, column 2 rather than 0.53396977 at row 1.

@hemalvarambhia
Copy link
Contributor

hemalvarambhia commented Mar 12, 2019

On further investigation, when we compute the maxAbsColumn in PolyMath, we get the element at row 4 as the highest value because it is 0.5339697737377911, whereas at row 1 it is 0.5339697737377903. With SciKitLearn the columns aren't as precise as you know, so perhaps we need to do the same and round off.

@hemalvarambhia hemalvarambhia self-assigned this Mar 13, 2019
@SergeStinckwich
Copy link
Member Author

In Mathematica if we use SVD:

m = {{-1, -1}, {2, -1}, {-3, -2}, {1, 1}, {2, 1}, {3, 2}}
{{-1, -1}, {2, -1}, {-3, -2}, {1, 1}, {2, 1}, {3, 2}}
{u, w, v} = SingularValueDecomposition[N[m]]
{{{0.227413, -0.184384, -0.602751, 0.246542, 0.356209, 
   0.602751}, {-0.204296, -0.949274, 0.0746245, 
   0.109693, -0.184317, -0.0746245}, {0.598729, -0.113805, 0.705625, 
   0.128753, 0.165622, 0.294375}, {-0.227413, 0.184384, 0.106997, 
   0.942819, -0.0498161, -0.106997}, {-0.371316, -0.070579, 
   0.187377, -0.0715718, 0.884194, -0.187377}, {-0.598729, 0.113805, 
   0.294375, -0.128753, -0.165622, 0.705625}}, {{6.01037, 0.}, {0., 
   1.96863}, {0., 0.}, {0., 0.}, {0., 0.}, {0., 
   0.}}, {{-0.86491, -0.501927}, {-0.501927, 0.86491}}}
v
{{-0.86491, -0.501927}, {-0.501927, 0.86491}}

@SergeStinckwich
Copy link
Member Author

SergeStinckwich commented Mar 27, 2019

Action points now:

  • try to find another example as an acceptance test,
  • run this example against PolyMath, Mathematica and scikit learn to order to find some patterns,
  • how SVD is done in Numpy ? what are the decisions taken by them ?

@hemalvarambhia
Copy link
Contributor

hemalvarambhia commented Mar 27, 2019

I tried Wolfram Alpha and our V matrix matches theirs.

@SergeStinckwich
Copy link
Member Author

@hemalvarambhia
Copy link
Contributor

Wolfram Alpha:
Screen Shot 2019-03-27 at 17 55 21

@hemalvarambhia
Copy link
Contributor

Polymath calculation: here we see it matches Wolfram Alpha exactly:
polymath_calculation

@hemalvarambhia
Copy link
Contributor

hemalvarambhia commented Mar 27, 2019

The next example we could try is in section 3.1 of this article

Here is the answer according to Wolfram Alpha. You can see that the V matrix is similar in that the elements have the same sign and magnitude.

@hemalvarambhia
Copy link
Contributor

hemalvarambhia commented Apr 14, 2019

@SergeStinckwich In relation to the first of our action points, to find another example as an acceptance test, I tried the example in section 3.1 of the PCA tutorial. The PCA of scikit learn matches the output published in the above paper almost exactly:.

[[-0.82797019 -0.17511531]
 [ 1.77758033  0.14285723]
 [-0.99219749  0.38437499]
 [-0.27421042  0.13041721]
 [-1.67580142 -0.20949846]
 [-0.9129491   0.17528244]
 [ 0.09910944 -0.3498247 ]
 [ 1.14457216  0.04641726]
 [ 0.43804614  0.01776463]
 [ 1.22382056 -0.16267529]]

This is not the case for PolyMath, sadly, for both implementations. However, as we've discovered, the problem is a bit further up, in how we're computing the eigenvectors.

@hemalvarambhia
Copy link
Contributor

hemalvarambhia commented Apr 19, 2019

@SergeStinckwich Out of curiosity, I used the mean centred data from the PCA tutorial:

x y
.69 .49
-1.31 -1.21
.39 .99
.09 .29
1.29 1.09
.49 .79
.19 -.31
-.81 -.81
-.31 -.31
-.71 -1.01

and for the SVD-based implementation of PCA, PolyMath's output is correct up to negation:

polymath_pca_tutorial_example

Similarly for the Jacobi implementation, again our output is correct up to the negation:

PolyMathPCATutorialExample_Jacobi

@hemalvarambhia
Copy link
Contributor

hemalvarambhia commented Apr 20, 2019

On point 3, Scikit-Learn uses scipy for the SVD part of the PCA. In turn scipy delegates to LAPACK driver routines and according to the documentation, by default, it uses gessd (the available options are gessd and gesvd). I could not see anything in the scipy source code (decomp_svd.py) that indicated the decision taken in flipping signs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants