-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugs report: IPCA.m #2
Comments
Dear @goodshawn12, Thank you for pointing that out. We are gonna check it! Cheers |
Dear @goodshawn12 thanks a lot again for pointing out these issues. We fixed Point 2 but we wonder why do you think Point 1 should be modified that way. Thank you! Andrea |
Dear Andrea, From my understanding of the IPCA algorithm (specifically according to Eq. 11 in the Cardot & Degras, 2018), the Q matrix (the variable M in your code) should have a factor of (1 - f) for lambda matrix and f * (1-f) for outer-product of Uhatx, where in stationary case f = 1/ (n+1). Hence, when multiplying a factor to x, it appears to me that the factor of sqrt( f * (1-f) ) would produce the consistent result. Please let me know if this makes sense to you. Best, |
Hi Shawn, I think that the discrepancy is between the specific choices made in Cardot & Degras and other variants to the general method by Arora et al. , (page 3). In particular, Cardot and Degras integrate eq (4) in their paper into this method for their covariance estimates, wheras Arora et al. do not seem to have such a recurrence integrated into eq (4) in their paper. In the implementation here, we were switching between the two papers and I believe settled on something in between. In particular, I believe if you carry the factors through in this implementation, the ultimate matrix you compute eigenvectors of is a convex combination of the old lambda matrix and the new outer-product matrix, which I believe should also make sense. Does this seem to explain the discrepancy? It does not answer the question "which is the best way of doing things?", and I do not know the answer to that but I do not anticipate any large differences except perhaps in the early iterations. Would you possibly be able to show a few of the example trajectories you computed with the different step rule? If there's a good empirical argument for using Cardot and Degras's steps in particular, we can certainly look at it. Feel free to correct me if any of this seems incorrect or this does not seem like the source of the discrepancy! Thanks, Victor |
closing for lack of activity |
Thanks for sharing the code of online PCA. The implementation is very elegant!
I write to report two bugs in the implementation of IPCA.m. It would be great if you can verify them.
Line 61:
x = sqrt(obj.f) * x;
I think the implementation is missing a factor of sqrt(1-obj.f)
New Line 61:
x = sqrt( obj.f * (1-obj.f) ) * x;
Line 76:
obj.lambda_ = s(end:-1:2);
Line 77:
V = V(:, end:-1:2);
When Line 68 (
normx >= obj.tol
) condition is not satisfied (e.g. at least for the first sample), matrix M would have dimention k by k instead of k+1 by k+1. Hence whennormx < obj.tol
, Line 76 and 77 should keep all of the dimension.Example fix: (replacing Line 75 - 77)
I have done simple testing using your demo_IPCA.m script, and in this way the errors decreased faster (although the final error was about the same).
The IPCA algorithm I referenced is from section 4 "Reduced Rank Incremental Principal Component Analysis" from Cardot, H., & Degras, D. (2018). Online Principal Component Analysis in High Dimension: Which Algorithm to Choose?. International Statistical Review, 86(1), 29-50.
The text was updated successfully, but these errors were encountered: