Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugs report: IPCA.m #2

Closed
goodshawn12 opened this issue Sep 14, 2018 · 5 comments
Closed

Bugs report: IPCA.m #2

goodshawn12 opened this issue Sep 14, 2018 · 5 comments

Comments

@goodshawn12
Copy link

Thanks for sharing the code of online PCA. The implementation is very elegant!
I write to report two bugs in the implementation of IPCA.m. It would be great if you can verify them.

  1. Line 61: x = sqrt(obj.f) * x;
    I think the implementation is missing a factor of sqrt(1-obj.f)
    New Line 61: x = sqrt( obj.f * (1-obj.f) ) * x;

  2. Line 76: obj.lambda_ = s(end:-1:2);
    Line 77: V = V(:, end:-1:2);

When Line 68 ( normx >= obj.tol ) condition is not satisfied (e.g. at least for the first sample), matrix M would have dimention k by k instead of k+1 by k+1. Hence when normx < obj.tol, Line 76 and 77 should keep all of the dimension.

Example fix: (replacing Line 75 - 77)

          obj.lambda_ = diag(s);
          if (normx >= obj.tol)
              obj.lambda_ = obj.lambda_(end:-1:2);
              V = V(:, end:-1:2);
          end

I have done simple testing using your demo_IPCA.m script, and in this way the errors decreased faster (although the final error was about the same).

The IPCA algorithm I referenced is from section 4 "Reduced Rank Incremental Principal Component Analysis" from Cardot, H., & Degras, D. (2018). Online Principal Component Analysis in High Dimension: Which Algorithm to Choose?. International Statistical Review, 86(1), 29-50.

@agiovann
Copy link
Contributor

Dear @goodshawn12,

Thank you for pointing that out. We are gonna check it!

Cheers

@agiovann
Copy link
Contributor

Dear @goodshawn12

thanks a lot again for pointing out these issues. We fixed Point 2 but we wonder why do you think Point 1 should be modified that way.

Thank you!

Andrea

@goodshawn12
Copy link
Author

Dear Andrea,

From my understanding of the IPCA algorithm (specifically according to Eq. 11 in the Cardot & Degras, 2018), the Q matrix (the variable M in your code) should have a factor of (1 - f) for lambda matrix and f * (1-f) for outer-product of Uhatx, where in stationary case f = 1/ (n+1). Hence, when multiplying a factor to x, it appears to me that the factor of sqrt( f * (1-f) ) would produce the consistent result.

Please let me know if this makes sense to you.

Best,
Shawn

@victorminden
Copy link
Contributor

victorminden commented Oct 6, 2018

Hi Shawn,

I think that the discrepancy is between the specific choices made in Cardot & Degras and other variants to the general method by Arora et al. , (page 3).

In particular, Cardot and Degras integrate eq (4) in their paper into this method for their covariance estimates, wheras Arora et al. do not seem to have such a recurrence integrated into eq (4) in their paper. In the implementation here, we were switching between the two papers and I believe settled on something in between.

In particular, I believe if you carry the factors through in this implementation, the ultimate matrix you compute eigenvectors of is a convex combination of the old lambda matrix and the new outer-product matrix, which I believe should also make sense.

Does this seem to explain the discrepancy? It does not answer the question "which is the best way of doing things?", and I do not know the answer to that but I do not anticipate any large differences except perhaps in the early iterations. Would you possibly be able to show a few of the example trajectories you computed with the different step rule? If there's a good empirical argument for using Cardot and Degras's steps in particular, we can certainly look at it.

Feel free to correct me if any of this seems incorrect or this does not seem like the source of the discrepancy!

Thanks,

Victor

@agiovann
Copy link
Contributor

closing for lack of activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants