Stochastic Optimization for PCA. #1391

zoq · 2018-05-11T22:21:11Z

Implementation of Stochastic PCA as described in "Stochastic Optimization for PCA and PLS", R. Arora et al.

rcurtin

Only two quick comments so far, then I realized to give anything more in-depth I'll have to read the paper fully. I'll try and do that in the next week. The technique looks really interesting, and I think some stochastic methods could give fast solutions to PCA that are pretty good. By any chance have you done any timing comparisons?

rcurtin · 2018-05-14T15:49:44Z

src/mlpack/methods/pca/decomposition_policies/stochastic_method.hpp

+ * For more information, see the following.
+ *
+ * @code
+ * @inproceedings{Musco2015,


Maybe this should be arora2012? I met one of the Musco brothers at NIPS last year, but I can't remember which one (probably Cameron?). A fun person to talk to. :)

You are absolutely right.

rcurtin · 2018-05-14T15:51:19Z

src/mlpack/methods/pca/decomposition_policies/stochastic_method.hpp

+    void Shuffle()
+    {
+      visitationOrder = arma::shuffle(
+          arma::linspace<arma::Row<size_t> >(0, data.n_cols - 1, data.n_cols));


It might give a nice speedup to shuffle the entire data matrix instead of keeping a visitationOrder.

zoq · 2018-06-12T19:10:31Z

Agreed stochastic methods can provide faster solutions, in this case, the runtime highly depends on the number of optimization steps (MaxIterations), since there is no condition to stop early; so this has to be tailored down for each task.

Besides the paper, what I really like is the combination of the existing optimization framework with the PCA class.

Timings:

data size: 100x100 rank: 100
----------------------------
ExactSVDPolicy: 6.29743 ms
RandomizedBlockKrylovSVDPolicy: 14.3532 ms
RandomizedSVDPolicy: 12.5829 ms
StochasticAdamPolicy (Iterations 300): 33.0697 ms
StochasticAdamPolicy (Iterations 100): 16.1772 ms

data size: 1000x1000 rank: 100
----------------------------
ExactSVDPolicy: 1039.41 ms
RandomizedBlockKrylovSVDPolicy: 757.691 ms
RandomizedSVDPolicy: 143.457 ms
StochasticAdamPolicy (Iterations 300): 422.688 ms
StochasticAdamPolicy (Iterations 100): 225.264 ms

data size: 5000x5000 rank: 100
----------------------------
ExactSVDPolicy: 119041 ms
RandomizedBlockKrylovSVDPolicy: 10375.3 ms
RandomizedSVDPolicy: 3881.91 ms
StochasticAdamPolicy (Iterations 300): 6435.66 ms
StochasticAdamPolicy (Iterations 100): 2148.02 ms

data size: 4000x8000 rank: 100
----------------------------
ExactSVDPolicy: 154379 ms
RandomizedBlockKrylovSVDPolicy: 7940.06 ms
RandomizedSVDPolicy: 2792.8 ms
StochasticAdamPolicy (Iterations 300): 3306.98 ms
StochasticAdamPolicy (Iterations 100): 2605.95 ms

data size: 6000x12000 rank: 100
----------------------------
ExactSVDPolicy: 530332 ms
RandomizedBlockKrylovSVDPolicy: 14385.7 ms
RandomizedSVDPolicy: 5892.35 ms
StochasticAdamPolicy (Iterations 300): 5990.87 ms
StochasticAdamPolicy (Iterations 100): 4726.83 ms

data size: 7000x19000 rank: 100
----------------------------
ExactSVDPolicy: 839472 ms
RandomizedBlockKrylovSVDPolicy: 25140.6 ms
RandomizedSVDPolicy: 10154.6 ms
StochasticAdamPolicy (Iterations 300): 9784.8 ms
StochasticAdamPolicy (Iterations 100): 8343.91 ms

rcurtin

Looks good to me, I enjoyed reading the paper. It had the shortest abstract of any paper I have ever read.

I only have a couple minor comments, mostly about not calculating the objective function. The timings look good to me, and I think it's a nice improvement. We could add it to pca_main.cpp if you want, and probably we should modify HISTORY.md. :)

rcurtin · 2018-06-27T18:05:58Z

src/mlpack/methods/pca/decomposition_policies/stochastic_method.hpp

+    {
+      Log::Warn << "StepSize(): invalid value (> 0), automatically take the "
+          << "negative direction." << std::endl;
+      optimizer.StepSize() *= -1;


Hmm, would it be easier to simply negate the objective function? The formulation of stochastic PCA here is a maximization, but the mlpack optimizers minimize. So if you just use the negative objective (and derive the gradient accordingly) I think that removes the need for this.

rcurtin · 2018-06-27T18:07:39Z

src/mlpack/methods/pca/decomposition_policies/stochastic_method.hpp

+      arma::mat newData = data.cols(ordering);
+
+      // If we are an alias, make sure we don't write to the original data.
+      math::ClearAlias(data);


Hm, I think that even if data is an alias, when we call data = std::move(newData) then data will no longer be an alias since it just takes the memory pointer from newData (which is not an alias). I am not 100% sure but about 95% sure on that, based on looking at the Armadillo matrix operator= for rvalue references.

Right, at this point I copied:

mlpack/src/mlpack/methods/logistic_regression/logistic_regression_function_impl.hpp

Lines 68 to 80 in e3fe135

{

MatType newPredictors;

arma::Row<size_t> newResponses;

math::ShuffleData(predictors, responses, newPredictors, newResponses);

// If we are an alias, make sure we don't write to the original data.

math::ClearAlias(predictors);

math::ClearAlias(responses);

// Take ownership of the new data.

predictors = std::move(newPredictors);

responses = std::move(newResponses);

(Sorry for the slow response here.) I see, I guess I originally wrote this code. I think that maybe we could omit the ClearAlias() call in both places, but I suppose it's not a huge deal either way---up to you if you want to make the change.

rcurtin · 2018-06-27T18:16:44Z

src/mlpack/methods/pca/decomposition_policies/stochastic_method.hpp

+                    const size_t /* begin */,
+                    const size_t /* batchSize */)
+    {
+      return pl;


I'm a bit confused about the psuedoloss here... I think you are trying to avoid the computation of the objective entirely (so maybe my comment about negative the objective makes no sense, since I wrote it before I got here!), but I'd be concerned that the optimizer might terminate early because it detects convergence.

From equation 1, I guess that the objective is just tr(U^T x x^T U). If we have some EvaluateWithGradient(), then we only have to do the extra computation of pre-multiplying U^T to the term we take in the gradient, and then taking the trace of that. So maybe it is not that much extra, but I could see that it might still make a difference.

What do you think, is there any problem computing the actual loss? I am not opposed to avoiding the objective calculation if it's not any problem.

rcurtin · 2018-06-27T18:17:33Z

src/mlpack/methods/pca/decomposition_policies/stochastic_method.hpp

+    updatePolicy.Update(iterate, Fargs...);
+
+    arma::mat R;
+    arma::qr_econ(iterate, R, iterate);


Just to check that I understand correctly, I believe that this is the renormalization step that isn't necessary every iteration. If that's the case, no need to respond, then I got it. :)

rcurtin · 2018-06-27T18:18:48Z

src/mlpack/methods/pca/decomposition_policies/stochastic_method.hpp

+   * @param iterate Parameters that minimize the function.
+   */
+  template<typename... Targs>
+  void Update(arma::mat& iterate, Targs... Fargs)


I guess if you want to be really picky it should be fargs (or fArgs?) not Fargs, but I dunno if you want to handle template pack parameters differently. :)

rcurtin

Everything looks good to me, just one confused comment. If you can clarify it would be great.

rcurtin · 2018-08-13T21:44:35Z

src/mlpack/methods/pca/decomposition_policies/stochastic_method.hpp

+      arma::mat newData = data.cols(ordering);
+
+      // If we are an alias, make sure we don't write to the original data.
+      math::ClearAlias(data);


(Sorry for the slow response here.) I see, I guess I originally wrote this code. I think that maybe we could omit the ClearAlias() call in both places, but I suppose it's not a huge deal either way---up to you if you want to make the change.

rcurtin · 2018-08-13T21:50:53Z

src/mlpack/methods/pca/decomposition_policies/stochastic_method.hpp

+  template<typename... Targs>
+  void Update(arma::mat& iterate, Targs... fArgs)
+  {
+    iterate *= -1;


(Oops, I put this in the wrong place originally.)

I guess, I am still just confused about this bit---shouldn't this be unnecessary if we simply negate the results of Evaluate() and EvaluateWithGradient() and Gradient()? Sorry if I am overlooking something---it's possible I've misunderstood. It just seems like extra computation here.

For PCA/PLS we take a (postive) step in the stochastic gradient direction; eq. 3, maybe I missed something that reverts:

mlpack/src/mlpack/core/optimizers/sgd/update_policies/vanilla_update.hpp

Line 58 in 2be49ae

iterate -= stepSize * gradient;

Ah, sorry that I never responded to this. I follow now, sorry for the confusion.

mlpack-bot

Second approval provided automatically after 24 hours. 👍

mlpack-bot

Second approval provided automatically after 24 hours. 👍

rcurtin · 2019-01-31T04:48:08Z

I think this needs transition to ensmallen before merge, but should be otherwise good to go. 👍

zoq · 2019-03-05T21:40:22Z

Agreed, let's see if I can do this on the next days.

zoq · 2019-04-14T18:57:47Z

@mlpack-jenkins test this please

zoq · 2019-04-14T19:03:28Z

@mlpack-jenkins test this please

zoq · 2019-04-14T19:20:29Z

@mlpack-jenkins test this please

mlpack-bot · 2019-07-19T02:25:42Z

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

rcurtin · 2019-07-19T15:09:50Z

@zoq what are you thinking for this one? I do think it would be nice to get merged, but not sure how much time you have to work on it.

mlpack-bot · 2019-08-18T16:05:24Z

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

zoq · 2019-08-25T17:07:00Z

Let's see if I can do the necessary changes in the next days.

…sis to the HISTORY.

mlpack-bot · 2019-09-30T13:05:37Z

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

zoq · 2019-09-30T19:48:58Z

This is ready, so let's keep this open.

birm · 2020-09-27T01:54:54Z

Bumping this; is this still ready?

rcurtin · 2020-10-24T16:12:39Z

Hey @zoq, this one got approved a long time ago but not merged---is everything still ready? If so, if you want to merge master in we can go ahead and merge it in. 👍

shrit · 2020-12-11T00:13:46Z

Agreed this is a cool feature to merge before releasing mlpack 4, @zoq great work 💯

shrit · 2020-12-18T11:37:37Z

I have marked this for mlpack 4. If you got a chance to merge it before that would be great 👍

rcurtin reviewed May 14, 2018

View reviewed changes

rcurtin reviewed Jun 27, 2018

View reviewed changes

rcurtin approved these changes Aug 13, 2018

View reviewed changes

rcurtin reviewed Aug 13, 2018

View reviewed changes

rcurtin added c: methods t: added feature labels Jan 19, 2019

mlpack-bot bot approved these changes Jan 21, 2019

View reviewed changes

mlpack-bot bot added the s: stale label Jul 19, 2019

mlpack-bot bot removed the s: stale label Jul 19, 2019

mlpack-bot bot added the s: stale label Aug 18, 2019

mlpack-bot bot closed this Aug 25, 2019

zoq reopened this Aug 25, 2019

mlpack-bot bot removed the s: stale label Aug 25, 2019

zoq added 6 commits August 30, 2019 20:51

Add stochastic PCA policy.

a87261d

Add stochastic PCA policy test.

542347e

Remove output.

c747918

Minor style fixes (comments, indentation, empty lines).

e386c9d

Fix bibtex citation.

ab0f62d

Shuffle the entire data matrix instead of keeping a visitationOrder.

be2869d

zoq added 7 commits August 30, 2019 20:51

Use parameter pack to forward the method parameter.

838e74e

Fix parameter pack naming.

16411d7

Add EvaluateWithGradient method and return loss approximation.

4c325c3

Minr style fix (remove empty line and extra '()').

1f02199

Use ensmallen for the update step.

ab5061f

Add stochastic (SGD, SVRG, Adam) policies.

cdeb1d7

Add Stochastic Approximation Algorithms for Principal Component Analy…

6637cb0

…sis to the HISTORY.

zoq force-pushed the spca branch from 526e9f2 to 6637cb0 Compare August 30, 2019 20:54

zoq added 3 commits August 30, 2019 22:57

Remove unused method.

a54a9d0

Minor style fixes (whitespace, line width).

ac527ae

Add option to set the step size and number of iterations.

151f072

mlpack-bot bot added the s: stale label Sep 30, 2019

zoq added s: keep open and removed s: stale labels Sep 30, 2019

Merge branch 'master' into spca

8e15e53

shrit added this to the mlpack 4.0.0 milestone Dec 18, 2020

zoq added this to mlpack 4.0.0 in Roadmap Jan 1, 2021

conradsnicta added s: stale and removed s: keep open labels Sep 10, 2023

mlpack-bot bot closed this Sep 17, 2023

	{
	MatType newPredictors;
	arma::Row<size_t> newResponses;

	math::ShuffleData(predictors, responses, newPredictors, newResponses);

	// If we are an alias, make sure we don't write to the original data.
	math::ClearAlias(predictors);
	math::ClearAlias(responses);

	// Take ownership of the new data.
	predictors = std::move(newPredictors);
	responses = std::move(newResponses);

Stochastic Optimization for PCA. #1391

Stochastic Optimization for PCA. #1391

Conversation

zoq commented May 11, 2018

rcurtin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zoq commented Jun 12, 2018

rcurtin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rcurtin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mlpack-bot bot left a comment

Choose a reason for hiding this comment

mlpack-bot bot left a comment

Choose a reason for hiding this comment

rcurtin commented Jan 31, 2019

zoq commented Mar 5, 2019

zoq commented Apr 14, 2019

zoq commented Apr 14, 2019

zoq commented Apr 14, 2019

mlpack-bot bot commented Jul 19, 2019

rcurtin commented Jul 19, 2019

mlpack-bot bot commented Aug 18, 2019

zoq commented Aug 25, 2019

mlpack-bot bot commented Sep 30, 2019

zoq commented Sep 30, 2019

birm commented Sep 27, 2020

rcurtin commented Oct 24, 2020

shrit commented Dec 11, 2020

shrit commented Dec 18, 2020