SparsePCA implementation based on sparse dictionary learning #13127

iancovert · 2019-02-10T04:10:33Z

Description

Hi developers,

There's a sklearn model called SparsePCA, but it doesn't seem to be SPCA at all. According to documentation here, it's based on a sparse dictionary learning paper (this one). This seems like a naming mistake, and is especially odd because there's also a model called DictionaryLearning that seems to optimize a very similar sparse coding objective.

I was looking for an open-source SPCA implementation, ideally based on the original paper, and was confused to find out that the sklearn one is something else altogether. I think I'm not the only one, e.g. I found open bug #11512 where a user requests a feature based on the SPCA paper that doesn't make as much sense in the context of dictionary learning.

Are there plans to fix this inconsistency? Would there be interest from other users in having a SPCA implementation?

I ended up implementing the model myself so would be happy to help, if this seems important.

Thanks,
Ian

References

H. Zou, T. Hastie, R. Tibshirani: Sparse Principal Components Analysis, Journal of Computational and Graphical Statistics, 2006.

J. Mairal, F. Bach, J. Ponce, G. Sapiro: Online Dictionary Learning for Sparse Coding, ICML 2009.

agramfort · 2019-02-10T17:22:44Z

indeed it's not the same algorithmic approach that is currently used. Yet it maximizes explained variance with a reduced rank under sparsity constraints on the components (not the loadings as dict learning). We should start by clarifying the doc + eventually add the original solver from Zou et al. provided it scales as well or gives better result (but better how?). my 2c

…

vincentbillaut · 2019-07-31T16:21:55Z

Found out it didn't do SPCA at all, also.
I think the very least would be to rename the class into something less ambiguous.

In the meantime I'm using the R package from the paper.

ogrisel · 2020-08-05T13:45:51Z

indeed it's not the same algorithmic approach that is currently used. Yet it maximizes explained variance with a reduced rank under sparsity constraints on the components (not the loadings as dict learning).

The definition in the Wikipedia article uses a constraint on the l_0 norm of the components:

https://en.wikipedia.org/wiki/Sparse_PCA

The Sparse PCA paper by Zou et al. uses a constraint on the l_1 norm of the components which is more in line with the l1 penalty used in the objective function of the SparsePCA class in scikit-learn.

However, both the Wikipedia article and the Zou et al. paper also have an orthogonality contraint which is not enforced in scikit-learn (and does not hold in practice either).

So indeed, the naming in scikit-learn is highly misleading. We should at least implement a solver that provides the user with the ability to respect the orthogonality constraint and maybe keep the current solver as an alternative solver that cannot enforce this constraint.

ogrisel · 2020-08-11T14:48:31Z

Speaking offline with @GaelVaroquaux, it's debatable whether SparsePCA without orthogonality constraints can be considered natural/intuitive or not:

When there is no penalty on the components, PCA with n_components < n_features will automatically converge to orthogonal components when optimizing for maximal explained variance (equivalent to minimizing the squared reconstruction error). This is no longer the case when we add an l1 penatly on the components to the objective function. One could therefore argue that adding an orthogonality constraint on top of the "max explained variance with l1 penalty" objective is arbitrary and according to Gael the literature was not clear about what Sparse PCA means.

Recently the definition of Zou et al. (which is quite similar to the one used in Wikipedia but not exactly the same) seems to have won so we might still want to rename the model we implemented in scikit-learn to avoid confusion but I am not sure what to rename it too.

ogrisel · 2020-08-11T14:53:22Z

We should start by clarifying the doc + eventually add the original solver from Zou et al. provided

Adding the Zou et al. implementation (with orthogonality constraints) in the same class is another option if people find it practically useful. @iancovert what is your use case? In particular why do you need to enforce the orthogonality of the components for your application?

iancovert · 2020-08-19T23:58:00Z

Hi Olivier. At the time I was looking for a SPCA solver that would produce a low-dimensional embedding using sparse loading vectors, and it was mainly the sparse loading vectors that I wanted. The solver from Zou et al. calculates the loadings via an alternating optimization algorithm, and that's what I personally expected to be implemented here. But you seem to be right that there are many SPCA variants out there.

I had no need for the loadings to be orthogonal, which, as you've mentioned, is not guaranteed by this solver (p.272 of the paper).

Since I required the loading vectors, my issue with the algorithm currently implemented here (according to this) is that it solves directly for the embedding and does not calculate loading vectors. Other users may have different requirements, I'm not sure because I'm not a frequent SPCA user.

ogrisel mentioned this issue Aug 5, 2020

[MRG] Adding explained variances to sparse pca #16255

Closed

ogrisel mentioned this issue Aug 11, 2020

Extracting explained variance in SparsePCA #11512

Open

cmarmo added the module:decomposition label Jan 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SparsePCA implementation based on sparse dictionary learning #13127

SparsePCA implementation based on sparse dictionary learning #13127

iancovert commented Feb 10, 2019

agramfort commented Feb 10, 2019 via email

vincentbillaut commented Jul 31, 2019

ogrisel commented Aug 5, 2020

ogrisel commented Aug 11, 2020 •

edited

ogrisel commented Aug 11, 2020

iancovert commented Aug 19, 2020 •

edited

SparsePCA implementation based on sparse dictionary learning #13127

SparsePCA implementation based on sparse dictionary learning #13127

Comments

iancovert commented Feb 10, 2019

Description

References

agramfort commented Feb 10, 2019 via email

vincentbillaut commented Jul 31, 2019

ogrisel commented Aug 5, 2020

ogrisel commented Aug 11, 2020 • edited

ogrisel commented Aug 11, 2020

iancovert commented Aug 19, 2020 • edited

ogrisel commented Aug 11, 2020 •

edited

iancovert commented Aug 19, 2020 •

edited