New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] Allow sparse input to incremental PCA #13960
[MRG+1] Allow sparse input to incremental PCA #13960
Conversation
880a7a3
to
17d7248
Compare
Looks good to me. |
17d7248
to
dc276bf
Compare
Please add an entry to the change log at doc/whats_new/v*.rst
. Like the other entries there, please reference this pull request with :issue:
and credit yourself (and other contributors if applicable) with :user:
.
3afe748
to
9e119f3
Compare
yes, but if the way you treat sparse matrices is to densify them, you might as well get the user to do that... |
A clarification: if the user wants to batchwise densify and fit multiple sparse matrices, then this is not currently possible. Example scenario: you want to fit a single estimator to multiple large |
Why can't the user currently pass them, dense, one by one to partial_fit?
|
I would presume that turning the entire sparse matrix to dense at once would be undesirable for memory reasons. |
That's why we require the user to make the data dense in situations where
doing it automatically may be deleterious.
|
9e119f3
to
5b45e71
Compare
Note: it's failing tests due to some other unrelated change, I think - failing tests in grid search CV. |
28c78a0
to
7ea580c
Compare
@NicolasHug @jnothman any further comments? |
Sorry my review time has been limited. Please add a check that partial_fit still raises an appropriate error when passed sparse X. Otherwise lgtm, thanks!
As Joel mentioned please test the error in partial_fit
for dense input.
Also all the methods that accept/return a sparse X
should be changed to array-like or sparse matrix
Otherwise LGTM too!
7ea580c
to
bf978a8
Compare
@jnothman @NicolasHug should be good to go now. |
bf978a8
to
cb5a185
Compare
Thanks Scott! |
Thanks @jnothman and @NicolasHug for the detailed reviews! |
Reference Issues/PRs
Fixes #13957.
What does this implement/fix? Explain your changes.
Implements sparse input for IncrementalPCA. IncrementalPCA is by design suited to accepting sparse input; this allows the input to be sparse, and if it is so, converts the data to dense on a batch-wise basis.