Make it possible to use ARPACK or randomized_svd solvers in PLSSVD #18931

ogrisel · 2020-11-27T17:31:43Z

As initially discussed in #18302 (comment) it might be interesting to add an extra constructor param to PLSSVD to select ARPACK or the sklearn's randomized_svd solver instead of the default LAPACK solver (from scipy.linalg.svd).

But the ARPACK and randomized_svd are non-deterministic so we would also need to add a random_state parameter.

Careful benchmarking to evaluate the speed vs numerical or statistical accuracy trade-off should be conducted to:

help the user choose the value of this parameter (both in the docstring and the user guide)
suggest an "auto" strategy to automatically select a good solver based on the shape of the data and the n_components parameter, similar to what is done in the PCA and TruncateSVD estimators. This "auto" parameter shall become the default after the usual deprecation period.

The text was updated successfully, but these errors were encountered:

ogrisel · 2020-11-27T17:36:48Z

This is related to #12069 that does a similar thing for KernelPCA.

ogrisel added the New Feature label Nov 27, 2020

ogrisel mentioned this issue Nov 27, 2020

MNT initialize weights when using ARPACK solver with a utility #18302

Merged

cmarmo added the module:decomposition label Jan 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make it possible to use ARPACK or randomized_svd solvers in PLSSVD #18931

Make it possible to use ARPACK or randomized_svd solvers in PLSSVD #18931

ogrisel commented Nov 27, 2020 •

edited

ogrisel commented Nov 27, 2020

Make it possible to use ARPACK or randomized_svd solvers in PLSSVD #18931

Make it possible to use ARPACK or randomized_svd solvers in PLSSVD #18931

Comments

ogrisel commented Nov 27, 2020 • edited

ogrisel commented Nov 27, 2020

ogrisel commented Nov 27, 2020 •

edited