Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate the impact of LU-variants on randomized SVD with scipy's lu_no_fortran branch #16

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ogrisel
Copy link
Owner

@ogrisel ogrisel commented May 23, 2023

Experiment with the lu_no_fortan branch of scipy (scipy/scipy#18358) to:

Here are the resulting plots of the randomized SVD benchmarks with various normalizers:

pca_uncorrelated_matrix
pca_a3a
pca_mnist_784
pca_20newsgroups
pca_olivetti_faces
pca_lfw_people
pca_low_rank_matrix

So in conclusion:

  • Using LU-based normalization while completely discarding the permutation info is sometimes better and sometimes worse (!!!) than no normalization at all, but always worse than an LU-based normalizer that takes the permutation into account one way or another.
  • Using index-based permutation (using matrix_p=False followed by row-wise fancing indexing) seems to be approximately as fast as letting scipy precompute the permutation with permute_l=True).
  • LU-based normalization with the new Cython-based scipy branch is still slightly faster than QR-based normalization (on average).

@ogrisel ogrisel changed the title Investigate the impact of LU-variants on randomized SVD with scipy u_no_fortran Investigate the impact of LU-variants on randomized SVD with scipy's lu_no_fortran branch May 23, 2023
@lezcano
Copy link

lezcano commented May 23, 2023

Fair enough. Thank you for the benchmarks!

There's just one bit I don't understand. How come in the last graph "LU_no_permute" and "QR" seem to "come back in time" (iter 0 happens AFTER iter 1)?

@ogrisel
Copy link
Owner Author

ogrisel commented May 23, 2023

There's just one bit I don't understand. How come in the last graph "LU_no_permute" and "QR" seem to "come back in time" (iter 0 happens AFTER iter 1)?

Those are random fluctuation for the small times. Ideally, I should re-run the benchmarks many times and plot the average with horizontal error bars, but life is too short ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants