Fixes safe handling of small singular values in svds. #11829

evgueni-ovtchinnikov · 2020-04-09T10:47:31Z

Reference issue

Allows to close #12629
Partially closes #11406

What does this implement/fix?

The way to compute a few singular values and vectors of a matrix A used by svds is to compute eigenvectors of either A' A or A A', whichever is smaller in size.

Assuming some eigenvalues λ_i and corresponding eigenvectors v_i of A' A are computed, the original implementation of svds computed singular values and vectors as follows. Observing that the computed eigenvectors v_i are the right singular vectors of A, and the square roots of corresponding eigenvalues λ_i are corresponding singular values σ_i, the corresponding left singular values u_i were computed by dividing A v_i by σ_i so that u_i and v_i satisfy A v_i = σ_i u_i and A' u_i = σ_i^-1 A' A v_i = σ_i v_i.

This way was clearly unsafe: computed λ_i can be negative because of the round-off errors, σ_i can be zero if A is not full rank, and if computed σ_i are very small then u_i may lose orthogonality. In an attempt to fix this, at some point the current approach was implemented whereby eigenvalues λ_i below certain cut-off value were replaced by zeros, and corresponding left singular vectors by random vectors orthogonal to computed singular vectors. Such an approach does make sense if one employs svds for computing a low rank approximation of A. Assuming that the singular values are in descending order, the norm of the difference between A and the sum of σ_i u_i v'_i, i = 1, ..., k is equal to σ_{k + 1}. Therefore, if σ_{k + 1} falls below the required accuracy of approximation, σ_i, i = k + 1, ... can be zeroed and the respective singular vectors no longer play any role. However, in this kind of svds application, it is the user who is to decide how small σ_{k + 1} should be for this and further singular values and vectors to become of no importance, and this is what issue #12629 advocates.

But the real disaster may happen if the user is after few smallest singular values or needs to find the nullspace of A (the subspace of vectors v with zero A v), as has been discovered by #11406. The algorithm employed by svds for computing singular vectors corresponding to singular values below the cut-off requires the availability of all singular vectors corresponding to singular values above cut-off. The reason for the failure reported by #11406 is that in the case of small singular values computation, only a few such singular vectors are available, and computing the rest of them would essentially require full SVD.

This PR suggests an alternative fix for small singular values problem adopted by some other software implementing truncated SVD and PCA (e.g. scikit-learn and raleigh).

Assuming A 'tall' (the number of rows not less than the number of columns) and having computed V = [v₁, ... , v_k] , svds should compute SVD for A V:

A V = U Σ Q',

where U is the matrix whose columns are singular vectors of A V, square matrix Σ is diagonal with corresponding singular values of A V on the diagonal and Q is the matrix whose columns are the corresponding right singular vectors of A V. After which V is updated as V := V Q, and we end up with

A V = U Σ

and hence V' A' A V = Σ U' U Σ = Σ² = Λ, i.e. the columns of V are eigenvectors of A' A.

Note that the singular values are guaranteed to be non-negative, singular vectors are orthonormal to machine accuracy, and zero (or very small) singular values are no longer a problem at all.

The suggested fix eliminates the need for eigenvalues thresholding requested by #12629, allowing to close this issue.

It also fixes the problem discovered by #11406 for matrices with the number of rows not less than the number of columns.

Additional information

Regarding the extra cost of the suggested post-processing step, one needs to bear in mind that both arpack and lobpcg solvers employed by svds for computing eigenvectors of A' A or A A' involve the Rayleigh-Ritz procedure in subspaces of dimension not less than 2k, which takes O(k³) flops. Therefore, using these solvers only makes sense for k << min(n, m), where (n, m) = A.shape.

Assuming n > m >> k, the SVD of the n-by-k matrix A V would involve two steps: QR factorization of this matrix, and full SVD of the k-by-k factor R. The first step takes less than (n + 1)k(k + 1) multiplications, which is much less than 2 n m k needed for applying A' A to k vector iterates, and negligible extra memory. The second step takes O(k³) flops, which is still less, and 5 k² extra memory at worst (if divide-and-conquer algorithm is used).

Thus, the cost of the suggested post-processing step is going to be a small fraction of the overall computational costs.

The way to safeguard the computation of singular vectors in the presence of small singular values currently adopted carries substantial risks of discarding valid (and valuable) singular values and vectors. This commit implements alternative way, adopted by scikit-learn's and raleigh's PCA and truncated SVD solvers, whereby all singular values (no matter how small) and singular vectors are computed in a stable and accurate manner.

…alg.svds. The way to safeguard the computation of singular vectors in the presence of small singular values currently adopted carries substantial risks of discarding valid (and valuable) singular values and vectors. This commit implements alternative way, adopted by scikit-learn's and raleigh's PCA and truncated SVD solvers, whereby all singular values (no matter how small) and singular vectors are computed in a stable and accurate manner.

into fix-svds

ilayn · 2020-04-09T12:19:22Z

@mdhaber Here is a good use-case for ?gejsv. Thoughts?

ilayn · 2020-04-09T12:25:15Z

Selecting a tolerance is a nontrivial task for the users. So it should have a sensible default value. That depends often to the infinity norm of the matrix but in this case it is a bit more involved. I will try to check the issue a bit later.

tylerjereddy · 2020-05-20T22:37:01Z

I will try to check the issue a bit later.

@ilayn Do you have time to review this sufficiently before we branch in ~6 days?

ogauthe · 2020-07-29T14:31:34Z

Hello,

I have no opinion on the use of svd(AV), however I would be very happy with the removal of the cutoff (see #12629).

A simple optimization can be made when return_singular_vectors is False by avoiding computing singular vectors and vh dot product : lines 1875-1882 could be replaced with

    u = X_matmat(eigvec)
    if not return_singular_vectors:
        s = svd(u, compute_uv=False)
        return s
    u, s, vh = svd(u, full_matrices=False)
    vh = vh @ _herm(eigvec)

The update of vh can be even more restrained to cases where it is actually returned depending on transpose and return. I also kept the standard decreasing order convention for singular values (see #12627). I can make a separate PR if this is more convenient than modifying this one.

ogauthe · 2020-07-29T15:27:30Z

To write it completely while I'm on it: a solution could be from Line 1874

    u = X_matmat(eigvec)
    if not return_singular_vectors:
        s = svd(u, compute_uv=False)
        return s

    # compute the right singular vectors of X and update the left ones accordingly
    u, s, vh = svd(u, full_matrices=False)
    return_u = (return_singular_vectors == 'u')
    return_vh = (return_singular_vectors == 'vh')
    if not transpose:
        if return_vh:
            u = None
        if return_u:
            vh = None
        else:
           vh = vh @ _herm(eigvec)
        return u, s, vh
    else:
        if return_u:
            u = eigvec @ _herm(vh)
            return u, s, None
        if return_vh:
            return None, s, _herm(u)
        u, vh = eigvec @ _herm(vh), _herm(u)
        return u, s, vh

Had to remove 'const' in definitions of xx, vxx and vrr in spatial/ckdtree.pyx (would not build otherwise). Added myself to THANKS.txt.

evgueni-ovtchinnikov · 2020-08-11T12:37:26Z

@Gauthe: thanks a lot, implemented your suggestions!

ogauthe · 2020-08-19T18:05:05Z

I am bit puzzled why you kept the increasing order convention s[::-1] at Line 1877 when singular vectors are not computed, but used the decreasing order convention else. I think you should remove the [::-1].

Note that you may completely remove the functions _augmented_orthonormal_cols and _augmented_orthonormal_rows, they are no more used in the file.

evgueni-ovtchinnikov · 2020-08-23T15:31:39Z

@Gauthe: svd returns singular values and vectors in the descending order of values, svds returns them in the opposite order.

tylerjereddy · 2020-11-11T23:13:03Z

@ilayn @pmla Is this close enough to your expertise to give a review before we branch in about two weeks?

ilayn · 2020-11-12T08:56:48Z

@tylerjereddy I think we will need to bump this again ☹️

lobpcg · 2022-05-23T22:51:10Z

@evgueni-ovtchinnikov the codes here and at #16240 are basically identical now. If you could please update the summary here similar to that in #16240 my new PR will become redundant and could be closed

evgueni-ovtchinnikov · 2022-05-24T06:57:23Z

@lobpcg will do later today

lobpcg · 2022-05-26T15:57:32Z

@evgueni-ovtchinnikov the tentative deadlines are
May 26/2022: branch 1.9.x

May 29/2022: rc1

Apparently delayed by a few days.

If we miss this release, I could no longer help with this PR - life is too short.

evgueni-ovtchinnikov · 2022-05-26T16:07:16Z

@lobpcg I was waiting for your comments on the updated summary

if you are ok with it, then we are done, are we not?

should I then ask ilayn et al for approval?

lobpcg · 2022-05-26T16:10:30Z

@lobpcg I was waiting for your comments on the updated summary

if you are ok with it, then we are done, are we not?

should I then ask ilayn et al for approval?

I have not noticed the update, sorry. Let me have a look.

Yes, please ask @ilayn . With the new summary it should be more clear what this PR does.

lobpcg · 2022-05-26T16:33:03Z

@evgueni-ovtchinnikov the summary should follow the format that I use in #16240 since it is automatically processed

Since the unit tests here and in #16240 are the same, you could repeat that "the new unit tests specifically for very small singular values that fail on the main; see https://github.com/scipy/scipy/runs/6538373727?check_suite_focus=true " which specifically demonstrates that this is a bug-fix, rather than enhancement PR.

You can put most the current text in the Additional information but may be you want to cut it and refer to https://en.wikipedia.org/wiki/Rayleigh%E2%80%93Ritz_method#Using_the_normal_matrix instead.

Please add the following info in the Section "What does this implement/fix?"

why it only partially closes scipy.sparse.linalg.svds (v1.4.1) on singular matrix does not give vh vectors in null space #11406
the extra cost of postprocessing

evgueni-ovtchinnikov · 2022-05-26T17:54:46Z

@lobpcg thanks for reminding about the format - last time I read about it was two years ago, when this PR was created.

The summary is updated.

evgueni-ovtchinnikov · 2022-05-27T16:37:07Z

@ilayn @mdhaber @tupui @tylerjereddy @ogauthe @pv @lobpcg
The rationale for this PR has been re-defined as a bugfix for #11406 - could someone please have a look and formally review it to give it a chance to be merged before 1.9.x is released?

ogauthe · 2022-05-30T10:10:36Z

I had a look a it. The PR rationale seems more clear to me.

I am somewhat puzzled why the QR is needed on line 329. I had a look at

scipy/scipy/sparse/linalg/_eigen/lobpcg/lobpcg.py

Line 771 in ad939bf

# Future work: Need to add Postprocessing here:

, I understand it as eigenvectors may not be fully orthogonal. Is it something specific to lobpcg? I expect the restarted Lanczos method from ARPACK to always return orthogonal eigenvectors, but I may be too optimistic. I think adding a comment explaining its use would be useful.

evgueni-ovtchinnikov · 2022-05-30T10:32:57Z

@lobpcg could you add a comment explaining why you added QR, please? My guess is that in lobpcg, Rayleigh-Ritz procedure uses non-orthogonal basis vectors in the trial subspace, which may cause eigenvectors' deviation from orthogonality - could it be the reason?

# solver lobpcg does not guarantee exactly orthonormal eigenvectors if solver == 'lobpcg': eigvec, _ = np.linalg.qr(eigvec)

lobpcg · 2022-05-30T16:17:11Z

I had a look a it. The PR rationale seems more clear to me.

I am somewhat puzzled why the QR is needed on line 329. I had a look at

scipy/scipy/sparse/linalg/_eigen/lobpcg/lobpcg.py

Line 771 in ad939bf

# Future work: Need to add Postprocessing here:

, I understand it as eigenvectors may not be fully orthogonal. Is it something specific to lobpcg? I expect the restarted Lanczos method from ARPACK to always return orthogonal eigenvectors, but I may be too optimistic. I think adding a comment explaining its use would be useful.

@ogauthe great point, thanks! It was my feeling too that only lobpcg may output not fully orthonormal eigenvectors and thus require this extra qr for all unit tests to pass, not arpack. I've made the corresponding change in the code and keep fingers crosses that all tests would still be green.

Could you please may be perform a formal review of this PR, hoping that it might help to have it accepted?

lobpcg · 2022-05-30T16:25:33Z

@lobpcg could you add a comment explaining why you added QR, please? My guess is that in lobpcg, Rayleigh-Ritz procedure uses non-orthogonal basis vectors in the trial subspace, which may cause eigenvectors' deviation from orthogonality - could it be the reason?

@evgueni-ovtchinnikov The possible inexact orthogonality in lobpcg comes from the fact that the RR matrices are computed implicitly to avoid extra matmuls during iterations and the postprocessing is still not implemented that would run one final exact RR and compute exact residuals, like @ogauthe notices above. I have edited the code to reflect this.

I have made #16319 as a reminder to have this finally done in lobpcg internally one day.

ogauthe

If the QR only applies to the lobpcg case then I think it would be more clear to put this after line 295.

tylerjereddy · 2022-05-30T19:43:21Z

On balance, I'm going to squash merge this in before branching since it looks like domain experts think it merits merging, the CI is passing and there are regression tests added that claim to fix a bug. The patch diff coverage is also pretty solid (that this, of course, not a guarantee of correctness though).

That said, the discussion was extremely hard to follow and there seem to be some concerns from @ilayn, which worries me a little. It looks like a unit test has also been temporarily disabled which makes me a bit nervous, but I understand that several additional/unrelated issues are confounding the purpose of this PR.

I guess we'll see how this fares during the release candidate stage of testing. I'm prepared to deal with the burden of i.e., an additional release candidate if this ultimately causes too many issues, but hopefully it will work out. I did notice some frustration in the discussion above, and I do think we want this kind of deep expertise engaged in our specialized linalg codes, so hopefully future adjustments will be smoother for all involved.

ilayn · 2022-05-30T21:28:29Z

I don't think this PR does the right thing at all but I don't have any capacity to review for a while because of my work. So I'll leave it to the sparse residents. It is concerning that sparse library is becoming a dense linalg with a mild sparse flavor in it and none of the tools are effectively tested with sparse problems just to put in record.

lobpcg · 2022-05-30T21:58:39Z

I don't think this PR does the right thing at all but I don't have any capacity to review for a while because of my work. So I'll leave it to the sparse residents. It is concerning that sparse library is becoming a dense linalg with a mild sparse flavor in it and none of the tools are effectively tested with sparse problems just to put in record.

@ilayn Unit tests are difficult if at all possible to effectively test with sparse problems. It would be surely beneficial for the community if someone writes solid benchmarking for the sparse library, including eigensolvers and svds and integrates it into scipy documentation. Creating such a issue could be a good first step in this direction.

lobpcg · 2022-05-30T22:20:53Z

@tylerjereddy thanks for the merge! There are a few of small issues with svds, e.g., #15152 that this merge now opens an opportunity to quickly fix

ilayn · 2022-05-30T22:23:58Z

No they are not difficult. As we discussed above, they can be conveniently added to the test suite. We just don't do it and I still don't see a valid reason why we don't. A very large sparse matrix with small nnz is not a problem to incorporate. Most sparse features are tested with 3x3 examples. That is not something we can continue if we want this library to be relevant to sparse users. Otherwise it will share the same fate as parts of scipy.signal is having.

lobpcg · 2022-05-30T22:31:18Z

@ilayn difficult for me to write it from scratch. Since it is easy for you, may be you could add one example for everybody to follow? E.g., I would be glad to reuse such a unit test as a sample, if such a working unit test was already merged.

ilayn · 2022-06-02T04:19:14Z

There is already one I've put above as a test case why this PR is not working as intended.

We should probably first get you to build SciPy locally. So that you can test things instead of relying on CI/CD

lobpcg · 2022-06-04T21:42:48Z

There is already one I've put above as a test case why this PR is not working as intended.

@ilayn to my recollection, your example was not a unit test, so I could not just add it to the set of existing unit tests unfortunately. It would be great if you open a PR specifically to add at least one unit test that you like for any of the sparse solvers, svds, eigenvalues or linear systems that could be later used as a sample for other solvers after it gets merged.

evgueni-ovtchinnikov and others added 3 commits April 5, 2020 16:57

Merge branch 'fix-svds' of https://github.com/evgueni-ovtchinnikov/scipy

32e34af

into fix-svds

miladsade96 added maintenance Items related to regular maintenance tasks scipy.sparse.linalg labels Apr 9, 2020

evgueni-ovtchinnikov added 5 commits April 9, 2020 15:29

added missing partial return handling

7beba07

BUG: fixed two bugs in test_arpack.py

bc3ed52

BUG: fixed a bug in test_svd_LM_ones_matrix()

91e15e5

BUG: another attempt to fix a bug in test_svd_LM_ones_matrix()

785c437

BUG: fixed E261 in arpack.py

fc9020c

pv added this to the 1.5.0 milestone Apr 13, 2020

tylerjereddy modified the milestones: 1.5.0, 1.6.0 May 25, 2020

ogauthe mentioned this pull request Jul 29, 2020

sparse SVD cutoff #12629

Closed

evgueni-ovtchinnikov added 3 commits August 11, 2020 12:17

Implemented code optimizations suggested by Olivier Gauthe.

398dd2e

Had to remove 'const' in definitions of xx, vxx and vrr in spatial/ckdtree.pyx (would not build otherwise). Added myself to THANKS.txt.

restored spatial/ckdtree.pyx (spatial/tests/test_kdtree.py failed)

eb4acec

restored THANKS.txt to avoid conflicts

6de19fb

corrected the order of singular values and vectors

33d2ebe

tylerjereddy modified the milestones: 1.6.0, 1.7.0 Nov 13, 2020

extra empty line removed

dc524e8

evgueni-ovtchinnikov requested a review from ilayn May 27, 2022 16:37

run extra qr for lobpcg only + comment

d8eaf98

# solver lobpcg does not guarantee exactly orthonormal eigenvectors if solver == 'lobpcg': eigvec, _ = np.linalg.qr(eigvec)

ogauthe reviewed May 30, 2022

View reviewed changes

move qr to the right place as proposed

237a28f

lobpcg mentioned this pull request May 30, 2022

ENH: improved accuracy and orthonormality of output eigenvectors in lobpcg #16319

Closed

tylerjereddy merged commit 01621e0 into scipy:main May 30, 2022

lobpcg deleted the fix-svds branch June 26, 2022 18:08

lobpcg mentioned this pull request Jul 23, 2022

random uniform -> normal to initiate lobpcg and arpack in svds #16323

Closed

thalassemia mentioned this pull request Mar 13, 2024

TST: Tweak tols and ignore warnings for more reliable SVD tests #20221

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes safe handling of small singular values in svds. #11829

Fixes safe handling of small singular values in svds. #11829

evgueni-ovtchinnikov commented Apr 9, 2020 •

edited

ilayn commented Apr 9, 2020

ilayn commented Apr 9, 2020

tylerjereddy commented May 20, 2020

ogauthe commented Jul 29, 2020

ogauthe commented Jul 29, 2020

evgueni-ovtchinnikov commented Aug 11, 2020

ogauthe commented Aug 19, 2020 •

edited

evgueni-ovtchinnikov commented Aug 23, 2020

tylerjereddy commented Nov 11, 2020

ilayn commented Nov 12, 2020

lobpcg commented May 23, 2022

evgueni-ovtchinnikov commented May 24, 2022

lobpcg commented May 26, 2022 •

edited

evgueni-ovtchinnikov commented May 26, 2022

lobpcg commented May 26, 2022

lobpcg commented May 26, 2022 •

edited

evgueni-ovtchinnikov commented May 26, 2022

evgueni-ovtchinnikov commented May 27, 2022

ogauthe commented May 30, 2022

evgueni-ovtchinnikov commented May 30, 2022

lobpcg commented May 30, 2022

lobpcg commented May 30, 2022 •

edited

ogauthe left a comment

tylerjereddy commented May 30, 2022

ilayn commented May 30, 2022 •

edited

lobpcg commented May 30, 2022

lobpcg commented May 30, 2022

ilayn commented May 30, 2022

lobpcg commented May 30, 2022

ilayn commented Jun 2, 2022

lobpcg commented Jun 4, 2022

Fixes safe handling of small singular values in svds. #11829

Fixes safe handling of small singular values in svds. #11829

Conversation

evgueni-ovtchinnikov commented Apr 9, 2020 • edited

Reference issue

What does this implement/fix?

Additional information

ilayn commented Apr 9, 2020

ilayn commented Apr 9, 2020

tylerjereddy commented May 20, 2020

ogauthe commented Jul 29, 2020

ogauthe commented Jul 29, 2020

evgueni-ovtchinnikov commented Aug 11, 2020

ogauthe commented Aug 19, 2020 • edited

evgueni-ovtchinnikov commented Aug 23, 2020

tylerjereddy commented Nov 11, 2020

ilayn commented Nov 12, 2020

lobpcg commented May 23, 2022

evgueni-ovtchinnikov commented May 24, 2022

lobpcg commented May 26, 2022 • edited

evgueni-ovtchinnikov commented May 26, 2022

lobpcg commented May 26, 2022

lobpcg commented May 26, 2022 • edited

evgueni-ovtchinnikov commented May 26, 2022

evgueni-ovtchinnikov commented May 27, 2022

ogauthe commented May 30, 2022

evgueni-ovtchinnikov commented May 30, 2022

lobpcg commented May 30, 2022

lobpcg commented May 30, 2022 • edited

ogauthe left a comment

Choose a reason for hiding this comment

tylerjereddy commented May 30, 2022

ilayn commented May 30, 2022 • edited

lobpcg commented May 30, 2022

lobpcg commented May 30, 2022

ilayn commented May 30, 2022

lobpcg commented May 30, 2022

ilayn commented Jun 2, 2022

lobpcg commented Jun 4, 2022

evgueni-ovtchinnikov commented Apr 9, 2020 •

edited

ogauthe commented Aug 19, 2020 •

edited

lobpcg commented May 26, 2022 •

edited

lobpcg commented May 26, 2022 •

edited

lobpcg commented May 30, 2022 •

edited

ilayn commented May 30, 2022 •

edited