Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT, TST: test_svds_parameter_tol failures #20157

Closed
tylerjereddy opened this issue Feb 28, 2024 · 7 comments · Fixed by #20221
Closed

MAINT, TST: test_svds_parameter_tol failures #20157

tylerjereddy opened this issue Feb 28, 2024 · 7 comments · Fixed by #20221
Labels
maintenance Items related to regular maintenance tasks scipy.sparse.linalg
Milestone

Comments

@tylerjereddy
Copy link
Contributor

Building and testing against NumPy 1.26.4 or main I sporadically see two different types of test failures in Test_SVDS_LOBPCG::test_svds_parameter_tol (see below the fold) on x86_64 Linux. Looks like this was just touched in gh-19855, so maybe some simple explanation related to some of the tol changes already made there. I've got the PROPACK submodule update pulled in, and it looks like if I go back to a commit that precedes that update I can't reproduce even with pytest-repeat and a bunch of replicates on Test_SVDS_LOBPCG.

============================================================================================ FAILURES =============================================================================================
____________________________________________________________________________ Test_SVDS_LOBPCG.test_svds_parameter_tol _____________________________________________________________________________
scipy/sparse/linalg/_eigen/tests/test_svds.py:311: in test_svds_parameter_tol
    assert error < accuracy
E   assert np.float64(2.5068101941265085e-09) < 1e-10
        A          = <100x100 sparse matrix of type '<class 'numpy.float64'>'
	with 6624 stored elements in Compressed Sparse Column format>
        _          = array([[-0.06798121, -0.06969187, -0.1634784 , ..., -0.08870255,
        -0.08289897, -0.07725935],
       [-0.0130691...722,  0.12174589],
       [ 0.03737577,  0.07600953, -0.04798564, ..., -0.08283926,
        -0.04644228,  0.04356348]])
        accuracies = {'arpack': [2.5e-15, 1e-10, 1e-10], 'lobpcg': [1e-10, 0.001, 10], 'propack': [1e-12, 1e-06, 0.0001]}
        accuracy   = 1e-10
        err        = <function SVDSCommonTests.test_svds_parameter_tol.<locals>.err at 0x7f71b2e25d00>
        error      = np.float64(2.5068101941265085e-09)
        k          = 3
        n          = 100
        rng        = Generator(PCG64) at 0x7F71B2DEB300
        s          = array([3.57532733e-01, 1.27553875e-01, 1.19870964e-01, 1.15829350e-01,
       1.12545658e-01, 1.09114116e-01, 1.028058...2.69065672e-04, 1.72149810e-04, 1.65442589e-04,
       7.44147686e-05, 3.32524683e-05, 2.12349114e-06, 1.50715281e-07])
        self       = <scipy.sparse.linalg._eigen.tests.test_svds.Test_SVDS_LOBPCG object at 0x7f71b335f890>
        tol        = 0.0001
        tols       = [0.0001, 0.01, 1.0]
===================================================================================== short test summary info =====================================================================================
FAILED scipy/sparse/linalg/_eigen/tests/test_svds.py::Test_SVDS_LOBPCG::test_svds_parameter_tol - assert np.float64(2.5068101941265085e-09) < 1e-10
____________________________________________________________________________ Test_SVDS_LOBPCG.test_svds_parameter_tol _____________________________________________________________________________
scipy/sparse/linalg/_eigen/tests/test_svds.py:310: in test_svds_parameter_tol
    error = err(tol)
        A          = <100x100 sparse matrix of type '<class 'numpy.float64'>'
	with 6624 stored elements in Compressed Sparse Column format>
        _          = array([[-0.06798121, -0.06969187, -0.1634784 , ..., -0.08870255,
        -0.08289897, -0.07725935],
       [-0.0130691...722,  0.12174589],
       [ 0.03737577,  0.07600953, -0.04798564, ..., -0.08283926,
        -0.04644228,  0.04356348]])
        accuracies = {'arpack': [2.5e-15, 1e-10, 1e-10], 'lobpcg': [1e-10, 0.001, 10], 'propack': [1e-12, 1e-06, 0.0001]}
        accuracy   = 1e-10
        err        = <function SVDSCommonTests.test_svds_parameter_tol.<locals>.err at 0x7f127f03dd00>
        k          = 3
        n          = 100
        rng        = Generator(PCG64) at 0x7F127F002F80
        s          = array([3.57532733e-01, 1.27553875e-01, 1.19870964e-01, 1.15829350e-01,
       1.12545658e-01, 1.09114116e-01, 1.028058...2.69065672e-04, 1.72149810e-04, 1.65442589e-04,
       7.44147686e-05, 3.32524683e-05, 2.12349114e-06, 1.50715281e-07])
        self       = <scipy.sparse.linalg._eigen.tests.test_svds.Test_SVDS_LOBPCG object at 0x7f127f4efbd0>
        tol        = 0.0001
        tols       = [0.0001, 0.01, 1.0]
scipy/sparse/linalg/_eigen/tests/test_svds.py:295: in err
    with pytest.warns(UserWarning, match=msg):
E   Failed: DID NOT WARN. No warnings of type (<class 'UserWarning'>,) were emitted.
E   The list of emitted warnings is: [].
        A          = <100x100 sparse matrix of type '<class 'numpy.float64'>'
	with 6624 stored elements in Compressed Sparse Column format>
        _          = array([[-6.13645026e-02,  5.01581761e-02, -1.19162447e-01,
        -6.05824505e-02, -5.06680242e-02,  1.67840546e-02,
...-7.40271118e-02, -4.17605366e-02,
        -8.17292926e-02, -8.87025530e-02, -8.28989662e-02,
        -7.72593582e-02]])
        k          = 3
        msg        = 'Exited at iteration.*|Exited postprocessing with accuracies.*'
        n          = 100
        s          = array([3.57532733e-01, 1.27553875e-01, 1.19870964e-01, 1.15829350e-01,
       1.12545658e-01, 1.09114116e-01, 1.028058...2.69065672e-04, 1.72149810e-04, 1.65442589e-04,
       7.44147686e-05, 3.32524683e-05, 2.12349114e-06, 1.50715281e-07])
        s2         = array([0.11987096, 0.12755388, 0.35753273])
        self       = <scipy.sparse.linalg._eigen.tests.test_svds.Test_SVDS_LOBPCG object at 0x7f127f4efbd0>
        tol        = 0.0001
@tylerjereddy tylerjereddy added the maintenance Items related to regular maintenance tasks label Feb 28, 2024
@lucascolley lucascolley added this to the 1.13.0 milestone Feb 29, 2024
@tylerjereddy
Copy link
Contributor Author

Not sure why I'd be different from the CI though. Best guess would be OpenBLAS version maybe, I often pull that from spack. That has been stable locally for many months though.

Run-time dependency scipy-openblas found: NO (tried pkgconfig)
Run-time dependency openblas found: YES 0.3.23
Dependency openblas found: YES 0.3.23 (cached)

@rgommers rgommers changed the title MATIN, TST: test_svds_parameter_tol failures MAINT, TST: test_svds_parameter_tol failures Mar 2, 2024
@thalassemia
Copy link
Contributor

Thanks for pointing this out. In the linked PR, I enabled this test by deleting the following:

def test_svds_parameter_tol(self):
-    if self.solver == 'propack':
-        if not has_propack:
-            pytest.skip("PROPACK not available")
-    return  # TODO: needs work, disabling for now

This test was essentially skipped before my PR was merged, which explains why no failures were previously observed. Seems the comment I deleted still applies even though the test passes in CI.

@tylerjereddy
Copy link
Contributor Author

tylerjereddy commented Mar 6, 2024

In case something on my end could be fixed/avoided, here are more detailed OpenBLAS specs (both version reproduce):

-- linux-ubuntu22.04-skylake / gcc@11.3.0 -----------------------
openblas@0.3.23~bignuma~consistent_fpcsr+fortran~ilp64+locking+pic+shared build_system=makefile symbol_suffix=none threads=none  /home/treddy/github_projects/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/openblas-0.3.23-vtzm3if4lq22ni6ftwf6xkqrr2zolk3l
openblas@0.3.26~bignuma~consistent_fpcsr+fortran~ilp64+locking+pic+shared build_system=makefile symbol_suffix=none threads=none  /home/treddy/github_projects/spack/opt/spack/linux-ubuntu22.04-skylake/gcc-11.3.0/openblas-0.3.26-gt4gcloeyzvdir73ryejeuo34h6l4jfr

~ = disabled; + = feature enabled

@tylerjereddy
Copy link
Contributor Author

I reproduced locally with scipy-openblas32 0.3.23.293.2 as well:

  • python dev.py build -j 32 --with-scipy-openblas
  • python dev.py -n test -j 32 -t scipy/sparse/linalg/_eigen/tests/test_svds.py::Test_SVDS_LOBPCG -- --count=100

So, can't be my specific OpenBLAS config I don't think. How strange.

@rgommers
Copy link
Member

I'll note that the test says:

# loop instead of parametrize for simplicity

this is quite unhealthy, since svds draws random numbers internally. So enabling the propack tests changed the numpy.random state used for the LOBPCG tests probably, and that made the tests unstable.

@thalassemia
Copy link
Contributor

So enabling the propack tests changed the numpy.random state used for the LOBPCG tests probably, and that made the tests unstable.

I thought about this as well and tested it by hard-coding the random_state kwarg for svds to a few different values. With this change, the test either always passes or always fails. For example, 0 always passes on my Linux machine but always fails on my Mac, showing that the test is indeed very sensitive to the RNG state. I like the current random behavior because it allowed us to discover the huge variability in LOBPCG accuracy. In gh-20221, I loosened the tolerance checks enough for the test to consistently pass on Linux and macOS. However, as @lobpcg noted, the new LOBPCG tolerance checks are practically worthless because of how permissive they are. Maybe we could use them as a temporary stopgap while we investigate the source of this variability.

@rgommers
Copy link
Member

I like the current random behavior because it allowed us to discover the huge variability in LOBPCG accuracy

I think we kinda already knew that - but either way, that's for putting in an issue rather than keeping in the test suite. We have a hard rule for that, tests should use seed random number state so the tests are deterministic. Otherwise we get these random failures in unrelated PRs or elsewhere. With tens of thousands of tests in the test suite, that isn't maintainable.

I'd suggest removing the randomness, and then choosing a seed at which the more strict tolerances don't fail. And opening an issue with another seed where the accuracy is very bad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Items related to regular maintenance tasks scipy.sparse.linalg
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants