Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PPC builds fail silently during test collection #186

Closed
h-vetinari opened this issue Aug 8, 2021 · 11 comments
Closed

PPC builds fail silently during test collection #186

h-vetinari opened this issue Aug 8, 2021 · 11 comments

Comments

@h-vetinari
Copy link
Member

h-vetinari commented Aug 8, 2021

I had been quite happy that the PPC builds stopped timing out all the time with the switch from travis to azure, and didn't inspect further since the CI passed.

However, I now saw that the test suite on PPC already fails during test collection, and then passes silently. There's at least a scipy issue involved (probably scipy/scipy#14560), but then also a pytest (or conda-build) issue: why does this do exit 0?!?

+ python -c 'import numpy; numpy._pytesttester._show_numpy_info()'
NumPy version 1.21.1
NumPy relaxed strides checking option: True
NumPy CPU features:  VSX VSX2 VSX3*
+ python -c 'import scipy; scipy.test(verbose=2, label='\''full'\'', extra_argv=['\''--durations=50'\''])'
============================= test session starts ==============================
platform linux -- Python 3.9.6, pytest-6.2.4, py-1.10.0, pluggy-0.13.1 -- $PREFIX/bin/python
cachedir: .pytest_cache
rootdir: $SRC_DIR
plugins: xdist-2.3.0, forked-1.3.0
collecting ... collected 46106 items / 1 error / 46105 selected

==================================== ERRORS ====================================
________________ ERROR collecting linalg/tests/test_sketches.py ________________
[...]/lib/python3.9/site-packages/scipy/linalg/tests/test_sketches.py:11: in <module>
    class TestClarksonWoodruffTransform:
        __builtins__ = <builtins>
        __cached__ = '/home/conda/feedstock_root/build_artifacts/[...]/lib/python3.9/site-packages/scipy/linalg/tests/__pycache__/test_sketches.cpython-39.pyc'
        __doc__    = 'Tests for _sketches.py.'
        __file__   = '/home/conda/feedstock_root/build_artifacts/[...]/lib/python3.9/site-packages/scipy/linalg/tests/test_sketches.py'
        __loader__ = <_pytest.assertion.rewrite.AssertionRewritingHook object at 0x4017326cd0>
        __name__   = 'scipy.linalg.tests.test_sketches'
        __package__ = 'scipy.linalg.tests'
        __spec__   = ModuleSpec(name='scipy.linalg.tests.test_sketches', loader=<_pytest.assertion.rewrite.AssertionRewritingHook object at [...]/lib/python3.9/site-packages/scipy/linalg/tests/test_sketches.py')
        assert_    = <function assert_ at 0x40178dcca0>
        assert_equal = <function assert_equal at 0x40178ef0d0>
        clarkson_woodruff_transform = <function clarkson_woodruff_transform at 0x4018c145e0>
        cwt_matrix = <function cwt_matrix at 0x4018c14550>
        issparse   = <function isspmatrix at 0x40179944c0>
        norm       = <function norm at 0x401b05e790>
        np         = <module 'numpy' from '/home/conda/feedstock_root/build_artifacts/[...]/lib/python3.9/site-packages/numpy/__init__.py'>
        rand       = <function rand at 0x4017a68ca0>
[...]/lib/python3.9/site-packages/scipy/linalg/tests/test_sketches.py:31: in TestClarksonWoodruffTransform
    A_csc = rand(
        A_dense    = array([[-7.62573313e-01, -4.94043289e-01,  1.99938367e+00, ...,
        -1.29091867e+00,  1.60923980e+00,  1.31264882e...  [-2.96092866e-02, -5.24315831e-02,  1.92306515e-01, ...,
        -1.05605338e+00, -1.00328328e+00,  8.67825559e-01]])
        __doc__    = '\n    Testing the Clarkson Woodruff Transform\n    '
        __module__ = 'scipy.linalg.tests.test_sketches'
        __qualname__ = 'TestClarksonWoodruffTransform'
        density    = 0.1
        n_cols     = 100
        n_rows     = 2000
        n_sketch_rows = 200
        rng        = RandomState(MT19937) at 0x401EF2B440
        seeds      = [1755490010, 934377150, 1391612830, 1752708722, 2008891431, 1302443994, ...]
[...]/lib/python3.9/site-packages/scipy/sparse/construct.py:868: in rand
    return random(m, n, density, format, dtype, random_state)
        density    = 0.1
        dtype      = None
        format     = 'csc'
        m          = 2000
        n          = 100
        random_state = RandomState(MT19937) at 0x401EF2B440
[...]/lib/python3.9/site-packages/scipy/sparse/construct.py:813: in random
    return coo_matrix((vals, (i, j)), shape=(m, n)).asformat(format,
        data_rvs   = functools.partial(<built-in method uniform of numpy.random.mtrand.RandomState object at 0x401ef2b440>, 0.0, 1.0)
        density    = 0.1
        dtype      = dtype('float64')
        format     = 'csc'
        i          = array([ 27873, 196467,   8474, ...,  37840,  24415,  88470], dtype=int32)
        ind        = array([ 27873, 196467,   8474, ...,  37840,  24415,  88470])
        j          = array([0, 0, 0, ..., 0, 0, 0], dtype=int32)
        k          = 20000
        m          = 2000
        mn         = 200000
        n          = 100
        random_state = RandomState(MT19937) at 0x401EF2B440
        tp         = <class 'numpy.int32'>
        vals       = array([0.23487199, 0.96013766, 0.70092071, ..., 0.48012122, 0.52902797,
       0.86194868])
[...]/lib/python3.9/site-packages/scipy/sparse/coo.py:196: in __init__
    self._check()
        M          = 2000
        N          = 100
        arg1       = (array([0.23487199, 0.96013766, 0.70092071, ..., 0.48012122, 0.52902797,
       0.86194868]), (array([ 27873, 196467,   8474, ...,  37840,  24415,  88470], dtype=int32), array([0, 0, 0, ..., 0, 0, 0], dtype=int32)))
        col        = array([0, 0, 0, ..., 0, 0, 0], dtype=int32)
        copy       = False
        dtype      = None
        idx_dtype  = <class 'numpy.int32'>
        obj        = array([0.23487199, 0.96013766, 0.70092071, ..., 0.48012122, 0.52902797,
       0.86194868])
        row        = array([ 27873, 196467,   8474, ...,  37840,  24415,  88470], dtype=int32)
        self       = <2000x100 sparse matrix of type '<class 'numpy.float64'>'
	with 20000 stored elements in COOrdinate format>
        shape      = (2000, 100)
[...]/lib/python3.9/site-packages/scipy/sparse/coo.py:283: in _check
    raise ValueError('row index exceeds matrix dimensions')
E   ValueError: row index exceeds matrix dimensions
        idx_dtype  = <class 'numpy.int32'>
        self       = <2000x100 sparse matrix of type '<class 'numpy.float64'>'
	with 20000 stored elements in COOrdinate format>
=========================== short test summary info ============================
ERROR linalg/tests/test_sketches.py - ValueError: row index exceeds matrix di...
!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!
========================= 1 error in 713.78s (0:11:53) =========================
+ exit 0
@h-vetinari
Copy link
Member Author

I was sufficiently weirded out by this that I raised: pytest-dev/pytest#8986

@h-vetinari
Copy link
Member Author

Duh, it's the scipy.test-wrapper. 🤦
Still it's surprising IMO that the code of the exception is 0. Will probably patch later.

@isuruf
Copy link
Member

isuruf commented Oct 26, 2021

This is a QEMU bug and doesn't occur when testing on a native machine.

@h-vetinari
Copy link
Member Author

This is a QEMU bug and doesn't occur when testing on a native machine.

That's good to hear! Though we'd still effectively be flying blind in CI, i.e. we wouldn't catch actual regressions that are masked by this bug, no?

@isuruf
Copy link
Member

isuruf commented Oct 26, 2021

Though we'd still effectively be flying blind in CI, i.e. we wouldn't catch actual regressions that are masked by this bug, no?

Same is true for osx-arm64.

@h-vetinari
Copy link
Member Author

Same is true for osx-arm64.

That's a known limitation, and - correct me if I'm wrong - isn't there a daily job somewhere that tests newly built osx-arm packages?

I mean, don't get me wrong, if people are happy with untested PPC packages, I'm not going to stand in the way. It just makes me queasy to me to have no testing of these artefacts at all.

@isuruf
Copy link
Member

isuruf commented Oct 26, 2021

isn't there a daily job somewhere that tests newly built osx-arm packages?

Not anymore

@rgommers
Copy link
Contributor

Re testing - there are no longer CI jobs for PPC on the main SciPy repo nor on https://github.com/MacPython/scipy-wheels/. I don't think it's necessarily conda-forge's job to run the full test suite because of that. Running test on QEMU is super slow, so running a subset (e.g., just import tests of all submodules plus a couple of the fast submodule test suites like interpolate.test()) seems fine to me.

@h-vetinari
Copy link
Member Author

It's easy to reduce to just import tests, with an obvious drop in coverage.

(e.g., just import tests of all submodules plus a couple of the fast submodule test suites like interpolate.test())

Could you make a suggestion of modules? In addition, their respective parts of the test suite may not use sparse matrices anywhere, lest they run into this QEMU bug (@isuruf, is there a bug for this somewhere?)

@rgommers
Copy link
Contributor

rgommers commented Oct 29, 2021

Could you make a suggestion of modules

odr, misc, cluster, fft

@h-vetinari
Copy link
Member Author

This should now be fixed with the sys.exit stuff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants