BUG: fix sparse random int handling #9486

tylerjereddy · 2018-11-16T02:51:50Z

SciPy wheel builds have been failing for the last ten weeks, which has caused a number of issues to fly under the radar until my attempt to start the wheel build process for 1.2.0rc1 as described in MacPython/scipy-wheels#37.

This PR targets perhaps the most worrisome of the failures, in scipy.sparse.tests.test_construct::TestConstructUtils::test_random_sampling.

I was able to reproduce one of the several failing wheel build matrix entries locally by using numpy==1.8.2, Cython==0.29.0, Python==2.7, and SciPy master branch.

The changes made in this PR were sufficient to patch that specific test failure locally, but that doesn't mean we actually want to fix it this way, and I'm not sure how imposing this fix is after the branch has happened--maybe not too cumbersome yet without a tag having been pushed.

The original compatibility layer addition was quite recent -- #9048 by @jor-.

rgommers · 2018-11-16T04:39:18Z

Adding the test failure for clarity:

___________________ TestConstructUtils.test_random_sampling ____________________
[gw1] linux -- Python 3.4.3 /venv/bin/python

self = <scipy.sparse.tests.test_construct.TestConstructUtils object at 0x7f21ec1dd5f8>

    def test_random_sampling(self):
        # Simple sanity checks for sparse random sampling.
        for f in sprand, _sprandn:
            for t in [np.float32, np.float64, np.longdouble,
                      np.int32, np.int64, np.complex64, np.complex128]:
>               x = f(5, 10, density=0.1, dtype=t)

f          = <function rand at 0x7f22047bf9d8>
self       = <scipy.sparse.tests.test_construct.TestConstructUtils object at 0x7f21ec1dd5f8>
t          = <class 'numpy.int32'>
x          = <5x10 sparse matrix of type '<class 'numpy.float128'>'
	with 5 stored elements in COOrdinate format>

/venv/lib/python3.4/site-packages/scipy/sparse/tests/test_construct.py:428: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/venv/lib/python3.4/site-packages/scipy/sparse/construct.py:842: in rand
    return random(m, n, density, format, dtype, random_state)
/venv/lib/python3.4/site-packages/scipy/sparse/construct.py:792: in random
    vals = data_rvs(k).astype(dtype, copy=False)
/venv/lib/python3.4/site-packages/scipy/sparse/construct.py:781: in data_rvs
    n, dtype=dtype)
/venv/lib/python3.4/site-packages/scipy/_lib/_numpy_compat.py:120: in randint_patched
    integers = random_state.randint(low, high=high, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   ValueError: low >= high

Lock       = <built-in function allocate_lock>
RandomState = <class 'mtrand.RandomState'>
__builtins__ = <builtins>
__doc__    = None
__file__   = '/venv/lib/python3.4/site-packages/numpy/random/mtrand.cpython-34m.so'
__loader__ = <_frozen_importlib.ExtensionFileLoader object at 0x7f22052cbd30>
__name__   = 'numpy.random.mtrand'
__package__ = 'numpy.random'
__spec__   = ModuleSpec(name='numpy.random.mtrand', loader=<_frozen_importlib.ExtensionFileLoader object at 0x7f22052cbd30>, origin='/venv/lib/python3.4/site-packages/numpy/random/mtrand.cpython-34m.so')
__test__   = {'RandomState.binomial (line 3546)': '\n        binomial(n, p, size=None)\n\n        Draw samples from a binomial dist...0.3])\n        array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'],\n              dtype='|S11')\n\n        ", ...}
_rand      = <mtrand.RandomState object at 0x7f22052d8c18>
_shape_from_size = <built-in function _shape_from_size>
beta       = <built-in method beta of mtrand.RandomState object at 0x7f22052d8c18>
binomial   = <built-in method binomial of mtrand.RandomState object at 0x7f22052d8c18>
... <etc>
zipf       = <built-in method zipf of mtrand.RandomState object at 0x7f22052d8c18>

mtrand.pyx:941: ValueError
=========================== short test summary info ============================
FAIL sparse/tests/test_construct.py::TestConstructUtils::test_random_sampling

rgommers · 2018-11-16T04:49:05Z

I can't say I get the logic of either the original code or the fix, it's unclear to me what the options are for *args, **kwargs. Deserves some more comments.

Note that if this takes a while to resolve, you could also consider reverting that PR in 1.2.x. The problem seems to be for numpy <1.11.0 only, which we're dropping in master very soon.

tylerjereddy · 2018-11-17T01:17:21Z

randint didn't have the dtype argument for the older NumPy versions with a compat layer here.

--- a/scipy/_lib/_numpy_compat.py
+++ b/scipy/_lib/_numpy_compat.py
@@ -95,6 +95,11 @@ else:
     # method of RandomState does only work with int32 values.
     def get_randint(random_state):
         def randint_patched(*args, **kwargs):
+            # args should be: low, high, size
+            # kwargs should only really contain
+            # dtype argument not available in older
+            # NumPy versions, which couldn't handle
+            # int64
             try:
                 low = args[0]
                 size = args[2]

The patched function is probably a bit awkward though, and may be important to check which of its codepaths are actually tested locally.

The top of _numpy_compat.py has a docstring with Functions copypasted from newer versions of numpy, but I think the random stuff is in Cython so we'd likely need to add in a compiled compatibility layer if we don't already have one (and only if it doesn't branch out too much to other parts of NumPy, which it might). I suspect @rkern would agree this is asking for trouble / maintenance issues, which is probably why we can't just copy/paste here.

perimosocordiae · 2018-11-18T14:23:58Z

This shim is only used in one place right now, in scipy/sparse/construct.py:

        if np.issubdtype(dtype, np.integer):
            randint = get_randint(random_state)

            def data_rvs(n):
                return randint(np.iinfo(dtype).min, np.iinfo(dtype).max,
                               n, dtype=dtype)

Since the shim will be obsolete once the minimum numpy version is increased, I think it's sufficient to patch it here to get tests running, without worrying too much about having all the corner cases checked. In particular, we know that both low and high are not None, and that high doesn't show up in the **kwargs, so a lot of the shim's logic isn't exercised.

scipy/_lib/_numpy_compat.py

tylerjereddy · 2018-11-19T20:35:47Z

Revised based on revisions requested from CJ, though I don't have immediate access to the local test environment originally used for reproduction since I'm at BIDS at the moment.

* TestConstructUtils::test_random_sampling fails in some SciPy wheel build scenarios because of incorrect handling of np.int32/64 limits in the compatibility layer; this commit attempts to fix that behavior

tylerjereddy · 2018-11-20T17:52:58Z

I reproduced the original test failure locally on the latest master branch (04c0e5f) using above-described dependencies & then confirmed that the test passes on the revised version of this PR branch (f9ad4bd) in the same environment.

CI is all-green too. If there's still too much concern about the shim after the simplifications above then I should probably consider the PR reversion.

rgommers · 2018-11-21T07:08:41Z

LGTM now, merged. Thanks @tylerjereddy and @perimosocordiae

tylerjereddy · 2018-11-21T18:14:01Z

Unfortunately we're still getting a related wheel build failure with OverflowError: Python int too large to convert to C long.

It looks like we really do have to ensure that all arguments to randint are within np.int32 bounds for the old NumPy version. Do we want to try attacking that with another PR, reverting the functionality, or xfail for old NumPy?

rgommers · 2018-11-21T23:12:30Z

Out of all distributions available, it's only randint that fails right, all the other distributions do work? If so, I'd suggest to either skip testing for randint or xfail the test on 1.2.x.

* backports for scipygh-9486, scipygh-9494, scipygh-9512, scipygh-9507, scipygh-9526

jor- · 2018-11-28T10:51:28Z

#9550 might fix the problem with randint and old NumPy versions.

tylerjereddy added defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.sparse backport-candidate This fix should be ported by a maintainer to previous SciPy versions. labels Nov 16, 2018

perimosocordiae reviewed Nov 18, 2018

View reviewed changes

scipy/_lib/_numpy_compat.py Outdated Show resolved Hide resolved

tylerjereddy force-pushed the fix_sparse_int_compat branch from 028a06a to 6a9b110 Compare November 19, 2018 20:32

BUG: fix sparse random int handling

f9ad4bd

* TestConstructUtils::test_random_sampling fails in some SciPy wheel build scenarios because of incorrect handling of np.int32/64 limits in the compatibility layer; this commit attempts to fix that behavior

tylerjereddy force-pushed the fix_sparse_int_compat branch from 6a9b110 to f9ad4bd Compare November 19, 2018 21:48

tylerjereddy changed the title ~~WIP, BUG: fix sparse random int handling~~ BUG: fix sparse random int handling Nov 20, 2018

rgommers added this to the 1.2.0 milestone Nov 21, 2018

rgommers merged commit b635563 into scipy:master Nov 21, 2018

tylerjereddy deleted the fix_sparse_int_compat branch November 21, 2018 18:09

tylerjereddy added a commit to tylerjereddy/scipy that referenced this pull request Nov 25, 2018

MAINT: multiple backports for 1.2.x

f3e7a41

* backports for scipygh-9486, scipygh-9494, scipygh-9512, scipygh-9507, scipygh-9526

tylerjereddy mentioned this pull request Nov 25, 2018

MAINT: multiple backports for 1.2.x #9534

Merged

jor- mentioned this pull request Nov 28, 2018

BUG: scipy/_lib/_numpy_compat: get_randint #9550

Merged

tylerjereddy removed the backport-candidate This fix should be ported by a maintainer to previous SciPy versions. label Feb 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: fix sparse random int handling #9486

BUG: fix sparse random int handling #9486

tylerjereddy commented Nov 16, 2018

rgommers commented Nov 16, 2018

rgommers commented Nov 16, 2018

tylerjereddy commented Nov 17, 2018 •

edited

perimosocordiae commented Nov 18, 2018

tylerjereddy commented Nov 19, 2018

tylerjereddy commented Nov 20, 2018

rgommers commented Nov 21, 2018

tylerjereddy commented Nov 21, 2018

rgommers commented Nov 21, 2018

jor- commented Nov 28, 2018

BUG: fix sparse random int handling #9486

BUG: fix sparse random int handling #9486

Conversation

tylerjereddy commented Nov 16, 2018

rgommers commented Nov 16, 2018

rgommers commented Nov 16, 2018

tylerjereddy commented Nov 17, 2018 • edited

perimosocordiae commented Nov 18, 2018

tylerjereddy commented Nov 19, 2018

tylerjereddy commented Nov 20, 2018

rgommers commented Nov 21, 2018

tylerjereddy commented Nov 21, 2018

rgommers commented Nov 21, 2018

jor- commented Nov 28, 2018

tylerjereddy commented Nov 17, 2018 •

edited