ENH: _lib._util: make _lazywhere compatible with Array API #18930

mdhaber · 2023-07-21T21:46:25Z

Reference issue

Along the lines of gh-18668

What does this implement/fix?

This makes scipy._lib._util compatible with the array API. Also:

The output dtype is now determined based on the types required by the outputs (rather than the types of the inputs).
The input validation s is simplified.
~cond is now computed only once.
np.empty is now used to initialize the output array when f2 is specified (instead of np.full).

For small (NumPy) arrays (<10k elements), this version tends to be faster. For moderate arrays (say 1 million elements), the execution time depends a lot on cond. Its distribution is spread out over more than order of magnitude, but is slightly (<6%) slower on average (gmean) with this PR. I'd be happy to find out that this can be improved, but I've studied it, and it seems to be accounted for by the use of array API-compatible features. Fortunately, it seems to asymptote to <7% slower as the array size increases.

Temporary memory is increased, but that seems necessary if we want to determine the output type based on what is going to be output (rather than the type of the input).

Additional information

Todo:

unit tests (especially with array API array)
When inputs are scalars, _lazywhere should return a scalar rather than 0D array. Is there a canonical way to determine the difference between the two (since the size, shape, and ndim are the same)? np.isscalar, of course
The output is an ndarray even when the input is an array_api.Array. That's because xp.full and xp.empty create ndarrays. I'm assuming that's supposed to happen for now?

scipy/_lib/_util.py

mdhaber · 2023-07-23T00:32:47Z

Once gh-18927 merges, I'll add this test:

class TestLazywhere:
    arrays = npst.arrays(dtype=npst.floating_dtypes(), shape=npst.array_shapes())
    fillvalue = npst.arrays(dtype=npst.floating_dtypes(), shape=tuple())
    p = strategies.floats(min_value=0, max_value=1)

    @array_api_compatible
    @given(arrays=arrays, fillvalue=fillvalue, p=p)
    def test_basic(self, arrays, fillvalue, p, xp):
        arrays = xp.asarray(arrays)
        fillvalue = xp.asarray(fillvalue)

        def f(*args):
            return args[0]

        def f2(*args):
            return args[0]*0.5

        if arrays.ndim <= 1:
            arrays = (arrays,)

        rng = np.random.default_rng(84268954369357456)
        cond = xp.asarray(rng.random(size=arrays[0].shape) > p)

        res = _lazywhere(cond, arrays, f, fillvalue)
        ref = xp.where(cond, f(*arrays), fillvalue)
        assert_equal(res, ref)

        res = _lazywhere(cond, arrays, f, f2=f2)
        ref = xp.where(cond, f(*arrays), f2(*arrays))
        assert_equal(res, ref)

@tupui is this how you'd use @array_api_compatible here?

Note that https://scipy.github.io/devdocs/dev/api-dev/array_api.html doesn't say where to import that decorator from.
This PR will add

  from scipy.conftest import array_api_compatible
  ...

to the top of the example on that page.

I'm not going to add a test of the input validation. It was there, so I preserved it, but I would rather remove it than test it.

tupui · 2023-07-24T11:08:48Z

@tupui is this how you'd use @array_api_compatible here?

Yes 👍

Also maybe np.where should also be converted to xp.where.

doesn't say where to import that decorator from.
This PR will add

from scipy.conftest import array_api_compatible
...
to the top of the example on that page.

Good catch 👍 Yes this is the correct import.

tylerjereddy

I just played around with this branch + copy/paste of the test mentioned in case the feedback is helpful.

Testing with torch: python dev.py test -b pytorch -t scipy/stats/tests/test_distributions.py::TestLazywhere produces error below the fold:

_________________________________________________________________________________________________________ TestLazywhere.test_basic[torch] _________________________________________________________________________________________________________
scipy/stats/tests/test_distributions.py:9455: in test_basic
    @given(arrays=arrays, fillvalue=fillvalue, p=p)
        f          = <function given.<locals>.run_test_as_given.<locals>.wrapped_test at 0x7fe19fb620c0>
        self       = <scipy.stats.tests.test_distributions.TestLazywhere object at 0x7fe19fa37610>
        xp         = <module 'torch' from '/home/treddy/python_venvs/py_311_scipy_dev/lib/python3.11/site-packages/torch/__init__.py'>
scipy/stats/tests/test_distributions.py:9472: in test_basic
    res = _lazywhere(cond, arrays, f, fillvalue)
        arrays     = (tensor([0.], dtype=torch.float16),)
        cond       = array([ True])
        f          = <function TestLazywhere.test_basic.<locals>.f at 0x7fe19fbcb4c0>
        f2         = <function TestLazywhere.test_basic.<locals>.f2 at 0x7fe19fbc9d00>
        fillvalue  = tensor(0., dtype=torch.float16)
        p          = 0.0
        rng        = Generator(PCG64) at 0x7FE19FC06EA0
        self       = <scipy.stats.tests.test_distributions.TestLazywhere object at 0x7fe19fa37610>
        xp         = <module 'torch' from '/home/treddy/python_venvs/py_311_scipy_dev/lib/python3.11/site-packages/torch/__init__.py'>
scipy/_lib/_util.py:87: in _lazywhere
    xp = array_namespace(cond, *arrays)
        arrays     = (tensor([0.], dtype=torch.float16),)
        cond       = array([ True])
        f          = <function TestLazywhere.test_basic.<locals>.f at 0x7fe19fbcb4c0>
        f2         = None
        fillvalue  = tensor(0., dtype=torch.float16)
scipy/_lib/_array_api.py:95: in array_namespace
    return array_api_compat.array_namespace(*arrays)
        arrays     = [array([ True]), tensor([0.], dtype=torch.float16)]
scipy/_lib/array_api_compat/array_api_compat/common/_helpers.py:110: in array_namespace
    raise TypeError(f"Multiple namespaces for array inputs: {namespaces}")
E   TypeError: Multiple namespaces for array inputs: {<module 'scipy._lib.array_api_compat.array_api_compat.numpy' from '/home/treddy/github_projects/scipy/build-install/lib/python3.11/site-packages/scipy/_lib/array_api_compat/array_api_compat/numpy/__init__.py'>, <module 'scipy._lib.array_api_compat.array_api_compat.torch' from '/home/treddy/github_projects/scipy/build-install/lib/python3.11/site-packages/scipy/_lib/array_api_compat/array_api_compat/torch/__init__.py'>}
E   Falsifying example: test_basic(
E       self=<scipy.stats.tests.test_distributions.TestLazywhere object at 0x7fe19fa37610>,
E       xp=<module 'torch' from '/home/treddy/python_venvs/py_311_scipy_dev/lib/python3.11/site-packages/torch/__init__.py'>,
E       arrays=array([0.], dtype=float16),
E       fillvalue=array(0., dtype=float16),
E       p=0.0,
E   )

I get an almost identical error from testing with CuPy (python dev.py test -b cupy -t scipy/stats/tests/test_distributions.py::TestLazywhere).

Looks like, in your draft test, cond from np.random.. produces a NumPy array, so adding cond = xp.asarray(cond) seems to help a bit in my hands. I do note that mixing hypothesis in here does make things a bit tricky on top of array API. Testing still fails on the device (with CuPy) with that patch because we can't just use assert_equal(res, ref) with host-device transfers. I haven't checked if we've merged an official way to do that just yet, I know I did draft some ideas.

mdhaber · 2023-07-26T18:01:51Z

Thanks. Yes, I planned to test this once the Hypothesis PR merges, but it looks like adding xp.asarray to the cond variable will save me some trouble.

I do note that mixing hypothesis in here does make things a bit tricky on top of array API. Testing still fails on the device (with CuPy) with that patch because we can't just use assert_equal(res, ref) with host-device transfers.

Is the problem inherent to Hypothesis, or is it that np.testing.assert_equal doesn't work for arbitrary array types?

tupui · 2023-07-26T18:05:22Z

Testing still fails on the device (with CuPy) with that patch because we can't just use assert_equal(res, ref) with host-device transfers. I haven't checked if we've merged an official way to do that just yet, I know I did draft some ideas.

My PR incorporated your code Tyler:

if "cupy" in xp.__name__:
    import cupy as cp
    cp.testing.assert_allclose(...)
else:
    np.testing.assert_allclose(...)

For the fft PR, I actually asked @lucascolley to make a standalone helper function that we can use elsewhere more easily.

It's a bit unfortunate that there is no testing bits in the standard itself. I feel like this would be helpful. Maybe we can push this at least in the compat library.

tylerjereddy · 2023-07-26T18:13:17Z

Edit: Pamphile clarified we already have some device testing shims

For torch I also noticed this, which you'll see when you start iterating I suppose:

scipy/_lib/_util.py:93: in _lazywhere
    cond, arrays = args[0].astype(bool, copy=False), args[1:]
E   AttributeError: 'Tensor' object has no attribute 'astype'

I was able to patch around that with:

--- a/scipy/_lib/_util.py
+++ b/scipy/_lib/_util.py
@@ -90,7 +90,7 @@ def _lazywhere(cond, arrays, f, fillvalue=None, f2=None):
         raise ValueError("Exactly one of `fillvalue` or `f2` must be given.")
 
     args = xp.broadcast_arrays(cond, *arrays)
-    cond, arrays = args[0].astype(bool, copy=False), args[1:]
+    cond, arrays = xp.astype(args[0], xp.bool, copy=False), args[1:]
 
     temp1 = f(*(arr[cond] for arr in arrays))

But then hypothesis started doing stuff that seemed hard for the torch control flow to handle:

scipy/stats/tests/test_distributions.py:9455: in test_basic
    @given(arrays=arrays, fillvalue=fillvalue, p=p)
        f          = <function given.<locals>.run_test_as_given.<locals>.wrapped_test at 0x7f1018bee0c0>
        self       = <scipy.stats.tests.test_distributions.TestLazywhere object at 0x7f1018ac7410>
        xp         = <module 'torch' from '/home/treddy/python_venvs/py_311_scipy_dev/lib/python3.11/site-packages/torch/__init__.py'>
scipy/stats/tests/test_distributions.py:9458: in test_basic
    fillvalue = xp.asarray(fillvalue)
E   ValueError: given numpy array has byte order different from the native byte order. Conversion between byte orders is currently not supported.
E   Falsifying example: test_basic(
E       self=<scipy.stats.tests.test_distributions.TestLazywhere object at 0x7f1018ac7410>,
E       xp=<module 'torch' from '/home/treddy/python_venvs/py_311_scipy_dev/lib/python3.11/site-packages/torch/__init__.py'>,
E       arrays=array([0.], dtype=float16),
E       fillvalue=array(0., dtype='>f2'),
E       p=0.0,
E   )

mdhaber · 2023-07-26T18:16:10Z

I see astype is a function in the Array API, not a method. Got it.

tylerjereddy · 2023-07-26T18:16:26Z

astype is in the array API. Shouldn't array_api_compat be taking care of that @tupui?

I think it is with the small patch I added

tupui · 2023-07-26T18:19:20Z

I see astype is a function in the Array API, not a method. Got it.

Yes that's the main design change, everything is a function instead of a method in the Array API standard.

lucascolley · 2023-07-26T19:16:10Z

For the fft PR, I actually asked @lucascolley to make a standalone helper function that we can use elsewhere more easily.

For my PR (which will appear on the main repo soon) this has been added to conftest.py:

def set_assert_allclose(xp=None):
    if xp is None:
        return npt.assert_allclose
    if 'cupy' in xp.__name__:
        import cupy as cp
        return cp.testing.assert_allclose
    return npt.assert_allclose

lucascolley · 2023-07-27T14:09:27Z

That's because xp.full and xp.empty create ndarrays.

Is this true? With xp == numpy.array_api they seem to create numpy.array_api arrays for me:

In [1]: import numpy as np

In [2]: import numpy.array_api as npa
<ipython-input-2-47d1a2ee1bbf>:1: UserWarning: The numpy.array_api submodule is still experimental. See NEP 47.
  import numpy.array_api as npa

In [3]: a = npa.asarray([1, 2, 3, 4])

In [4]: cond = a > 2

In [5]: isinstance(cond, npa._array_object.Array)
Out[5]: True

In [6]: isinstance(cond, np.ndarray)
Out[6]: False

In [7]: x = npa.full(cond.shape, fill_value=npa.nan)

In [8]: isinstance(x, npa._array_object.Array)
Out[8]: True

In [9]: isinstance(x, np.ndarray)
Out[9]: False

Perhaps xp is not being set correctly?

mdhaber · 2023-07-27T14:17:13Z

I don't think xp is set to numpy.array_api. IIUC, For now, we are using array_api_compat.numpy on purpose.

lucascolley · 2023-07-27T14:58:59Z

Very sorry if I am misunderstanding, but I think that xp should be set to numpy.array_api as long as the global switch SCIPY_ARRAY_API is not False.

The first line of _lazywhere is xp = array_namespace(cond, *arrays), which calls array_namespace in _array_api.py:

if not _GLOBAL_CONFIG["SCIPY_ARRAY_API"]:
        # here we could wrap the namespace if needed
        return array_api_compat_numpy

arrays = [array for array in arrays if array is not None]

compliance_scipy(*arrays) 

return array_api_compat.array_namespace(*arrays)

Which in turn calls array_namespace from array_api_compat:

namespaces = set()
    for x in xs:
        if isinstance(x, (tuple, list)):
            namespaces.add(array_namespace(*x, _use_compat=_use_compat))
        elif hasattr(x, '__array_namespace__'):
            namespaces.add(x.__array_namespace__(api_version=api_version))
        elif _is_numpy_array(x):
            _check_api_version(api_version)
            if _use_compat:
                from .. import numpy as numpy_namespace
                namespaces.add(numpy_namespace)
            else:
                import numpy as np
                namespaces.add(np)
        # ...
    if not namespaces:
        raise TypeError("Unrecognized array input")

    if len(namespaces) != 1:
        raise TypeError(f"Multiple namespaces for array inputs: {namespaces}")

    xp, = namespaces

    return xp

Since hasattr(x, '__array_namespace__') is True when x is a numpy.array_api._array_object.Array, numpy.array_api is returned by this function.

Am I missing something? It seems to me that xp would only be array_api_compat.numpy for a numpy.array_api._array_object.Array input if SCIPY_ARRAY_API was set to False, which shouldn't be the case whenever -b is used with python dev.py test.

tupui · 2023-07-27T15:13:50Z

Very sorry if I am misunderstanding, but I think that xp should be set to numpy.array_api as long as the global switch SCIPY_ARRAY_API is not False.

array_api_compat.numpy is returned when the flag is False. This is so that normal NumPy arrays are made compatible with the standard. Hence we can start adding compatibility.

When the flag is on. The namespace that we get back is coming from the the function array_api_compat.array_namespace. Here it's making the decision for NumPy to give us back the portable version, i.e. array_api_compat.numpy as well.

See here:

scipy/scipy/_lib/_array_api.py

Lines 87 to 95 in 73c8d7b

    
           if not _GLOBAL_CONFIG["SCIPY_ARRAY_API"]: 
        
               # here we could wrap the namespace if needed 
        
               return array_api_compat_numpy 
        
           arrays = [array for array in arrays if array is not None] 
        
           compliance_scipy(*arrays) 
        
           return array_api_compat.array_namespace(*arrays)

lucascolley · 2023-07-27T15:23:07Z

Here it's making the decision for NumPy to give us back the portable version, i.e. array_api_compat.numpy as well.

I don't agree! As stated in my last comment:

Since hasattr(x, '__array_namespace__') is True when x is a numpy.array_api._array_object.Array, numpy.array_api is returned by this function.

This matches the behaviour I see in the python interpreter:

In [1]: import numpy.array_api as npa
<ipython-input-1-47d1a2ee1bbf>:1: UserWarning: The numpy.array_api submodule is still experimental. See NEP 47.
  import numpy.array_api as npa

In [2]: from scipy._lib.array_api_compat.array_api_compat import array_namespace

In [3]: x = npa.asarray([1, 2, 3, 4])

In [4]: xp = array_namespace(x)

In [5]: isinstance(x, npa._array_object.Array)
Out[5]: True

In [6]: xp
Out[6]: <module 'numpy.array_api' from '/home/lucas/mambaforge/envs/scipy-dev/lib/python3.10/site-packages/numpy/array_api/__init__.py'>

In [7]: xp.__name__
Out[7]: 'numpy.array_api'

tupui · 2023-07-27T15:30:17Z

from scipy._lib.array_api_compat.array_api_compat import array_namespace

Here you are actually calling array_api_compat.array_namespace. But we have our own:

from scipy._lib._array_api import array_namespace

EDIT: oh I think I see what you mean. Let me think about that.

tupui · 2023-07-27T15:34:54Z

Ok yes, so array_api_compat.array_namespace is indeed going to return numpy.array_api if you pass in numpy.array_api. This is a pass through to allow to check the standard. So there is no wrapping. Internally, the compat library look for __array_namespace__ and as it founds it for numpy.array_api, it directly returns this namespace.

lucascolley · 2023-07-27T15:42:21Z

Yep that's it. Matt, was this with the SCIPY_ARRAY_API flag set or not?

The output is an ndarray even when the input is an array_api.Array. That's because xp.full and xp.empty create ndarrays.

mdhaber · 2023-07-28T02:01:29Z

Yeah, not set.

scipy/_lib/_util.py

[skip ci]

scipy/_lib/tests/test__util.py

mdhaber · 2023-08-21T21:09:34Z

scipy/_lib/tests/test__util.py

+        assert_equal(np.asarray(res1), ref1)
+        assert_equal(np.asarray(res2), ref2)


Suggested change

assert_equal(np.asarray(res1), ref1)

assert_equal(np.asarray(res2), ref2)

assert_equal(np.asarray(res1), ref1)

assert_equal(res1.shape, ref1.shape)

assert_equal(np.asarray(res2), ref2)

assert_equal(res2.shape, ref2.shape)

A set_assert_equal utility (like set_assert_allclose discussed in the special PR) could be useful here, to allow us to use the equivalent from the other libraries we want to test with.

I meant to add a comment in the code about that to preempt this comment. But let's avoid scope creep - when set_assert_equal exists, we can use it, but that's not part of this PR. Same with the isinstance check below.

That said, if you'd like to submit a PR for set_assert_equal (and perhaps set_assert_allclose separate from gh-19005) I would be willing to review it. Rather than all these separate functions, though, why not something like an array_testing_namespace?

Okay, I may be able to submit that PR in the next few days 👍 but please don't wait for me if this is ready to go in. In its current state, I think that you will need to add @skip_if_array_api_gpu, since np.asarray fails for CuPy arrays.

@mdhaber would you mind elaborating a bit on how an array_testing_namespace would work? Do we want testing functions where we pass in xp as a keyword, or do we keep the set_...(xp) structure? Or something else?

As array_namespace accepts arrays and returns a namespace with the appropriate Array API functions, array_testing_namespace would accept arrays and return a namespace with the appropriate testing functions. As array_namespace uses array_api_compat to give a consistent signature to existing functions, so array_testing_namespace would give a consistent signature to the various existing testing functions.

There are really not many array comparison testing functions to be handled. I really only need one to assess whether arrays are close (assert_allclose), equal (assert_equal or assert_array_equal, which don't seem different enough to exist separately aside from backward compatibility reasons), or ordered (assert_array_less).

So on second thought, rather than accepting arrays and returning a namespace, it might be better to just write three functions (e.g. assert_equal, assert_close, and assert_less) that accept two arrays, each of any type, and do the comparison as appropriate to the devices the arrays live on, whatever that means. I assume that for two arrays on the GPU we would use CuPy's functions, two arrays accessible to the CPU would use NumPy or PyTorch test functions, and arrays split between the devices would move the data to one or the other before comparison.

That sounds good to me - agreed that we only need those couple of functions.

I think this suggestion is resolved, but I'll leave it open for others to refer to.

scipy/_lib/tests/test__util.py

scipy/_lib/_util.py

mdhaber · 2023-08-21T22:48:31Z

scipy/_lib/tests/test__util.py

+    data = strategies.data()
+
+    @pytest.mark.filterwarnings('ignore::RuntimeWarning')  # overflows, etc.
+    @array_api_compatible


Suggested change

@array_api_compatible

@skip_if_array_api_gpu

@array_api_compatible

Consider adding per #18930 (comment) (with required import at the top).

Alternatively, since array API behavior is opt-in and CI doesn't use GPUs right now, leave as-is. A failure in the future will remind us to replace np.assert_equal below with a GPU-compatible version (which will presumably have been written by the time anyone sees a failure).

@lucascolley IIUC this test should now pass with CuPy arrays, so we can resolve this? Can you test locally?

Looks fine with CuPy. python dev.py test -t scipy/_lib/tests/test__util.py -b all gives me this error with torch:

scipy/_lib/tests/test__util.py:364: in test_basic @array_api_compatible f = <function given.<locals>.run_test_as_given.<locals>.wrapped_test at 0x7f22e371e980> self = <scipy._lib.tests.test__util.TestLazywhere object at 0x7f22e3d9bb50> xp = <module 'torch' from '/home/lucas/mambaforge/envs/scipy-dev/lib/python3.11/site-packages/torch/__init__.py'> scipy/_lib/tests/test__util.py:399: in test_basic xp_assert_equal(res1, ref1) arrays = [tensor(0.), tensor(nan)] cond = tensor(True) cond_shape = () data = data(...) dtype = <class 'numpy.float32'> f = <function TestLazywhere.test_basic.<locals>.f at 0x7f2342606e80> f2 = <function TestLazywhere.test_basic.<locals>.f2 at 0x7f234263a3e0> fillvalue = tensor(0.) input_shapes = ((), (), ()) mbs = mutually_broadcastable_shapes(num_shapes=3, min_side=0) n_arrays = 2 p = 0.0 ref1 = tensor(nan) ref2 = tensor(nan) res1 = tensor(nan) res2 = tensor(nan) result_shape = () rng = Generator(PCG64) at 0x7F234263E5E0 rng_seed = 1000000000 self = <scipy._lib.tests.test__util.TestLazywhere object at 0x7f22e3d9bb50> shapes = [(), ()] xp = <module 'torch' from '/home/lucas/mambaforge/envs/scipy-dev/lib/python3.11/site-packages/torch/__init__.py'> scipy/_lib/_array_api.py:188: in assert_equal return xp.testing.assert_close(actual, desired, rtol=0, atol=0, E AssertionError E Falsifying example: test_basic( E self=<scipy._lib.tests.test__util.TestLazywhere object at 0x7f22e3d9bb50>, E xp=<module 'torch' from '/home/lucas/mambaforge/envs/scipy-dev/lib/python3.11/site-packages/torch/__init__.py'>, E n_arrays=2, E rng_seed=1000000000, E dtype=numpy.float32, E p=0.0, E data=data(...), E ) E Draw 1: BroadcastableShapes(input_shapes=((), (), ()), result_shape=()) E Draw 2: array(0., dtype=float32) E Draw 3: array(0., dtype=float32) E Draw 4: array(nan, dtype=float32) E E You can reproduce this example by temporarily adding @reproduce_failure('6.82.7', b'AXicY2RgZCANKDAy/P8BYQIAEN8CGw==') as a decorator on your test case actual = tensor(nan) desired = tensor(nan) err_msg = '' xp = <module 'scipy._lib.array_api_compat.array_api_compat.torch' from '/home/lucas/dev/myscipy/build-install/lib/python3.11/site-packages/scipy/_lib/array_api_compat/array_api_compat/torch/__init__.py'> ========================================================================================= short test summary info ========================================================================================== FAILED scipy/_lib/tests/test__util.py::TestLazywhere::test_basic[torch] - AssertionError ====================================================================================== 1 failed, 24 passed in 26.03s =======================================================================================

Thanks. Can you try again after implementing https://github.com/scipy/scipy/pull/19186/files#r1316228638.

python dev.py test -t scipy/_lib/tests/test__util.py -b all passes all tests after that patch yup.

However, with SCIPY_DEVICE=cuda, I get this error:

____________________________________________________________________________________ TestLazywhere.test_basic[torch] ______________________________________________________________________________________ scipy/_lib/tests/test__util.py:364: in test_basic @array_api_compatible f = <function given.<locals>.run_test_as_given.<locals>.wrapped_test at 0x7f9b0639fc40> self = <scipy._lib.tests.test__util.TestLazywhere object at 0x7f9b066ba390> xp = <module 'torch' from '/home/lucas/mambaforge/envs/scipy-dev/lib/python3.11/site-packages/torch/__init__.py'> scipy/_lib/tests/test__util.py:384: in test_basic res1 = _lazywhere(cond, arrays, f, fillvalue) arrays = [tensor([])] cond = tensor(True) cond_shape = () data = data(...) dtype = <class 'numpy.float32'> f = <function TestLazywhere.test_basic.<locals>.f at 0x7f9be61a6de0> f2 = <function TestLazywhere.test_basic.<locals>.f2 at 0x7f9b652c2020> fillvalue = tensor(0.) input_shapes = ((), (0,)) mbs = mutually_broadcastable_shapes(num_shapes=2, min_side=0) n_arrays = 1 p = 0.0 result_shape = (0,) rng = Generator(PCG64) at 0x7F9B652E5FC0 rng_seed = 1000000000 self = <scipy._lib.tests.test__util.TestLazywhere object at 0x7f9b066ba390> shapes = [(0,)] xp = <module 'torch' from '/home/lucas/mambaforge/envs/scipy-dev/lib/python3.11/site-packages/torch/__init__.py'> scipy/_lib/_util.py:119: in _lazywhere out[cond] = temp1 E RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Since cond is created from a numpy array (cond = xp.asarray(rng.random(size=cond_shape) > p)), it is created on the cpu.

Edit: I'm trying to test whether pytorch/pytorch#106773 has fixed this when the input is a numpy array, but I'm running into some issues https://discuss.pytorch.org/t/cuda-not-available-in-nightly-but-is-for-2-0-1/187767.

... it is created on the cpu.

Do I understand that:

cond = xp.asarray(rng.random(size=cond_shape) > p) generates a CPU array

arrays = [xp.asarray(data.draw(npst.arrays(dtype=dtype, shape=shape))) for shape in shapes] also generates CPU arrays

xp = array_namespace(cond, *arrays) returns a Pytorch namespace

temp1 = xp.asarray(f(*(arr[cond] for arr in arrays))) returns a CPU array

out = xp.full(cond.shape, fill_value=fillvalue, dtype=dtype) generates a GPU array because I didn't specify the device.

out[cond] = temp1 fails

This sounds like it is going to be a common gotcha. I would not expect the namespace returned by array_namespace(cond, *arrays) to have functions that produce arrays that are incompatible with cond and arrays.

I think that may be the case, but I don't know how asarray is supposed to interact with the default device and whether pytorch/pytorch#106773 will apply to numpy array inputs too.

The ideal scenario is that xp.asarray transfers to GPU à la CuPy when we have SCIPY_DEVICE=cuda.

I'll let you know once I hear back from PyTorch and am able to test with the nightly.

@mdhaber crisis averted: the pytorch/pytorch#106773 fix applies to np inputs too, so with a recent pytorch nightly cond = xp.asarray(rng.random(size=cond_shape) > p) will generate a GPU array when we have SCIPY_DEVICE=cuda.

So everything in this PR is looking good, the errors will go away in the next pytorch release.

Cool. Thanks.

So everything in this PR is looking good

Thanks. In that case, please consider hitting the approve button. (As a maintainer, it's helpful to see that someone else has also taken a look.)

scipy/_lib/tests/test__util.py

mdhaber · 2023-09-04T23:15:53Z

@lucascolley A few thoughts about the new _array_api assertions:

Should they be named something different in case they need to be used in the same file as np.testing assertions? Here, I've done the import assert_equal as xp_assert_equal.
Does passing the assertion already imply that the result and reference are the same array type? If not, should it?
What do you think of adding options that check the shape and dtype of the resulting array? Although a little inconvenient if all elements of an array are supposed to be the same number (e.g. 0), I would even suggest that checking shape and dtype should be part of the default comparison behavior to force us to be more careful about these things as we head into the Array API world.

lucascolley · 2023-09-05T10:57:48Z

Should they be named something different in case they need to be used in the same file as np.testing assertions? Here, I've done the import assert_equal as xp_assert_equal.

I'm happy to add the xp_ prefix.

Does passing the assertion already imply that the result and reference are the same array type? If not, should it?

I don't think that this is implied - maybe it should be. However, testing that the correct namespace is returned should always be done at least somewhere other than these functions. For fft I have a separate class called TestNamespaces, but these could be worked into the existing tests to be slightly more efficient if needed. These tests use Tyler's _assert_matching_namespace.

What do you think of adding options that check the shape and dtype of the resulting array? Although a little inconvenient if all elements of an array are supposed to be the same number (e.g. 0), I would even suggest that checking shape and dtype should be part of the default comparison behavior to force us to be more careful about these things as we head into the Array API world.

Yep, this sounds good. I'm not sure whether they even need to be options, we should probably always do this. torch does this anyway, so I'll just need to change np and cupy.

I'll make a follow-up PR with these improvements. We can always come back and tweak again as needed.

mdhaber · 2023-09-05T18:11:10Z

However, testing that the correct namespace is returned should always be done at least somewhere other than these functions.

Why? That is, why can't it just be baked into all the tests that also check the numerical results?

lucascolley · 2023-09-05T19:11:00Z

why can't it just be baked into all the tests that also check the numerical results?

Some functions will be tested with these assertions, some won't. To me, having a separate class to test namespaces, or at least a separate line in the test functions with that explicit purpose, will ensure that we have full coverage of these checks.

For groups of functions which are all tested using these functions, having it baked in would be sufficient, but I don't really like how implicit that is.

What do you think?

lucascolley · 2023-09-06T17:59:53Z

I thought that python dev.py test -t scipy/_lib/tests/test__util.py -b all was passing for me yesterday, but now I am getting a failure with torch, where res1 and ref1 are both nan:

_____________________________________________________________________________________ TestLazywhere.test_basic[torch] ______________________________________________________________________________________
scipy/_lib/tests/test__util.py:364: in test_basic
    @pytest.mark.filterwarnings('ignore::RuntimeWarning')  # overflows, etc.
        f          = <function given.<locals>.run_test_as_given.<locals>.wrapped_test at 0x7fc27cc78040>
        self       = <scipy._lib.tests.test__util.TestLazywhere object at 0x7fc27d02c2d0>
        xp         = <module 'torch' from '/home/lucas/mambaforge/envs/scipy-dev/lib/python3.11/site-packages/torch/__init__.py'>
/home/lucas/mambaforge/envs/scipy-dev/lib/python3.11/site-packages/hypothesis/core.py:790: in execute_once
    result = self.test_runner(data, run)
        data       = ConjectureData(VALID, 61 bytes)
        example_kwargs = None
        expected_failure = None
        is_final   = True
        print_example = True
        run        = <function StateForActualGivenExecution.execute_once.<locals>.run at 0x7fc27d00bba0>
        self       = <hypothesis.core.StateForActualGivenExecution object at 0x7fc27d02c4d0>
        test       = <function TestLazywhere.test_basic at 0x7fc27cc77e20>
        text_repr  = None
/home/lucas/mambaforge/envs/scipy-dev/lib/python3.11/site-packages/hypothesis/executors.py:47: in default_new_style_executor
    return function(data)
        data       = ConjectureData(VALID, 61 bytes)
        function   = <function StateForActualGivenExecution.execute_once.<locals>.run at 0x7fc27d00bba0>
/home/lucas/mambaforge/envs/scipy-dev/lib/python3.11/site-packages/hypothesis/core.py:786: in run
    return test(*args, **kwargs)
        a          = []
        args       = ()
        argslices  = {'dtype': (6, 7), 'n_arrays': (0, 1), 'p': (7, 18), 'rng_seed': (1, 6)}
        context    = <hypothesis.control.BuildContext object at 0x7fc3b545bb10>
        data       = ConjectureData(VALID, 61 bytes)
        example_kwargs = None
        expected_failure = None
        is_final   = True
        kw         = {'data': data(...), 'dtype': <class 'numpy.float32'>, 'n_arrays': 2, 'p': 0.0, ...}
        kwargs     = {'data': data(...), 'dtype': <class 'numpy.float32'>, 'n_arrays': 2, 'p': 0.0, ...}
        print_example = True
        printer    = <hypothesis.vendor.pretty.RepresentationPrinter object at 0x7fc27d077e10>
        self       = <hypothesis.core.StateForActualGivenExecution object at 0x7fc27d02c4d0>
        test       = <function TestLazywhere.test_basic at 0x7fc27cc77e20>
        text_repr  = None
scipy/_lib/tests/test__util.py:401: in test_basic
    xp_assert_equal(res1, ref1)
        arrays     = [tensor(0.), tensor(nan)]
        cond       = tensor(True)
        cond_shape = ()
        data       = data(...)
        dtype      = <class 'numpy.float32'>
        f          = <function TestLazywhere.test_basic.<locals>.f at 0x7fc3b7f8a200>
        f2         = <function TestLazywhere.test_basic.<locals>.f2 at 0x7fc27cc15760>
        fillvalue  = tensor(0.)
        input_shapes = ((), (), ())
        mbs        = mutually_broadcastable_shapes(num_shapes=3, min_side=0)
        n_arrays   = 2
        p          = 0.0
        ref1       = tensor(nan)
        ref2       = tensor(nan)
        res1       = tensor(nan)
        res2       = tensor(nan)
        result_shape = ()
        rng        = Generator(PCG64) at 0x7FC27CC70200
        rng_seed   = 1000000000
        self       = <scipy._lib.tests.test__util.TestLazywhere object at 0x7fc27d02c2d0>
        shapes     = [(), ()]
        xp         = <module 'torch' from '/home/lucas/mambaforge/envs/scipy-dev/lib/python3.11/site-packages/torch/__init__.py'>
scipy/_lib/_array_api.py:188: in assert_equal
    return xp.testing.assert_close(actual, desired, rtol=0, atol=0,
E   AssertionError
        actual     = tensor(nan)
        desired    = tensor(nan)
        err_msg    = ''
        xp         = <module 'scipy._lib.array_api_compat.array_api_compat.torch' from '/home/lucas/dev/myscipy/build-install/lib/python3.11/site-packages/scipy/_lib/array_api_compat/array_api_compat/torch/__init__.py'>
------------------------------------------------------------------------------------------------ Hypothesis ------------------------------------------------------------------------------------------------
Falsifying example: test_basic(
    self=<scipy._lib.tests.test__util.TestLazywhere object at 0x7fc27d02c2d0>,
    xp=<module 'torch' from '/home/lucas/mambaforge/envs/scipy-dev/lib/python3.11/site-packages/torch/__init__.py'>,
    n_arrays=2,
    rng_seed=1000000000,
    dtype=numpy.float32,
    p=0.0,
    data=data(...),
)
Draw 1: BroadcastableShapes(input_shapes=((), (), ()), result_shape=())
Draw 2: array(0., dtype=float32)
Draw 3: array(0., dtype=float32)
Draw 4: array(nan, dtype=float32)
========================================================================================= short test summary info ==========================================================================================
FAILED scipy/_lib/tests/test__util.py::TestLazywhere::test_basic[torch] - AssertionError
====================================================================================== 1 failed, 21 passed in 13.53s =======================================================================================

@reproduce_failure('6.82.7', b'AXicY2RgZCANKDAy/P8BYQIAEN8CGw==')

mdhaber · 2023-09-06T18:41:49Z

@lucascolley Isn't that the same as #18930 (comment), fixed by adding equal_nan=True in gh-19186?

mdhaber · 2023-09-06T18:53:13Z

What do you think?

These sort of separate checks (I'm lumping this in with checks of shape, dtype, and scalar vs array) have been forgotten for the past 20 years, so I imagine they will continue to be forgotten. But it seems to me that performing them automatically in these very widely used assertions can only help. I mean, I can anticipate an argument like "If we make it standard practice to include a separate namespace check separately, then it will never be forgotten, whereas if it is not written separately when these assertions are used, it may be forgotten when these assertions are not used." But I don't believe that statement.

lucascolley · 2023-09-06T19:00:10Z

Isn't that the same as #18930 (comment), fixed by adding equal_nan=True

Ah yes exactly, apologies, I forgot that I tested with that patch yesterday. I'll approve now 👍.

"If we make it standard practice to include a separate namespace check separately, then it will never be forgotten, whereas if it is not written separately when these assertions are used, it may be forgotten when these assertions are not used."

Yup, that's the argument I was thinking about.

But I don't believe that statement.

Fair enough. Happy to go along with whatever you think is best here 👍.

lucascolley

This all looks good to me now. python dev.py test -t scipy/_lib/tests/test__util.py -b all and the same with SCIPY_DEVICE=cuda should pass once #19186 is in and we get a new PyTorch release.

tupui

Thanks both. Looks like this is ready then. Let's get it in.

* Fixes scipy#19190 * the module-level skipping described in the above issue was causing spurious skips in some cases as described there, so move the tests in that module to a class guarded with a `skipif` mark, which seems to be the preferred approach anyway according to: https://docs.pytest.org/en/7.1.x/reference/reference.html?highlight=pytest%20skip#pytest.skip * on this branch I seem to get the expected behavior with i.e., - `python dev.py test -v -- -k "test_concatenate_int32_overflow" -rsx` - `59382 deselected` - `python dev.py test -m full -v -- -k "test_concatenate_int32_overflow" -rsx` - `1 passed, 59381 deselected` - `python dev.py test -v -t scipy/sparse/tests/test_construct.py::TestConstructUtils::test_concatenate_int32_overflow -- -rsx` - `1 deselected` - `SCIPY_ARRAY_API=1 python dev.py test -v -t "scipy/_lib/tests/test_array_api.py" -b cupy -- -rsx` - just the two failures expected from scipygh-19194 - `python dev.py test -v -t "scipy/_lib/tests/test_array_api.py" -b cupy -- -rsx` - same (the backend setting implies the env var) - `python dev.py test -v -t "scipy/_lib/tests/test_array_api.py" -- -rsx` - `6 skipped` - `python dev.py test -j 32 -b all` - `3 failed, 44291 passed, 2546 skipped, 149 xfailed, 12 xpassed` (failures noted above + PyTorch issue from scipygh-18930) - `SCIPY_DEVICE=cuda python dev.py test -j 32 -b all` - `3 failed, 44198 passed, 2639 skipped, 149 xfailed, 12 xpassed` (failures noted above + PyTorch issue from scipygh-18930) * could there be a `pytest` bug here somewhere? Yeah, probably. But given that following their docs suggestion to avoid the global skip scope seems to solve the problem, I'm inclined not to spend much time providing a small reproducer upstream and debating with them. It is probably related to the `-k` option collecting every single test node vs. the hand-tuned selection of the single node on the command line.

* Fixes scipy#19190 * the module-level skipping described in the above issue was causing spurious skips in some cases as described there, so move the tests in that module to a class guarded with a `skipif` mark, which seems to be the preferred approach anyway according to: https://docs.pytest.org/en/7.1.x/reference/reference.html?highlight=pytest%20skip#pytest.skip * on this branch I seem to get the expected behavior with i.e., - `python dev.py test -v -- -k "test_concatenate_int32_overflow" -rsx` - `59382 deselected` - `python dev.py test -m full -v -- -k "test_concatenate_int32_overflow" -rsx` - `1 passed, 59381 deselected` - `python dev.py test -v -t scipy/sparse/tests/test_construct.py::TestConstructUtils::test_concatenate_int32_overflow -- -rsx` - `1 deselected` - `SCIPY_ARRAY_API=1 python dev.py test -v -t "scipy/_lib/tests/test_array_api.py" -b cupy -- -rsx` - just the two failures expected from scipygh-19194 - `python dev.py test -v -t "scipy/_lib/tests/test_array_api.py" -b cupy -- -rsx` - same (the backend setting implies the env var) - `python dev.py test -v -t "scipy/_lib/tests/test_array_api.py" -- -rsx` - `6 skipped` - `python dev.py test -j 32 -b all` - `3 failed, 44291 passed, 2546 skipped, 149 xfailed, 12 xpassed` (failures noted above + PyTorch issue from scipygh-18930) - `SCIPY_DEVICE=cuda python dev.py test -j 32 -b all` - `3 failed, 44198 passed, 2639 skipped, 149 xfailed, 12 xpassed` (failures noted above + PyTorch issue from scipygh-18930) * could there be a `pytest` bug here somewhere? Yeah, probably. But given that following their docs suggestion to avoid the global skip scope seems to solve the problem, I'm inclined not to spend much time providing a small reproducer upstream and debating with them. It is probably related to the `-k` option collecting every single test node vs. the hand-tuned selection of the single node on the command line. [skip cirrus]

ENH: _lib._util: make _lazywhere compatible with Array API

332ea94

mdhaber added enhancement A new feature or improvement scipy._lib labels Jul 21, 2023

mdhaber requested a review from tupui July 21, 2023 21:46

mdhaber commented Jul 21, 2023

View reviewed changes

scipy/_lib/_util.py Show resolved Hide resolved

TST: _lib._util._lazywhere: add test

3f3b9f7

This was referenced Jul 23, 2023

TST: use Hypothesis for property-based tests #18927

Merged

SciPy: Fundamental Tools for Biomedical Research mdhaber/scipy#90

Open

tylerjereddy reviewed Jul 26, 2023

View reviewed changes

mdhaber commented Jul 28, 2023

View reviewed changes

scipy/_lib/_util.py Outdated Show resolved Hide resolved

mdhaber added 2 commits August 1, 2023 11:26

Apply suggestions from code review

426aefd

[skip ci]

Merge branch '_lazywhere' of github.com:mdhaber/scipy into _lazywhere

61f5384

mdhaber added 3 commits August 1, 2023 13:00

MAINT: _lib._util._lazywhere: correct array API compatibility issues

6e12806

Merge remote-tracking branch 'upstream/main' into _lazywhere

db5af03

TST: _lib._util._lazywhere: strengthen test

8302210

mdhaber commented Aug 21, 2023

View reviewed changes

mdhaber marked this pull request as ready for review August 21, 2023 21:16

mdhaber commented Aug 21, 2023

View reviewed changes

scipy/_lib/_util.py Show resolved Hide resolved

mdhaber commented Aug 21, 2023

View reviewed changes

lucascolley mentioned this pull request Aug 29, 2023

TST: add array-agnostic assertions #19157

Merged

mdhaber commented Sep 4, 2023

View reviewed changes

scipy/_lib/tests/test__util.py Outdated Show resolved Hide resolved

mdhaber added 2 commits September 4, 2023 15:50

Merge remote-tracking branch 'upstream/main' into _lazywhere

f020eb5

TST: _lib._util._lazywhere: update tests to use _array_api assertion

3a18c4d

lucascolley mentioned this pull request Sep 5, 2023

TST: _lib: improve array API assertions #19186

Merged

mdhaber requested a review from rgommers September 6, 2023 17:34

lucascolley approved these changes Sep 6, 2023

View reviewed changes

tupui approved these changes Sep 6, 2023

View reviewed changes

tupui added this to the 1.12.0 milestone Sep 6, 2023

tupui merged commit 5495f4c into scipy:main Sep 6, 2023
24 checks passed

tylerjereddy mentioned this pull request Sep 7, 2023

TST, MAINT: skip fix for array API tests #19206

Merged

		assert_equal(np.asarray(res1), ref1)
		assert_equal(np.asarray(res2), ref2)

	@array_api_compatible
	@skip_if_array_api_gpu
	@array_api_compatible

ENH: _lib._util: make _lazywhere compatible with Array API #18930

ENH: _lib._util: make _lazywhere compatible with Array API #18930

Conversation

mdhaber commented Jul 21, 2023 • edited

Reference issue

What does this implement/fix?

Additional information

mdhaber commented Jul 23, 2023 • edited

tupui commented Jul 24, 2023

tylerjereddy left a comment

Choose a reason for hiding this comment

mdhaber commented Jul 26, 2023

tupui commented Jul 26, 2023 • edited

tylerjereddy commented Jul 26, 2023 • edited

mdhaber commented Jul 26, 2023 • edited

tylerjereddy commented Jul 26, 2023

tupui commented Jul 26, 2023

lucascolley commented Jul 26, 2023 • edited

lucascolley commented Jul 27, 2023

mdhaber commented Jul 27, 2023 • edited

lucascolley commented Jul 27, 2023 • edited

tupui commented Jul 27, 2023 • edited

lucascolley commented Jul 27, 2023

tupui commented Jul 27, 2023 • edited

tupui commented Jul 27, 2023

lucascolley commented Jul 27, 2023

mdhaber commented Jul 28, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdhaber Aug 21, 2023 • edited

Choose a reason for hiding this comment

mdhaber Aug 21, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdhaber Aug 24, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdhaber Aug 21, 2023 • edited

Choose a reason for hiding this comment

mdhaber Sep 4, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lucascolley Sep 5, 2023 • edited

Choose a reason for hiding this comment

mdhaber Sep 5, 2023 • edited

Choose a reason for hiding this comment

lucascolley Sep 6, 2023 • edited

Choose a reason for hiding this comment

lucascolley Sep 6, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdhaber commented Sep 4, 2023 • edited

lucascolley commented Sep 5, 2023 • edited

mdhaber commented Sep 5, 2023 • edited

lucascolley commented Sep 5, 2023

lucascolley commented Sep 6, 2023 • edited

mdhaber commented Sep 6, 2023

mdhaber commented Sep 6, 2023 • edited

lucascolley commented Sep 6, 2023 • edited

lucascolley left a comment

Choose a reason for hiding this comment

tupui left a comment

Choose a reason for hiding this comment

mdhaber commented Jul 21, 2023 •

edited

mdhaber commented Jul 23, 2023 •

edited

tupui commented Jul 26, 2023 •

edited

tylerjereddy commented Jul 26, 2023 •

edited

mdhaber commented Jul 26, 2023 •

edited

lucascolley commented Jul 26, 2023 •

edited

mdhaber commented Jul 27, 2023 •

edited

lucascolley commented Jul 27, 2023 •

edited

tupui commented Jul 27, 2023 •

edited

tupui commented Jul 27, 2023 •

edited

mdhaber Aug 21, 2023 •

edited

mdhaber Aug 21, 2023 •

edited

mdhaber Aug 24, 2023 •

edited

mdhaber Aug 21, 2023 •

edited

mdhaber Sep 4, 2023 •

edited

lucascolley Sep 5, 2023 •

edited

mdhaber Sep 5, 2023 •

edited

lucascolley Sep 6, 2023 •

edited

lucascolley Sep 6, 2023 •

edited

mdhaber commented Sep 4, 2023 •

edited

lucascolley commented Sep 5, 2023 •

edited

mdhaber commented Sep 5, 2023 •

edited

lucascolley commented Sep 6, 2023 •

edited

mdhaber commented Sep 6, 2023 •

edited

lucascolley commented Sep 6, 2023 •

edited