Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEP-18: flatnonzero() on CuPy array missing cupy.compress() #4497

Closed
pentschev opened this issue Feb 18, 2019 · 9 comments
Closed

NEP-18: flatnonzero() on CuPy array missing cupy.compress() #4497

pentschev opened this issue Feb 18, 2019 · 9 comments

Comments

@pentschev
Copy link
Member

Calling dask.array.flatnonzero() on Dask array created from a CuPy one fails due to missing cupy.compress() implementation. Sample and traceback:

import cupy
import dask.array as da

x = cupy.random.random(5000) * cupy.random.randint(-1, 1, 5000)

d = da.from_array(x, chunks=(1000), asarray=False)

da.flatnonzero(d).compute()
Traceback (most recent call last):
  File "flatnonzero.py", line 8, in <module>
    da.flatnonzero(d).compute()
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/dask/base.py", line 156, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/dask/base.py", line 398, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/dask/threaded.py", line 76, in get
    pack_exception=pack_exception, **kwargs)
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/dask/local.py", line 460, in get_async
    raise_exception(exc, tb)
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/dask/compatibility.py", line 112, in reraise
    raise exc
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/dask/local.py", line 230, in execute_task
    result = _execute_task(task, data)
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/dask/core.py", line 118, in _execute_task
    args2 = [_execute_task(a, cache) for a in args]
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/dask/core.py", line 118, in <listcomp>
    args2 = [_execute_task(a, cache) for a in args]
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/dask/core.py", line 119, in _execute_task
    return func(*args2)
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/dask/optimization.py", line 942, in __call__
    dict(zip(self.inkeys, args)))
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/dask/core.py", line 149, in get
    result = _execute_task(task, cache)
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/dask/core.py", line 119, in _execute_task
    return func(*args2)
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/dask/compatibility.py", line 93, in apply
    return func(*args, **kwargs)
  File "/home/nfs/pentschev/.local/lib/python3.5/site-packages/numpy/core/overrides.py", line 153, in public_api
    implementation, public_api, relevant_args, args, kwargs)
  File "cupy/core/core.pyx", line 1259, in cupy.core.core.ndarray.__array_function__
AttributeError: module 'cupy' has no attribute 'compress'

I suppose the first thing we could is check whether compress() is really necessary for flatnonzero(), calling cupy.nonflatzero() on a CuPy array works, despite the lack of cupy.compress().

@mrocklin

@jakirkham
Copy link
Member

Are you able to do masked selection with CuPy arrays?

import cupy as cp

a = cp.random.random((100,))
m = (a < 0.5)

a[m]  # does this work?

@mrocklin
Copy link
Member

mrocklin commented Mar 5, 2019

Looks like it

In [1]: import cupy as cp
   ...:
   ...: a = cp.random.random((100,))
   ...: m = (a < 0.5)
   ...:
   ...: a[m]  # does this work?
   ...:
Out[1]:
array([0.04045322, 0.24932342, 0.28296151, 0.4294802 , 0.49973633,
       0.27953066, 0.37460658, 0.08451402, 0.49368075, 0.24242205,
       0.03885954, 0.20229318, 0.43174102, 0.03993055, 0.08816303,
       0.39807801, 0.28210111, 0.47563606, 0.23024547, 0.25666147,
       0.27301158, 0.12098015, 0.29695839, 0.27015565, 0.03648725,
       0.39370685, 0.14722936, 0.12761314, 0.31176351, 0.29144733,
       0.10560592, 0.43213118, 0.31096294, 0.37314704, 0.46814892,
       0.10694211, 0.32721032, 0.30863228, 0.11592595, 0.1794934 ,
       0.17914631, 0.43492253, 0.44624587, 0.05248363, 0.44066885,
       0.23812253, 0.32535871])

@jakirkham
Copy link
Member

Thanks @mrocklin. One more question, do 1-D masked selections work on N-D arrays?

import cupy as cp
a = cp.random.random((10, 11))
m = (a[0] < 0.5)
a[:, m]  # does this work?

If so, I think we can write a helper function that uses mask selection and use it here instead of np.compress. This would solve this case and a few other cases at the same time.

@pentschev
Copy link
Member Author

@jakirkham it does work as well.

>>> import cupy as cp
>>> a = cp.random.random((10, 11))
>>> m = (a[0] < 0.5)
>>> a[:, m]  # does this work?
array([[0.06600065, 0.23931853, 0.30653141, 0.3029505 , 0.40414086,
        0.42268827],
       [0.90021452, 0.47309845, 0.31011203, 0.39550123, 0.1865791 ,
        0.31820397],
       [0.56267709, 0.03729485, 0.93311142, 0.90169266, 0.22027314,
        0.82554595],
       [0.24942237, 0.48803313, 0.93434275, 0.21592739, 0.32250506,
        0.53656856],
       [0.15371137, 0.41410281, 0.37906657, 0.33839371, 0.5345219 ,
        0.07338778],
       [0.98917756, 0.78057068, 0.66349326, 0.33831257, 0.86242157,
        0.56330354],
       [0.44809768, 0.41974921, 0.88093927, 0.9179428 , 0.30592205,
        0.37130313],
       [0.34141672, 0.91433694, 0.90972702, 0.37957753, 0.94163387,
        0.71046038],
       [0.29285269, 0.32572922, 0.29799585, 0.57637301, 0.72525671,
        0.54205314],
       [0.97411764, 0.40978076, 0.2279997 , 0.74468592, 0.622623  ,
        0.12544337]])

And thanks for the suggestion, @jakirkham. I will look into details as soon as I get to the issue here, unless you want to give it a try. :)

@jakirkham
Copy link
Member

Cool. Thanks for checking @pentschev.

If so, I think we can write a helper function that uses mask selection and use it here instead of np.compress. This would solve this case and a few other cases at the same time.

Thinking on this again. We may actually want to just consolidate these two branches and perform the selection on the Dask Array directly. Since the compress code was written, Dask Arrays started supporting masked selection ( #2658 ). So it may simplify things a bit to just leverage that functionality here.

I will look into details as soon as I get to the issue here, unless you want to give it a try. :)

Probably not this week. Maybe next week or the week after though. ;)

@pentschev
Copy link
Member Author

Thanks @jakirkham, otherwise I might come back to this one after we handle the _meta issue.

@jakirkham
Copy link
Member

jakirkham commented Mar 5, 2019

This wound up being pretty easy. So just added PR ( #4548 ). :)

@pentschev
Copy link
Member Author

Thanks for the PR @jakirkham!

@jakirkham
Copy link
Member

Happy to help. Please let me know how it goes once you have a chance to retest.

@pentschev pentschev mentioned this issue Apr 24, 2019
19 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants