Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify moment chunk functions to return dicts #4519

Merged
merged 1 commit into from Feb 25, 2019

Conversation

Projects
None yet
2 participants
@pentschev
Copy link
Member

commented Feb 22, 2019

This is a follow-up on #4513.

@mrocklin

  • Tests added / passed
  • Passes flake8 dask
@mrocklin

This comment has been minimized.

Copy link
Member

commented Feb 22, 2019

Thanks @pentschev !

I'm curious, do cupy operations like std work after these changes?

@mrocklin

This comment has been minimized.

Copy link
Member

commented Feb 22, 2019

@pentschev you might also consider the following change to the sparse tests

diff --git a/dask/array/tests/test_sparse.py b/dask/array/tests/test_sparse.py
index 459609de..35b89f6d 100644
--- a/dask/array/tests/test_sparse.py
+++ b/dask/array/tests/test_sparse.py
@@ -33,6 +33,7 @@ functions = [
     lambda x: x.T,
     lambda x: da.transpose(x, (1, 2, 0)),
     lambda x: x.sum(),
+    lambda x: x.std(),
     lambda x: x.dot(np.arange(x.shape[-1])),
     lambda x: x.dot(np.eye(x.shape[-1])),
     lambda x: da.tensordot(x, np.ones(x.shape[:2]), axes=[(0, 1), (0, 1)]),

When I do this I get the following traceback:

dask/core.py:119: in _execute_task
    return func(*args2)
dask/compatibility.py:93: in apply
    return func(*args, **kwargs)
dask/array/reductions.py:385: in moment_chunk
    n = numel(A, **kwargs).astype(np.int64)
dask/array/reductions.py:323: in numel
    return chunk.sum(np.ones_like(x), **kwargs)
../../Software/anaconda/lib/python3.6/site-packages/numpy/core/numeric.py:263: in ones_like
    res = empty_like(a, dtype=dtype, order=order, subok=subok)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <COO: shape=(1, 2, 2), dtype=float64, nnz=2, fill_value=0.0>, kwargs = {}, densify = False

    def __array__(self, **kwargs):
        from . import _AUTO_DENSIFICATION_ENABLED as densify
        if not densify:
>           raise RuntimeError('Cannot convert a sparse array to dense automatically. '
                               'To manually densify, use the todense method.')
E           RuntimeError: Cannot convert a sparse array to dense automatically. To manually densify, use the todense method.

../sparse/sparse/sparse_array.py:213: RuntimeError

I think that numel currently computes the number of elements in an array by creating an array of ones and then doing the same reduction on that array of ones. We should be able to avoid this (and be considerably faster) by looking at the shape and the axis argument and doing a bit of thinking.

@pentschev

This comment has been minimized.

Copy link
Member Author

commented Feb 22, 2019

Thanks @pentschev !

I'm curious, do cupy operations like std work after these changes?

Well, AFAIK, there are no really "elaborate" tests for that, on my own simple tests it did. I will add that as well to #4486.

@pentschev

This comment has been minimized.

Copy link
Member Author

commented Feb 22, 2019

Regarding the x.std() sparse test, I tried it as well in current master (i.e., without changes in this PR and without those in #4513), it also fails, just like x.mean() that you added to #4513. Should we also tackle this one in a different issue too?

@mrocklin

This comment has been minimized.

Copy link
Member

commented Feb 22, 2019

I think that in general using the sparse library might be a good proxy for testing cupy. The sparse library is probably more sensitive to accidentally interacting with numpy arrays than cupy is, and is much easier to run on CI systems :)

@mrocklin

This comment has been minimized.

Copy link
Member

commented Feb 22, 2019

Regarding the x.std() sparse test, I tried it as well in current master (i.e., without changes in this PR and without those in #4513), it also fails, just like x.mean() that you added to #4513. Should we also tackle this one in a different issue too?

Right, we haven't supported these operations well before. I think that with this PR we're one step closer to supporting operations like std. I'd be happy to see that change either in another PR or in this one. I don't have a strong preference.

@pentschev

This comment has been minimized.

Copy link
Member Author

commented Feb 22, 2019

My preference is generally to do things in smaller tasks rather than trying to fix all at once. I will then open an issue to tackle std() and mean() in a separate PR so we can free ourselves of this issue here instead of having a thread that extends forever, ok?

@mrocklin

This comment has been minimized.

Copy link
Member

commented Feb 22, 2019

Sounds good. Thanks @pentschev

@mrocklin

This comment has been minimized.

Copy link
Member

commented Feb 22, 2019

Merging in 24 hours if there are no further comments.

@pentschev

This comment has been minimized.

Copy link
Member Author

commented Feb 22, 2019

Opened new issue to tackle the sparse error in #4523.

@mrocklin mrocklin merged commit b1430f0 into dask:master Feb 25, 2019

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@pentschev pentschev deleted the pentschev:moment-dict branch Mar 12, 2019

jorge-pessoa pushed a commit to jorge-pessoa/dask that referenced this pull request May 14, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.