New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add XArray compatibility features #102

Merged
merged 9 commits into from Feb 22, 2018

Conversation

Projects
None yet
3 participants
@hameerabbasi
Collaborator

hameerabbasi commented Feb 19, 2018

Once the right broadcasting setup was in place; it was trivial to implement the three-argument version of where and NaN-skipping aggregations.

cc @mrocklin Feedback welcome.
cc @shoyer Feedback really welcome, as you know the ins and outs of XArray. Of course, "I can't" is okay; as always. :-)

@hameerabbasi

This comment has been minimized.

Collaborator

hameerabbasi commented Feb 19, 2018

Tests not yet added.

@hameerabbasi

This comment has been minimized.

Collaborator

hameerabbasi commented Feb 19, 2018

Unfortunately, there doesn't seem to be a way to override np.nansum. I added a print command in there, and:

>>> import numpy as np
>>> import sparse
>>> x = np.asarray([5, 6, np.nan])
>>> s = sparse.COO.from_numpy(x)
>>> s.nansum()
nanreduce
11.0
>>> np.nansum(s)
11.0

Edit: It doesn't want to fall back to our implementation even if we err when coercing.

>>> class NoncoercibleCOO(sparse.COO):
...     def __array__(self, *args, **kwargs):
...         raise ValueError('Cannot coerce COO.')
...     
>>> s = NoncoercibleCOO.from_numpy(x)
>>> np.nansum(s)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Users/hameer/anaconda3/envs/sparse/lib/python3.6/site-packages/numpy/lib/nanfunctions.py", line 581, in nansum
    a, mask = _replace_nan(a, 0)
  File "/Users/hameer/anaconda3/envs/sparse/lib/python3.6/site-packages/numpy/lib/nanfunctions.py", line 64, in _replace_nan
    a = np.array(a, subok=True, copy=True)
  File "<input>", line 3, in __array__
ValueError: Cannot coerce COO.
@hameerabbasi

This comment has been minimized.

Collaborator

hameerabbasi commented Feb 19, 2018

Hmm. It seems that (for nanmin and nanmax), sometimes we get -inf as a value (on a full axis that contained JUST nan and nothing else), whereas NumPy actually returns nan. I think the actual value of the reduction in this case is debatable (I, for one, think empty min should be inf, etc, as a hobbyist mathematician).

In any case, I'm willing to leave this as an xfail for now. Any input welcome.

@hameerabbasi

This comment has been minimized.

Collaborator

hameerabbasi commented Feb 19, 2018

I added tests and matched the Numpy API. Looks review ready to me. 💃 cc @mrocklin @shoyer

@hameerabbasi

This comment has been minimized.

Collaborator

hameerabbasi commented Feb 20, 2018

Do we want to support the edge case of NaNs in object arrays? It's proving difficult. It's certainly possible, but only with a bit of trickery.

@mrocklin

This comment has been minimized.

Collaborator

mrocklin commented Feb 20, 2018

@shoyer

This comment has been minimized.

Member

shoyer commented Feb 20, 2018

@hameerabbasi

This comment has been minimized.

Collaborator

hameerabbasi commented Feb 21, 2018

It seems we don't support object arrays in many cases at the moment. I've opened #104 to track this but am inclined to ignore this.

@hameerabbasi

This comment has been minimized.

Collaborator

hameerabbasi commented Feb 21, 2018

I fixed the nan issue (by using fmin/fmax); tested for it; and matched the Numpy warning (and tested for it). This time around I feel this really is ready for review.

Edit: I looked at the implementation for nanmin etc. in Numpy. Unfortunately, without hooking into Numpy and replacing its functions with our own if the args are SparseArray, I don't see a way to solve this.

@mrocklin

This comment has been minimized.

Collaborator

mrocklin commented Feb 21, 2018

This looks good to me :)

@hameerabbasi hameerabbasi merged commit 8965294 into pydata:master Feb 22, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@hameerabbasi

This comment has been minimized.

Collaborator

hameerabbasi commented Feb 22, 2018

Merged!

@hameerabbasi hameerabbasi deleted the hameerabbasi:xarray-compat branch Feb 22, 2018

hameerabbasi added a commit to hameerabbasi/sparse that referenced this pull request Feb 27, 2018

Add XArray compatibility features (pydata#102)
* Implement where.

* Implement NaN-skipping aggregations.

* Docs.

* Add tests, clarify docs a bit.

* Move NaN aggregations to be functions rather than methods to match Numpy

* Get rid of eval that was bothering me a lot.

* Fix NaN inequality issue.

* Remove object dtype code.

* Test for and fix warning code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment