New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial implementation for arbitrary fill values. #165
Conversation
Codecov Report
@@ Coverage Diff @@
## master #165 +/- ##
==========================================
+ Coverage 96.96% 97.12% +0.15%
==========================================
Files 11 11
Lines 1219 1252 +33
==========================================
+ Hits 1182 1216 +34
+ Misses 37 36 -1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few minor comments, but I'm excited about this! The implementation looks quite clean.
This will let us define functions all the nan-aggregations (e.g., nanmedian()
) for use in xarray. These functions can simply require a fill value of NaN.
sparse/coo/common.py
Outdated
@@ -56,6 +56,7 @@ def linear_loc(coords, shape): | |||
return out | |||
|
|||
|
|||
@check_fill_value(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it more readable to use keyword argument with literal values, e..g, @check_zero_fill_value(nargs=2)
sparse/utils.py
Outdated
|
||
|
||
def equivalent(x, y): | ||
return (x == y) | ((x != x) & (y != y)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: if you're calling this on arrays, you could skip the non-NA checks for dtypes that can't hold NaN/NaT.
sparse/utils.py
Outdated
def generator(func): | ||
@functools.wraps(func) | ||
def wrapped(*args, **kwargs): | ||
for arg in args[:nargs]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Watch out: This has one of the same issues we encountered in NEP 18: it means the public API for these functions will be changed, to only accepted positional arguments.
I would suggest writing helper functions to call inside func
inside, e.g., check_zero_fill_value(a, b)
.
sparse/utils.py
Outdated
@functools.wraps(func) | ||
def wrapped(*args, **kwargs): | ||
for arg in args[:nargs]: | ||
if hasattr(arg, 'fill_value') and \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: per PEP 8, prefer using extra parentheses ()
to explicit continuation with \
.
sparse/utils.py
Outdated
from .sparse_array import SparseArray | ||
|
||
@functools.wraps(func) | ||
def wrapped(arrays, *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, consider using a function instead of a decorator here.
In this case, the decorator would work OK (as long as the first argument is always called arrays
) but decorators are more magical than simple function calls.
docs/quickstart.rst
Outdated
However, operations which convert the sparse array into a dense one will raise exceptions | ||
For example, the following raises a :obj:`ValueError`. | ||
However, operations which convert the sparse array into a dense one will usually change the fill | ||
value instead of raising an error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new sentence here is a little confusing to me. These operations how change the fill value instead of converting a sparse array to a dense array, so they don't "convert the sparse array into a dense one" at all now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the sentence structure a bit.
@@ -265,7 +216,6 @@ All of the following will raise an :obj:`IndexError`, like in Numpy 1.13 and lat | |||
z[3, 6] | |||
z[1, 4, 8] | |||
z[-6] | |||
z[[True, True, False, True], 3, 4] | |||
|
|||
.. note:: Numpy advanced indexing is currently not supported. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
below here: maybe note that stack
and concatenate
require matching fill values, and that some operations (e.g., tensordot
) require a fill value of zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just another minor comment. I agree that this looks good to merge!
sparse/utils.py
Outdated
Traceback (most recent call last): | ||
... | ||
ValueError: This operation requires zero fill values. | ||
""" | ||
for arg in args: | ||
if (hasattr(arg, 'fill_value') and | ||
not equivalent(arg.fill_value, _zero_of_dtype(arg.dtype))): | ||
raise ValueError('This operation requires zero fill values.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a minor point, but it's nice to include offending values in all error messages, e.g., arg.fill_value
in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what best practice would be here. We wouldn't know what exact argument would produce this fill value, so showing it may be not be useful. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe:
'This operation requires zero fill values, but argument {} has fill value {}'.format(i, arg.fill_value)
where i
comes from iterating over args
with enumerate()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
sparse/utils.py
Outdated
@@ -276,4 +345,4 @@ def check_consistent_fill_value(arrays): | |||
fv = arrays[0].fill_value | |||
|
|||
if not all(equivalent(fv, s.fill_value) for s in arrays): | |||
raise ValueError('Consistent fill-values required.') | |||
raise ValueError('This operation requires consistent fill-values.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same concern as above about including fill values in the error message
I appreciate getting pinged, but I probably won't be reviewing this one. That's also fine. I probably shouldn't be active on every pull request here. It looks like @shoyer seems pretty happy, which is a good sign. I trust his attention to detail :) |
Well I believe there should be at least one reviewer. I'm thinking of ways to rope new contributors in... Reviewers or code-wise. 😄 |
Closes #143
Initial implementation for arbitrary fill values.