New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation for arbitrary fill values. #165

Merged
merged 12 commits into from Jul 18, 2018

Conversation

Projects
None yet
3 participants
@hameerabbasi
Collaborator

hameerabbasi commented Jul 15, 2018

Closes #143

Initial implementation for arbitrary fill values.

@hameerabbasi hameerabbasi requested review from mrocklin and shoyer Jul 15, 2018

@codecov

This comment has been minimized.

codecov bot commented Jul 15, 2018

Codecov Report

Merging #165 into master will increase coverage by 0.15%.
The diff coverage is 98.97%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #165      +/-   ##
==========================================
+ Coverage   96.96%   97.12%   +0.15%     
==========================================
  Files          11       11              
  Lines        1219     1252      +33     
==========================================
+ Hits         1182     1216      +34     
+ Misses         37       36       -1
Impacted Files Coverage Δ
sparse/io.py 100% <ø> (ø) ⬆️
sparse/coo/core.py 96.63% <100%> (+0.31%) ⬆️
sparse/coo/umath.py 96.82% <100%> (-0.02%) ⬇️
sparse/utils.py 98.4% <100%> (+1.28%) ⬆️
sparse/dok.py 95.57% <100%> (ø) ⬆️
sparse/sparse_array.py 93.1% <100%> (+1.1%) ⬆️
sparse/coo/common.py 97.29% <100%> (+0.03%) ⬆️
sparse/coo/indexing.py 98.9% <87.5%> (-1.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e2906c0...3faf622. Read the comment docs.

hameerabbasi added some commits Jul 15, 2018

@shoyer

I have a few minor comments, but I'm excited about this! The implementation looks quite clean.

This will let us define functions all the nan-aggregations (e.g., nanmedian()) for use in xarray. These functions can simply require a fill value of NaN.

@@ -56,6 +56,7 @@ def linear_loc(coords, shape):
return out
@check_fill_value(2)

This comment has been minimized.

@shoyer

shoyer Jul 15, 2018

Member

I find it more readable to use keyword argument with literal values, e..g, @check_zero_fill_value(nargs=2)

def equivalent(x, y):
return (x == y) | ((x != x) & (y != y))

This comment has been minimized.

@shoyer

shoyer Jul 15, 2018

Member

note: if you're calling this on arrays, you could skip the non-NA checks for dtypes that can't hold NaN/NaT.

def generator(func):
@functools.wraps(func)
def wrapped(*args, **kwargs):
for arg in args[:nargs]:

This comment has been minimized.

@shoyer

shoyer Jul 15, 2018

Member

Watch out: This has one of the same issues we encountered in NEP 18: it means the public API for these functions will be changed, to only accepted positional arguments.

I would suggest writing helper functions to call inside func inside, e.g., check_zero_fill_value(a, b).

@functools.wraps(func)
def wrapped(*args, **kwargs):
for arg in args[:nargs]:
if hasattr(arg, 'fill_value') and \

This comment has been minimized.

@shoyer

shoyer Jul 15, 2018

Member

Nit: per PEP 8, prefer using extra parentheses () to explicit continuation with \.

from .sparse_array import SparseArray
@functools.wraps(func)
def wrapped(arrays, *args, **kwargs):

This comment has been minimized.

@shoyer

shoyer Jul 15, 2018

Member

As above, consider using a function instead of a decorator here.

In this case, the decorator would work OK (as long as the first argument is always called arrays) but decorators are more magical than simple function calls.

However, operations which convert the sparse array into a dense one will raise exceptions
For example, the following raises a :obj:`ValueError`.
However, operations which convert the sparse array into a dense one will usually change the fill
value instead of raising an error.

This comment has been minimized.

@shoyer

shoyer Jul 16, 2018

Member

The new sentence here is a little confusing to me. These operations how change the fill value instead of converting a sparse array to a dense array, so they don't "convert the sparse array into a dense one" at all now.

This comment has been minimized.

@hameerabbasi

hameerabbasi Jul 18, 2018

Collaborator

Changed the sentence structure a bit.

@@ -265,7 +216,6 @@ All of the following will raise an :obj:`IndexError`, like in Numpy 1.13 and lat
z[3, 6]
z[1, 4, 8]
z[-6]
z[[True, True, False, True], 3, 4]
.. note:: Numpy advanced indexing is currently not supported.

This comment has been minimized.

@shoyer

shoyer Jul 16, 2018

Member

below here: maybe note that stack and concatenate require matching fill values, and that some operations (e.g., tensordot) require a fill value of zero?

This comment has been minimized.

@hameerabbasi

hameerabbasi Jul 18, 2018

Collaborator

Done!

@hameerabbasi

This comment has been minimized.

Collaborator

hameerabbasi commented Jul 18, 2018

@shoyer I think this is now ready to merge.
@mrocklin If you have time, if it's going to take too long, let me know.
@ahwillia If you're willing to review. Reviewing is great for learning the codebase, if you're interested.

@shoyer

Just another minor comment. I agree that this looks good to merge!

Traceback (most recent call last):
...
ValueError: This operation requires zero fill values.
"""
for arg in args:
if (hasattr(arg, 'fill_value') and
not equivalent(arg.fill_value, _zero_of_dtype(arg.dtype))):
raise ValueError('This operation requires zero fill values.')

This comment has been minimized.

@shoyer

shoyer Jul 18, 2018

Member

It's a minor point, but it's nice to include offending values in all error messages, e.g., arg.fill_value in this case

This comment has been minimized.

@hameerabbasi

hameerabbasi Jul 18, 2018

Collaborator

I'm not sure what best practice would be here. We wouldn't know what exact argument would produce this fill value, so showing it may be not be useful. What do you think?

This comment has been minimized.

@shoyer

shoyer Jul 18, 2018

Member

Maybe:

'This operation requires zero fill values, but argument {} has fill value {}'.format(i, arg.fill_value)

where i comes from iterating over args with enumerate().

This comment has been minimized.

@hameerabbasi

hameerabbasi Jul 18, 2018

Collaborator

Done.

@@ -276,4 +345,4 @@ def check_consistent_fill_value(arrays):
fv = arrays[0].fill_value
if not all(equivalent(fv, s.fill_value) for s in arrays):
raise ValueError('Consistent fill-values required.')
raise ValueError('This operation requires consistent fill-values.')

This comment has been minimized.

@shoyer

shoyer Jul 18, 2018

Member

same concern as above about including fill values in the error message

hameerabbasi added some commits Jul 18, 2018

@mrocklin

This comment has been minimized.

Collaborator

mrocklin commented Jul 18, 2018

@mrocklin If you have time, if it's going to take too long, let me know.

I appreciate getting pinged, but I probably won't be reviewing this one. That's also fine. I probably shouldn't be active on every pull request here. It looks like @shoyer seems pretty happy, which is a good sign. I trust his attention to detail :)

@hameerabbasi

This comment has been minimized.

Collaborator

hameerabbasi commented Jul 18, 2018

Well I believe there should be at least one reviewer. I'm thinking of ways to rope new contributors in... Reviewers or code-wise. 😄

@hameerabbasi hameerabbasi merged commit f767837 into pydata:master Jul 18, 2018

4 of 5 checks passed

LGTM analysis: Python Analysis Failed (could not build the base commit (e2906c0))
Details
ci/circleci: build_27 Your tests passed on CircleCI!
Details
ci/circleci: build_36 Your tests passed on CircleCI!
Details
codecov/patch 98.97% of diff hit (target 96.96%)
Details
codecov/project 97.12% (+0.15%) compared to e2906c0
Details

@hameerabbasi hameerabbasi deleted the hameerabbasi:fill-values branch Jul 18, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment