Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow dtype input argument in np.sum #4472

Merged
merged 29 commits into from Sep 30, 2019

Conversation

luk-f-a
Copy link
Contributor

@luk-f-a luk-f-a commented Aug 21, 2019

closes #4220

@esc
Copy link
Member

esc commented Aug 21, 2019

@luk-f-a thanks for getting this going! I'll mark it as in-progress for now, please do let us know when it is ready for a review.

@luk-f-a
Copy link
Contributor Author

luk-f-a commented Aug 21, 2019

I have implemented the case np.sum(arr, dtype). Before I move on to np.sum(arr, axis, dtype) it would be helpful if someone could perform a quick review and tell me if I'm on the right path.

@esc
Copy link
Member

esc commented Aug 22, 2019

@luk-f-a thanks, I have scheduled it for a preliminary review.

@stuartarchibald
Copy link
Contributor

@luk-f-a thanks for the patch. I think this approach is likely to work, but as noted on gitter, have to wonder about the longevity of such an addition. That said, it seems to be less involved than anticipated, so perhaps complete this and ensure there's an exceptionally good set of tests for it and this can then really help guide refactoring these implementations to use the newer APIs at some point in the future. What do you think?

@luk-f-a
Copy link
Contributor Author

luk-f-a commented Aug 27, 2019

@stuartarchibald, sounds good, I'll add more tests. Do you think this will be replaced in the short term? I ask because I have an ugly solution that works for sum(axis, dtype) but involves some duplicated code I'm not proud of. I cannot find a workaround however, maybe we can catch up on gitter.

@luk-f-a
Copy link
Contributor Author

luk-f-a commented Aug 31, 2019

@stuartarchibald, I have added more tests. Maybe too many 😄 . I test over a full range of dtypes, for all 4 implentations: sum, sum(axis), sum(dtype) and sum(axis, dtype). some are failing for certain dtypes.
-sum passes for all dtypes.
-sum(axis) fails for timedelta
-sum(dtype) passes for all (shares implementation with sum)
-sum(axis, dtype) (shares implementation with sum(axis)) fails for timedelta and int32.

The error with timedelta comes from type(zero) in arraymath.py line 203. do you know why this would fail?
The error with int32 happens because the result is returned as int64, I haven't been able to track this yet. I'll have another look in the next days.

Please have a look and let me know if these tests cover what you had in mind.

@stuartarchibald
Copy link
Contributor

@luk-f-a, thanks for this, I've taken a look at the tests, this level of detail seems appropriate given the changes proposed. I think some more effort will be needed code generation wise as it seems that the compiler might be being hit too often, will take a closer look when this is nearer ready.

The error with timedelta comes from type(zero) in arraymath.py line 203. do you know why this would fail?

Suspect the timedelta types are lacking __class__ attr resolvers cf.:

@infer_getattr
class NumberAttribute(AttributeTemplate):
key = types.Number
def resolve___class__(self, ty):
return types.NumberClass(ty)
def resolve_real(self, ty):
return getattr(ty, "underlying_float", ty)
def resolve_imag(self, ty):
return getattr(ty, "underlying_float", ty)
@bound_function("complex.conjugate")
def resolve_conjugate(self, ty, args, kws):
assert not args
assert not kws
return signature(ty)
@bound_function("number.item")
def resolve_item(self, ty, args, kws):
assert not kws
if not args:
return signature(ty)

@luk-f-a
Copy link
Contributor Author

luk-f-a commented Sep 6, 2019

@stuartarchibald , I've fixed the problem with np.int32. All that remains is the test failing on sums of timedelta arrays. What is failing is the existing np.sum implementation, not the new code, should I open a separate issue to fix timedelta and disable that test for now?
I tried to fix it, but I cannot find the right place. You showed me an example of NumberAttribute template, but there's nothing similar for timedelta. Should one be created?

@stuartarchibald
Copy link
Contributor

@luk-f-a excellent, good job. Yes, if the timedelta etc problems are present in existing implementations then please file a new issue for them and feel free to ignore here. I think that there's sufficient quantity of churn/new code here without trying to fix bugs/unimplemented corners too!

@stuartarchibald
Copy link
Contributor

Seems like there's some windows CI failures, suspect it's types.intp/windows having different integer size related.

@luk-f-a
Copy link
Contributor Author

luk-f-a commented Sep 9, 2019

@stuartarchibald I created an issue for timedelta. Regarding windows, I am trying to set up a Windows env to test locally. Have you seen this problem before? What's the usual approach when linux and Windows behave differently?

@stuartarchibald stuartarchibald mentioned this pull request Sep 26, 2019
32 tasks
@stuartarchibald
Copy link
Contributor

@stuartarchibald ,test_sum, test_sum_axis_kws are the original ones, the ones I found in the code when I started. I haven't deleted them but I think the new ones supersede them, since they test the same and more. should I delete them?

If they are entirely superfluous given the new tests then yes please, the build is starting to hit CI wall time limits!

@stuartarchibald
Copy link
Contributor

@luk-f-a 0726ed1 seems to have deleted the entire module containing the tests?

As title
@stuartarchibald
Copy link
Contributor

Also, luk-f-a#2 word wraps the text.

@luk-f-a
Copy link
Contributor Author

luk-f-a commented Sep 26, 2019

Also, luk-f-a#2 word wraps the text.

thanks.

seems to have deleted the entire module containing the tests?

sorry, I don't understand how that happened, locally the file was still there. maybe I should not push in the evening.

@@ -170,6 +179,12 @@ def array_sum_const_axis_neg_one(a, axis):
# "const_multi" variant would raise errors
return a.sum(axis=-1)

def array_mean(a):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure these are supposed to be here either?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're too fast, I was just reverting those. I think I got on the wrong side of PyCharm smart checkout, and it brought those from my branch working on np.mean. I think I cleaned it now.

Copy link
Contributor

@stuartarchibald stuartarchibald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Thanks very much for your patience and persistence in implementing this, much appreciated. I think a large number of users will be grateful for this patch/feature too!

@stuartarchibald stuartarchibald added 4 - Waiting on CI Review etc done, waiting for CI to finish and removed 4 - Waiting on author Waiting for author to respond to review labels Sep 26, 2019
@esc
Copy link
Member

esc commented Sep 27, 2019

🎉

@esc esc closed this Sep 27, 2019
@esc esc reopened this Sep 27, 2019
@luk-f-a
Copy link
Contributor Author

luk-f-a commented Sep 27, 2019

@stuartarchibald , thanks for all your help and patience. I hope this will allow the improvement of other functions (like np.mean getting an axis parameter).

@luk-f-a
Copy link
Contributor Author

luk-f-a commented Sep 27, 2019

@esc, thanks for your review and comments.

@esc
Copy link
Member

esc commented Sep 27, 2019

@luk-f-a no problem, now we just need to massage it through CI..

@luk-f-a
Copy link
Contributor Author

luk-f-a commented Sep 29, 2019

@esc @stuartarchibald I don't think Linux py35_np111_cov has run even once since I added the extended set of tests. It always times out. If you merge this PR, every future PR will have the same problem, won't it?

@esc
Copy link
Member

esc commented Sep 29, 2019

@luk-f-a I don't think the cov timeout is caused by this PR.. Think it is pretty safe to assume this PR will be O.K.

@esc
Copy link
Member

esc commented Sep 30, 2019

For azure only the coverage run timed out. Probably safe to merge.

@stuartarchibald stuartarchibald added 5 - Ready to merge Review and testing done, is ready to merge and removed 4 - Waiting on CI Review etc done, waiting for CI to finish labels Sep 30, 2019
@stuartarchibald stuartarchibald merged commit 6359d3b into numba:master Sep 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to merge Review and testing done, is ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support for 'dtype' keyword argument in np.sum()
3 participants