-
-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: adds np.nancumsum and np.nancumprod #7421
Conversation
cc @shoyer and pydata/xarray#791 |
Corresponds to PR numpy#7421, adpated from numpy#7410
Just a note -- since this is new API, somebody might ask you to write the numpy-discussions mailing list. But I think this should be pretty uncontroversial 👍. |
Consistency checks with integer arrays are great, but these also need tests on arrays with NaNs. |
Thanks @shoyer, I accidentally left that out. I updated this to include tests on arrays with NaNs following |
Could you put the documentation commit in this PR as well? Note that you can keep adding commits to the PR, and later clean up the history with |
|
||
One is returned for slices that are all-NaN or empty. | ||
|
||
.. versionadded:: 1.11.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're already at the release candidate stage for 1.11, so this will make it into 1.12
Return the cumulative sum of array elements over a given axis treating Not a | ||
Numbers (NaNs) as zero. | ||
|
||
One is returned for slices that are all-NaN or empty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is supposed to say zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seberg, thanks for finding that typo (should have been plural form of verb too, not singular-- same also for nancumprod)
Mailing list thread: https://mail.scipy.org/pipermail/numpy-discussion/2016-March/075169.html |
@@ -548,7 +551,7 @@ def nanprod(a, axis=None, dtype=None, out=None, keepdims=np._NoValue): | |||
Parameters | |||
---------- | |||
a : array_like | |||
Array containing numbers whose sum is desired. If `a` is not an | |||
Array containing numbers whose product is desired. If `a` is not an | |||
array, a conversion is attempted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're at it, nans are treated like "one", not "zero" on line 542/545.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if I am not mistaken, axis
, two lines below, accepts a tuple now. The docs should be updated according to np.prod
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for the .accumulate()
method used by the cumxxx
functions: only one axis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake. By the way, would it make sense to apply the _ureduce
function from numpy.lib.function_base
to other places, like most of numpy.core.from_numeric
? That way we could add multi-axis support without waiting for the gufunc
redesign that seems to be coming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which functions do you have in mind? I don't see any obvious place where _ureduce
could help...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ptp
, argmin
, argmax
. Also, the following might benefit as well, but the internal order of the array would matter: partition
, argpartition
, sort
, argsort
, searchsorted
, cumsum
, cumprod
. I was even thinking that it might be worth allowing ravel
to do a partial ravel along a subset of the axes (basically like _ureduce
does), but that is probably going to require some major rewriting to do properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @madphysicist, change made by 8fd1ee8
fixes the typo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only that I made a mistake. Axis only accepts one value here. You were absolutely right.
Return the cumulative sum of array elements over a given axis treating Not a | ||
Numbers (NaNs) as zero. | ||
|
||
Zeros are returned for slices that are all-NaN or empty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention that the sum does not change when nans are encountered and that any leading nans are replaced by zeros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@madphysicist, thanks, should be fixed by 41adc0f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you pushed that commit yet?
Aside from the comments about docs, LGTM. And by all means feel free to ignore the comments about functions other than your own. |
@madphysicist, I've pushed commits to address your comments (including for |
@madphysicist, actually there are more following a page refresh... hold on a bit. I'll let you know when the others are pushed. |
nansum_along_axis : ndarray. | ||
A new array holding the result is returned unless `out` is | ||
specified, in which it is returned. The | ||
result has the same size as `a`, and the same shape as `a` if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to nitpick, but the line lengths look funny here. Also, _along_axis
is inconsistent with the other functions. I think you should just leave it as nansum
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed I wondered about that myself but didn't want to depart from convention. Fixed in 29cca2d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also in 6ae90ec
Reading PEP8 more carefully, it looks like I'm wrong about spacing around binary operators: Hmm.... |
@shoyer, reverted spacing back on |
@@ -22,6 +22,18 @@ | |||
np.array([0.1042, -0.5954]), | |||
np.array([0.1610, 0.1859, 0.3146])] | |||
|
|||
# Rows of _ndat with nans converted to ones | |||
_rdat_ones = [np.array([0.6244, 1.0, 0.2692, 0.0116, 1.0, 0.1170]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should probably be all one big array -- it's only not one array for _rdat
because _rdat
is ragged
@pwolfram I think this is very close, though a test for negative axes would be nice. Otherwise this looks good to me. Sorry for leading your astray on pep8! |
assert_almost_equal(res, tgt) | ||
tgt = np.cumsum(_ndat_zeros,axis=axis) | ||
res = np.nancumsum(_ndat, axis=axis) | ||
assert_almost_equal(res, tgt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shoyer, was this what you were thinking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, exactly
This PR adds an implementation of `nancumsum` and `nancumprod`. The actual function is a two-liner adapted from `nansum`. Its structure is adapted from PR: numpy#5418
res = nf(mat, axis=axis, out=resout) | ||
assert_almost_equal(res, resout) | ||
assert_almost_equal(res, tgt) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shoyer, I also generalized the test here for consistency and greater coverage.
@shoyer, please see above tests added over the negative axis. Thanks for helping refine this! |
Also, btw @shoyer, no worries about the pep8 confusion. I installed the plugin for vim and I have a clearer idea of formatting standards so it was certainly value-added! |
OK, this looks good to me. I'll merge this once tests pass unless anyone speaks up with objections... |
Thanks @shoyer! |
Needed until numpy v1.12, see numpy/numpy#7421
Needed until numpy v1.12, see numpy/numpy#7421
Needed until numpy v1.12, see numpy/numpy#7421
Needed until numpy v1.12, see numpy/numpy#7421
Needed until numpy v1.12, see numpy/numpy#7421
Needed until numpy v1.12, see numpy/numpy#7421
Needed until numpy v1.12, see numpy/numpy#7421
* Adds nancumsum, nancumprod for numpy compatability Needed until numpy v1.12, see numpy/numpy#7421 * Adds nancumsum, nancumprod to xarray functions
This PR adds an implementation of
nancumsum
andnancumprod
.The actual function is a two-liner adapted from
nansum
andnanprod
.Its structure is adapted from PR: #5418 ( a minor typo in the doc string from this PR is fixed too)