-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: improve nanmedian performance #3335
Conversation
remove unnecessary copy, an unnecessary sort and use inplace median on the compressed copy.
median has the overwrite_input argument and sorts by itself since at least numpy 1.4 |
if x.size == 0: | ||
return np.nan | ||
return np.median(x) | ||
return np.median(x, overwrite_input=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
about overwrite_input=True
Do we have a copy of the original array without any nans and if it is 1d? or 2-d and non nans in a columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.compress returns a copy without the nans, this function only works on 1d arrays
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Numpy 1.5 is the oldest targeted version, so overwrite_input
should be fine
as normally you have less nans than data in your arrays it would be faster to just select the nans (np.where(np.isnan(x))) and move them to the end of the array with a little of fancy indexing. Though its probably best to do a faster nanmedian in numpy and then reuse it in scipy. |
Re bottleneck: I'm actually wondering if we want to bundle bottleneck or (better) have it as a dependency. |
-1 for a new dependency, that costs much more than it's worth. Replacing some scipy functions with relevant parts of |
ENH: improve nanmedian performance
This looks correct, so merging for 0.14.x. Thanks Julian. |
remove unnecessary copy, an unnecessary sort and use inplace median on
the compressed copy.