ENH: improve nanmedian performance #3335

Merged
merged 1 commit into from Feb 23, 2014

Conversation

Projects
None yet
6 participants
Contributor

juliantaylor commented Feb 15, 2014

remove unnecessary copy, an unnecessary sort and use inplace median on
the compressed copy.

ENH: improve nanmedian performance
remove unnecessary copy, an unnecessary sort and use inplace median on
the compressed copy.
Contributor

juliantaylor commented Feb 15, 2014

median has the overwrite_input argument and sorts by itself since at least numpy 1.4
what is the oldest supported numpy?

if x.size == 0:
return np.nan
- return np.median(x)
+ return np.median(x, overwrite_input=True)
@josef-pkt

josef-pkt Feb 15, 2014

Member

about overwrite_input=True
Do we have a copy of the original array without any nans and if it is 1d? or 2-d and non nans in a columns?

@juliantaylor

juliantaylor Feb 15, 2014

Contributor

np.compress returns a copy without the nans, this function only works on 1d arrays

@pv

pv Feb 15, 2014

Owner

Numpy 1.5 is the oldest targeted version, so overwrite_input should be fine

Coverage Status

Coverage remained the same when pulling 3ca2774 on juliantaylor:nanmedian into 6fa1a7c on scipy:master.

Contributor

juliantaylor commented Feb 15, 2014

as normally you have less nans than data in your arrays it would be faster to just select the nans (np.where(np.isnan(x))) and move them to the end of the array with a little of fancy indexing.

Though its probably best to do a faster nanmedian in numpy and then reuse it in scipy.
I guess performance sensitive users use bottleneck anyway.

Member

ev-br commented Feb 19, 2014

Re bottleneck: I'm actually wondering if we want to bundle bottleneck or (better) have it as a dependency.

@pv pv added the PR label Feb 19, 2014

Owner

rgommers commented Feb 22, 2014

-1 for a new dependency, that costs much more than it's worth. Replacing some scipy functions with relevant parts of bottleneck may make sense.

rgommers added a commit that referenced this pull request Feb 23, 2014

Merge pull request #3335 from juliantaylor/nanmedian
ENH: improve nanmedian performance

@rgommers rgommers merged commit dfd2e2b into scipy:master Feb 23, 2014

1 check passed

default The Travis CI build passed
Details
Owner

rgommers commented Feb 23, 2014

This looks correct, so merging for 0.14.x. Thanks Julian.

@rgommers rgommers added this to the 0.14.0 milestone Feb 23, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment