Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Implement min(max)imum_filter1d using the MINLIST algorithm #3527

Closed
wants to merge 1 commit into from

Conversation

jaimefrio
Copy link
Member

This PR implements the minimum_filter1d and maximum_filter1d functions
using Richard Harter's implementation of the MINLIST algorithm. While
slower (~30%) than the MINLINE algorithm currently in place for random
inputs, its performance is consistent for all inputs.

@jaimefrio
Copy link
Member Author

Some timings comparing with the code from #3517:

In [1]: import numpy as np
In [2]: import scipy.ndimage as ndi
In [3]: np.random.seed(0)
In [4]: a = np.random.rand(1000)

In [5]: %timeit ndi.filters.minimum_filter1d(a, 5)
100000 loops, best of 3: 12.5 us per loop # 3517
100000 loops, best of 3: 14.2 us per loop # this PR

In [6]: %timeit ndi.filters.minimum_filter1d(a, 10)
100000 loops, best of 3: 12.8 us per loop # 3517
100000 loops, best of 3: 14.5 us per loop # this PR

In [7]: %timeit ndi.filters.minimum_filter1d(a, 100)
100000 loops, best of 3: 13.3 us per loop # 3517
100000 loops, best of 3: 15.6 us per loop # current master

In [8]: a = np.linspace(0, 1, 1000)

In [9]: %timeit ndi.filters.minimum_filter1d(a, 5)
100000 loops, best of 3: 17.5 us per loop # 3517
100000 loops, best of 3: 13.6 us per loop # this PR

In [10]: %timeit ndi.filters.minimum_filter1d(a, 10)
10000 loops, best of 3: 24.6 us per loop # 3517
100000 loops, best of 3: 13.8 us per loop # this PR

In [11]: %timeit ndi.filters.minimum_filter1d(a, 100)
10000 loops, best of 3: 142 us per loop # 3517
100000 loops, best of 3: 15.3 us per loop # this PR

@jaimefrio
Copy link
Member Author

The code is basically a transcription of this:

http://www.richardhartersworld.com/cri/2001/slidingmin.html

The original algorithm was first published (AFAIK) by S.C. Douglas:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.42.2777

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 108a54c on jaimefrio:minmax_filter1d-minlist into 387b42a on scipy:master.

@rgommers
Copy link
Member

pep8 failure unrelated.

@rgommers
Copy link
Member

@jaimefrio on the mailing list you got one response, favoring a switch and a doc update. The latter is missing. Not totally sure about the former, this PR seems to say you don't think a switch is needed?

@jaimefrio
Copy link
Member Author

I have made the code a little more compact, and fixed #3898, but performance is unchanged from the above benchmarks: a little slower for the general case than the current algorithm, but with guaranteed worst case performance that could be over an order of magnitude faster than the current one. The new algorithm is the same used in bottleneck and pandas, so it is tried and tested.

I have added tests, both regression and some general ones, as there did not seem to be any at all.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.0%) when pulling 5b8f87a on jaimefrio:minmax_filter1d-minlist into 6761b2c on scipy:master.



This PR implements the minimum_filter1d and maximum_filter1d functions
using Richard Harter's implementation of the MINLIST algorithm. While
slower (~30%) than the MINLINE algorithm currently in place for random
inputs, its performance is consistent for all inputs, and is much
faster for long, ordered seqeunces.
@coveralls
Copy link

Coverage Status

Coverage increased (+0.0%) when pulling 598c922 on jaimefrio:minmax_filter1d-minlist into 6761b2c on scipy:master.

@rgommers rgommers added this to the 0.15.0 milestone Sep 9, 2014
@rgommers
Copy link
Member

There were no objections on the mailing list for choosing this algorithm, so I'm inclined to merge this by tomorrow. The performance trade-off seems very reasonable.

rgommers added a commit that referenced this pull request Sep 17, 2014
@rgommers
Copy link
Member

Rebased (had merge conflict with threading change from Julian) and merged in fdddfc3. Thanks Jaime!

@WarrenWeckesser
Copy link
Member

The Travis build on Python 3.4 is now failing to build. See the first Travis job at #4003 and #4004. The error is

scipy/ndimage/src/ni_filters.c: In function ‘NI_MinOrMaxFilter1D’:
scipy/ndimage/src/ni_filters.c:386:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
cc1: some warnings being treated as errors

The relevant code in ni_filters.c is

    struct pairs {
        double value;
        npy_intp death;
    } *ring = NULL, *minpair, *end, *last;   // Line 386

The compiler doesn't like *ring = NULL. That's easy enough to fix, but why is this only happening with Python 3.4? Does it use different compiler flags, forcing it to use C90 conventions?

@jaimefrio
Copy link
Member Author

I think it was the rebasing with Julian's GIL release code that messed things up, as the declaration of the struct pairs pointers is now preceded by an errmsg[0] = 0;, which is the code mixed with the declarations. Moving the three lines from Julian to after the struct pairs declarations should fix things, will send a PR straight away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants