New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add minimal distance argument to local_maxima #4024
Conversation
Adds the optional keyword `distance` to local_maxima (and local_minima) to specify the minimal euclidean distance allowed between maxima. In case of a conflict, the maximum with the smaller value is dismissed.
The relevant implementation is in Cython and uses a rather "naive brute force" algorithm:
Early benchmarks indicate a slow-down by 400% and more depending on how the The current API is not necessarily the best one. It may be useful to make this public or add this keyword to h_maxima as well. |
Can't believe I forgot my earlier attempt #3016 (comment). 😄 I'll need to investigate how these two approaches compare... |
Okay, a quick evaluation hints that the implementation in #3016 (comment) is significantly slower and scales very badly with an increasing number of peaks. So the new approach seems like the way to go. |
@lagru very cool! Sorry for the silence, it's been busy. =) I haven't had a chance to fully understand the code yet. With my review, I'll probably aim to break the function into some smaller functions so that the flow can be clearer. In the meantime though, it would be great to add some tests to this, and in particular to verify that functions where Also, imho the keyword argument should be called Thanks! |
@lagru another point, about your initial implementation: you were using brute-force to compare the distance to every other peak. Instead, a scipy.spatial.cKDTree could be used to rapidly search for nearby maxima, once you have the coordinates. |
Great suggestions. Definitely something to keep in mind/do next. I'm still tweaking the algorithm so I refrained from adding tests and compartmentalizing the code for now. cKDTree looks like it could be really helpful. I initially shied away because it works with "unraveled" indices (that can be solved though). I'm guessing that because it's written in C we could use it directly in Cython. Not sure whether the C-API is part of SciPy's official API though. I'll investigate and see if there's a performance improvement to be had. |
|
||
# Evaluate the space within the minimal distance of the current | ||
# maximum | ||
while queue_pop(&to_search, ¤t_index): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @jni suggested, a cKDTree might be a good tool to speed this while-block up.
I don't resist to xref to this issue: #4048 Perhaps we can solve and gain also on peak_local_max. |
@jni in the mailing list you mentioned this:
That'd be great to do however I don't see how this is only a modulo calculation away. As I understand it, one has to do a modulo call for all dimension in the correct order an evaluate the reminder each time (a % b = r, r=0 or r=b-1 -> border, else not border). But that wouldn't solve the problem because we actually care about not crossing the border in the wrong direction (evaluating an index at the border is fine in itself). So we'd have to consider the offsets as well. That is quite a bit of overhead compared to padding with a border flag. Seems to me that we would be trading memory cost for runtime and complexity. Do you have examples or experience regarding this or (if you find the time) can you elaborate where this statement is coming from? |
Lars is correct, it's not easy or trivial; if one goes back through my flood fill PR my immediate approach was to remove the padding. That initially seemed to work, but actually let the fill alias off one side and into the opposite side of the image. Fixing this became such a headache that I reverted to padding. |
Closed in favor of #4165 which proposes an equally fast (if not faster) solution based on SciPy's cKDTree. |
Description
Closes #3816.
Adds the optional keyword
distance
to local_maxima (and local_minima) to specify the minimal euclidean distance allowed between maxima. In case of a conflict, the maximum with the smaller value is dismissed.Checklist
./doc/examples
(new features only)./benchmarks
, if your changes aren't covered by anexisting benchmark
For reviewers
later.
__init__.py
.doc/release/release_dev.rst
.@meeseeksdev backport to v0.14.x