FIX: select num_peaks if labels is specified #2098

Merged
merged 5 commits into from Sep 6, 2016

Projects

None yet

9 participants

@sciunto
Member
sciunto commented May 21, 2016

This PR fixes the bug described in #1726.

  • I moved duplicated code into an internal function
  • I added a unittest (was not covered before).
@soupault soupault and 1 other commented on an outdated diff May 21, 2016
skimage/feature/tests/test_peak.py
@@ -85,6 +85,20 @@ def test_num_peaks():
assert (3, 5) in peaks_limited
+def test_num_peaks_and_labels():
+ image = np.zeros((7, 7), dtype=np.uint8)
+ labels = np.zeros((7, 7), dtype=np.uint8) + 20
+ image[1, 1] = 10
+ image[1, 3] = 11
+ image[1, 5] = 12
+ image[3, 5] = 8
+ image[5, 3] = 7
+ assert len(peak.peak_local_max(image, min_distance=1, threshold_abs=0)) == 5
+ peaks_limited = peak.peak_local_max(
+ image, min_distance=1, threshold_abs=0, num_peaks=2)
+ assert len(peaks_limited) == 2
@soupault
soupault May 21, 2016 Member

This test does not seem to address the original issue. To cover the case, we need at least 2 labels for input image and call peak_local_max with labels parameter.

@sciunto
sciunto May 22, 2016 Member

it's a typo

@soupault
Member

@mskoh52 @sciunto what should be the preferable behaviour: to return <= num_peaks total, or <= num_peaks for each label?

@mskoh52
mskoh52 commented May 21, 2016

I don't think there is a "right answer", but I would vote for <= num_peaks for each label. I'm imagining partitioning an image into regions of interest with the labels and trying to find peaks in each one, so in that case I'd want to treat each ROI independently.

@sciunto
Member
sciunto commented May 22, 2016 edited

We can add another option like num_peaks_per_label ?

Right now, if num_peaks is for each label, it's not compatible with the actual docstring.

@codecov-io
codecov-io commented May 22, 2016 edited

Current coverage is 90.67% (diff: 100%)

Merging #2098 into master will increase coverage by 0.01%

@@             master      #2098   diff @@
==========================================
  Files           303        303          
  Lines         21763      21797    +34   
  Methods           0          0          
  Messages          0          0          
  Branches       2012       2012          
==========================================
+ Hits          19731      19765    +34   
  Misses         1669       1669          
  Partials        363        363          

Powered by Codecov. Last update e691467...2486759

@sciunto
Member
sciunto commented May 22, 2016

@mskoh52 @soupault I added that option + unittests

@soupault soupault and 1 other commented on an outdated diff May 25, 2016
skimage/feature/tests/test_peak.py
+ labels = np.zeros((7, 7), dtype=np.uint8) + 20
+ image[1, 1] = 10
+ image[1, 3] = 11
+ image[1, 5] = 12
+ image[3, 5] = 8
+ image[5, 3] = 7
+ peaks_limited = peak.peak_local_max(
+ image, min_distance=1, threshold_abs=0, labels=labels)
+ assert len(peaks_limited) == 5
+ peaks_limited = peak.peak_local_max(
+ image, min_distance=1, threshold_abs=0, labels=labels, num_peaks=2)
+ assert len(peaks_limited) == 2
+
+def test_num_peaks_per_labels():
+ image = np.zeros((7, 7), dtype=np.uint8)
+ labels = np.zeros((7, 7), dtype=np.uint8) + 1 #0*6.5
@soupault
soupault May 25, 2016 Member

What does the comment mean?

@soupault soupault commented on an outdated diff May 25, 2016
skimage/feature/tests/test_peak.py
+def test_num_peaks_and_labels():
+ image = np.zeros((7, 7), dtype=np.uint8)
+ labels = np.zeros((7, 7), dtype=np.uint8) + 20
+ image[1, 1] = 10
+ image[1, 3] = 11
+ image[1, 5] = 12
+ image[3, 5] = 8
+ image[5, 3] = 7
+ peaks_limited = peak.peak_local_max(
+ image, min_distance=1, threshold_abs=0, labels=labels)
+ assert len(peaks_limited) == 5
+ peaks_limited = peak.peak_local_max(
+ image, min_distance=1, threshold_abs=0, labels=labels, num_peaks=2)
+ assert len(peaks_limited) == 2
+
+def test_num_peaks_per_labels():
@soupault
soupault May 25, 2016 Member

I'm not sure about this test.
On the one hand, it doesn't check if num_peaks and num_peaks_per_label are different (the former is checked in the test above with the same test case). On the other hand, it doesn't check anything additional to test_num_peaks_and_labels_4quadrants.
I think you should merge test_num_peaks_and_labels and test_num_peaks_per_labels: for example, create image where the same values of num_peaks and num_peaks_per_label guarantee different results and put all 3 asserts into single test.

@soupault soupault and 1 other commented on an outdated diff May 25, 2016
skimage/feature/tests/test_peak.py
+ labels = np.zeros((7, 7), dtype=np.uint8) + 1 #0*6.5
+ image[1, 1] = 10
+ image[1, 3] = 11
+ image[1, 5] = 12
+ image[3, 5] = 8
+ image[5, 3] = 7
+ peaks_limited = peak.peak_local_max(
+ image, min_distance=1, threshold_abs=0, labels=labels)
+ assert len(peaks_limited) == 5
+ peaks_limited = peak.peak_local_max(
+ image, min_distance=1, threshold_abs=0, labels=labels, num_peaks_per_label=2)
+ assert len(peaks_limited) == 2
+
+
+def test_num_peaks_and_labels_4quadrants():
+ image = np.random.uniform(size=(40, 60))
@soupault
soupault May 25, 2016 Member

Please, specify the seed. We may face assertion fail from time to time, if /dev/random has bad luck.

The test itself is OK, but I'd consider decreasing the image size. Let's say to 20x20.

@sciunto
sciunto May 25, 2016 Member

There is a seed at the top of the file.

@sciunto
sciunto May 25, 2016 Member

I decreased the img size

@soupault
Member

@sciunto Sorry for being so meticulous :). Several comments more.

This recursive code is fun! Hope we'll refactor it one day.

@sciunto
Member
sciunto commented May 25, 2016

no problem ;) It's better to be meticulous instead of blowing up the house after the next release :)

@soupault soupault and 3 others commented on an outdated diff May 27, 2016
skimage/feature/tests/test_peak.py
@@ -272,14 +312,14 @@ def test_adjacent_different_objects():
def test_four_quadrants():
- image = np.random.uniform(size=(40, 60))
- i, j = np.mgrid[0:40, 0:60]
- labels = 1 + (i >= 20) + (j >= 30) * 2
+ image = np.random.uniform(size=(20, 30))
@sciunto
sciunto May 27, 2016 Member

It's defined globally at line 8, should be enough no?

@stefanv
stefanv May 27, 2016 Member

I guess that should be fine?

On Fri, 27 May 2016 at 11:02 François Boulogne notifications@github.com
wrote:

In skimage/feature/tests/test_peak.py
#2098 (comment)
:

@@ -272,14 +312,14 @@ def test_adjacent_different_objects():

def test_four_quadrants():

  • image = np.random.uniform(size=(40, 60))
  • i, j = np.mgrid[0:40, 0:60]
  • labels = 1 + (i >= 20) + (j >= 30) * 2
  • image = np.random.uniform(size=(20, 30))

It's defined globally at line 8, should be enough no?


You are receiving this because you are subscribed to this thread.

Reply to this email directly, view it on GitHub
https://github.com/scikit-image/scikit-image/pull/2098/files/b84af59c301beb2ce04639d22b4227b229c94493#r64944523,
or mute the thread
https://github.com/notifications/unsubscribe/AACwD6yqFD9XysFbFgA1XTY1NqgjQWRSks5qFzGlgaJpZM4Ijx7J
.

@sciunto
sciunto May 28, 2016 edited Member

From @stefanv comment, I saw that in __init__.py, we call setup_test() from _shared.testing. So, we have a seed there. Line 8 is redundant.

@soupault
soupault May 28, 2016 Member

What is we need a seed different from the one defined in that setup_test()?
I personally prefer to initialize the RNG in the very same place where it is used.

@sciunto
sciunto May 28, 2016 edited Member

OK, I put a seed at the top of each function with random calls. I think that we'll need to setup our tests with a fixture as Stefan said in #2113, but that's another tasks (it impacts several test suites). Does it make sense?

@stefanv
stefanv via email Jun 2, 2016 Member
@blink1073
blink1073 Jun 2, 2016 Member

Why, because nose is no longer supported ;)?

@blink1073
blink1073 Jun 2, 2016 Member

It is not a small task for a larger lib, see: matplotlib/matplotlib#5405

@soupault
Member
soupault commented Jun 2, 2016

👍

@sciunto sciunto closed this Jun 11, 2016
@sciunto sciunto reopened this Jun 11, 2016
@sciunto
Member
sciunto commented Jun 11, 2016

@scikit-image/core Anything else for this PR?

@sciunto
Member
sciunto commented Jun 21, 2016

@soupault Do we need another opinion here?

@soupault
Member

@sciunto we do. According to the policy one isn't permitted to vote/merge his/her own PR.

@sciunto
Member
sciunto commented Aug 1, 2016

@jni @ahojnnes Are you available to review this PR please? It's pending for a while. ;)

@jni jni and 1 other commented on an outdated diff Aug 2, 2016
skimage/feature/peak.py
@@ -41,6 +42,8 @@ def peak_local_max(image, min_distance=1, threshold_abs=None,
num_peaks : int, optional
Maximum number of peaks. When the number of peaks exceeds `num_peaks`,
return `num_peaks` peaks based on highest peak intensity.
+ num_peaks_per_label : int, optional
+ Maximum number of peaks for each label.
@jni
jni Aug 2, 2016 Contributor

This is an API breaking change (until we switch to Python3 only and make these keyword-only arguments). I suggest you put this new argument at the end so we can avoid the deprecation.

@sciunto
sciunto Aug 2, 2016 Member

Good point. I hesitated because it looks less natural, but you're right. We can change that later... As time goes, I'm more and more favourable to move to py3 only (I was not clear even few months ago).

@jni jni and 1 other commented on an outdated diff Aug 2, 2016
skimage/feature/peak.py
@@ -92,6 +95,19 @@ def peak_local_max(image, min_distance=1, threshold_abs=None,
array([[10, 10, 10]])
"""
+ def get_high_intensity_peaks(image, mask, num_peaks):
+ """
+ Return the highest intensity peak coordinates.
+ """
+ # get coordinates of peaks
+ coord = np.transpose(mask.nonzero())
+ # select num_peaks peaks
+ if coord.shape[0] > num_peaks:
+ intensities = image.flat[np.ravel_multi_index(coord.transpose(),
+ image.shape)]
+ idx_maxsort = np.argsort(intensities)[::-1]
+ coord = coord[idx_maxsort][:num_peaks]
+ return coord
@jni
jni Aug 2, 2016 Contributor

Unless I'm missing something, there is no reason for this function to be nested? Pull it out and add a leading _ underscore to indicate that it's a private function.

@jni
jni Aug 2, 2016 Contributor

Also, you can save a bit of work by transposing later:

coord = np.nonzero(mask)
if len(coord[0]) > num_peaks:
    intensities = image[coord]
    idx_maxsort = np.argsort(intensities)
    coord = np.transpose(coord)[idx_maxsort, -num_peaks:]
return coord
@sciunto
sciunto Aug 2, 2016 Member

This code doesn't work... :)

@jni
jni Aug 3, 2016 Contributor

Well, I just saw that my indexing is wrong, should be [idx_maxsort][-num_peaks:], but other than that it should work...?

@sciunto
sciunto Aug 3, 2016 Member

A correct function would be:

    coord = np.nonzero(mask)
    if len(coord[0]) > num_peaks:
        intensities = image[coord]
        idx_maxsort = np.argsort(intensities)
        coord = np.transpose(coord)[idx_maxsort][-num_peaks:]
        coord = np.row_stack(coord)
    else:
        coord = np.column_stack(coord)
    return coord
@jni
jni Aug 4, 2016 Contributor

You shouldn't need the row_stack call?

btw I was completely unaware of row_stack and column_stack! So handy! I can't tell you how many times I've used newaxis with hstack, which is super-annoying!

At any rate, here's some nice middle ground, seeing the issue of having to stack one way or another:

coord = np.transpose(np.nonzero(mask))
if coord.shape[0] > num_peaks:
    intensities = image[tuple(coord.T)]
    top_idxs = np.argsort(intensities)[-num_peaks:]
    coord = coord[top_idxs]
@sciunto
sciunto Aug 4, 2016 Member

I'm positive to say that if you remove the row_stack, tests blow up. I do not remember exactly the different shapes, but for sure, you need it. :)

I'm not sure which writing is the best (ie also the most legible).

@jni
jni Aug 5, 2016 Contributor

I'm positive that removing ravel_multi_index and flatten is a readability win. =)

@jni
jni Aug 5, 2016 Contributor

*flat

@jni jni commented on the diff Aug 2, 2016
skimage/feature/tests/test_peak.py
i, j = np.mgrid[-3:4, -3:4]
footprint = (i * i + j * j <= 9)
expected = np.zeros(image.shape, float)
- for imin, imax in ((0, 20), (20, 40)):
- for jmin, jmax in ((0, 30), (30, 60)):
+ for imin, imax in ((0, 10), (10, 20)):
+ for jmin, jmax in ((0, 15), (15, 30)):
@jni
jni Aug 2, 2016 Contributor

Why did this test change?

@sciunto
sciunto Aug 2, 2016 Member

I made the size smaller, see above (just to run the tests faster.. a little)

@jni
Contributor
jni commented Aug 2, 2016

@sciunto I have some concerns... =P

@sciunto
Member
sciunto commented Aug 5, 2016

@jni Your version is pushed. :)

François Bou... and others added some commits May 21, 2016
@sciunto François Boulogne FIX: select num_peaks if labels is specified #1726 1674b63
@sciunto François Boulogne TEST: put a seed for each test with random calls 02d1cc0
@sciunto sciunto Make API compatible with previous version a12aff6
@sciunto sciunto Make get_higher_peak private 4eb77e7
@sciunto sciunto Rewrite _get_high_intensity_peaks thanks @jni
2486759
@sciunto
Member
sciunto commented Sep 5, 2016

This PR felt in the forgotten PR basket :) Anyone for a second +1?

@ahojnnes
Member
ahojnnes commented Sep 5, 2016

LGTM, but would like to hear @jni as well.

@emmanuelle
Member

@jni is back from holidays I think :-)

@jni
Contributor
jni commented Sep 6, 2016

@emmanuelle you're right! =) Looking now. =)

@jni jni merged commit 92a3851 into scikit-image:master Sep 6, 2016

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@sciunto sciunto deleted the sciunto:bug_localmaxpeak branch Sep 6, 2016
@sciunto
Member
sciunto commented Sep 6, 2016

thanks @jni

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment