Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

h_maxima/minima strange behaviors on floating point image input #4194

Closed
w-hc opened this issue Sep 29, 2019 · 20 comments · Fixed by #4496
Closed

h_maxima/minima strange behaviors on floating point image input #4194

w-hc opened this issue Sep 29, 2019 · 20 comments · Fixed by #4496

Comments

@w-hc
Copy link

@w-hc w-hc commented Sep 29, 2019

Description

I have been trying to use the h_maxima implementation but this function seems to exhibit unstable behaviors. For example,

  1. when the input is all zeros, h_maxima returns a marker filled with all 1s. This seems incorrect.

  2. Also, when my input consists of floating point values, and h set to a floating point threshold, even with a slight 0.1 change in h value, the number of detected instances can vary quite a bit.

Way to reproduce

img = np.zeros((10, 10), dtype=float)
marker = h_maxima(img, h=1)
print(np.unique(marker))

Version information

# Paste the output of the following python commands
from __future__ import print_function
import sys; print(sys.version)
import platform; print(platform.platform())
import skimage; print("scikit-image version: {}".format(skimage.__version__))
import numpy; print("numpy version: {}".format(numpy.__version__))
3.6.8 |Anaconda custom (64-bit)| (default, Dec 30 2018, 01:22:34)
[GCC 7.3.0]
Linux-3.10.0-957.5.1.el7.x86_64-x86_64-with-centos-7.6.1810-Core
scikit-image version: 0.15.0
numpy version: 1.16.4
@jni
Copy link
Member

@jni jni commented Sep 30, 2019

@w-hc thanks for the report!

I don't have time to investigate in detail right now, but

  1. It's unclear to me whether this is a bug or an ambiguous case... I'll have to think more about it.
  2. Without more detail, that is, a specific test image and some code to that shows the maxima "varying quite a bit", we won't be able to help you in this case. It's quite probably that the output is correct for your image. The only guarantee that you can have for h_maxima is that the number of detected maxima should increase with decreasing h.

@w-hc
Copy link
Author

@w-hc w-hc commented Sep 30, 2019

Here is script and sample image that can reproduce what I see. You need to uncompress this zip file to get the numpy array.

import numpy as np
from skimage.morphology import h_maxima
from scipy import ndimage as ndi
heatmap = np.load('heatmap.npy')
assert heatmap.dtype == np.float32
for h in np.linspace(0.2, 4.0, num=10):
    num_instances = ndi.label(h_maxima(heatmap, h))[1]
    print('at h= {:.3f}, discovered {} instances'.format(h, num_instances))

Output:

at h= 0.200, discovered 17 instances
at h= 0.622, discovered 25 instances
at h= 1.044, discovered 17 instances
at h= 1.467, discovered 25 instances
at h= 1.889, discovered 17 instances
at h= 2.311, discovered 24 instances
at h= 2.733, discovered 16 instances
at h= 3.156, discovered 23 instances
at h= 3.578, discovered 21 instances
at h= 4.000, discovered 23 instances

The correct number of local maxima should be 25.

heatmap.npy.zip

Here is what this floating array looks like when it is plotted with plt.imshow
Screen Shot 2019-09-30 at 12 58 26

@jni
Copy link
Member

@jni jni commented Oct 2, 2019

@w-hc

The correct number of local maxima should be 25.

well then just use 0.622, what's the problem??? 😂

Kidding of course. Yikes! Either I misunderstand h_maxima or this is indeed a significant bug. Thank you so much for the report and for the test case!

@jni
Copy link
Member

@jni jni commented Oct 2, 2019

Yep, definitely a bug. By computing a bunch of results, displaying them in napari, and slicing through them, I can see individiual maxima blinking on and off, instead of being off and staying off:

import numpy as np
from skimage.morphology import h_maxima as hmaxima
import napari

image = np.load('heatmap.npy')
# use this in ipython --gui=qt, or in a `with napari.gui_qt():` code block
viewer = napari.view_image(image, name='original', colormap='green', blending='additive')
result = np.array([hmaxima(image, f) for f in np.linspace(0.2, 100.0, num=100)])
res_layer = viewer.add_image(result, name='result', colormap='magenta', blending='additive')

@jni
Copy link
Member

@jni jni commented Oct 2, 2019

I suspect that this has to do with the handling of the resolution variable here:

resolution = 2 * np.finfo(image.dtype).resolution
if h < resolution:
h = resolution
h_corrected = h - resolution / 2.0
shifted_img = image - h

One concerning thing is that if the image is float32, which it is here, resolution / 2 gets cast to float64, and h - resolution / 2 is float64, while image - h is float32. I don't know whether this will then cause problems but it's plausible that it would.

@w-hc
Copy link
Author

@w-hc w-hc commented Oct 2, 2019

I tried changing the heatmap dtype to float64 but the error persists, albeit with different number of maxima.

import numpy as np
from skimage.morphology import h_maxima
from scipy import ndimage as ndi
heatmap = np.load('heatmap.npy').astype(np.float64)
assert heatmap.dtype == np.float64
for h in np.linspace(0.2, 4.0, num=10):
    num_instances = ndi.label(h_maxima(heatmap, h))[1]
    print('at h= {:.3f}, discovered {} instances'.format(h, num_instances))
at h= 0.200, discovered 25 instances
at h= 0.622, discovered 12 instances
at h= 1.044, discovered 25 instances
at h= 1.467, discovered 21 instances
at h= 1.889, discovered 11 instances
at h= 2.311, discovered 14 instances
at h= 2.733, discovered 23 instances
at h= 3.156, discovered 16 instances
at h= 3.578, discovered 17 instances
at h= 4.000, discovered 23 instances

@w-hc
Copy link
Author

@w-hc w-hc commented Oct 2, 2019

Just to confirm, the intensity parameter in peak_local_max works differently than h in h_maxima, right?

I have been trying to understand the exact meaning of h_maxima and find the doc in skimage quite difficult to parse.

A local maximum M of height h is a local maximum for which there is at least one path joining M with a higher maximum on which the minimal value is f(M) - h (i.e. the values along the path are not decreasing by more than h with respect to the maximum’s value) and no path for which the minimal value is greater.

After a bit of research, my current understanding is that h_maxima is equivalent to:

  1. Finding all morphological local maxima
  2. Throw away those maxima whose 'height' <= h
    where 'height' is defined as how much you can monotonically descend from this local peak without rising up again.

It is implemented by morphological reconstruction but in effect it is equivalent to what I described above. (disregarding the fact that the heights of remaining maxima would decrement by h)

However, this understanding seems quite different from what the wordings of the doc imply.

@HazenBabcock
Copy link
Contributor

@HazenBabcock HazenBabcock commented Mar 3, 2020

It looks like a rounding error, if you multiply everything by 10 you get more sensible results.

tmp = (10.0 * heatmap)

for h in np.linspace(2, 40, num=10):
    num_instances = ndi.label(h_maxima(tmp, int(h)))[1]
    print('at h= {:.3f}, discovered {} instances'.format(h, num_instances))
at h= 2.000, discovered 25 instances
at h= 6.222, discovered 25 instances
at h= 10.444, discovered 25 instances
at h= 14.667, discovered 25 instances
at h= 18.889, discovered 24 instances
at h= 23.111, discovered 24 instances
at h= 27.333, discovered 23 instances
at h= 31.556, discovered 23 instances
at h= 35.778, discovered 23 instances
at h= 40.000, discovered 23 instances

@HazenBabcock
Copy link
Contributor

@HazenBabcock HazenBabcock commented Mar 3, 2020

This is very ad-hoc, but if you change h_maxima in morphology/extrema.py to:

        resolution = 2 * numpy.finfo(image.dtype).resolution
        if h < resolution:
            h = resolution
        h_corrected = h - 2 * resolution
        shifted_img = image - h

Then the example works as expected.

Edit. I think you also need to scale resolution based on the magnitude of h for this to work properly.

@emmanuelle
Copy link
Member

@emmanuelle emmanuelle commented Mar 3, 2020

@HazenBabcock thanks a lot for taking a look! Are you interested in working in a pull request in order to correct this bug? We can help you of course in this process, if you wish.

@HazenBabcock
Copy link
Contributor

@HazenBabcock HazenBabcock commented Mar 3, 2020

@emmanuelle Sure, happy to try.

@emmanuelle
Copy link
Member

@emmanuelle emmanuelle commented Mar 3, 2020

Great, thank you so much :-)! One difficult part is maybe to add a unit test which would fail currently and pass if you can fix the bug, but if the current test suite passes and the example of @w-hc is fixed (when testing manually), this will definitely be an improvement.

HazenBabcock added a commit to HazenBabcock/scikit-image that referenced this issue Mar 4, 2020
@HazenBabcock
Copy link
Contributor

@HazenBabcock HazenBabcock commented Mar 4, 2020

I think I have a test and fix in my fork, feedback welcome. It is two commits, new test first so that you can see the failure without the fix.

I should also fix extrema.h_minima()?

This still doesn't work correctly for the h = 0.0 case however. Should that be documented as ambiguous? Or should this give the same results as extrema.local_maxima()?

@emmanuelle
Copy link
Member

@emmanuelle emmanuelle commented Mar 4, 2020

Awesome! Could you please open a pull request from your branch so that we can comment on the pull request? If you think that h_minima needs to be fixed in the same way, yes, please, that'd be great :-).

For h=0.0 this is definitely a corner case; I would suggest to raise a ValueError and mention local_maxima in the error message. Would that seem reasonable to you? I'd prefer users to use explicitly local_maxima without having it used under the hood. Since the current behaviour is buggy, no need to set up a deprecation cycle for such a change.

@HazenBabcock
Copy link
Contributor

@HazenBabcock HazenBabcock commented Mar 4, 2020

Hmm, I think I just found another bug, possibly in both versions. If h is greater than the maximum difference in the image h_maxima will still return the highest maxima in the image. I'm pretty sure that in this case it should return nothing, so some more work to do here.

@emmanuelle
Copy link
Member

@emmanuelle emmanuelle commented Mar 4, 2020

Good catch, thanks. If you think this other bug comes from the reconstruction function (hopefully not) you can open another issue which will be fixed later, as you wish.

@HazenBabcock
Copy link
Contributor

@HazenBabcock HazenBabcock commented Mar 4, 2020

Another issue is that h_maxima is silently upgrading shifted_img to numpy.float64 if h is not an integer. Should we enforce that the type of h matches that of image?

@emmanuelle
Copy link
Member

@emmanuelle emmanuelle commented Mar 4, 2020

Hum, this would be a breaking change. It does not make much sense I think to provide a float h when image is of integer type - or maybe it does for half integers? I'm not sure. One possibility would be to modify the docstring, which is inexact at the moment (it says that h is integer, but it's a scalar). We could mention that there will a cast to float if h is float and image is integer (which can be a problem for the memory footprint, otherwise I don't think this is a problem - or am I missing something?).

@HazenBabcock
Copy link
Contributor

@HazenBabcock HazenBabcock commented Mar 5, 2020

The problem is that fails for some non-integer values of h. I added a test that demonstrates this, as well as test that demonstrates that h_maxima (for floats) doesn't work correctly for large h. I've created a pull request for this. I see already that it is not PEP8 compliant..

@emmanuelle
Copy link
Member

@emmanuelle emmanuelle commented Mar 5, 2020

Great! Don't worry too much about PEP8, it's more an indication. If you fix some of it it's cool, but we don't have our CI depend on this so it's more indicative really.

rfezzani pushed a commit that referenced this issue Apr 22, 2020
…4496)

* Add h-maxima test demonstrating floating point failures as reported in #4194.

* Fix for floating point image issue #4194. Documentation correction.

* Add h_minima() test demonstrating floating point image issues.

* Fix h_minima handling of floating point images. Change h_maxima() and h_minima() to raise a ValueError if the image is floating point and h is exactly 0.0. Documentation correction for h_minima().

* Add h-maxima test for integer image, floating point h value. Add h-maxima test for h value large enough that no maxima should be found.

* Fix PEP8 style issues.

* Fix additional PEP8 style issues.

* Remove '101' from list of maxima and minima heights.

* Change 'test_h_maxima_large' to test for height relative to image minimum.

* Add test for a h value that is larger than the range image. Convert image to float if the h parameter is a float. Also check for h=0 in the integer code path.

* Use numpy.ptp() instead of separate calls to numpy.min() and numpy.max(). Improve handling of floating point h value and integer image. Add comment explaining the logic behind how handling floating point images. Directly return results without using h_max intermediate variable.

* Remove external parenthesis. Use parenthesis continuation line. Update h_minima() to match h_maxima(). Add additional tests for h_minima().

* Fix spelling mistake.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

4 participants