Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix blob dog docstring example and normalization #5503

Merged
merged 6 commits into from
Aug 15, 2021

Conversation

grlee77
Copy link
Contributor

@grlee77 grlee77 commented Aug 6, 2021

Description

This PR fixes a few small issues in the previously merged #4482.

1.) To make the choice of threshold less dependent on the choice of sigma_scale, we should normalize all of the difference-of-Gaussian (DoG) images by (sigma_scale - 1). This was originally suggested in #4482 but I mistakenly omitted it in the version that got merged.

2.) The docstring example was broken by #4482, but this was not caught since our docstring checks were accidentally removed from CI last year (see #5502). Now that the DoG is no longer biased toward larger sigma, it was necessary to adjust the min_sigma to avoid detecting additional peaks associated with the edges of the coins. The adjusted parameters give one peak per coin again.

3.) The default threshold was reduced a bit and perhaps should be reduced even further now that we don't multiply each DoG image by sigma.

Additional info

As another verification, the gallery example still looks fine with the existing threshold of 0.1:

Current gallery example output: (see blob_dog in the center)
plot_blob_main

Note that before #4482 was merged, there was more bias toward larger blobs. See for example the result in the release 0.18.2 docs:
plot_blob_v018

Checklist

For reviewers

  • Check that the PR title is short, concise, and will make sense 1 year
    later.
  • Check that new functions are imported in corresponding __init__.py.
  • Check that new features, API changes, and deprecations are mentioned in
    doc/release/release_dev.rst.

…_ratio

without this, larger sigma_ratio will lead to larger magnitude in the difference-of-Gaussians images
@grlee77 grlee77 requested a review from mkcor August 6, 2021 17:07
@mkcor mkcor mentioned this pull request Aug 6, 2021
9 tasks
Copy link
Member

@mkcor mkcor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @grlee77! Would you mind rewriting the function call to peak_local_max (line 363) as well, please? I would suggest:

    local_maxima = peak_local_max(
        image_cube,
        threshold_rel=threshold,
        exclude_border=exclude_border,
        footprint=np.ones((3,) * (image.ndim + 1))
    )

because

  1. it's lighter cognitively to read the keyword arguments in the order given by the function signature;
  2. we don't need to use both threshold_abs and threshold_rel:
    If both `threshold_abs` and `threshold_rel` are provided, the maximum
    of the two is chosen as the minimum intensity threshold of peaks.
  3. I think it's more natural to use threshold_rel, so we are not haunted by rescaling issues... The 0.5 default value is sensible enough as half the maximum intensity:
    threshold_rel : float, optional
    Minimum intensity of peaks, calculated as `max(image) * threshold_rel`.

@@ -214,7 +214,7 @@ def _format_exclude_border(img_ndim, exclude_border):
)


def blob_dog(image, min_sigma=1, max_sigma=50, sigma_ratio=1.6, threshold=2.0,
def blob_dog(image, min_sigma=1, max_sigma=50, sigma_ratio=1.6, threshold=0.5,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!! It didn't make sense to have an absolute threshold at 2.0, especially considering that this function starts by rescaling the input image into the [0, 1] range (image = img_as_float(image))...

skimage/feature/blob.py Outdated Show resolved Hide resolved
Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org>
@mkcor
Copy link
Member

mkcor commented Aug 10, 2021

@grlee77 do you agree with my review comment regarding the peak_local_max call?

@grlee77
Copy link
Contributor Author

grlee77 commented Aug 12, 2021

@grlee77 do you agree with my review comment regarding the peak_local_max call?

It is a bigger API difference to change to a relative threshold. I can imagine some scenario where one may want a fixed absolute threshold across a series of input images, which would then become harder to do. I do think the relative threshold would be a more reasonable default than an absolute one, though. Perhaps we should add an additional threshold_rel kwarg defaulting to None for now?

@mkcor
Copy link
Member

mkcor commented Aug 12, 2021

It is a bigger API difference to change to a relative threshold. I can imagine some scenario where one may want a fixed absolute threshold across a series of input images, which would then become harder to do.

Right, I see.

I do think the relative threshold would be a more reasonable default than an absolute one, though. Perhaps we should add an additional threshold_rel kwarg defaulting to None for now?

Yes, great idea! Then, the function call would become:

    local_maxima = peak_local_max(
        image_cube,
        threshold_abs=threshold,
        threshold_rel=None,
        exclude_border=exclude_border,
        footprint=np.ones((3,) * (image.ndim + 1))
    )

which I much prefer over the unordered threshold_rel=0.0 😌

@grlee77
Copy link
Contributor Author

grlee77 commented Aug 14, 2021

Yes, great idea! Then, the function call would become:

Not exactly. I was proposing adding threshold_rel=None to the top-level blob_dog function. We can then pass on both threshold_rel and threshold_abs to peak_local_max.

@mkcor
Copy link
Member

mkcor commented Aug 14, 2021

Not exactly. I was proposing adding threshold_rel=None to the top-level blob_dog function. We can then pass on both threshold_rel and threshold_abs to peak_local_max.

Yes! Sorry, I originally understood your point correctly, then I got mixed up, but I'm +1 on passing threshold_rel=None to the top-level blob_dog function. 👍

@grlee77
Copy link
Contributor Author

grlee77 commented Aug 15, 2021

@mkcor, see the three latest commits for the addition of threshold_rel. This brings up two more things:

1.) It would be useful to extend threshold_rel to blob_doh and blob_log as well. Let me know if you think that is better in a follow-up PR.
2.) we could optionally deprecate "threshold", renaming it to "threshold_abs" for consistency with peak_local_max.

overlap : float, optional
A value between 0 and 1. If the area of two blobs overlaps by a
fraction greater than `threshold`, the smaller blob is eliminated.
threshold_rel : float or None, optional
Minimum intensity of peaks, calculated as
``max(dog_space) * threshold_rel``. Where ``dog_space`` refers to the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
``max(dog_space) * threshold_rel``. Where ``dog_space`` refers to the
``max(dog_space) * threshold_rel``, where ``dog_space`` refers to the

(no big deal)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS: Should we write 'image_cube' instead of 'dog_space' to match the source code? Or is just formal (~ pseudo-code), sensible variable naming for the docstring?

Minimum intensity of peaks, calculated as
``max(dog_space) * threshold_rel``. Where ``dog_space`` refers to the
stack of difference-of-Gaussian (DoG) images computed internally. This
should have a value between 0 and 1. If None, `threshold_abs` is used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
should have a value between 0 and 1. If None, `threshold_abs` is used
should have a value between 0 and 1. If None, `threshold` is used

You're moving too fast! 😄 But, yes, I would be in favour of deprecating "threshold", renaming it to "threshold_abs" for consistency with peak_local_max.

@mkcor
Copy link
Member

mkcor commented Aug 15, 2021

Thank you, @grlee77!

1.) It would be useful to extend threshold_rel to blob_doh and blob_log as well. Let me know if you think that is better in a follow-up PR.

Good point; I think this change can go in a follow-up PR, indeed.

2.) we could optionally deprecate "threshold", renaming it to "threshold_abs" for consistency with peak_local_max.

Right, plus explicit is better than implicit. I think this update could also be a follow-up PR. I have approved this PR because it's self-contained as is; maybe I'd like for it to ship soon also because it's a bug fix! Thanks for all your diligence 🙏

@grlee77
Copy link
Contributor Author

grlee77 commented Aug 15, 2021

Okay, let me merge this as a bug fix PR and I will make a new one to update the other two functions and make the names consistent.

@grlee77 grlee77 merged commit 1080b31 into scikit-image:main Aug 15, 2021
@magnunor magnunor mentioned this pull request Feb 5, 2022
3 tasks
magnunor added a commit to magnunor/pyxem that referenced this pull request Feb 5, 2022
Due to changes in recent version (0.19) of skimage.feature.blob_dog:
scikit-image/scikit-image#5503
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants