Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow passing callable for high and low threshold of Canny #3070

Closed
wants to merge 6 commits into from

Conversation

jmetz
Copy link
Contributor

@jmetz jmetz commented May 11, 2018

Description

Adds option of passing callables for high_threshold and low_threshold, which
take the magnitude image as inputs. This allows more general threshold strategies.

Checklist

[It's fine to submit PRs which are a work in progress! But before they are merged, all PRs should provide:]

References

Closes #3054, a feature request to add this option.

For reviewers

(Don't remove the checklist below.)

  • Check that the PR title is short, concise, and will make sense 1 year
    later.
  • Check that new functions are imported in corresponding __init__.py.
  • Check that new features, API changes, and deprecations are mentioned in
    doc/release/release_dev.rst.

@pep8speaks
Copy link

pep8speaks commented May 11, 2018

Hello @jmetz! Thanks for updating the PR.

Line 71:28: E241 multiple spaces after ','
Line 86:28: E241 multiple spaces after ','

Line 282:13: E722 do not use bare except'
Line 284:80: E501 line too long (197 > 79 characters)
Line 286:20: E201 whitespace after '('
Line 289:80: E501 line too long (169 > 79 characters)
Line 294:13: E722 do not use bare except'
Line 296:80: E501 line too long (196 > 79 characters)
Line 297:20: E201 whitespace after '('
Line 300:80: E501 line too long (168 > 79 characters)

Line 114:12: E128 continuation line under-indented for visual indent
Line 127:80: E501 line too long (80 > 79 characters)

Comment last updated on May 21, 2018 at 12:52 Hours UTC

Lower bound for hysteresis thresholding (linking edges).
If callable is given, it is applied to the gradient image
to generate the low_threshold.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could extend the Examples section to illustrate this possibility, maybe with percentiles?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmetz, this docstring statement isn't bad but it's not great either. Specifically, I want it to state the type of the callable: takes in image as its only input and returns either a float or an array of the same shape as image as output.

@emmanuelle
Copy link
Member

Hi @jmetz thank you for your PR. What kind of functions would typically be used to compute thresholds in your experience, apart from percentiles? If it's only percentiles, we can also think of a keyword argument threshold_mode (absolute, percentile etc.) which could be easier to use for non-expert users.

@jmetz
Copy link
Contributor Author

jmetz commented May 15, 2018

Hi @emmanuelle - I've used skimage's own threshold_li and threshold_otsu in the past... in principal though it can be any function that takes an image (2d ndarray) and returns a single value representing a threshold to use.

Because of this I'd suggest at least having the option to pass in a callable... if need be we could possibly have two canny interfaces, something like canny and canny_advanced or similar, where

  • canny implements the original simple function which takes scalar threshold values (and possibly the quartile option as it currently does). Internally this could simply wrap canny_advanced.
  • canny_advanced would then take the callables as shown in this PR.

Thoughts?

@jni
Copy link
Member

jni commented May 16, 2018

Hi @emmanuelle and @jmetz

I am -1 for doing separate functions. We already support int/float thresholds and quantiles. From the users' perspective, this simplicity never goes away. Advanced users would then have the complex use at their disposal if they so desire, without the need to clutter the API.

Another example where the advanced API could be used: using threshold_sauvola with different values of k.

@jmetz
Copy link
Contributor Author

jmetz commented May 16, 2018

I tend to agree @jni - are you happy with the PR as is then?

@jni jni changed the title Added option of passing callable to high and low threshold of canny fixes #3054 Allow passing callable for high and low threshold of Canny May 17, 2018
Copy link
Member

@jni jni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmetz I've got a couple more comments. Also, add the new feature to doc/releases/release_dev. Also, as @emmanuelle says, ideally you could extend the current Canny example to show this feature in action. You could, for example, fiddle with the example image so that the brightness/contrast decreases over the rows axis and then a global threshold doesn't work anymore, but a local threshold does.

Or if you have your own example data, that also works, but probably that's more of a hassle because we don't want to keep adding data to the repo.

Lower bound for hysteresis thresholding (linking edges).
If callable is given, it is applied to the gradient image
to generate the low_threshold.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmetz, this docstring statement isn't bad but it's not great either. Specifically, I want it to state the type of the callable: takes in image as its only input and returns either a float or an array of the same shape as image as output.

#
# Else if high_threshold and/or low_threshold are callables
# call them to determine the threshold value / image
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind removing the whitespace comments? I would also change call them to apply them to the input image.

if callable(high_threshold):
high_threshold = high_threshold(magnitude)
if callable(low_threshold):
low_threshold = low_threshold(magnitude)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to hold up the PR for this, but ideally we would do some sanity checking here: use try/except so that if there's an error, we raise a TypeError with the appropriate error message (the input type of the callable passed to Canny is wrong; and reproduce the traceback that came with the call), and if the type of high_threshold and low_threshold are wrong, then we raise a TypeError saying that the output type of the callable is wrong.

If you can add this we would be eternally grateful! =)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just working on the example quickly over lunch ;)

@jni
Copy link
Member

jni commented May 17, 2018

@jmetz I won't hold the PR for the example though if you don't have time right now. Let us know! And thanks again! =)

@soupault soupault added the ⏩ type: Enhancement Improve existing features label May 17, 2018
@soupault soupault added this to the 0.14 milestone May 17, 2018
Jeremy Metz added 3 commits May 18, 2018 12:14
When passing in image to low_threshold callable, only
pass in values that are below high_thresh.
This is similar to how low_threshold is usually set to a fraction
of high threshold for scalar threshold values
Also added low SNR image to show strength of callables
@jmetz
Copy link
Contributor Author

jmetz commented May 18, 2018

@jni - Added callable usage to example.

Even for simple data (in this case just making a lower SNR square image) shows the strength of using a callable (here I used threshold_otsu).

For your convenience: the image in the canny example now looks like this:

screenshot from 2018-05-18 13-19-26

@jmetz
Copy link
Contributor Author

jmetz commented May 18, 2018

(Also apologies for that last commit - can consolidate if needed)

ax1.set_title('noisy image', fontsize=20)
axes[0][0].imshow(im, cmap=plt.cm.gray)
axes[0][0].axis('off')
axes[0][0].set_title('Low noise image')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this to NumPy indexing? [0, 0], etc. Much neater imho.

I also feel like some of this repetition can be removed: instead of individual calls to axis('off'), remove them all and at the end say, for ax in axes.ravel(): ax.axis('off').

Copy link
Member

@jni jni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmetz

That's a spectacular example! Thank you for this!

I've made one more minor comment, and also this comment was not addressed, but I'm happy to approve as-is. Neither change should hold up a merge, but is something that you might be willing to push.

@jmetz
Copy link
Contributor Author

jmetz commented May 20, 2018

@jni - thanks, I should be able to add those last changes in later today - though your link to the comment that I didn't address doesn't seem to take me to any specific comments, would you be able to let me know which one it is?

@jni
Copy link
Member

jni commented May 20, 2018

Ah it’s the thing about producing a meaningful error message when the callables don’t produce the right type of output or fail with the given input

@jmetz
Copy link
Contributor Author

jmetz commented May 20, 2018

@jni - just taking a look at this now... so cleaning up the example to use a loop is straight-forward.
Only question there would be how far to go. E.g. I could conceivably use:

names_and_ims = (
    ('Input Image', im), 
    ('Default', edges_1), 
    ...
)
for ax, (name, im) in zip(axes.ravel(), names_and_ims):
    ax.imshow(im)
    ax.set_title(name)

In terms of analysing the callables, it's a little less straight-forward how I should check that they have the correct call signature and return the correct things.
As far as I can tell there are two main groups of options:

a) Put the callable-related code in a try-except and then raise an error on an except
b) Check the output (e.g. isinstance(result, numbers.Number) or (isinstance(result, np.array) and (result.ndim == 2))) as well as checking the correct number of inputs with inspect.

I'd suggest we go with a), as that's more Pythonic, and cleaner to code up - what do you guys thing?

@jni
Copy link
Member

jni commented May 21, 2018

@jmetz there are two types of errors to catch:

(1) the callable fails with a typeerror, which probably means that the callable passed has the wrong input type. We should put this in a try/except clause and raise the appropriate error in the except (a TypeError with an explanation of how to use callables here, and the full traceback to help diagnose things).
(2) the callable works but returns an output that is the wrong type (not float or ndarray). This should also be caught (this time with an if statement) and an error raised.

Regarding the example, I would only go for removing the axis('off') calls. The rest of the stuff, I think, would be less clear if drawn into a loop, except for expert users. Would you agree?

@jmetz
Copy link
Contributor Author

jmetz commented May 21, 2018

@jni - I've implemented the changes outlined, but just before I commit, final detail: I check that the output of the thresholds is either a 2d array or a numbers.Number (i.e. allow for ints etc, so in case the chosen function returns that...) - you mentioned checking for float... just wanted to clarify whether you're happy accepting numbers.Number or want to stick specifically to float?

@jni
Copy link
Member

jni commented May 21, 2018

@jmetz when you say "2D array" I hope you mean "array of the same shape as the input image"? =)

Yes numbers.Number is exactly correct, thank you for pointing that out!

if not( isinstance(high_threshold, numbers.Number) or (
isinstance(high_threshold, np.ndarray) and
(high_threshold.shape == magnitude.shape))):
raise ValueError("Callable `high_threshold` must take one input (image array) and return a scalar value or image array of the same shape as input image")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jmetz! These are great, except they should be TypeErrors. (the same error as when a function is called with the wrong arguments.) And incidentally the except above should not be bare — it should only catch TypeErrors!

Copy link
Contributor Author

@jmetz jmetz May 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jni Ah... So I checked about TypeError Vs ValueError, and because the issue is with the "value" of the callables, I thought ValueError would be more appropriate?
I.e. to quote the docs for ValueError:
"Raised when a built-in operation or function receives an argument that has the right type but an inappropriate value, and the situation is not described by a more precise exception such as IndexError."

Also as for not having a bare except (I know that's generally quite bad), if there is any issue with a called function, I thought it might still be useful to have the info from canny also, not just have it raise the internal exception... Happy to only catch ValueError though if that's preferred.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(put another way, as we're in the case where eg high_threshold is a callable, it's type is technically already correct... Just that it's specific value is wrong... Right?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I once thought the same as you, but I've learned a bit more about type theory since then. =) Specifically, the type of the callable is not just "callable", but Union[(array -> float), (array -> array)]. That is: the inputs and outputs of the callable are part of the type. (Whereas a ValueError would stem from, e.g. returning a float that is outside some allowed range, for example.)

And, as I mentioned, you should catch TypeError, because that is the error we are detecting: when the function is unhappy with the inputs as provided.

The errors will propagate through if they are of a different type, and this is what we want: our error message that "you have provided the wrong type of function" will not necessarily be accurate if we catch all errors. If someone, for example, provides a threshold function that sends the data to some Google Compute Engine to process on their TPUs (to get back a threshold 😂), and there is a NetworkError, we don't want to say that it's the wrong kind of function: it is the right kind, but it failed for extraneous reasons.

Copy link
Contributor Author

@jmetz jmetz May 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jni - many thanks for the explanation, glad I know more about type theory wrt functions now too :) - I'll amend the last commit in the morning.

@codecov-io
Copy link

codecov-io commented May 21, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@a51ffde). Click here to learn what that means.
The diff coverage is 78.57%.

Impacted file tree graph

@@           Coverage Diff            @@
##             master   #3070   +/-   ##
========================================
  Coverage          ?   85.9%           
========================================
  Files             ?     336           
  Lines             ?   27329           
  Branches          ?       0           
========================================
  Hits              ?   23477           
  Misses            ?    3852           
  Partials          ?       0
Impacted Files Coverage Δ
skimage/feature/_canny.py 95.86% <73.68%> (ø)
skimage/feature/tests/test_canny.py 98.41% <88.88%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a51ffde...6a5a790. Read the comment docs.

@jmetz
Copy link
Contributor Author

jmetz commented May 23, 2018

@stefanv - I can't see how this can work... setting, for example high_threshold to "robust", but not low_threshold could easily result in nonsensical thresholds (e.g. low_threshold above high_threshold).

As you point out, the whole logic of the low and high threshold does, to some extent, imply a dependency, i.e. low_threshold should always depend on high_threshold for Canny to work.

My vote would be to leave it as is and accept that this dependency is a current limitation, but I'm happy to switch to having one or several hard-coded explicit implementations (such as what I used for the example).

@soupault soupault modified the milestones: 0.14, 0.14.1 May 29, 2018
@soupault soupault modified the milestones: 0.14.1, 0.15 Aug 20, 2018
@soupault soupault modified the milestones: 0.15, 0.16 Apr 20, 2019

# Correct output produced manually with quantiles
# of 0.8 and 0.6 for high and low respectively
correct_output = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a new line...

low_threshold = low_threshold(magnitude[mask_threshold])
except:
traceback.print_exc()
raise ValueError("Callable `low_threshold` raised above exception.\nIt must take one input (image array) and return a scalar value or image array of the same shape as input image")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is too long line...

@stefanv
Copy link
Member

stefanv commented Dec 10, 2019

@jmetz Do you think the robust thresholding method can be implemented (perhaps not with ideal efficiently, but still) by calling the existing threshold method multiple times? If so, the complexity this PR generates may be worth avoiding, and we can either add a new method that does it the circuitous way (for now) or document that a robust method exists and give appropriate pointers.

@jni
Copy link
Member

jni commented Dec 10, 2019

I agree. This comment comes from a discussion in the group meeting about the PR. Unless I'm mistaken, these things could be achieved outside of the function of interest, so perhaps this PR is best replaced by a gallery example, which I think would actually be a very valuable addition to our gallery!

@jmetz
Copy link
Contributor Author

jmetz commented Dec 10, 2019

Hi @stefanv & @jni - tbh I'm a little lost now... So to recap, my original PR was to add in extended functionality in canny by allowing the user to pass in callables as high_threshold and low_threshold.

I can see how this would be deemed quite an advanced functionality and therefore there's reluctance to accept this PR, but I'm confused what's being proposed now instead?

@jni
Copy link
Member

jni commented Dec 11, 2019

You linked to a part of the wikipedia article that mentions how to find a robust threshold: set the high threshold to be something principled (e.g. Otsu but I guess anything will do), then, set the low threshold to be the same principled method limited to the pixels below the original threshold. I think this would make a very nice example in the gallery.

In fact, I might be inclined to make this be the default going forward. The old defaults seem pretty random!

@jmetz
Copy link
Contributor Author

jmetz commented Dec 11, 2019

There is a problem with that idea - using Otsu to calculate the threshold values can only be done either

  • within the canny function as the array the threshold values are applied to are not the input image, but the sobel filtered values (which happens within canny).
  • or on a filtered image which has been filtered in exactly the same way as the data will be filtered within canny.

This was the reason I first suggested passing callables for the threshold values.
We could though as you suggest simply replace the implementation to use the enhanced threshold determination (i.e. Otsu) within canny.

Either way, IMO it makes most sense to close this PR and create one or more issues to cover these proposals.

@jmetz
Copy link
Contributor Author

jmetz commented Dec 11, 2019

On a related but separate note, I would also suggest that the canny implementation be broken up (i.e. the body of canny simply call into a few subfunctions) into it's constituent subfunctions, namely (roughly):

  • edge filtering (sobel magnitude and direction which is needed for non-maximal supression)
  • non-maxima suppression
  • threshold determination
  • hysteresis thresholding

such that canny itself becomes much shorter (and passes PEP8 wrt function length 😅 ) and advanced users can then stitch together the relevant functions, but replacing e.g. the threshold determination step to much more easily create their own canny variant functions, instead of having to copy and paste the long contents of canny and replacing just a few lines amongst the 150 or so LOC.

@rfezzani
Copy link
Member

rfezzani commented Feb 7, 2020

Sorry for coming late to the discussion.
I agree with your last comment @jmetz, I refactored the canny function in #4342 ;-).

I see two other options concerning the discussion above:

  • if we want to allow passing callable for high and low threshold with more flexibility, we can define a new argument thresholds defining both low_threshold and high_threshold (mainly thresholds=(low_threshold, high_threshold)). Doing so, the robust thresholds selection can be implemented with thresholds as a callable...
  • we can also modify the default thresholds selection (when threshold_low and threshold_high are None) to be the robust selection method.

@rfezzani
Copy link
Member

rfezzani commented Dec 8, 2020

@jmetz I propose in #5114 to deprecate low_threshold and high_threshold in favor of the unique parameter thresholds, this may ease the implementation of this PR 😉

@rfezzani
Copy link
Member

rfezzani commented Dec 9, 2020

As @jni mentioned, simply adding a thresholds argument that supplant low_threshold and high_threshold is an excellent solution to allow passing callable. What do you think @jmetz?

@jmetz
Copy link
Contributor Author

jmetz commented Dec 10, 2020

As @jni mentioned, simply adding a thresholds argument that supplant low_threshold and high_threshold is an excellent solution to allow passing callable. What do you think @jmetz?

Thanks for the updates to this question @rfezzani, and I agree - switching to a more generic thresholds parameter could be a nice way to achieve additional versatility (and the modification to the current PR should be minimal).

Base automatically changed from master to main February 18, 2021 18:23
@jarrodmillman jarrodmillman modified the milestones: 0.18, 0.20 Jun 4, 2022
@stefanv
Copy link
Member

stefanv commented Jan 20, 2023

@jmetz Thank you for this PR. It was a great thought experiment in how we can improve the canny implementation in scikit-image, and several concrete ideas have been proposed that we should follow-up on.

This specific PR, then, is probably not the right fit, but I hope to see some follow-up PRs that take on the rest.

Thanks again for going through the whole review cycle and engaging in such constructive conversation; I know it's not ideal to have a PR closed after all that work, but I think we all learned a lot.

@stefanv stefanv closed this Jan 20, 2023
@stefanv stefanv mentioned this pull request Jan 20, 2023
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⏩ type: Enhancement Improve existing features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: Add option of using threshold algorithm in Canny