Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threshold generalized histogram #5332

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

divyank0
Copy link
Contributor

Description

#5189
Feature request: generalized_histogram_thresholding algorithms

It contains the required function, doc string and test cases.
I am working on Gallery examples. Would love a Review till here.

For reviewers

  • Check that the PR title is short, concise, and will make sense 1 year
    later.
  • Check that new functions are imported in corresponding __init__.py.
  • Check that new features, API changes, and deprecations are mentioned in
    doc/release/release_dev.rst.

@pep8speaks
Copy link

pep8speaks commented Apr 15, 2021

Hello @divyank0! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 12:80: E501 line too long (81 > 79 characters)
Line 20:61: W291 trailing whitespace
Line 23:80: E501 line too long (81 > 79 characters)
Line 44:1: E101 indentation contains mixed spaces and tabs
Line 44:1: W191 indentation contains tabs
Line 44:4: E128 continuation line under-indented for visual indent
Line 45:1: W191 indentation contains tabs
Line 45:4: E124 closing bracket does not match visual indentation
Line 49:1: W191 indentation contains tabs
Line 49:4: E126 continuation line over-indented for hanging indent
Line 50:1: W191 indentation contains tabs
Line 51:1: W191 indentation contains tabs
Line 52:1: W191 indentation contains tabs
Line 53:1: W191 indentation contains tabs
Line 54:1: W191 indentation contains tabs
Line 54:6: E126 continuation line over-indented for hanging indent
Line 61:1: E101 indentation contains mixed spaces and tabs
Line 62:17: E128 continuation line under-indented for visual indent
Line 62:23: W291 trailing whitespace
Line 63:17: E128 continuation line under-indented for visual indent
Line 64:17: E128 continuation line under-indented for visual indent
Line 65:17: E128 continuation line under-indented for visual indent
Line 78:1: E101 indentation contains mixed spaces and tabs
Line 78:1: W191 indentation contains tabs
Line 79:1: W191 indentation contains tabs
Line 79:27: E226 missing whitespace around arithmetic operator
Line 80:1: W191 indentation contains tabs

Line 710:9: E126 continuation line over-indented for hanging indent

Line 1299:80: E501 line too long (83 > 79 characters)
Line 1302:5: E303 too many blank lines (2)
Line 1406:5: E303 too many blank lines (4)
Line 1420:80: E501 line too long (81 > 79 characters)

Comment last updated at 2021-05-01 15:23:54 UTC

counts, bin_centers = histogram(image.ravel(), 256, source_range="image")

# (nu, tau, kappa, omega, threshold)
possible_values = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you want to parametrize this test 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

@mkcor mkcor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! I just took a quick look at your code. I will review more thoroughly later.

default_nu, default_tau, default_kappa, default_omega)
>>> binary = data<=t
"""
assert nu >= 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use assertions to self-test the program, not to handle wrong user input. Instead, raise an exception and print an error message. Possibly handle the exception by overwriting wrong input values with the respective default parameter values?

@sciunto
Copy link
Member

sciunto commented Apr 16, 2021

I think we need to discuss why this addition is valuable for the library. A broad range of threshold algorithm exists and we need to focus on the most famous one. The proposed one seems to be new. Is there a clear advantage?

@divyank0
Copy link
Contributor Author

Sure Francois, there is some initial Description written by Juan on the importance of this Feature.
#5189

Also
While reading the paper and going through other thresholding based global implementations,
I found GHT to be more powerful, in comparison to other global algorithms, which only returns 1 particular threshold value, with limited control available to the user.

Following points I found useful:

  1. GHT with its four hyperparameters allow us choose from a wide variety of thresholds, allowing the user to tune to specific use case. Depicted in the following plots.
  2. GHT doesn't require the histogram to be normalized.
  3. tau, τ hyperparameter serves a similar purpose as coarsing, blurring the input histogram.
  4. GHT can simulate other algorithms as special case.
  5. Robust to parameter change
    image

Thresholds in image 2 to 6 are calculated using GHT with different hyperparameters, 7th is Otsu's Implementation available in scikit-image.

My knowledge in this subject is limited as compared to others, Hence I welcome any and all comments.

@jni
Copy link
Member

jni commented Apr 22, 2021

@sciunto see #5189 for the motivation. The fact that a number of existing algorithms can be generalised to this method, and that this method when properly tuned can outperform many neural nets, is a clear advantage.

Having said this, @divyank0 the main issue I see is that GHT Otsu and existing Otsu should match. If they don't, that suggests that there is a bug in the implementation somewhere. Also, see my comment on the original issue for my thoughts about the API:

The signature should take an image or a histogram, where the histogram can be either just the set of frequencies or the bin centres and the frequencies. See the code for Otsu's thresholding for details. The idea is that you can provide the image (as all thresholding methods do), or, if you have a precomputed histogram, you can provide the histogram, which is more efficient.

The parameters should all be accessible and yes, defaulting to Otsu's is probably the most sensible default.

@divyank0
Copy link
Contributor Author

I reviewed the code and found that parameters were not configured correctly.
Changing the nu value to 1e20 from 1e10, fixed the issue for tau=0.01
Please see images attached for explanation.

for quick reference: 1e30 = 2**100 ; approx.

image
image
image

@divyank0
Copy link
Contributor Author

any update ? @jni

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants