Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement 2-stage multi-Otsu thresholding for improved efficiency #5198

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

grlee77
Copy link
Contributor

@grlee77 grlee77 commented Jan 27, 2021

Description

Currently threshold_multiotsu is only suitable for a very small number of thresholds (e.g. <= 5 for nbins=256) due the exponential nature of the algorithm used. This PR, does not resolve the exponential nature, but does reduce runtime substantially by running the computations in a two stage manner.

The first stage is done with a coarser number of bins and the second refinement stage is done using the full set of bins, but narrower search regions based on the thresholds from stage 1. The results will be identical to the existing implementation (tests were added to verify this).

Benchmark running with various number of classes on the uint8 cameraman image:

classes duration (master) (ms) duration (PR) (ms)
3 1.97 0.873
4 90.1 1.14
5 5750 4.22
6 >1 hour? 44.8
7 >1 day? 626
8 >1 month? 9889

These results are all for the default bin_width_stage1=None setting. Manual tuning can give moderate improvements, so I left bin_width_stage1 as a keyword-only argument, but we could just not expose it if we don't want an additional API parameter.

Alternatives

It should be noted that there are alternative algorithms that have linear (Luessi et. al) or polynomial complexity (Menotti et. al.). that could be considered if further improvement is needed. The linear time algorithm appears to be considerably more complicated, though.

  • Luessi, Martin, et al. "Framework for efficient optimal multilevel image thresholding." Journal of Electronic Imaging 18.1 (2009): 013004.https://doi.org/10.1117/1.3073891

  • Menotti, D. et. al. Efficient Polynomial Implementation of Several Multithresholding Methods for Gray-Level Image Segmentation. Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 2015, pp. 350--357. https://doi.org/10.1007/978-3-319-25751-8_42

TODO

  • add ASV benchmark for classes = 2, 3, 4

Checklist

For reviewers

  • Check that the PR title is short, concise, and will make sense 1 year
    later.
  • Check that new functions are imported in corresponding __init__.py.
  • Check that new features, API changes, and deprecations are mentioned in
    doc/release/release_dev.rst.

TST: add multi-otsu test cases

BUG: fix probability calculation and binning
@sciunto
Copy link
Member

sciunto commented Jan 28, 2021

extremely interesting!

Base automatically changed from master to main February 18, 2021 18:23
@andres-fr
Copy link
Contributor

Hi, couldn't avoid noticing that the regular Otsu thresholding since v0.18 admits the histogram as parameter, while multiotsu does not. Feeding the histogram directly has many advantageous use cases, one of them is extracting thresholds for multiple images at the same time (e.g. for video sequences).

The code can be easily translated with a few modifications, but I wonder if that would collide with this PR? If not' I'd be happy to submit my own PR. Also feel free to use my version, code below.

Cheers!

def threshold_multiotsu(image=None, classes=3, nbins=256, *, hist=None):
    """
    Modified from ``skimage.filters.threshold_multiotsu`` to admit the
    histogram directly as parameter, following the implementation of
    ``skimage.filters.threshold_otsu`` from skimage 0.18.1.
    """
    if image is None and hist is None:
        raise Exception("Either image or hist must be provided.")

    if image is not None and image.ndim > 2 and image.shape[-1] in (3, 4):
        msg = "threshold_otsu is expected to work correctly only for " \
            "grayscale images; image shape {0} looks like an RGB image"
        warn(msg.format(image.shape))

    # Check if the image has more than one intensity value; if not, return that
    # value
    if image is not None:
        first_pixel = image.ravel()[0]
        if np.all(image == first_pixel):
            return first_pixel

    # calculating the histogram and the probability of each gray level.
    if hist is not None:
        if isinstance(hist, (tuple, list)):
            prob, bin_centers = hist
        else:
            prob = hist
            bin_centers = np.arange(prob.size)
    else:
        prob, bin_centers = histogram(
            image.ravel(), nbins, source_range='image', normalize=True)
    prob = prob.astype('float32')
    nvalues = np.count_nonzero(prob)
    if nvalues < classes:
        msg = ("The input image has only {} different values. "
               "It can not be thresholded in {} classes")
        raise ValueError(msg.format(nvalues, classes))
    elif nvalues == classes:
        thresh_idx = np.where(prob > 0)[0][:-1]
    else:
        # Get threshold indices
        try:
            thresh_idx = _get_multiotsu_thresh_indices_lut(prob, classes - 1)
        except MemoryError:
            # Don't use LUT if the number of bins is too large (if the
            # image is uint16 for example): in this case, the
            # allocated memory is too large.
            thresh_idx = _get_multiotsu_thresh_indices(prob, classes - 1)

    thresh = bin_centers[thresh_idx]

    return thresh

@grlee77
Copy link
Contributor Author

grlee77 commented Aug 26, 2021

Hi @andres-fr, please go ahead and open a PR with the feature you mention. I think I would recommend creating your PR independently of this one by branching from main. If yours gets reviewed/merged first, I can deal with resolving any conflicts in this one.

Given that adding the hist argument is simpler than this PR, hopefully it could get a fairly quick review and approval.

@grlee77
Copy link
Contributor Author

grlee77 commented May 6, 2022

will resolve the conflicts and add the benchmarks here when I get a chance

@alexdesiqueira
Copy link
Member

Thank you @grlee77!

@grlee77 grlee77 mentioned this pull request May 12, 2022
@grlee77
Copy link
Contributor Author

grlee77 commented May 12, 2022

The MacOS failures seem to be related to the recent release of pip 22.1. Currently testing pinning to pip<22.1 in #6379

Copy link
Member

@alexdesiqueira alexdesiqueira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @grlee77! Looks good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants