Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

restoration.rolling_ball taking 2+mins #5193

Closed
ajoshi80 opened this issue Jan 22, 2021 · 8 comments
Closed

restoration.rolling_ball taking 2+mins #5193

ajoshi80 opened this issue Jan 22, 2021 · 8 comments

Comments

@ajoshi80
Copy link

The restoration.rolling_ball function takes over 2 minutes for a single 1800*1800 image using radius 15 rolling ball.

Way to reproduce

    image = cv.imread(path)
    image_inverted = util.invert(image)
    background_inverted = restoration.rolling_ball(image_inverted, radius=15)
    filtered_image_inverted = image_inverted - background_inverted
    filtered_image = util.invert(filtered_image_inverted)

Version information

# Paste the output of the following python commands
from __future__ import print_function
import sys; print(sys.version)
import platform; print(platform.platform())
import skimage; print("scikit-image version: {}".format(skimage.__version__))
import numpy; print("numpy version: {}".format(numpy.__version__))
Python 3.7.7 (default, May  7 2020, 21:25:33) 
[GCC 7.3.0] :: Anaconda, Inc. on linux

Linux-4.4.0-122-generic-x86_64-with-debian-stretch-sid
scikit-image version: 0.18.1
numpy version: 1.19.0
@jni
Copy link
Member

jni commented Jan 23, 2021

It might just be that rolling ball is a slow algorithm. @FirefoxMetzger did a pretty heroic job optimizing the implementation. Could you check how long Fiji takes to do the same thing?

https://imagej.net/Rolling_Ball_Background_Subtraction

Having said that, a lot of the optimization required might have had to do with us trying to cater to nD + floating point and we just made things harder on ourselves...

@FirefoxMetzger
Copy link
Contributor

Yes, the rolling ball does some heavy computation in the main loop. Does your image have a color channel? In that case, you likely want to convert it to grayscale beforehand or specify the kernel manually to avoid unnecessary computation along the channel dimension.

Is it possible for you to share the image with us, so we can benchmark it on our machines? This usually makes it easier to figure out what is going on.

Sidenote: The kernel is twice the size of the radius in each dimension (to fit the ball's diameter), which - assuming you use a grayscale image - means something like 3 billion (3.11e9) multiplications and additions irrespective of the number format for an 1800x1800 image. The algorithm isn't too cache-friendly, so add to that the time it takes to constantly load data from the RAM and that could explain the two minutes.

A few things to check that might impact performance:

  • Convert color images to grayscale if you can, otherwise use kernel= and explicitly declare an ellipsoid kernel that is small in the channel dimension, e.g. 1. The algorithm will zero-pad the image along each dimension to fit the kernel. Along the channel dimension, this may not make sense for you and reduce performance.
  • Check if OpenMP acceleration is enabled. The algorithm takes a num_threads keyword argument that does pretty much what it says. You can run it with different values and see if the execution time changes; if not, you may consider building skimage with OpenMP enabled.
  • Check if your CPU supports at least SSE2 instructions. This is probably true if you have an Intel or AMD chip; for ARM this is worth checking. The algorithm works without it, but - since we can't churn the numbers as quickly - you will have to wait longer for a result.

@ajoshi80
Copy link
Author

ajoshi80 commented Jan 25, 2021

The image has color channels. I have tried Fiji/ImageJ's version and it can do it in less than 20s with rolling ball 30. If we convert to grayscale and compute background, do we then subtract from each channel?

@FirefoxMetzger
Copy link
Contributor

That explains it; where possible we try to be unopinionated. The original rolling ball algorithm - by that I mean the paper from the 90's - only specified the algorithm on grayscale images; how to extend that to color images, volumes, time series, etc. depends on your use-case.

For you, it likely doesn't make sense to have the ball extend along the color dimension. You have two options:

  1. do what you suggested, convert to gray, and subtract the result from each channel
  2. Filter each channel individually

The code for option 1 looks something like this

from skimage import io, restoration, color, util

image = io.imread(path)  # Note: if you use OpenCV's cv.imread the channel order will be BGR not RGB
image = util.img_as_float64(image)
gray_image = color.rgb2gray(image)
image_inverted = util.invert(image)
gray_inverted = util.invert(gray_image)
background = restoration.rolling_ball(gray_inverted, radius=15)
filtered_image_inverted = image_inverted - background[..., None]
filtered_image = util.invert(filtered_image_inverted)

The entire script takes about 900ms on my machine, including the loading of the image from disk.

The code for option 2 looks something like this

from skimage import io, restoration, util

image = io.imread("test.png")  # Note: if you use OpenCV's cv.imread the channel order will be BGR not RGB
image = util.img_as_float64(image)
image_inverted = util.invert(image)
kernel = restoration.ellipsoid_kernel((31, 31, 1), 31)  # filter each color channel individually
background = restoration.rolling_ball(image_inverted, kernel=kernel)
filtered_image_inverted = image_inverted - background
filtered_image = util.invert(filtered_image_inverted)

which takes about 15.5s on my machine, again including the initial loading from disk.

You can reduce the time of the second option by filtering each channel separately, which makes the image a little more cache and SIMD friendly.

from skimage import io, restoration, util

image = io.imread("test.png")  # Note: if you use OpenCV's cv.imread the channel order will be BGR not RGB
image = util.img_as_float64(image)
image_inverted = util.invert(image)
background_R = restoration.rolling_ball(image_inverted[..., 0], radius=15)
background_G = restoration.rolling_ball(image_inverted[..., 1], radius=15)
background_B = restoration.rolling_ball(image_inverted[..., 2], radius=15)
background = np.stack([background_R, background_G, background_B], 2)
filtered_image_inverted = image_inverted - background
filtered_image = util.invert(filtered_image_inverted)

Which takes about 1.4 sec on my machine. Or more elegantly (but slower) by temporarily storing the image channel-first

from skimage import io, restoration, util
import numpy as np

image = io.imread("test.png")  # Note: if you use OpenCV's cv.imread the channel order will be BGR not RGB
image = np.moveaxis(image, 2, 0) # make the image channel-first
image = util.img_as_float64(image)
image_inverted = util.invert(image)
kernel = restoration.ellipsoid_kernel((1, 31, 31), 31)  # filter each color channel individually
background = restoration.rolling_ball(image_inverted, kernel=kernel)
filtered_image_inverted = image_inverted - background
filtered_image = util.invert(filtered_image_inverted)
filtered_image = np.moveaxis(filtered_image, 0, 2)  # make the image channel-last

Which takes about 2.5 seconds on my machine.


I don't know how imageJ does this under the hood, but maybe someone else knows? Either way, I hope this helps you reducing the runtime a bit.

@grlee77
Copy link
Contributor

grlee77 commented Jan 25, 2021

Thanks for the detailed feedback here @FirefoxMetzger.

A couple of additional possibilities would be:

  • convert to HSV and filter the V (value) channel then convert back to RGB
  • convert to YUV and filter the Y (luminance) channel then convert back to RGB

I guess in practice one would have to compare the quality of these various approaches in an application to determine which is most appropriate to use. I also don't know which approach ImageJ takes.

@grlee77
Copy link
Contributor

grlee77 commented Jan 25, 2021

Actually, there is a link to the java code in one of the links above. It looks like it uses a color conversion approach as in my prior comment (HSB with filtering on the "brightness" channel).

I think HSB is just a synonym for what scikit-image calls HSV, so the scikit-image conversion functions are skimage.color.rgb2hsv and skimage.color.hsv2rgb.

This is Java code from the link above

    public void subtractRGBBackround(ColorProcessor ip, int ballRadius) {
        int width = ip.getWidth();
        int height = ip.getHeight();
        byte[] H = new byte[width*height];
        byte[] S = new byte[width*height];
        byte[] B = new byte[width*height];
        ip.getHSB(H, S, B);
        ByteProcessor brightness = new ByteProcessor(width, height, B, null);
        subtractBackround(brightness, radius);
        ip.setHSB(H, S, (byte[])brightness.getPixels());
    }

This should also be fast as in the cases @FirefoxMetzger benchmarked above.

@FirefoxMetzger
Copy link
Contributor

Nice find, @grlee77 ! This is indeed a very similar idea to the one that converts to grayscale. It only filters a single channel, so one would expect the timings to be near the grayscale approach, i.e., around 1 second runtime.

@grlee77
Copy link
Contributor

grlee77 commented Jan 25, 2021

Feel free to continue discussion here, but it looks like this issue can be closed as performance seems satisfactory as long as a 2D rolling ball is applied to the color image.

If desired, please open a separate issue regarding a potential enhancement for color image processing. Specifically, we could add an rgb kwarg to the method and have it implement the HSV approach mentioned above if set to True.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants