faster conversion between RGB and HSV colorspaces #5362

grlee77 · 2021-04-28T21:32:12Z

Description

The current code for rgb2hsv and hsv2rgb is pretty inefficient. This PR uses Cython to accelerate these two functions, while also reducing the peak memory required. Related, to #1133, but this PR only improves two of the functions in the module.

For the included benchmarks, I get ~20 times faster computation and the memory used by for hsv2rgb is several times smaller. The cause of the large memory footprint in the current hsv2rgb implementation is the following lines where several temporary arrays matching the size of the input image are created:

scikit-image/skimage/color/colorconv.py

Lines 315 to 322 in 821c7f2

    
           hi = np.stack([hi, hi, hi], axis=-1).astype(np.uint8) % 6 
        
           out = np.choose( 
        
               hi, np.stack([np.stack((v, t, p), axis=-1), 
        
                             np.stack((q, v, p), axis=-1), 
        
                             np.stack((p, v, t), axis=-1), 
        
                             np.stack((p, q, v), axis=-1), 
        
                             np.stack((t, p, v), axis=-1), 
        
                             np.stack((v, p, q), axis=-1)]))

Note

The calls to reshape(-1, 3) are used to collapse multiple spatial dimensions into a single spatial axis to loop over in the Cython code. The output of the Cython function is then restored to the original shape.

Checklist

Docstrings for all functions
Gallery example in ./doc/examples (new features only)
Benchmark in ./benchmarks, if your changes aren't covered by an
existing benchmark
Unit tests
Clean style in the spirit of PEP8

For reviewers

Check that the PR title is short, concise, and will make sense 1 year
later.
Check that new functions are imported in corresponding __init__.py.
Check that new features, API changes, and deprecations are mentioned in
doc/release/release_dev.rst.

benchmark both float32 and float64

pep8speaks · 2021-04-28T21:32:16Z

Hello @grlee77! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file benchmarks/benchmark_color.py:

Line 45:80: E501 line too long (88 > 79 characters)

Comment last updated at 2021-08-16 15:34:48 UTC

grlee77 · 2021-04-28T21:41:57Z

I wrote simple CUDA kernel code for all color conversions in cuCIM. This is done by JIT compiling elementwise kernels for each color conversion (example for separate_stains), giving a large acceleration over the CPU code.

The cases involving HSV were the ones where the relative performance difference was largest, so I have backported a CPU equivalent here for that one.

grlee77 · 2021-04-28T21:49:59Z

Actually, these functions are also a reasonable use case for Pythran. I tried this just out of interest and the following is what the Pythran code for rgb2hsv_inner that I came up with looks like:

#pythran export rgb2hsv_inner(float64[:, 3] order (C))
#pythran export rgb2hsv_inner(float32[:, 3] order (C))
def rgb2hsv_inner(rgb):
    hsv = np.empty_like(rgb)
    n = rgb.shape[0]
    for i in range(n):
        minv = rgb[i, :].min()
        maxv = rgb[i, :].max()
        delta = maxv - minv
        if delta == 0.0:
            hsv[i, :2] = 0.0
        else:
            hsv[i, 1] = delta / maxv
            if rgb[i, 0] == maxv:
                hsv[i, 0] = (rgb[i, 1] - rgb[i, 2]) / delta
            elif rgb[i, 1] == maxv:
                hsv[i, 0] = 2.0 + (rgb[i, 2] - rgb[i, 0]) / delta
            elif rgb[i, 2] == maxv:
                hsv[i, 0] = 4.0 + (rgb[i, 0] - rgb[i, 1]) / delta
            hsv[i, 0] /= 6.0
            hsv[i, 0] -= floor(hsv[i, 0])
        hsv[i, 2] = maxv
    return hsv

It is pretty similar to the Cython case, but a bit simpler. This Pythran version was actually slightly faster than the Cython code here on small images, but ~25% slower on large ones. I am not sure what is the underlying cause for the difference.

rfezzani

Thank you @grlee77 😉

skimage/color/colorconv.py

rfezzani · 2021-04-28T22:00:29Z

skimage/color/_colorconv.pyx

+        Py_ssize_t i, n, ch
+
+    n = rgb.shape[0]
+    for i in range(n):


What about prange here?

I can try it. There may not be enough computation for it to provide a benefit, though.

I tried prange and did see some improvements by a factor of 2-5 for 10 cores (20 threads) on large images. For some reason the rgb2hsv function became much slower than before if setting OMP_NUM_THREADS=1 before running the benchmarks though! That seems odd to me, but I wasn't able to quickly diagnose the source of the problem so I have left it without prange for now.

rfezzani · 2021-04-28T22:07:35Z

Pythran version loops twice over the channels looking for min and max value while in your implementation there is a unique traversal. This may explain the performance gain 😉

grlee77 · 2021-04-28T22:17:51Z

Pythran version loops twice over the channels looking for min and max value while in your implementation there is a unique traversal.

I thought that too, but tried changing it and didn't see much difference

rfezzani · 2021-04-29T07:58:43Z

skimage/color/_colorconv.pyx

+                elif rgb[i, 2] == maxv:
+                    hsv[i, 0] = 4.0 + (rgb[i, 0] - rgb[i, 1]) / delta
+                hsv[i, 0] /= 6.0
+                hsv[i, 0] -= floor(<double>hsv[i, 0])


I am not sure that this will work but what about defining a _floor function specific to np_floats?

if np_floats is cnp.float32_t: _floor = floorf else: _floor = floor

it may avoid this cast to double precision when the input is single precision...

I think we would need to a C99 flag (or language='c++') to ensure floorf is available. I think that used to be a problem for older MSVC versions, but is probably fine now.

Cython 0.29.x doesn't define floorf in it's libc.math, but the current 3.0 alpha does. That doesn't really matter, though, since we can always just add it ourselves with:

cdef extern from "<math.h>" nogil: float floorf(float)

skimage/color/_colorconv.pyx

Only a specific branch potentially needs to wrap negative values. DOC: document the source of the algorithm and the publication where HSV was proposed

jonahpearl · 2023-05-08T16:55:41Z

The skimage version is still pretty slow -- I tried just grabbing OP's suggestion but it doesn't seem any faster. Any suggested fixes? Or is the current skimage version actually a sped-up implementation, and it was even slower before? Thanks!

grlee77 added 2 commits April 28, 2021 16:50

Cython implementation for rgb2hsv and hsv2rgb

91b278d

add benchmarks for rgb2hsv and hsv2rgb

ffcc1f4

benchmark both float32 and float64

grlee77 added the performance label Apr 28, 2021

fix copy-paste error in color/setup.py

fe93ff1

rfezzani reviewed Apr 28, 2021

View reviewed changes

grlee77 added 2 commits April 28, 2021 18:33

use nogil context

b1cd11b

update output variable name in docstring

6585d9d

rfezzani reviewed Apr 29, 2021

View reviewed changes

alexdesiqueira mentioned this pull request May 3, 2021

2021's calendar of community management #5169

Closed

grlee77 added 📈 type: Performance and removed performance labels Jul 8, 2021

Merge remote-tracking branch 'upstream/main' into color_rgb_hsv_cython

7580a74

hmaarrfk reviewed Aug 15, 2021

View reviewed changes

skimage/color/_colorconv.pyx Show resolved Hide resolved

More efficient modulus in rgb2hsv inner

0fab7ef

Only a specific branch potentially needs to wrap negative values. DOC: document the source of the algorithm and the publication where HSV was proposed

grlee77 mentioned this pull request Sep 6, 2021

Add Pythran support to build, convert two functions #3226

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

faster conversion between RGB and HSV colorspaces #5362

faster conversion between RGB and HSV colorspaces #5362

grlee77 commented Apr 28, 2021

pep8speaks commented Apr 28, 2021 •

edited

grlee77 commented Apr 28, 2021

grlee77 commented Apr 28, 2021

rfezzani left a comment

rfezzani Apr 28, 2021

grlee77 Apr 28, 2021

grlee77 Apr 28, 2021

rfezzani commented Apr 28, 2021

grlee77 commented Apr 28, 2021

rfezzani Apr 29, 2021

grlee77 Apr 29, 2021

jonahpearl commented May 8, 2023

	hi = np.stack([hi, hi, hi], axis=-1).astype(np.uint8) % 6
	out = np.choose(
	hi, np.stack([np.stack((v, t, p), axis=-1),
	np.stack((q, v, p), axis=-1),
	np.stack((p, v, t), axis=-1),
	np.stack((p, q, v), axis=-1),
	np.stack((t, p, v), axis=-1),
	np.stack((v, p, q), axis=-1)]))

faster conversion between RGB and HSV colorspaces #5362

Are you sure you want to change the base?

faster conversion between RGB and HSV colorspaces #5362

Conversation

grlee77 commented Apr 28, 2021

Description

Note

Checklist

For reviewers

pep8speaks commented Apr 28, 2021 • edited

Comment last updated at 2021-08-16 15:34:48 UTC

grlee77 commented Apr 28, 2021

grlee77 commented Apr 28, 2021

rfezzani left a comment

Choose a reason for hiding this comment

rfezzani Apr 28, 2021

Choose a reason for hiding this comment

grlee77 Apr 28, 2021

Choose a reason for hiding this comment

grlee77 Apr 28, 2021

Choose a reason for hiding this comment

rfezzani commented Apr 28, 2021

grlee77 commented Apr 28, 2021

rfezzani Apr 29, 2021

Choose a reason for hiding this comment

grlee77 Apr 29, 2021

Choose a reason for hiding this comment

jonahpearl commented May 8, 2023

pep8speaks commented Apr 28, 2021 •

edited