Richardson-Lucy deconvolution: allow single-precision computation #4880

grlee77 · 2020-08-02T07:11:03Z

Description

This PR improves the performance of deconvolution by keeping the computations in single-precision when the input image is single-precision.

It also avoids unnecessary copies of the input arrays by adding copy=False to existing .astype() calls.

This can give up to a factor 2 performance improvement when calling Richardson Lucy either single-threaded or in various multi-threaded scenarios. See the timings for double-precision in #4083 (comment) vs. single-precision in #4083 (comment).

If this seems reasonable to others, I can also adapt the example from the comments into an ASV benchmark.

Checklist

Docstrings for all functions
Gallery example in ./doc/examples (new features only)
Benchmark in ./benchmarks, if your changes aren't covered by an
existing benchmark
Unit tests
Clean style in the spirit of PEP8

For reviewers

Check that the PR title is short, concise, and will make sense 1 year
later.
Check that new functions are imported in corresponding __init__.py.
Check that new features, API changes, and deprecations are mentioned in
doc/release/release_dev.rst.

skimage/restoration/tests/test_restoration.py

emmanuelle · 2020-08-02T12:44:17Z

Thanks for the PR! An ASV benchmark would be great if you find the time :-).

grlee77 · 2020-08-05T23:23:04Z

I added some simple ASV benchmarks for richardson_lucy. There is also a separate commit that fixed the existing restoration benchmarks, which were currently failing to run due to a misnamed variable!

As expected, float32 operation is now faster. This PR also reduced peak memory use a bit in the float64 case by avoiding internal copies.

Result of benchmarks for release 0.17.2

[  0.00%] ·· Benchmarking existing-py_home_lee8rx_miniconda3_envs_pyir_bin_python3.7
[ 40.00%] ··· Running (benchmark_restoration.DeconvolutionSuite.time_denoise_nl_means_f32--)..
[ 60.00%] ··· benchmark_restoration.DeconvolutionSuite.peakmem_denoise_nl_means_f32                       293M
[ 70.00%] ··· benchmark_restoration.DeconvolutionSuite.peakmem_richardson_lucy_f64                        293M
[ 80.00%] ··· benchmark_restoration.DeconvolutionSuite.peakmem_setup                                      124M
[ 90.00%] ··· benchmark_restoration.DeconvolutionSuite.time_denoise_nl_means_f32                    2.32±0.01s
[100.00%] ··· benchmark_restoration.DeconvolutionSuite.time_richardson_lucy_f64                     2.33±0.02s

Result of benchmarks for this PR

[  0.00%] ·· Benchmarking existing-py_home_lee8rx_miniconda3_envs_pyir_bin_python3.7
[ 40.00%] ··· Running (benchmark_restoration.DeconvolutionSuite.time_denoise_nl_means_f32--)..
[ 60.00%] ··· benchmark_restoration.DeconvolutionSuite.peakmem_richardson_lucy_f32                        200M
[ 70.00%] ··· benchmark_restoration.DeconvolutionSuite.peakmem_richardson_lucy_f64                        276M
[ 80.00%] ··· benchmark_restoration.DeconvolutionSuite.peakmem_setup                                      124M
[ 90.00%] ··· benchmark_restoration.DeconvolutionSuite.time_richardson_lucy_f32                     1.64±0.01s
[100.00%] ··· benchmark_restoration.DeconvolutionSuite.time_richardson_lucy_f64                     2.30±0.01s

grlee77 · 2020-08-05T23:25:32Z

For the benchmarks, I ended up making a new DeconvolutionSuite class since the setup method is different to the one in the existing RestorationSuite class (need to store a PDF and convolve it with the input image).

grlee77 · 2020-08-06T01:07:38Z

I think maybe the doc failure might be because I haven't rebased on master since that PR was merged?

emmanuelle

Thanks @grlee77 !

jni · 2020-08-12T10:15:25Z

@grlee77 why is time RL f32 not shown in your benchmarks result?

grlee77 · 2020-08-14T18:12:12Z

@grlee77 why is time RL f32 not shown in your benchmarks result?

It is there, but was misnamed as "time_denoise_nl_means_f32" due to a copy paste error that was later fixed in bf38f4c.

I will edit the comment above to reflect the correct naming.

jni · 2020-08-16T02:49:22Z

@grlee77 this is then ready after that fix and a rebase/merge on master!

If the input image is single precision, all computations and the output will be in single precision. Internal copies are avoided when possible.

rgb2gray is not needed for the grayscale camera image. img_as_float used to rescale to [0, 1] so clip=False is not needed in the richardson_lucy call.

alexdesiqueira · 2020-08-17T21:34:11Z

Thank you @grlee77!

grlee77 added the 📈 type: Performance label Aug 2, 2020

emmanuelle reviewed Aug 2, 2020

View reviewed changes

skimage/restoration/tests/test_restoration.py Outdated Show resolved Hide resolved

grlee77 force-pushed the richardson_lucy_preserve_single branch from 073ec41 to c1c2c1c Compare August 6, 2020 01:11

emmanuelle approved these changes Aug 6, 2020

View reviewed changes

sciunto mentioned this pull request Aug 10, 2020

2020's calendar of community management #4486

Closed

jni approved these changes Aug 16, 2020

View reviewed changes

grlee77 added 6 commits August 17, 2020 01:24

Preserve floating point precision in Richardson-Lucy Deconvolution

2ce0568

If the input image is single precision, all computations and the output will be in single precision. Internal copies are avoided when possible.

mention richardson_lucy dtype change in the release notes

3a3a485

Fix richardson_lucy docstring example

e2581c9

rgb2gray is not needed for the grayscale camera image. img_as_float used to rescale to [0, 1] so clip=False is not needed in the richardson_lucy call.

simplify test parametrization

9f92d39

fix indentation in existing restoration benchmarks

2a6250a

add benchmarks for richardson_lucy

97f0546

grlee77 force-pushed the richardson_lucy_preserve_single branch from bf38f4c to 97f0546 Compare August 17, 2020 05:29

alexdesiqueira merged commit 8202192 into scikit-image:master Aug 17, 2020

grlee77 deleted the richardson_lucy_preserve_single branch August 19, 2020 02:37

grlee77 mentioned this pull request Jan 28, 2021

More consistent support for single precision computation across skimage #5205

Closed

grlee77 mentioned this pull request Sep 7, 2021

[FEA] Update API for consistency with upcoming scikit-image 0.19 rapidsai/cucim#98

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Richardson-Lucy deconvolution: allow single-precision computation #4880

Richardson-Lucy deconvolution: allow single-precision computation #4880

grlee77 commented Aug 2, 2020

emmanuelle commented Aug 2, 2020

grlee77 commented Aug 5, 2020 •

edited

Loading

grlee77 commented Aug 5, 2020

grlee77 commented Aug 6, 2020

emmanuelle left a comment

jni commented Aug 12, 2020

grlee77 commented Aug 14, 2020

jni commented Aug 16, 2020

alexdesiqueira commented Aug 17, 2020

Richardson-Lucy deconvolution: allow single-precision computation #4880

Richardson-Lucy deconvolution: allow single-precision computation #4880

Conversation

grlee77 commented Aug 2, 2020

Description

Checklist

For reviewers

emmanuelle commented Aug 2, 2020

grlee77 commented Aug 5, 2020 • edited Loading

Result of benchmarks for release 0.17.2

Result of benchmarks for this PR

grlee77 commented Aug 5, 2020

grlee77 commented Aug 6, 2020

emmanuelle left a comment

Choose a reason for hiding this comment

jni commented Aug 12, 2020

grlee77 commented Aug 14, 2020

jni commented Aug 16, 2020

alexdesiqueira commented Aug 17, 2020

grlee77 commented Aug 5, 2020 •

edited

Loading