-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Richardson-Lucy deconvolution: allow single-precision computation #4880
Richardson-Lucy deconvolution: allow single-precision computation #4880
Conversation
Thanks for the PR! An ASV benchmark would be great if you find the time :-). |
I added some simple ASV benchmarks for As expected, float32 operation is now faster. This PR also reduced peak memory use a bit in the float64 case by avoiding internal copies. Result of benchmarks for release 0.17.2
Result of benchmarks for this PR
|
For the benchmarks, I ended up making a new |
I think maybe the doc failure might be because I haven't rebased on master since that PR was merged? |
073ec41
to
c1c2c1c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @grlee77 !
@grlee77 why is time RL f32 not shown in your benchmarks result? |
@grlee77 this is then ready after that fix and a rebase/merge on master! |
If the input image is single precision, all computations and the output will be in single precision. Internal copies are avoided when possible.
rgb2gray is not needed for the grayscale camera image. img_as_float used to rescale to [0, 1] so clip=False is not needed in the richardson_lucy call.
bf38f4c
to
97f0546
Compare
Thank you @grlee77! |
Description
This PR improves the performance of deconvolution by keeping the computations in single-precision when the input image is single-precision.
It also avoids unnecessary copies of the input arrays by adding
copy=False
to existing.astype()
calls.This can give up to a factor 2 performance improvement when calling Richardson Lucy either single-threaded or in various multi-threaded scenarios. See the timings for double-precision in #4083 (comment) vs. single-precision in #4083 (comment).
If this seems reasonable to others, I can also adapt the example from the comments into an ASV benchmark.
Checklist
./doc/examples
(new features only)./benchmarks
, if your changes aren't covered by anexisting benchmark
For reviewers
later.
__init__.py
.doc/release/release_dev.rst
.