New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add StackBlur for imgproc #20379
Add StackBlur for imgproc #20379
Conversation
modules/imgproc/src/stackblur.cpp
Outdated
#include "precomp.hpp" | ||
#include "opencv2/core/hal/intrin.hpp" | ||
|
||
static unsigned short const stackblur_mul[255] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess, you based your implementation on some other code. What's the license of the original implementation? At least, you should put the link/reference to the original code, copyright and information about the license (or the license itself, if it's short)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zihaomu, thank you for the contribution! |
@zihaomu Friendly reminder about tests. |
Thanks. I will keep working on it. |
@vpisarev your turn. |
Hi @vpisarev, CI test still fails. I don't know if there is any error in the use of the macro. Should I change the macro flag from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for contribution!
@alalek Thank you for your code review! And sorry for the late reply. I will update the code and upload it next week. |
cbd343c
to
6ff95be
Compare
Hi @vpisarev, @alalek and @asmorkalov, sorry for the late update. Please take a look at this patch, it has passed the CI test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many functions of the implementation are empty for the case, if CV_SIMD is not defined. It's the case for ARMv6, ARMv7 without NEON, regular MIPS, most of current RISC-V. Please implement the branch with generic C++ code.
Hi @asmorkalov, thanks for the code reviewing. I have updated the code to be compatible with no SIMD version. |
Also It makes sense to add performance test for the new function as soon as you bring vectorized version. Performance report (against CV_DISABLE_OPTIMIZATION) will be very useful. |
Performance on desktop with simd vs without simd code:
|
Hi @asmorkalov. Thanks for the performance test. This test result is reasonable. |
It looks like tests for small resolution are very fast and performance statistics is unstable. That's why some perf tests fail on CI. I propose to exclude VGA resolution and some other too fast cases from performance testing. |
Update, fixed the performance test. Hi @asmorkalov, can you help me to set the right test case? I have no idea why performance tests fail. |
9b500f0
to
4bb39ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Please squash commits and I'll merge the patch.
@asmorkalov It does not appear that |
Yes. We have a lot of merge conflicts between 4.x and 5.x regular merge procedure bas not been established yet. |
/** @brief Blurs an image using the StackBlur. | ||
The function applies and StackBlur to an image. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation formatting is broken.
New empty line is required after @brief
.
Or @details
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be fixed at new patch.
Stack Blur Algorithm by Mario Klingemann <mario@quasimondo.com> | ||
@param src input image. The number of channels can be arbitrary, but the depth should be one of | ||
CV_8U, CV_16U, CV_16S or CV_32F. | ||
@param dst output image of the same size and type as src. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same size
Which is border handling method used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for code reviewing. @alalek. For now, we only use BORDER_REPLICATE type as border type, since the original paper is it. More details of paper can be found at http://underdestruction.com/2004/02/25/stackblur-2004/.
Should we support other border types? Or just describe this detail in a new patch?
ParallelStackBlurRow& operator=(const ParallelStackBlurRow &) { return *this; } | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed
CV_Error_( CV_StsNotImplemented, | ||
("Unsupported input format in StackBlur, the supported formats are: CV_8U, CV_16U, CV_16S and CV_32F.")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CV_Error_
use cv::format internally.
It is not needed here, so common CV_Error
should be used instead.
} | ||
|
||
int sp1 = sp + 1; | ||
sp1 &= -(sp1 < stackLenW); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a test code. So we don't need any kind of "optimization" obfuscation here.
if (stackStart >= kernelSize) stackStart -= kernelSize; | ||
|
||
int sp1 = sp + 1; | ||
sp1 &= -(sp1 < kernelSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure that compilers optimize this properly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The no optimized code is following:
int sp1 = sp + 1;
if (sp1 >= kernelSize)
sp1 = 0;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and we should not obfuscate code for compilers.
More operations may block optimizations (because we can't pass assumptions into optimization transformations).
See here: https://godbolt.org/z/G3jdx5v8a
parallel_for_(Range(0, widthElem), ParallelStackBlurColumn<float, float>(dst, dst, radiusH), numOfThreads); | ||
} | ||
else | ||
CV_Error_( CV_StsNotImplemented, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CV_StsNotImplemented
We should not use enums from C API anymore.
|
||
dstPtr += widthElem; | ||
++sp; | ||
if (sp >= kernelSize) sp = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if
and its body
should be placed on separate lines. This is necessary for correct "code coverage" reports.
Add New Blur Filter: StackBlur
Hi, the basic idea is blurring input images through one-dimension blur in the horizontal and the vertical respectively. StackBlur can blur the image in linear time. In theory, stackblur should run at the same speed as BoxBlur, and stackblur can eliminate the boxed-feeling of Boxblur in a large kernel size, and achieve the same effect as Gaussian blur.
Stack Blur was invented by Mario Klingemann mario@quasimondo.com.
Details description can be found at: http://underdestruction.com/2004/02/25/stackblur-2004
HW/SW:
Apple M1 chip / MacOS
Speed Test
The testing image size: [1920 x 2048], Input format: CV_8UC3.
Test the rule, run 1000 times to pick the value with the lowest speed.
The source log of the image and some test script can be found this google drive.
More performance results:
Here are some experimental results for CV_8U input:
Here are some experimental results for CV_32F input:
Update 2022.09.21
Stackblur
is slower thanGaussianBlur
when the kernel is small and the format of the input isuchar
. In other cases, it is faster than Gaussian blur, and the larger the kernel size, the greater the speedup ratio.Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.