New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DNN: fused depthwise and add #23096
DNN: fused depthwise and add #23096
Conversation
161dd13
to
ab37a1c
Compare
@rogday Please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
if (fusedAdd) | ||
out += outptr[out_j]; | ||
if (relu) | ||
out = out > 0.f ? out : out*relu_coeff; | ||
outptr[out_j] = out; | ||
outptr[out_j] += out; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems suspicious - we add outptr 2 times, is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for reviewing. Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Merge with test data: opencv/opencv_extra#1034
Fixes: #23074
In the previous optimization, we fused the
Conv
andAdd
layers. This PR further provides support forDepth-wise Conv
andAdd
layers fusion.Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.