-
-
Notifications
You must be signed in to change notification settings - Fork 55.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DNN: optimize the speed of general Depth-wise #23952
Conversation
@WanliZhong, Can you check if this patch fixes your issue? |
27c0760
to
ff7db7f
Compare
I have tested this patch on arm and x86 (use the min value). Palm detection model has many 5x5 depth wise layers. Handpose and person detection models have several 5x5 depth wise layers. So the effect is evident on palm detection model. Intel chip:
M2 chip:
|
@asmorkalov, I believe, 4.8.1 should be released as soon as possible with this and a few other fixes that you mentioned (Python-related) |
Related performance tests should be updated to avoid similar incidents in the future. |
@wanli will be assigned to add such a performance test. |
Perf test (use min value) result on MacBook Air M2:
|
Looks like the default CI has a failure irrelated to this PR. |
Perf results for i5-2500K CPU @ 3.30GHz (No AVX2)
|
Perf results for ARM v7 (Jetson tk1):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
DNN: optimize the speed of general Depth-wise #23952 Try to solve the issue: #23941 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
DNN: optimize the speed of general Depth-wise opencv#23952 Try to solve the issue: opencv#23941 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
DNN: optimize the speed of general Depth-wise opencv#23952 Try to solve the issue: opencv#23941 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
Try to solve the issue: #23941
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.