-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BatchNorm channels last memory format optimization on CPU #46234
Conversation
💊 CI failures summary and remediationsAs of commit 6b3ffd4 (more details on the Dr. CI page):
🕵️ 4 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages: pytorch_linux_xenial_cuda9_2_cudnn7_py3_gcc5_4_build (1/4)Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)
|
bbb3a24
to
725856e
Compare
Codecov Report
@@ Coverage Diff @@
## master #46234 +/- ##
===========================================
+ Coverage 60.82% 68.38% +7.56%
===========================================
Files 2751 411 -2340
Lines 254434 53950 -200484
===========================================
- Hits 154748 36894 -117854
+ Misses 99686 17056 -82630
Continue to review full report at Codecov.
|
Hi @mingfeima! Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks! |
parallel inference contiguous path parallel inference channels last path add dim apply optimize update stats add channels last support for backward Revert "add channels last support for backward" This reverts commit cc5e29dce44395250f8e2abf9772f0b99f4bcf3a. Revert "optimize update stats" This reverts commit 7cc6540701448b9cfd5833e36c745b5015ae7643. Revert "add dim apply" This reverts commit b043786d8ef72dee5cf85b5818fcb25028896ecd. bug fix
725856e
to
6b3ffd4
Compare
Update performance with pytorch internal operator benchmark, machine Xeon(R) Gold 6248 CPU, 20 cores per socket, 2.5GHz. 1C refers to single core run, 20C refers to single socket run. jemalloc and numactrl are applied to reduce test result fluctuation.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Code looks good, but I see that you changed channels first AND channels last implementations. Can you please make sure to benchmark both memory layouts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will approve to continue process, but I will not land until benchmarks provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
On the channels last format, inference performance is OK but training is not good enough since the current impl on training will parallel on dim of I need to continue to improve this PR, once finished I will stack this with other channels last pull requests. |
please review the new one #48919 |
BatchNorm channels last memory format optimization on CPU