-
Notifications
You must be signed in to change notification settings - Fork 22.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
quant: switch observers to use min_max #42957
Conversation
Summary: Switches observers to use the new min_max function to calculate min and max at the same time. We see around 45-50% speedup on representative input shapes on the microbenchmarks. Test Plan: CI for correctness performance: ``` cd benchmarks/operator_benchmark // repeat (before diff, after diff) x (cpu, cuda) python -m pt.qobserver_test --tag_filter all --device cpu ``` Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Switches observers to use the new min_max function to calculate min and max at the same time. We see around 45-50% speedup on representative input shapes on the microbenchmarks. Test Plan: CI for correctness performance: ``` cd benchmarks/operator_benchmark // repeat (before diff, after diff) x (cpu, cuda) python -m pt.qobserver_test --tag_filter all --device cpu ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: e0dedf621a555bd6f821427299ccce63618ac8d0 Pull Request resolved: #42957
💊 CI failures summary and remediationsAs of commit 6a60cbc (more details on the Dr. CI page):
Extra GitHub checks: 1 failed
ci.pytorch.org: 1 failedThis comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 31 times. |
Summary: Switches observers to use the new min_max function to calculate min and max at the same time. We see around 45-50% speedup on representative input shapes on the microbenchmarks for all observers except `HistogramObserver`. Test Plan: CI for correctness performance: ``` cd benchmarks/operator_benchmark // repeat (before diff, after diff) x (cpu, cuda) python -m pt.qobserver_test --tag_filter all --device cpu /* * before, cpu: https://our.intern.facebook.com/intern/paste/P138633280/ * before, cuda: https://our.intern.facebook.com/intern/paste/P138639473/ * after, cpu: https://our.intern.facebook.com/intern/paste/P138635458/ * after, cuda: https://our.intern.facebook.com/intern/paste/P138636344/ */ ``` Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23093995](https://our.internmc.facebook.com/intern/diff/D23093995) [ghstack-poisoned]
Summary: Switches observers to use the new min_max function to calculate min and max at the same time. We see around 45-50% speedup on representative input shapes on the microbenchmarks for all observers except `HistogramObserver`. Test Plan: CI for correctness performance: ``` cd benchmarks/operator_benchmark // repeat (before diff, after diff) x (cpu, cuda) python -m pt.qobserver_test --tag_filter all --device cpu /* * before, cpu: https://our.intern.facebook.com/intern/paste/P138633280/ * before, cuda: https://our.intern.facebook.com/intern/paste/P138639473/ * after, cpu: https://our.intern.facebook.com/intern/paste/P138635458/ * after, cuda: https://our.intern.facebook.com/intern/paste/P138636344/ */ ``` Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23093995](https://our.internmc.facebook.com/intern/diff/D23093995) [ghstack-poisoned]
Summary: Switches observers to use the new min_max function to calculate min and max at the same time. We see around 45-50% speedup on representative input shapes on the microbenchmarks for all observers except `HistogramObserver`. Test Plan: CI for correctness performance: ``` cd benchmarks/operator_benchmark // repeat (before diff, after diff) x (cpu, cuda) python -m pt.qobserver_test --tag_filter all --device cpu /* * before, cpu: https://our.intern.facebook.com/intern/paste/P138633280/ * before, cuda: https://our.intern.facebook.com/intern/paste/P138639473/ * after, cpu: https://our.intern.facebook.com/intern/paste/P138635458/ * after, cuda: https://our.intern.facebook.com/intern/paste/P138636344/ */ ``` Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23093995](https://our.internmc.facebook.com/intern/diff/D23093995) [ghstack-poisoned]
Summary: Switches observers to use the new min_max function to calculate min and max at the same time. We see around 45-50% speedup on representative input shapes on the microbenchmarks. Test Plan: CI for correctness performance: ``` cd benchmarks/operator_benchmark // repeat (before diff, after diff) x (cpu, cuda) python -m pt.qobserver_test --tag_filter all --device cpu ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 721eb176284cae1bd1aeb82dba9deaf37662c502 Pull Request resolved: #42957
Summary: Switches observers to use the new min_max function to calculate min and max at the same time. We see around 45-50% speedup on representative input shapes on the microbenchmarks for all observers except `HistogramObserver`. Test Plan: CI for correctness performance: ``` cd benchmarks/operator_benchmark // repeat (before diff, after diff) x (cpu, cuda) python -m pt.qobserver_test --tag_filter all --device cpu /* * before, cpu: https://our.intern.facebook.com/intern/paste/P138633280/ * before, cuda: https://our.intern.facebook.com/intern/paste/P138639473/ * after, cpu: https://our.intern.facebook.com/intern/paste/P138635458/ * after, cuda: https://our.intern.facebook.com/intern/paste/P138636344/ */ ``` Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23093995](https://our.internmc.facebook.com/intern/diff/D23093995) [ghstack-poisoned]
Summary: Switches observers to use the new min_max function to calculate min and max at the same time. We see around 45-50% speedup on representative input shapes on the microbenchmarks. Test Plan: CI for correctness performance: ``` cd benchmarks/operator_benchmark // repeat (before diff, after diff) x (cpu, cuda) python -m pt.qobserver_test --tag_filter all --device cpu ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 673b854bc2458171e806d50fa020a584b9c1d7ae Pull Request resolved: #42957
Summary: Switches observers to use the new min_max function to calculate min and max at the same time. We see around 45-50% speedup on representative input shapes on the microbenchmarks for all observers except `HistogramObserver`. Test Plan: CI for correctness performance: ``` cd benchmarks/operator_benchmark // repeat (before diff, after diff) x (cpu, cuda) python -m pt.qobserver_test --tag_filter all --device cpu /* * before, cpu: https://our.intern.facebook.com/intern/paste/P138633280/ * before, cuda: https://our.intern.facebook.com/intern/paste/P138639473/ * after, cpu: https://our.intern.facebook.com/intern/paste/P138635458/ * after, cuda: https://our.intern.facebook.com/intern/paste/P138636344/ */ ``` Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23093995](https://our.internmc.facebook.com/intern/diff/D23093995) [ghstack-poisoned]
Summary: Switches observers to use the new min_max function to calculate min and max at the same time. We see around 45-50% speedup on representative input shapes on the microbenchmarks. Test Plan: CI for correctness performance: ``` cd benchmarks/operator_benchmark // repeat (before diff, after diff) x (cpu, cuda) python -m pt.qobserver_test --tag_filter all --device cpu ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 056222595367143386a75c0218fdf33fb5b7929f Pull Request resolved: #42957
Codecov Report
@@ Coverage Diff @@
## gh/vkuzo/123/base #42957 +/- ##
=====================================================
+ Coverage 69.32% 69.35% +0.03%
=====================================================
Files 381 381
Lines 47190 47323 +133
=====================================================
+ Hits 32713 32822 +109
- Misses 14477 14501 +24
Continue to review full report at Codecov.
|
This pull request has been merged in fd8e206. |
Stack from ghstack:
Summary:
Switches observers to use the new min_max function to calculate
min and max at the same time. We see around 45-50% speedup on
representative input shapes on the microbenchmarks for all observers except
HistogramObserver
.Test Plan:
CI for correctness
performance:
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D23093995