-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[quant][qat] Ensure observer respects device affinity #47514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: Previosuly the scale and zero_point were returned on the CPU even if the input tensor was on the GPU. This is because `copy_()` doesn't respect the device when copying over the tensor. Also fixed a bug where we were always setting the device to 'cuda' (irrespective of the device id) in the calculate_qparams function Test Plan: python test/test_quantization.py TestObserver.test_observer_qparams_respects_device_affinity Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Previosuly the scale and zero_point were returned on the CPU even if the input tensor was on the GPU. This is because `copy_()` doesn't respect the device when copying over the tensor. Also fixed a bug where we were always setting the device to 'cuda' (irrespective of the device id) in the calculate_qparams function Test Plan: python test/test_quantization.py TestObserver.test_observer_qparams_respects_device_affinity Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 15d3d50 Pull Request resolved: #47514
💊 CI failures summary and remediationsAs of commit be9df93 (more details on the Dr. CI page):
ci.pytorch.org: 1 failedcodecov.io: 1 failed
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 13 times. |
QAT Benchmark on MobilenetV2
After change
|
wow, interesting, great catch
is the |
Hmm, it might. I think I can use Update: It would appear that using x.cuda isn't scriptable and seems to map to the wrong function.
I can instead to the |
that would make sense to me, and the performance metrics don't show a slowdown which is good. Doesn't have to block this bugfix, but I think in the future it might make sense to rethink how we do device management in observers. Ideally we can enforce strong assumptions that all observer buffers are on the correct device after initialization and any |
Summary: Previosuly the scale and zero_point were returned on the CPU even if the input tensor was on the GPU. This is because `copy_()` doesn't respect the device when copying over the tensor. Also fixed a bug where we were always setting the device to 'cuda' (irrespective of the device id) in the calculate_qparams function Test Plan: python test/test_quantization.py TestObserver.test_observer_qparams_respects_device_affinity Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D24800495](https://our.internmc.facebook.com/intern/diff/D24800495) [ghstack-poisoned]
Summary: Previosuly the scale and zero_point were returned on the CPU even if the input tensor was on the GPU. This is because `copy_()` doesn't respect the device when copying over the tensor. Also fixed a bug where we were always setting the device to 'cuda' (irrespective of the device id) in the calculate_qparams function Test Plan: python test/test_quantization.py TestObserver.test_observer_qparams_respects_device_affinity Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 2cbd6f3 Pull Request resolved: #47514
Summary: Previosuly the scale and zero_point were returned on the CPU even if the input tensor was on the GPU. This is because `copy_()` doesn't respect the device when copying over the tensor. Also fixed a bug where we were always setting the device to 'cuda' (irrespective of the device id) in the calculate_qparams function Test Plan: python test/test_quantization.py TestObserver.test_observer_qparams_respects_device_affinity Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D24800495](https://our.internmc.facebook.com/intern/diff/D24800495) [ghstack-poisoned]
Summary: fixed a bug where we were always setting the device to 'cuda' (irrespective of the device id) in the calculate_qparams function Fixes #46533 Test Plan: python test/test_quantization.py TestObserver.test_observer_qparams_respects_device_affinity Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 90eb81d Pull Request resolved: #47514
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lg, the test failure in QuantizePerChannel4d
looks potentially related
Summary: fixed a bug where we were always setting the device to 'cuda' (irrespective of the device id) in the calculate_qparams function Fixes #46533 Test Plan: python test/test_quantization.py TestObserver.test_observer_qparams_respects_device_affinity Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D24800495](https://our.internmc.facebook.com/intern/diff/D24800495) [ghstack-poisoned]
Summary: fixed a bug where we were always setting the device to 'cuda' (irrespective of the device id) in the calculate_qparams function Fixes #46533 Test Plan: python test/test_quantization.py TestObserver.test_observer_qparams_respects_device_affinity Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: a3dc910 Pull Request resolved: #47514
Codecov Report
@@ Coverage Diff @@
## gh/supriyar/206/base #47514 +/- ##
=====================================================
Coverage 80.78% 80.79%
=====================================================
Files 1809 1809
Lines 190036 190036
=====================================================
+ Hits 153526 153535 +9
+ Misses 36510 36501 -9 |
This pull request has been merged in 6bb18b2. |
Stack from ghstack:
Summary:
fixed a bug where we were always setting the device to 'cuda' (irrespective of the device id)
in the calculate_qparams function
Fixes #46533
Test Plan:
python test/test_quantization.py TestObserver.test_observer_qparams_respects_device_affinity
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D24800495