New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fx quant: do not insert observers at quantized inputs #49239
Conversation
Summary: Context: the existing implementation of `quantized_input_idxs` is convert-only. Therefore, observers are inserted between the input and the first quantized node. This is a problem during QAT, because the initial input is a fake_quant, and it starts with scale=1 and zp=0. This does not match the quantization parameters of the graph input, which can lead to incorrect numerics. Fix: do not insert observer for a quantized input. Test Plan: ``` python test/test_quantization.py TestQuantizeFx ``` Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit bd25bc0 (more details on the Dr. CI page):
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. This comment has been revised 21 times. |
Summary: Context: the existing implementation of `quantized_input_idxs` is convert-only. Therefore, observers are inserted between the input and the first quantized node. This is a problem during QAT, because the initial input is a fake_quant, and it starts with scale=1 and zp=0. This does not match the quantization parameters of the graph input, which can lead to incorrect numerics. Fix: do not insert observer for a quantized input. Test Plan: ``` python test/test_quantization.py TestQuantizeFx ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: c9f76756f5c34e1f65289984421e4a9df4fc2744 Pull Request resolved: #49239
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG
Summary: Context: the existing implementation of `quantized_input_idxs` is convert-only. Therefore, observers are inserted between the input and the first quantized node. This is a problem during QAT, because the initial input is a fake_quant, and it starts with scale=1 and zp=0. This does not match the quantization parameters of the graph input, which can lead to incorrect numerics. Fix: do not insert observer for a quantized input. Test Plan: ``` python test/test_quantization.py TestQuantizeFx ``` Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D25499486](https://our.internmc.facebook.com/intern/diff/D25499486) [ghstack-poisoned]
Summary: Context: the existing implementation of `quantized_input_idxs` is convert-only. Therefore, observers are inserted between the input and the first quantized node. This is a problem during QAT, because the initial input is a fake_quant, and it starts with scale=1 and zp=0. This does not match the quantization parameters of the graph input, which can lead to incorrect numerics. Fix: do not insert observer for a quantized input. Test Plan: ``` python test/test_quantization.py TestQuantizeFx ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 936518ba30591e75900c36168183f68f7e97e5ab Pull Request resolved: #49239
Summary: Context: the existing implementation of `quantized_input_idxs` is convert-only. Therefore, observers are inserted between the input and the first quantized node. This is a problem during QAT, because the initial input is a fake_quant, and it starts with scale=1 and zp=0. This does not match the quantization parameters of the graph input, which can lead to incorrect numerics. Fix: do not insert observer for a quantized input. Test Plan: ``` python test/test_quantization.py TestQuantizeFx ``` Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D25499486](https://our.internmc.facebook.com/intern/diff/D25499486) [ghstack-poisoned]
Summary: Context: the existing implementation of `quantized_input_idxs` is convert-only. Therefore, observers are inserted between the input and the first quantized node. This is a problem during QAT, because the initial input is a fake_quant, and it starts with scale=1 and zp=0. This does not match the quantization parameters of the graph input, which can lead to incorrect numerics. Fix: do not insert observer for a quantized input. Test Plan: ``` python test/test_quantization.py TestQuantizeFx ``` Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D25499486](https://our.internmc.facebook.com/intern/diff/D25499486) [ghstack-poisoned]
Codecov Report
@@ Coverage Diff @@
## gh/vkuzo/187/base #49239 +/- ##
==================================================
Coverage 80.62% 80.62%
==================================================
Files 1875 1875
Lines 202790 202797 +7
==================================================
+ Hits 163501 163510 +9
+ Misses 39289 39287 -2 |
This pull request has been merged in 7542076. |
Summary: Pull Request resolved: pytorch#49239 Context: the existing implementation of `quantized_input_idxs` is convert-only. Therefore, observers are inserted between the input and the first quantized node. This is a problem during QAT, because the initial input is a fake_quant, and it starts with scale=1 and zp=0. This does not match the quantization parameters of the graph input, which can lead to incorrect numerics. Fix: do not insert observer for a quantized input. Test Plan: ``` python test/test_quantization.py TestQuantizeFx ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25499486 fbshipit-source-id: 303b49cc9d95a9fd06fef3b0859c08be34e19d8a
Stack from ghstack:
Summary:
Context: the existing implementation of
quantized_input_idxs
is convert-only.Therefore, observers are inserted between the input and the first
quantized node. This is a problem during QAT, because the initial
input is a fake_quant, and it starts with scale=1 and zp=0. This does
not match the quantization parameters of the graph input, which can
lead to incorrect numerics.
Fix: do not insert observer for a quantized input.
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D25499486