eager quant: remove fake_quant after add/mul nodes during QAT #49213

vkuzo · 2020-12-11T02:08:04Z

Stack from ghstack:

fx quant: make sure observer is inserted before a quantized output #49420 fx quant: make sure observer is inserted before a quantized output
fx quant: fix fq when input is quantized and node does not need fq #49382 fx quant: fix fq when input is quantized and node does not need fq
fx quant: do not insert observers at quantized inputs #49239 fx quant: do not insert observers at quantized inputs
fx quant: move {input|output}_quantized_idxs cfg from convert to prepare #49238 fx quant: move {input|output}_quantized_idxs cfg from convert to prepare
eager quant: remove fake_quant after add/mul nodes during QAT #49213 eager quant: remove fake_quant after add/mul nodes during QAT

Summary:

Changes behavior of Eager mode quantization to remove observation after add_scalar/mul_scalar.
This is not used, and it removes one difference between Eager and FX modes.

Test Plan:

python test/test_quantization.py TestQuantizeFxOps.test_quantized_add_qat
python test/test_quantization.py TestQuantizeFxOps.test_quantized_mul_qat
python test/test_quantization.py TestQuantizationAwareTraining.test_add_scalar_uses_input_qparams
python test/test_quantization.py TestQuantizationAwareTraining.test_mul_scalar_uses_input_qparams

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D25486276

Summary: Changes behavior of FX graph mode quantization to insert a fake_quant after add and mul nodes. This is useful to model numerics better, since these nodes will be quantized during inference, and adding/multiplying by a scalar can result in values in between quantization bins during training. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_quantized_add_qat python test/test_quantization.py TestQuantizeFxOps.test_quantized_mul_qat ``` Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Changes behavior of FX graph mode quantization to insert a fake_quant after add and mul nodes. This is useful to model numerics better, since these nodes will be quantized during inference, and adding/multiplying by a scalar can result in values in between quantization bins during training. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_quantized_add_qat python test/test_quantization.py TestQuantizeFxOps.test_quantized_mul_qat ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 2d662539da6de9d94c8707c73cfaff7eadb4f143 Pull Request resolved: #49213

facebook-github-bot · 2020-12-11T02:08:23Z

💊 CI failures summary and remediations

As of commit a620fe7 (more details on the Dr. CI page):

1/1 failures possibly* introduced in this PR
- 1/1 non-CircleCI failure(s)

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

This comment has been revised 22 times.

jerryzh168 · 2020-12-11T18:41:53Z

torch/quantization/fx/quantize.py

+            else:
+                # for QAT, we always insert a fake_quant after add/mul
+                new_observer = qconfig.activation()
+                insert_observer(


I think we'll need to use the same fake quantize module as the input here to properly simulate quantization since output of quantized::add with scalar input inherits quantization parameters from input

Not sure I follow: The current logic inserts observers as long as one of the inputs is a tensor. why is this a problem for QAT?

jerryzh168 · 2020-12-14T23:07:01Z

we can match eager now and improve later. currently eager mode is incorrect, because the fake quantize is ignored, and the model before and after convert for the qat model would have different numerics. output of add_scalar/mul_scalar need to use the same fake quantize module as input so that the numerics of fake quantize matches the quantized op.

Summary: Changes behavior of FX graph mode quantization to insert a fake_quant after add and mul nodes. This is useful to model numerics better, since these nodes will be quantized during inference, and adding/multiplying by a scalar can result in values in between quantization bins during training. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_quantized_add_qat python test/test_quantization.py TestQuantizeFxOps.test_quantized_mul_qat ``` Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D25486276](https://our.internmc.facebook.com/intern/diff/D25486276) [ghstack-poisoned]

raghuramank100 · 2020-12-15T02:05:01Z

test/quantization/test_quantize_fx.py

+                self.conv2 = torch.nn.Conv2d(1, 1, 1)
+
+            def forward(self, x):
+                x = torch.add(x, 1.0)


Are the inputs and outputs quantized? In this case, shouldnt we have 5 fake-quants? (add, conv1, add-relu, conv2, output)?

output of add/input of conv1 is considered to be observed here I think

the ideal state would be the output of add/input of conv1 being fake quantized with the same fq module as the input of add.

jerryzh168 · 2020-12-15T02:09:26Z

looks like this breaks some of the numeric_suite tests. we may need to keep the activation_post_process attribute (and just use Identity).

…g QAT" Summary: Changes behavior of Eager mode quantization to remove observation after `add_scalar/mul_scalar`. This is not used, and it removes one difference between Eager and FX modes. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_quantized_add_qat python test/test_quantization.py TestQuantizeFxOps.test_quantized_mul_qat python test/test_quantization.py TestQuantizationAwareTraining.test_add_scalar_uses_input_qparams python test/test_quantization.py TestQuantizationAwareTraining.test_mul_scalar_uses_input_qparams ``` Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D25486276](https://our.internmc.facebook.com/intern/diff/D25486276) [ghstack-poisoned]

vkuzo · 2020-12-15T23:52:51Z

looks like this breaks some of the numeric_suite tests. we may need to keep the activation_post_process attribute (and just use Identity).

thanks, the fix was to do the same change on QFunctional. add_scalar and mul_scalar will no longer be supported by Eager NS after this PR, which is intended as Eager NS only works on activations. Theoretically someone could hack it back in with a module swap, but that should probably be outside of this PR.

…QAT" Summary: Changes behavior of Eager mode quantization to remove observation after `add_scalar/mul_scalar`. This is not used, and it removes one difference between Eager and FX modes. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_quantized_add_qat python test/test_quantization.py TestQuantizeFxOps.test_quantized_mul_qat python test/test_quantization.py TestQuantizationAwareTraining.test_add_scalar_uses_input_qparams python test/test_quantization.py TestQuantizationAwareTraining.test_mul_scalar_uses_input_qparams ``` Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D25486276](https://our.internmc.facebook.com/intern/diff/D25486276) [ghstack-poisoned]

facebook-github-bot · 2020-12-17T03:15:01Z

This pull request has been merged in 36b2092.

…h#49213) Summary: Pull Request resolved: pytorch#49213 Changes behavior of Eager mode quantization to remove observation after add_scalar/mul_scalar. This is not used, and it removes one difference between Eager and FX modes. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_quantized_add_qat python test/test_quantization.py TestQuantizeFxOps.test_quantized_mul_qat python test/test_quantization.py TestQuantizationAwareTraining.test_add_scalar_uses_input_qparams python test/test_quantization.py TestQuantizationAwareTraining.test_mul_scalar_uses_input_qparams ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25486276 fbshipit-source-id: 34a5d6ce0d08739319ec0f8b197cfc1309d71040

facebook-github-bot added cla signed fx labels Dec 11, 2020

vkuzo requested review from jerryzh168 and raghuramank100 December 11, 2020 02:38

This was referenced Dec 11, 2020

fx quant: move {input|output}_quantized_idxs cfg from convert to prepare #49238

Closed

fx quant: do not insert observers at quantized inputs #49239

Closed

jerryzh168 reviewed Dec 11, 2020

View reviewed changes

jerryzh168 approved these changes Dec 14, 2020

View reviewed changes

vkuzo changed the title ~~fx quant: insert fake_quant after add/mul nodes during QAT~~ eager quant: remove fake_quant after add/mul nodes during QAT Dec 15, 2020

vkuzo requested a review from jerryzh168 December 15, 2020 00:09

jerryzh168 approved these changes Dec 15, 2020

View reviewed changes

vkuzo mentioned this pull request Dec 15, 2020

fx quant: fix fq when input is quantized and node does not need fq #49382

Closed

raghuramank100 reviewed Dec 15, 2020

View reviewed changes

raghuramank100 approved these changes Dec 15, 2020

View reviewed changes

vkuzo mentioned this pull request Dec 15, 2020

fx quant: make sure observer is inserted before a quantized output #49420

Closed

facebook-github-bot closed this in 36b2092 Dec 17, 2020

facebook-github-bot added the Merged label Dec 17, 2020

facebook-github-bot deleted the gh/vkuzo/185/head branch December 20, 2020 15:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eager quant: remove fake_quant after add/mul nodes during QAT #49213

eager quant: remove fake_quant after add/mul nodes during QAT #49213

vkuzo commented Dec 11, 2020 •

edited

facebook-github-bot commented Dec 11, 2020 •

edited

jerryzh168 Dec 11, 2020

raghuramank100 Dec 14, 2020

jerryzh168 commented Dec 14, 2020 •

edited

raghuramank100 Dec 15, 2020

jerryzh168 Dec 15, 2020

jerryzh168 Dec 15, 2020

jerryzh168 commented Dec 15, 2020

vkuzo commented Dec 15, 2020

facebook-github-bot commented Dec 17, 2020

eager quant: remove fake_quant after add/mul nodes during QAT #49213

eager quant: remove fake_quant after add/mul nodes during QAT #49213

Conversation

vkuzo commented Dec 11, 2020 • edited

facebook-github-bot commented Dec 11, 2020 • edited

💊 CI failures summary and remediations

jerryzh168 Dec 11, 2020

Choose a reason for hiding this comment

raghuramank100 Dec 14, 2020

Choose a reason for hiding this comment

jerryzh168 commented Dec 14, 2020 • edited

raghuramank100 Dec 15, 2020

Choose a reason for hiding this comment

jerryzh168 Dec 15, 2020

Choose a reason for hiding this comment

jerryzh168 Dec 15, 2020

Choose a reason for hiding this comment

jerryzh168 commented Dec 15, 2020

vkuzo commented Dec 15, 2020

facebook-github-bot commented Dec 17, 2020

vkuzo commented Dec 11, 2020 •

edited

facebook-github-bot commented Dec 11, 2020 •

edited

jerryzh168 commented Dec 14, 2020 •

edited