Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fx quant: do not insert observers at quantized inputs #49239

Closed
wants to merge 4 commits into from

Conversation

vkuzo
Copy link
Contributor

@vkuzo vkuzo commented Dec 11, 2020

Stack from ghstack:

Summary:

Context: the existing implementation of quantized_input_idxs is convert-only.
Therefore, observers are inserted between the input and the first
quantized node. This is a problem during QAT, because the initial
input is a fake_quant, and it starts with scale=1 and zp=0. This does
not match the quantization parameters of the graph input, which can
lead to incorrect numerics.

Fix: do not insert observer for a quantized input.

Test Plan:

python test/test_quantization.py TestQuantizeFx

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D25499486

Summary:

Context: the existing implementation of `quantized_input_idxs` is convert-only.
Therefore, observers are inserted between the input and the first
quantized node.  This is a problem during QAT, because the initial
input is a fake_quant, and it starts with scale=1 and zp=0.  This does
not match the quantization parameters of the graph input, which can
lead to incorrect numerics.

Fix: do not insert observer for a quantized input.

Test Plan:

```
python test/test_quantization.py TestQuantizeFx
```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Dec 11, 2020

💊 CI failures summary and remediations

As of commit bd25bc0 (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-CircleCI failure(s)

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

This comment has been revised 21 times.

vkuzo added a commit that referenced this pull request Dec 11, 2020
Summary:

Context: the existing implementation of `quantized_input_idxs` is convert-only.
Therefore, observers are inserted between the input and the first
quantized node.  This is a problem during QAT, because the initial
input is a fake_quant, and it starts with scale=1 and zp=0.  This does
not match the quantization parameters of the graph input, which can
lead to incorrect numerics.

Fix: do not insert observer for a quantized input.

Test Plan:

```
python test/test_quantization.py TestQuantizeFx
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: c9f76756f5c34e1f65289984421e4a9df4fc2744
Pull Request resolved: #49239
Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

Summary:

Context: the existing implementation of `quantized_input_idxs` is convert-only.
Therefore, observers are inserted between the input and the first
quantized node.  This is a problem during QAT, because the initial
input is a fake_quant, and it starts with scale=1 and zp=0.  This does
not match the quantization parameters of the graph input, which can
lead to incorrect numerics.

Fix: do not insert observer for a quantized input.

Test Plan:

```
python test/test_quantization.py TestQuantizeFx
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D25499486](https://our.internmc.facebook.com/intern/diff/D25499486)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Dec 15, 2020
Summary:

Context: the existing implementation of `quantized_input_idxs` is convert-only.
Therefore, observers are inserted between the input and the first
quantized node.  This is a problem during QAT, because the initial
input is a fake_quant, and it starts with scale=1 and zp=0.  This does
not match the quantization parameters of the graph input, which can
lead to incorrect numerics.

Fix: do not insert observer for a quantized input.

Test Plan:

```
python test/test_quantization.py TestQuantizeFx
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 936518ba30591e75900c36168183f68f7e97e5ab
Pull Request resolved: #49239
Summary:

Context: the existing implementation of `quantized_input_idxs` is convert-only.
Therefore, observers are inserted between the input and the first
quantized node.  This is a problem during QAT, because the initial
input is a fake_quant, and it starts with scale=1 and zp=0.  This does
not match the quantization parameters of the graph input, which can
lead to incorrect numerics.

Fix: do not insert observer for a quantized input.

Test Plan:

```
python test/test_quantization.py TestQuantizeFx
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D25499486](https://our.internmc.facebook.com/intern/diff/D25499486)

[ghstack-poisoned]
Summary:

Context: the existing implementation of `quantized_input_idxs` is convert-only.
Therefore, observers are inserted between the input and the first
quantized node.  This is a problem during QAT, because the initial
input is a fake_quant, and it starts with scale=1 and zp=0.  This does
not match the quantization parameters of the graph input, which can
lead to incorrect numerics.

Fix: do not insert observer for a quantized input.

Test Plan:

```
python test/test_quantization.py TestQuantizeFx
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D25499486](https://our.internmc.facebook.com/intern/diff/D25499486)

[ghstack-poisoned]
@codecov
Copy link

codecov bot commented Dec 16, 2020

Codecov Report

Merging #49239 (bd25bc0) into gh/vkuzo/187/base (fe74970) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@                Coverage Diff                 @@
##           gh/vkuzo/187/base   #49239   +/-   ##
==================================================
  Coverage              80.62%   80.62%           
==================================================
  Files                   1875     1875           
  Lines                 202790   202797    +7     
==================================================
+ Hits                  163501   163510    +9     
+ Misses                 39289    39287    -2     

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 7542076.

@facebook-github-bot facebook-github-bot deleted the gh/vkuzo/187/head branch December 20, 2020 15:18
hwangdeyu pushed a commit to hwangdeyu/pytorch that referenced this pull request Jan 6, 2021
Summary:
Pull Request resolved: pytorch#49239

Context: the existing implementation of `quantized_input_idxs` is convert-only.
Therefore, observers are inserted between the input and the first
quantized node.  This is a problem during QAT, because the initial
input is a fake_quant, and it starts with scale=1 and zp=0.  This does
not match the quantization parameters of the graph input, which can
lead to incorrect numerics.

Fix: do not insert observer for a quantized input.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D25499486

fbshipit-source-id: 303b49cc9d95a9fd06fef3b0859c08be34e19d8a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants