onnx export of per channel fake quantize functions #42835

skyw · 2020-08-10T22:07:57Z

This PR adds support for exporting fake_quantize_per_channel_affine to a pair of QuantizeLinear and DequantizeLinear. Per tensor support was added by PR #39738.

axis attribute of QuantizeLinear and DequantizeLinear, which is required for per channel support, is added in opset13 added by onnx/onnx#2772.

[update 1/20/2021]: opset13 is being supported on master, the added function is now properly tested. Code also rebased to new master.

The function is also tested offline with the following code

import torch
from torch import quantization

from torchvision import models
qat_resnet18 = models.resnet18(pretrained=True).eval().cuda()

qat_resnet18.qconfig = quantization.QConfig(
    activation=quantization.default_fake_quant, weight=quantization.default_per_channel_weight_fake_quant)
quantization.prepare_qat(qat_resnet18, inplace=True)
qat_resnet18.apply(quantization.enable_observer)
qat_resnet18.apply(quantization.enable_fake_quant)

dummy_input = torch.randn(16, 3, 224, 224).cuda()
_ = qat_resnet18(dummy_input)
for module in qat_resnet18.modules():
    if isinstance(module, quantization.FakeQuantize):
        module.calculate_qparams()
qat_resnet18.apply(quantization.disable_observer)

qat_resnet18.cuda()

input_names = [ "actual_input_1" ]
output_names = [ "output1" ]


torch.onnx.export(qat_resnet18, dummy_input, "quant_model.onnx", verbose=True, opset_version=13)

It can generate the desired graph.

dr-ci · 2020-08-10T22:18:43Z

💊 CI failures summary and remediations

As of commit 0e35b9a (more details on the Dr. CI page):

2/2 failures possibly* introduced in this PR
- 2/2 non-CircleCI failure(s)

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

skyw · 2020-08-11T02:28:30Z

@gottbrath @raghuramank100 @vkuzo @houseroad FYI

vadimkantorov · 2020-08-11T12:09:59Z

test/onnx/test_models.py

+    @skipIfUnsupportedMinOpsetVersion(13)
+    def test_qat_resnet_per_channel(self):
+        # Quantize ResNet50 model
+        x = Variable(torch.randn(BATCH_SIZE, 3, 224, 224).fill_(1.0))


Is explicit Variable still useful?

I guess no. I copied from other tests.

It seems all other tests use Variable, should I just keep it consistent with the rest?

I would suggest to remove the deprecated usage of Variable. The other tests could be updated in a separate PR.

Removed Variable in all fake quantization ONNX export tests.

skyw · 2020-08-19T16:50:31Z

@houseroad , any more comment? Thanks.

Pull in master

pull in master

codecov · 2020-08-27T06:36:03Z

Codecov Report

Merging #42835 (0e35b9a) into master (f68e5f1) will decrease coverage by 0.34%.
The diff coverage is 41.93%.

@@            Coverage Diff             @@
##           master   #42835      +/-   ##
==========================================
- Coverage   80.85%   80.50%   -0.35%     
==========================================
  Files        1931     1931              
  Lines      210934   210965      +31     
==========================================
- Hits       170544   169836     -708     
- Misses      40390    41129     +739

skyw · 2020-08-27T17:30:08Z

ONNX runtime now supports opset13 version of QuantizeLinear and DequantizeLinear, microsoft/onnxruntime#4759 on master. Verified Q/DQ with axis generated by this change does is valid ONNX op.

skyw · 2020-09-04T16:29:59Z

TF2ONNX now supports this, onnx/tensorflow-onnx#1081

Pull in upstream master

daquexian · 2020-11-17T16:29:13Z

@houseroad @raghuramank100 Could you please review this PR? It is an important step to deploy the PyTorch QAT model on various backends like TensorRT and many mobile frameworks.

Pull in master

facebook-github-bot · 2021-01-19T20:09:26Z

Hi @skyw!

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file.

In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

pull in master

facebook-github-bot · 2021-01-22T01:12:31Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

Pull in master

skyw · 2021-02-02T16:35:41Z

@spandantiwari , rebased to new master.

neginraoof

Thanks for adding this op.

facebook-github-bot

@SplitInfinity has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@SplitInfinity has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

houseroad

Looks good.

facebook-github-bot · 2021-02-08T21:17:29Z

@SplitInfinity merged this pull request in 7363da7.

Summary: Fixes #39502 This PR adds support for exporting **fake_quantize_per_channel_affine** to a pair of QuantizeLinear and DequantizeLinear. Per tensor support was added by PR #39738. `axis` attribute of QuantizeLinear and DequantizeLinear, which is required for per channel support, is added in opset13 added by onnx/onnx#2772. [update 1/20/2021]: opset13 is being supported on master, the added function is now properly tested. Code also rebased to new master. The function is also tested offline with the following code ```python import torch from torch import quantization from torchvision import models qat_resnet18 = models.resnet18(pretrained=True).eval().cuda() qat_resnet18.qconfig = quantization.QConfig( activation=quantization.default_fake_quant, weight=quantization.default_per_channel_weight_fake_quant) quantization.prepare_qat(qat_resnet18, inplace=True) qat_resnet18.apply(quantization.enable_observer) qat_resnet18.apply(quantization.enable_fake_quant) dummy_input = torch.randn(16, 3, 224, 224).cuda() _ = qat_resnet18(dummy_input) for module in qat_resnet18.modules(): if isinstance(module, quantization.FakeQuantize): module.calculate_qparams() qat_resnet18.apply(quantization.disable_observer) qat_resnet18.cuda() input_names = [ "actual_input_1" ] output_names = [ "output1" ] torch.onnx.export(qat_resnet18, dummy_input, "quant_model.onnx", verbose=True, opset_version=13) ``` It can generate the desired graph. Pull Request resolved: #42835 Reviewed By: houseroad Differential Revision: D26293823 Pulled By: SplitInfinity fbshipit-source-id: 300498a2e24b7731b12fa2fbdea4e73dde80e7ea

Summary: Fixes #39502 This PR adds support for exporting **fake_quantize_per_channel_affine** to a pair of QuantizeLinear and DequantizeLinear. Per tensor support was added by PR #39738. `axis` attribute of QuantizeLinear and DequantizeLinear, which is required for per channel support, is added in opset13 added by onnx/onnx#2772. [update 1/20/2021]: opset13 is being supported on master, the added function is now properly tested. Code also rebased to new master. The function is also tested offline with the following code ```python import torch from torch import quantization from torchvision import models qat_resnet18 = models.resnet18(pretrained=True).eval().cuda() qat_resnet18.qconfig = quantization.QConfig( activation=quantization.default_fake_quant, weight=quantization.default_per_channel_weight_fake_quant) quantization.prepare_qat(qat_resnet18, inplace=True) qat_resnet18.apply(quantization.enable_observer) qat_resnet18.apply(quantization.enable_fake_quant) dummy_input = torch.randn(16, 3, 224, 224).cuda() _ = qat_resnet18(dummy_input) for module in qat_resnet18.modules(): if isinstance(module, quantization.FakeQuantize): module.calculate_qparams() qat_resnet18.apply(quantization.disable_observer) qat_resnet18.cuda() input_names = [ "actual_input_1" ] output_names = [ "output1" ] torch.onnx.export(qat_resnet18, dummy_input, "quant_model.onnx", verbose=True, opset_version=13) ``` It can generate the desired graph. Pull Request resolved: #42835 Reviewed By: houseroad Differential Revision: D26293823 Pulled By: SplitInfinity fbshipit-source-id: 300498a2e24b7731b12fa2fbdea4e73dde80e7ea Co-authored-by: Hao Wu <skyw@users.noreply.github.com>

pytorchbot added the open source label Aug 10, 2020

vadimkantorov reviewed Aug 11, 2020

View reviewed changes

zou3519 requested a review from houseroad August 11, 2020 14:04

zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 11, 2020

Merge pull request #1 from pytorch/master

c4d3a6b

Pull in master

skyw force-pushed the skyw/fake_quant_per_channel_onnx_export branch from 650f2e6 to 0a53c2d Compare August 20, 2020 16:10

Merge pull request #2 from pytorch/master

73bbdf1

pull in master

skyw force-pushed the skyw/fake_quant_per_channel_onnx_export branch from cb9015d to 583b5ce Compare August 27, 2020 03:48

gottbrath requested a review from raghuramank100 September 15, 2020 16:57

Merge pull request #3 from pytorch/master

de93d4e

Pull in upstream master

skyw mentioned this pull request Nov 17, 2020

Inference on a model prequantized in pytorch NVIDIA/TensorRT#770

Closed

Merge pull request #4 from pytorch/master

88ff3ac

Pull in master

skyw force-pushed the skyw/fake_quant_per_channel_onnx_export branch from 44ac44b to da9c59b Compare January 19, 2021 20:09

Merge pull request #5 from pytorch/master

0a1eb16

pull in master

skyw force-pushed the skyw/fake_quant_per_channel_onnx_export branch from da9c59b to 1899521 Compare January 20, 2021 00:31

Merge pull request #6 from pytorch/master

59823c3

pull in master

skyw force-pushed the skyw/fake_quant_per_channel_onnx_export branch from 1899521 to 154d556 Compare January 20, 2021 17:26

facebook-github-bot added the cla signed label Jan 22, 2021

skyw and others added 3 commits January 28, 2021 12:52

Merge pull request #7 from pytorch/master

5b3fe5f

Pull in master

Reimplment fake_quantize_per_channel_affine ONNX export

d1a47c6

change reduce_max to max

ebc9328

Hao Wu added 5 commits January 28, 2021 13:09

fix per channel fake quantizazion test

340169a

improve test coverage

9f8d3af

fix bugs in test

8d13a9f

remove wrong test

b089329

rebase to master

b37a9e6

skyw force-pushed the skyw/fake_quant_per_channel_onnx_export branch from cdb0de3 to b37a9e6 Compare January 28, 2021 21:11

Hao Wu added 2 commits January 28, 2021 13:28

remove windows line ending

fd4938e

fix import

0e35b9a

neginraoof approved these changes Feb 3, 2021

View reviewed changes

facebook-github-bot reviewed Feb 6, 2021

View reviewed changes

houseroad approved these changes Feb 8, 2021

View reviewed changes

facebook-github-bot closed this in 7363da7 Feb 8, 2021

facebook-github-bot added the Merged label Feb 8, 2021

This was referenced Feb 18, 2021

onnx export of per channel fake quantize functions (#42835) #52430

Merged

[v.1.8.0] Release Tracker #51886

Closed

vkuzo mentioned this pull request Feb 26, 2021

could torch with quantize aware of training support onnx-export #52838

Closed

supriyar mentioned this pull request Jul 20, 2021

export torchscript module to onnx failed #59781

Closed

andrewor14 mentioned this pull request May 6, 2022

Can the quantized model trained by pytorch qat be converted to the onnx model? #76583

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

onnx export of per channel fake quantize functions #42835

onnx export of per channel fake quantize functions #42835

skyw commented Aug 10, 2020 •

edited

dr-ci bot commented Aug 10, 2020 •

edited by facebook-github-bot

skyw commented Aug 11, 2020

vadimkantorov Aug 11, 2020

skyw Aug 11, 2020 •

edited

ptrblck Aug 20, 2020

skyw Aug 20, 2020

skyw commented Aug 19, 2020

codecov bot commented Aug 27, 2020 •

edited

skyw commented Aug 27, 2020

skyw commented Sep 4, 2020

daquexian commented Nov 17, 2020 •

edited

facebook-github-bot commented Jan 19, 2021

facebook-github-bot commented Jan 22, 2021

skyw commented Feb 2, 2021

neginraoof left a comment

facebook-github-bot left a comment

facebook-github-bot left a comment

houseroad left a comment

facebook-github-bot commented Feb 8, 2021

onnx export of per channel fake quantize functions #42835

onnx export of per channel fake quantize functions #42835

Conversation

skyw commented Aug 10, 2020 • edited

dr-ci bot commented Aug 10, 2020 • edited by facebook-github-bot

💊 CI failures summary and remediations

skyw commented Aug 11, 2020

vadimkantorov Aug 11, 2020

Choose a reason for hiding this comment

skyw Aug 11, 2020 • edited

Choose a reason for hiding this comment

ptrblck Aug 20, 2020

Choose a reason for hiding this comment

skyw Aug 20, 2020

Choose a reason for hiding this comment

skyw commented Aug 19, 2020

codecov bot commented Aug 27, 2020 • edited

Codecov Report

skyw commented Aug 27, 2020

skyw commented Sep 4, 2020

daquexian commented Nov 17, 2020 • edited

facebook-github-bot commented Jan 19, 2021

facebook-github-bot commented Jan 22, 2021

skyw commented Feb 2, 2021

neginraoof left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

houseroad left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Feb 8, 2021

skyw commented Aug 10, 2020 •

edited

dr-ci bot commented Aug 10, 2020 •

edited by facebook-github-bot

skyw Aug 11, 2020 •

edited

codecov bot commented Aug 27, 2020 •

edited

daquexian commented Nov 17, 2020 •

edited