Add support for quantized operator conversion from PT to C2 via ONNX #29694

supriyar · 2019-11-13T00:45:28Z

Stack from ghstack:

Add support for quantized operator conversion from PT to C2 via ONNX #29694 Add support for quantized operator conversion from PT to C2 via ONNX

Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear and Conv operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D18467130

Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]

Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: efd184d Pull Request resolved: #29694

…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]

Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: cbdf56a Pull Request resolved: #29694

caffe2/onnx/backend.cc

houseroad

Great progress, left a few comments. Please address them.

torch/csrc/jit/export.cpp

torch/csrc/jit/passes/onnx/unpack_quantized_weights.cpp

torch/onnx/symbolic_opset9.py

torch/onnx/symbolic_registry.py

torch/onnx/utils.py

houseroad · 2019-11-13T18:58:27Z

The CI is broken too, check: https://circleci.com/gh/pytorch/pytorch/3580194?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link

…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]

Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: ceba000 Pull Request resolved: #29694

…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]

Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: a85b056 Pull Request resolved: #29694

…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]

Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 5481eed Pull Request resolved: #29694

caffe2/onnx/backend.cc

test/onnx/test_pytorch_onnx_caffe2.py

dreiss · 2019-11-14T00:52:10Z

torch/csrc/jit/passes/onnx/unpack_quantized_weights.cpp

+    auto tmp = input_node->inputs()[0]->node();
+    return getScaleFromInput(tmp);
+  }
+  return 1.0;


When does this happen?

Do we need a default value here? Why not throw an exception? Wouldn't that be safer?

Yes, that makes sense. I'll update it.

torch/onnx/symbolic_opset11.py

…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]

houseroad · 2019-11-14T22:59:43Z

Also could you rebase to master to avoid /Build Failed CI, it will keep sending us emails.

…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]

test/onnx/test_pytorch_onnx_caffe2.py

…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]

houseroad · 2019-11-15T00:50:57Z

test/onnx/test_pytorch_onnx_caffe2.py

                                           embed_params=True, opset_version=10))


+class TestQuantizedOps(unittest.TestCase):


Actually, why not create a test_pytorch_onnx_quantization_caffe2.py? I didn't see you reuse anything in this file.

…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]

…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear and Conv operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]

Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 0b76868 Pull Request resolved: #29694

houseroad

Looks great! left two more comments. Also please import the PR, and make sure internal signals are good, then we are good to go! Cheers :-)

houseroad · 2019-11-16T03:49:15Z

test/onnx/test_pytorch_onnx_caffe2_quantized.py

+        self.generic_test(QAddModule(), (x, y), input_names=["x", "y"])
+
+    def test_quantized_relu(self):
+        self.generic_unary_test(torch.nn.ReLU())


Nit: also add test to make sure regular relu can be correctly handled as well?

test_linear in test_pytorch_onnx_caffe2.py already covers regular relu.

test/onnx/test_pytorch_onnx_caffe2_quantized.py

…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear and Conv operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]

Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 941e83b Pull Request resolved: #29694

facebook-github-bot · 2019-11-18T20:22:33Z

This pull request has been merged in 91c6d2e.

Pluto1944 · 2020-01-21T09:35:13Z

@houseroad @supriyar Hi, I have a problem when dump my quantized pytorch model to onnx.

Traceback (most recent call last):
File "get_pck_quant.py", line 129, in
onnx_model = export_to_onnx(model, x, input_names)
File "get_pck_quant.py", line 118, in export_to_onnx
operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK)
File ".../.local/lib/python3.5/site-packages/torch/onnx/init.py", line 148, in export
strip_doc_string, dynamic_axes, keep_initializers_as_inputs)
File ".../.local/lib/python3.5/site-packages/torch/onnx/utils.py", line 66, in export
dynamic_axes=dynamic_axes, keep_initializers_as_inputs=keep_initializers_as_inputs)
File ".../.local/lib/python3.5/site-packages/torch/onnx/utils.py", line 416, in _export
fixed_batch_size=fixed_batch_size)
File ".../.local/lib/python3.5/site-packages/torch/onnx/utils.py", line 296, in _model_to_graph
fixed_batch_size=fixed_batch_size, params_dict=params_dict)
File ".../.local/lib/python3.5/site-packages/torch/onnx/utils.py", line 130, in _optimize_graph
torch._C._jit_pass_onnx_unpack_quantized_weights(graph, params_dict)
RuntimeError: false INTERNAL ASSERT FAILED at /pytorch/torch/csrc/jit/passes/onnx/unpack_quantized_weights.cpp:86, please report a bug to PyTorch. Unrecognized quantized operator while trying to compute q_scale for operator aten::max_pool2d (getScaleFromInput at /pytorch/torch/csrc/jit/passes/onnx/unpack_quantized_weights.cpp:86)

So, how to deal with aten::max_pool2d?

supriyar · 2020-01-22T00:28:50Z

Are you trying to convert the model to Caffe2? Currently this conversion flow only supports conversion to C2.

Your max_pool2d operator doesn't appear to be quantized. If it is quantized the name should be quantized::max_pool2d https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/quantized/cpu/qpool.cpp#L414

To add similar support for quantized::max_pool2d you can follow the steps we took for aten::avg_pool2d
https://github.com/pytorch/pytorch/blob/master/torch/onnx/symbolic_caffe2.py#L159

Pluto1944 · 2020-02-28T08:40:58Z

Thank you for your reply!
I'm trying to convert the model to C2
Here is my code(#33932), I don't know why my max_pool2d operator doesn't appear to be quantized.
Is there something wrong in my covert script？

supriyar · 2020-02-28T18:02:01Z

Answered in #33932

supriyar requested a review from apaszke as a code owner November 13, 2019 00:45

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Nov 13, 2019

supriyar requested review from houseroad and dreiss November 13, 2019 00:58

houseroad reviewed Nov 13, 2019

View reviewed changes

caffe2/onnx/backend.cc Outdated Show resolved Hide resolved

houseroad requested changes Nov 13, 2019

View reviewed changes

supriyar requested a review from houseroad November 14, 2019 00:57

supriyar mentioned this pull request Nov 14, 2019

Add test for quantized conv conversion #29780

Closed

dreiss reviewed Nov 14, 2019

View reviewed changes

supriyar added 3 commits November 14, 2019 15:17

houseroad reviewed Nov 14, 2019

View reviewed changes

test/onnx/test_pytorch_onnx_caffe2.py Outdated Show resolved Hide resolved

houseroad reviewed Nov 15, 2019

View reviewed changes

supriyar requested a review from houseroad November 15, 2019 01:33

houseroad approved these changes Nov 16, 2019

View reviewed changes

facebook-github-bot closed this in 91c6d2e Nov 18, 2019

facebook-github-bot added the merged label Nov 18, 2019

supriyar mentioned this pull request Nov 19, 2019

[WIP] Add more operator support to convert quantized model from pytorch to c2 #30064

Closed

facebook-github-bot deleted the gh/supriyar/36/head branch November 22, 2019 15:17

mruberry added the Merged label Oct 28, 2020

		embed_params=True, opset_version=10))


		class TestQuantizedOps(unittest.TestCase):

Add support for quantized operator conversion from PT to C2 via ONNX #29694

Add support for quantized operator conversion from PT to C2 via ONNX #29694

Uh oh!

Conversation

supriyar commented Nov 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

houseroad left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

houseroad commented Nov 13, 2019

Uh oh!

Uh oh!

Uh oh!

dreiss Nov 14, 2019

Choose a reason for hiding this comment

Uh oh!

dreiss Nov 15, 2019

Choose a reason for hiding this comment

Uh oh!

supriyar Nov 15, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

houseroad commented Nov 14, 2019

Uh oh!

Uh oh!

houseroad Nov 15, 2019

Choose a reason for hiding this comment

Uh oh!

supriyar Nov 15, 2019

Choose a reason for hiding this comment

Uh oh!

houseroad left a comment

Choose a reason for hiding this comment

Uh oh!

houseroad Nov 16, 2019

Choose a reason for hiding this comment

Uh oh!

supriyar Nov 18, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

facebook-github-bot commented Nov 18, 2019

Uh oh!

Pluto1944 commented Jan 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

supriyar commented Jan 22, 2020

Uh oh!

Pluto1944 commented Feb 28, 2020

Uh oh!

supriyar commented Feb 28, 2020

Uh oh!

Uh oh!

supriyar commented Nov 13, 2019 •

edited

Loading

Pluto1944 commented Jan 21, 2020 •

edited

Loading