-
Notifications
You must be signed in to change notification settings - Fork 25k
Add support for quantized operator conversion from PT to C2 via ONNX #29694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: efd184d Pull Request resolved: #29694
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: cbdf56a Pull Request resolved: #29694
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great progress, left a few comments. Please address them.
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: ceba000 Pull Request resolved: #29694
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: a85b056 Pull Request resolved: #29694
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 5481eed Pull Request resolved: #29694
auto tmp = input_node->inputs()[0]->node(); | ||
return getScaleFromInput(tmp); | ||
} | ||
return 1.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When does this happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a default value here? Why not throw an exception? Wouldn't that be safer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense. I'll update it.
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
Also could you rebase to master to avoid /Build Failed CI, it will keep sending us emails. |
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
@@ -2438,5 +2438,89 @@ def setup_rnn_tests(): | |||
embed_params=True, opset_version=10)) | |||
|
|||
|
|||
class TestQuantizedOps(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, why not create a test_pytorch_onnx_quantization_caffe2.py? I didn't see you reuse anything in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear and Conv operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear and Conv operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear and Conv operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 0b76868 Pull Request resolved: #29694
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! left two more comments. Also please import the PR, and make sure internal signals are good, then we are good to go! Cheers :-)
self.generic_test(QAddModule(), (x, y), input_names=["x", "y"]) | ||
|
||
def test_quantized_relu(self): | ||
self.generic_unary_test(torch.nn.ReLU()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: also add test to make sure regular relu can be correctly handled as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_linear
in test_pytorch_onnx_caffe2.py already covers regular relu.
…2 via ONNX" Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: Currently tested end-to-end for Linear and Conv operator python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130) [ghstack-poisoned]
Summary: This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend. For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace. The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph. The actual tensor values are looked up from the params dict. Test Plan: python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 941e83b Pull Request resolved: #29694
This pull request has been merged in 91c6d2e. |
@houseroad @supriyar Hi, I have a problem when dump my quantized pytorch model to onnx. Traceback (most recent call last): So, how to deal with aten::max_pool2d? |
Are you trying to convert the model to Caffe2? Currently this conversion flow only supports conversion to C2. Your max_pool2d operator doesn't appear to be quantized. If it is quantized the name should be To add similar support for quantized::max_pool2d you can follow the steps we took for aten::avg_pool2d |
Thank you for your reply! |
Answered in #33932 |
Stack from ghstack:
Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.
Test Plan:
Currently tested end-to-end for Linear and Conv operator
python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D18467130