Skip to content

Add support for quantized operator conversion from PT to C2 via ONNX #29694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 22 commits into from

Conversation

supriyar
Copy link
Contributor

@supriyar supriyar commented Nov 13, 2019

Stack from ghstack:

Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear and Conv operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D18467130

Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@supriyar supriyar requested a review from apaszke as a code owner November 13, 2019 00:45
@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Nov 13, 2019
…2 via ONNX"

Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@supriyar supriyar requested review from houseroad and dreiss November 13, 2019 00:58
…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Nov 13, 2019
Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: efd184d
Pull Request resolved: #29694
…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Nov 13, 2019
Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: cbdf56a
Pull Request resolved: #29694
Copy link
Member

@houseroad houseroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great progress, left a few comments. Please address them.

@houseroad
Copy link
Member

The CI is broken too, check: https://circleci.com/gh/pytorch/pytorch/3580194?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link

…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Nov 13, 2019
Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: ceba000
Pull Request resolved: #29694
…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Nov 14, 2019
Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: a85b056
Pull Request resolved: #29694
…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Nov 14, 2019
Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 5481eed
Pull Request resolved: #29694
@supriyar supriyar requested a review from houseroad November 14, 2019 00:57
auto tmp = input_node->inputs()[0]->node();
return getScaleFromInput(tmp);
}
return 1.0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does this happen?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a default value here? Why not throw an exception? Wouldn't that be safer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. I'll update it.

…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
@houseroad
Copy link
Member

Also could you rebase to master to avoid /Build Failed CI, it will keep sending us emails.

…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
@@ -2438,5 +2438,89 @@ def setup_rnn_tests():
embed_params=True, opset_version=10))


class TestQuantizedOps(unittest.TestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, why not create a test_pytorch_onnx_quantization_caffe2.py? I didn't see you reuse anything in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
@supriyar supriyar requested a review from houseroad November 15, 2019 01:33
…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear and Conv operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear and Conv operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear and Conv operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Nov 15, 2019
Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 0b76868
Pull Request resolved: #29694
Copy link
Member

@houseroad houseroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! left two more comments. Also please import the PR, and make sure internal signals are good, then we are good to go! Cheers :-)

self.generic_test(QAddModule(), (x, y), input_names=["x", "y"])

def test_quantized_relu(self):
self.generic_unary_test(torch.nn.ReLU())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: also add test to make sure regular relu can be correctly handled as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_linear in test_pytorch_onnx_caffe2.py already covers regular relu.

…2 via ONNX"


Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name '_caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
Currently tested end-to-end for Linear and Conv operator

python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D18467130](https://our.internmc.facebook.com/intern/diff/D18467130)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Nov 18, 2019
Summary:
This PR adds preliminary support required to be able to run quantized pytorch models on a C2 backend.
For quantized ops we use a custom domain name 'caffe2' to register the ops if they are in the "quantized" namespace.
The change also adds JIT pass to unpack the quantized weights and insert the unpacked values into the graph.
The actual tensor values are looked up from the params dict.

Test Plan:
python test/onnx/test_pytorch_onnx_caffe2.py TestQuantizedOps

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 941e83b
Pull Request resolved: #29694
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 91c6d2e.

@Pluto1944
Copy link

Pluto1944 commented Jan 21, 2020

@houseroad @supriyar Hi, I have a problem when dump my quantized pytorch model to onnx.

Traceback (most recent call last):
File "get_pck_quant.py", line 129, in
onnx_model = export_to_onnx(model, x, input_names)
File "get_pck_quant.py", line 118, in export_to_onnx
operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK)
File ".../.local/lib/python3.5/site-packages/torch/onnx/init.py", line 148, in export
strip_doc_string, dynamic_axes, keep_initializers_as_inputs)
File ".../.local/lib/python3.5/site-packages/torch/onnx/utils.py", line 66, in export
dynamic_axes=dynamic_axes, keep_initializers_as_inputs=keep_initializers_as_inputs)
File ".../.local/lib/python3.5/site-packages/torch/onnx/utils.py", line 416, in _export
fixed_batch_size=fixed_batch_size)
File ".../.local/lib/python3.5/site-packages/torch/onnx/utils.py", line 296, in _model_to_graph
fixed_batch_size=fixed_batch_size, params_dict=params_dict)
File ".../.local/lib/python3.5/site-packages/torch/onnx/utils.py", line 130, in _optimize_graph
torch._C._jit_pass_onnx_unpack_quantized_weights(graph, params_dict)
RuntimeError: false INTERNAL ASSERT FAILED at /pytorch/torch/csrc/jit/passes/onnx/unpack_quantized_weights.cpp:86, please report a bug to PyTorch. Unrecognized quantized operator while trying to compute q_scale for operator aten::max_pool2d (getScaleFromInput at /pytorch/torch/csrc/jit/passes/onnx/unpack_quantized_weights.cpp:86)

So, how to deal with aten::max_pool2d?

@supriyar
Copy link
Contributor Author

Are you trying to convert the model to Caffe2? Currently this conversion flow only supports conversion to C2.

Your max_pool2d operator doesn't appear to be quantized. If it is quantized the name should be quantized::max_pool2d https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/quantized/cpu/qpool.cpp#L414

To add similar support for quantized::max_pool2d you can follow the steps we took for aten::avg_pool2d
https://github.com/pytorch/pytorch/blob/master/torch/onnx/symbolic_caffe2.py#L159

@Pluto1944
Copy link

Thank you for your reply!
I'm trying to convert the model to C2
Here is my code(#33932), I don't know why my max_pool2d operator doesn't appear to be quantized.
Is there something wrong in my covert script?

@supriyar
Copy link
Contributor Author

Answered in #33932

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Merged oncall: jit Add this issue/PR to JIT oncall triage queue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants