Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Quant] Add fused conv2d_add op for onednn backend #90262

Conversation

leslie-fang-intel
Copy link
Collaborator

@leslie-fang-intel leslie-fang-intel commented Dec 6, 2022

Stack from ghstack (oldest at bottom):

Summary
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

Test Plan

python -m pytest test_quantization.py::TestQuantizedConv

cc @VitalyFedyunin @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @gujinghui @PenghuiCheng @jianyuh @min-jean-cho @yanbing-j @Guobing-Chen @Xia-Weiwen

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 6, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90262

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8baef3e:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

leslie-fang-intel added a commit that referenced this pull request Dec 6, 2022
ghstack-source-id: ca56717ef6b01387b769bd7b679b3e82d5d696b4
Pull Request resolved: #90262
@github-actions github-actions bot added module: cpu CPU specific problem (e.g., perf, algorithm) module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration labels Dec 6, 2022
@leslie-fang-intel leslie-fang-intel marked this pull request as draft December 6, 2022 06:26
@leslie-fang-intel leslie-fang-intel changed the title [Quant] Add fused conv_add op for onednn backend [WIP] [Quant] Add fused conv_add op for onednn backend Dec 6, 2022
cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen

[ghstack-poisoned]
leslie-fang-intel added a commit that referenced this pull request Dec 6, 2022
ghstack-source-id: a7f5fb324870a2dbdecb84ed9ee76446f6450002
Pull Request resolved: #90262
leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Dec 7, 2022
ghstack-source-id: a7f5fb324870a2dbdecb84ed9ee76446f6450002
Pull Request resolved: pytorch#90262
leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Dec 7, 2022
ghstack-source-id: a7f5fb324870a2dbdecb84ed9ee76446f6450002
Pull Request resolved: pytorch#90262
cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen

[ghstack-poisoned]
leslie-fang-intel added a commit that referenced this pull request Dec 7, 2022
ghstack-source-id: 324e97d3dcda37d6abad83af61453aaf0e9d209e
Pull Request resolved: #90262
@leslie-fang-intel leslie-fang-intel changed the title [WIP] [Quant] Add fused conv_add op for onednn backend [Quant] Add fused conv_add op for onednn backend Dec 7, 2022
@leslie-fang-intel leslie-fang-intel added intel This tag is for PR from Intel ciflow/trunk Trigger trunk jobs on your pull request labels Dec 7, 2022
@leslie-fang-intel leslie-fang-intel changed the title [Quant] Add fused conv_add op for onednn backend [Quant] Add fused conv2d_add op for onednn backend Dec 7, 2022
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `conv2d_add` op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `conv2d_add` op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen

[ghstack-poisoned]
leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Dec 20, 2022
ghstack-source-id: 2b22fc9ffcf1350eac41ce219dfe6d3110abd668
Pull Request resolved: pytorch#90262
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `conv2d_add` op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `conv2d_add` op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `conv2d_add` op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `conv2d_add` op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen

[ghstack-poisoned]
@@ -4687,7 +4709,7 @@ def _test_qconv_impl(
Y_scale=st.floats(4.2, 5.6),
Y_zero_point=st.integers(0, 4),
use_bias=st.booleans(),
use_relu=st.booleans(),
post_op=st.sampled_from(["none", "relu"]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be a separate PR, but might make sense to split the conv and conv_relu test as well

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions, I have split the conv and conv_relu test and so as the other similar test cases.

@@ -4780,7 +4886,7 @@ def test_qconv2d(
Y_scale=st.floats(4.2, 5.6),
Y_zero_point=st.sampled_from([0]),
use_bias=st.booleans(),
use_relu=st.booleans(),
post_op=st.sampled_from(["none", "relu"]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions, split them into separate tests.

Comment on lines 4839 to 4840
if post_op == "add":
qconv = torch.ops.quantized.conv2d_add
Copy link
Contributor

@jerryzh168 jerryzh168 Jan 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is only "add" we can remove the post_op argument and also this check

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions, I have removed the post_op argument and the check. In next PR, I will put conv2d_add_relu into a separate test.

**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `conv2d_add` op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen

[ghstack-poisoned]
leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Jan 13, 2023
ghstack-source-id: bb2010eccb3737ed8d8706fa43c7c24982a64b72
Pull Request resolved: pytorch#90262
leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Jan 26, 2023
ghstack-source-id: bb2010eccb3737ed8d8706fa43c7c24982a64b72
Pull Request resolved: pytorch#90262
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `conv2d_add` op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `conv2d_add` op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `conv2d_add` op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `conv2d_add` op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `conv2d_add` op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen

[ghstack-poisoned]
@leslie-fang-intel
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@facebook-github-bot facebook-github-bot deleted the gh/leslie-fang-intel/4/head branch June 8, 2023 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request intel This tag is for PR from Intel Merged module: cpu CPU specific problem (e.g., perf, algorithm) module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration open source release notes: quantization release notes category
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

5 participants