[Quant] Add fused LinearLeakyReLU module for onednn backend #88661

Xia-Weiwen · 2022-11-08T08:56:07Z

Stack from ghstack (oldest at bottom):

Summary
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused QLinearLeakyReLU module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown.

Test plan
python test_quantization.py TestStaticQuantizedModule

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

[ghstack-poisoned]

pytorch-bot · 2022-11-08T08:56:11Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88661

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 08a2e13:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: ae263fe5175dfac481e98e40ec31dc3b54585530 Pull Request resolved: #88661

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]

ghstack-source-id: 2f46d71f686e94140c1a9242bbeb49efde77bf1a Pull Request resolved: #88661

jerryzh168 · 2022-11-08T23:37:43Z

should we add a test in https://github.com/pytorch/pytorch/blob/master/test/quantization/eager/test_quantize_eager_ptq.py#L76

Xia-Weiwen · 2022-11-09T03:58:25Z

should we add a test in https://github.com/pytorch/pytorch/blob/master/test/quantization/eager/test_quantize_eager_ptq.py#L76

Thanks for the suggestion. But I did not see tests for fused module here.
Shall we add it here?

pytorch/test/quantization/core/test_quantized_module.py

Line 49 in a7420d2

class TestStaticQuantizedModule(QuantizationTestCase):

If so, do you suggest adding a new test implementation for LinearLeakyReLU or modify _test_linear_api_impl below so that it supports more fusions?

pytorch/test/quantization/core/test_quantized_module.py

Line 83 in a7420d2

    
           def _test_linear_api_impl(self, batch_size, in_features, out_features, use_bias, use_fused, per_channel):

jgong5

LGTM on the changes. Consider to add a test as Jerry suggested.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]

Xia-Weiwen · 2022-11-11T05:30:16Z

Hi @jerryzh168 I have added a test here: test/quantization/core/test_quantized_module.py. Please take a look.

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]

jerryzh168 · 2022-11-12T03:27:54Z

test/quantization/core/test_quantized_module.py

        class_map = {
-            True: nniq.LinearReLU,
-            False: nnq.Linear,
+            'none': nnq.Linear,
+            'relu': nniq.LinearReLU,
+            'leaky_relu': nniq.LinearLeakyReLU,
+        }
+        q_module_name_map = {
+            'none': 'QuantizedLinear',
+            'relu': 'QuantizedLinearReLU',
+            'leaky_relu': 'QuantizedLinearLeakyReLU',
        }


I think we should probably pass these from argument, it looks a bit weird that we hardcode these in this helper function

also for naming, I think we can rename test_linear_api to test_linear as well, to make it clearer

and we can split test_linear to test_linear and test_linear_relu to make sure each test only tests one thing

OK I will do that.

I have made these as arguments of the helper function and split test_qlinear to two tests for qlinear and qlinear_relu respectively.

jerryzh168 · 2022-11-12T03:31:49Z

please add the summary and test plan as suggested in previous PR as well. Also are we planning to use this in eager mode as well or just fx graph mode? and how urgent is this change? can you wait for PyTorch 2.0 where we integration quantization in PyTorch 2.0 export path and these changes may not be needed?

jerryzh168 · 2022-11-12T03:41:28Z

torch/ao/nn/intrinsic/quantized/modules/linear_relu.py

+    def from_reference(cls, ref_mod, output_scale, output_zero_point):
+        linear = ref_mod[0]
+        leaky_relu = ref_mod[1]
+        qlinear = cls(


nit: qlinear -> qlinear_leakyrelu to make it clearer

Xia-Weiwen · 2022-11-12T03:43:04Z

please add the summary and test plan as suggested in previous PR as well.

Ok. I will add them later.

Also are we planning to use this in eager mode as well or just fx graph mode?

It's just for FX mode

and how urgent is this change? can you wait for PyTorch 2.0 where we integration quantization in PyTorch 2.0 export path and these changes may not be needed?

Not that urgent but it's nice to land ASAP. I wander how quantization path will be like in PyTorch 2.0. Do you mean that there will be a different mechanism for fusion, so these PRs are not needed? If so, we would like to align with 2.0 to support more fusion patterns. How to do that? Thanks.

jerryzh168 · 2022-12-09T16:50:44Z

yeah, looks good, please feel free to land

**Summary** Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `QLinearLeakyReLU` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]

Xia-Weiwen · 2022-12-13T03:41:26Z

@pytorchbot merge

pytorchmergebot · 2022-12-13T03:43:07Z

Merge failed

Reason: Approval needed from one of the following (Rule 'superuser'):
teng-li, anj-s, vtsyvina, hwangjeff, clee2000, ...

Details for Dev Infra team

Raised by workflow job

Xia-Weiwen · 2022-12-14T00:51:35Z

yeah, looks good, please feel free to land

Hi @jerryzh168 You did not approve this. Could you approve it? Thanks!

jerryzh168

sorry, missed this before, please revert changes to torch/nn/intrinsic/ folder, we want to deprecate this namespace and we should not add new things to it

**Summary** Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `QLinearLeakyReLU` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]

Xia-Weiwen · 2022-12-15T02:16:29Z

sorry, missed this before, please revert changes to torch/nn/intrinsic/ folder, we want to deprecate this namespace and we should not add new things to it

I have removed them.

**Summary** Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `QLinearLeakyReLU` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]

Xia-Weiwen · 2022-12-16T01:52:15Z

@pytorchbot merge

pytorchmergebot · 2022-12-16T01:54:07Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Xia-Weiwen · 2022-12-16T02:53:07Z

@pytorchbot revert -m="This is breaking tests. Need to rebase." -c=nosignal

pytorch-bot · 2022-12-16T02:53:08Z

❌ 🤖 pytorchbot command failed:

@pytorchbot revert: error: the following arguments are required: -c/--classification

usage: @pytorchbot revert -m MESSAGE -c
                          {nosignal,ignoredsignal,landrace,weird,ghfirst}

Try @pytorchbot --help for more info.

Xia-Weiwen · 2022-12-16T02:54:06Z

@pytorchbot revert -m="This is breaking tests. Need to rebase." -c=nosignal

pytorchmergebot · 2022-12-16T02:58:21Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot · 2022-12-16T02:58:30Z

@Xia-Weiwen your PR has been successfully reverted.

…88661)" This reverts commit 353c2e7. Reverted #88661 on behalf of https://github.com/Xia-Weiwen due to This is breaking tests. Need to rebase.

**Summary** Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `QLinearLeakyReLU` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]

Xia-Weiwen · 2022-12-16T07:26:24Z

@pytorchbot merge

pytorchmergebot · 2022-12-16T07:28:09Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

[Quant] Add fused LinearLeakyReLU module for onednn backend

1f7900d

[ghstack-poisoned]

Xia-Weiwen requested review from albanD and jbschlosser as code owners November 8, 2022 08:56

pytorch-bot bot added the release notes: AO frontend label Nov 8, 2022

Xia-Weiwen mentioned this pull request Nov 8, 2022

[Quant] Add fused linear-leaky_relu op for onednn backend #88478

Closed

Xia-Weiwen added a commit that referenced this pull request Nov 8, 2022

[Quant] Add fused LinearLeakyReLU module for onednn backend

be918bd

ghstack-source-id: ae263fe5175dfac481e98e40ec31dc3b54585530 Pull Request resolved: #88661

pytorchbot added the open source label Nov 8, 2022

Xia-Weiwen marked this pull request as draft November 8, 2022 09:02

Xia-Weiwen added the intel This tag is for PR from Intel label Nov 8, 2022

Xia-Weiwen requested a review from jgong5 November 8, 2022 09:03

Update on "[Quant] Add fused LinearLeakyReLU module for onednn backend"

7d372c3

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]

Xia-Weiwen added a commit that referenced this pull request Nov 8, 2022

[Quant] Add fused LinearLeakyReLU module for onednn backend

8014096

ghstack-source-id: 2f46d71f686e94140c1a9242bbeb49efde77bf1a Pull Request resolved: #88661

This was referenced Nov 8, 2022

[Quant][FX] Add backend config for onednn backend and fuse Linear-LeakyReLU #88665

Closed

[Quant][FX] Lower QLinearLeakyReLU for onednn backend #88668

Closed

jgong5 approved these changes Nov 9, 2022

View reviewed changes

Update on "[Quant] Add fused LinearLeakyReLU module for onednn backend"

79391d3

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]

Xia-Weiwen marked this pull request as ready for review November 11, 2022 05:30

Xia-Weiwen requested review from jerryzh168 and z-a-f as code owners November 11, 2022 05:30

Update on "[Quant] Add fused LinearLeakyReLU module for onednn backend"

8cd54ed

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]

This was referenced Nov 11, 2022

[Quant] Add fused linear-tanh op for onednn backend #88879

Closed

[Quant] Add fused LinearTanh module for onednn backend #88923

Closed

jerryzh168 reviewed Nov 12, 2022

View reviewed changes

jerryzh168 approved these changes Dec 14, 2022

View reviewed changes

jerryzh168 requested changes Dec 14, 2022

View reviewed changes

jerryzh168 approved these changes Dec 15, 2022

View reviewed changes

pytorchmergebot added the Merged label Dec 16, 2022

pytorchmergebot closed this in 353c2e7 Dec 16, 2022

pytorchmergebot added the Reverted label Dec 16, 2022

Xia-Weiwen reopened this Dec 16, 2022

pytorchmergebot closed this in 6ea93b2 Dec 16, 2022

facebook-github-bot deleted the gh/Xia-Weiwen/2/head branch June 8, 2023 14:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Quant] Add fused LinearLeakyReLU module for onednn backend #88661

[Quant] Add fused LinearLeakyReLU module for onednn backend #88661

Xia-Weiwen commented Nov 8, 2022 •

edited

pytorch-bot bot commented Nov 8, 2022 •

edited

jerryzh168 commented Nov 8, 2022

Xia-Weiwen commented Nov 9, 2022

jgong5 left a comment

Xia-Weiwen commented Nov 11, 2022

jerryzh168 Nov 12, 2022

jerryzh168 Nov 12, 2022

jerryzh168 Nov 12, 2022

Xia-Weiwen Nov 12, 2022

Xia-Weiwen Nov 17, 2022

jerryzh168 commented Nov 12, 2022

jerryzh168 Nov 12, 2022

Xia-Weiwen Nov 17, 2022

Xia-Weiwen commented Nov 12, 2022

jerryzh168 commented Dec 9, 2022

Xia-Weiwen commented Dec 13, 2022

pytorchmergebot commented Dec 13, 2022

Xia-Weiwen commented Dec 14, 2022 •

edited

jerryzh168 left a comment

Xia-Weiwen commented Dec 15, 2022

Xia-Weiwen commented Dec 16, 2022

pytorchmergebot commented Dec 16, 2022

Xia-Weiwen commented Dec 16, 2022 •

edited

pytorch-bot bot commented Dec 16, 2022

Xia-Weiwen commented Dec 16, 2022

pytorchmergebot commented Dec 16, 2022

pytorchmergebot commented Dec 16, 2022

Xia-Weiwen commented Dec 16, 2022

pytorchmergebot commented Dec 16, 2022

[Quant] Add fused LinearLeakyReLU module for onednn backend #88661

[Quant] Add fused LinearLeakyReLU module for onednn backend #88661

Conversation

Xia-Weiwen commented Nov 8, 2022 • edited

pytorch-bot bot commented Nov 8, 2022 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88661

✅ No Failures

jerryzh168 commented Nov 8, 2022

Xia-Weiwen commented Nov 9, 2022

jgong5 left a comment

Choose a reason for hiding this comment

Xia-Weiwen commented Nov 11, 2022

jerryzh168 Nov 12, 2022

Choose a reason for hiding this comment

jerryzh168 Nov 12, 2022

Choose a reason for hiding this comment

jerryzh168 Nov 12, 2022

Choose a reason for hiding this comment

Xia-Weiwen Nov 12, 2022

Choose a reason for hiding this comment

Xia-Weiwen Nov 17, 2022

Choose a reason for hiding this comment

jerryzh168 commented Nov 12, 2022

jerryzh168 Nov 12, 2022

Choose a reason for hiding this comment

Xia-Weiwen Nov 17, 2022

Choose a reason for hiding this comment

Xia-Weiwen commented Nov 12, 2022

jerryzh168 commented Dec 9, 2022

Xia-Weiwen commented Dec 13, 2022

pytorchmergebot commented Dec 13, 2022

Merge failed

Xia-Weiwen commented Dec 14, 2022 • edited

jerryzh168 left a comment

Choose a reason for hiding this comment

Xia-Weiwen commented Dec 15, 2022

Xia-Weiwen commented Dec 16, 2022

pytorchmergebot commented Dec 16, 2022

Merge started

Xia-Weiwen commented Dec 16, 2022 • edited

pytorch-bot bot commented Dec 16, 2022

Xia-Weiwen commented Dec 16, 2022

pytorchmergebot commented Dec 16, 2022

pytorchmergebot commented Dec 16, 2022

Xia-Weiwen commented Dec 16, 2022

pytorchmergebot commented Dec 16, 2022

Merge started

Xia-Weiwen commented Nov 8, 2022 •

edited

pytorch-bot bot commented Nov 8, 2022 •

edited

Xia-Weiwen commented Dec 14, 2022 •

edited

Xia-Weiwen commented Dec 16, 2022 •

edited