[Quant][Inductor] Enable quantized conv weight prepack inside inductor constant folding #104581

leslie-fang-intel · 2023-07-04T07:15:59Z

Stack from ghstack (oldest at bottom):

Summary
Enable quantization conv weight prepack inside inductor constant folding.

Test Plan

python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_unary

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov

[ghstack-poisoned]

pytorch-bot · 2023-07-04T07:16:03Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/104581

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit e53478a with merge base 97a291f ():

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

linux-focal-rocm5.6-py3.8 / test (default, 1, 3, linux.rocm.gpu, unstable) (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 89ffc1f80fcbcb289ff2308b6f9e879ee67461d9 Pull Request resolved: #104581

…ide inductor constant folding" **Summary** Enable quantization conv weight prepack inside inductor constant folding. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 [ghstack-poisoned]

ghstack-source-id: 24bdf37b9197c64f3186b3c4f6e40ed70bc638ef Pull Request resolved: #104581

…ide inductor constant folding" **Summary** Enable quantization conv weight prepack inside inductor constant folding. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 [ghstack-poisoned]

ghstack-source-id: 17c060ea0a43b0a44cd73dbb2a4e4235a1e77dab Pull Request resolved: pytorch#104581

…ide inductor constant folding" **Summary** Enable quantization conv weight prepack inside inductor constant folding. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 [ghstack-poisoned]

ghstack-source-id: 25217f0f3564e986499b597d2fbae84cbd01a5ab Pull Request resolved: pytorch#104581

…ide inductor constant folding" **Summary** Enable quantization conv weight prepack inside inductor constant folding. **Test Plan** ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_unary ``` cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 [ghstack-poisoned]

ghstack-source-id: 298da79ef2a83af088e0daffb275d7ab7df56d27 Pull Request resolved: pytorch#104581

…ide inductor constant folding" **Summary** Enable quantization conv weight prepack inside inductor constant folding. **Test Plan** ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_unary ``` cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 [ghstack-poisoned]

leslie-fang-intel · 2023-08-25T17:35:57Z

@pytorchbot merge

pytorchmergebot · 2023-08-25T17:37:33Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…ctor (#104588) **Summary** Enable the `dequant-quantization-quant` pattern fusion and lowering inside inductor. **Test Plan** ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_unary ``` Pull Request resolved: #104588 Approved by: https://github.com/jgong5, https://github.com/eellison ghstack dependencies: #104580, #104581

**Summary** Enable the `dequant pattern` promotion pass in inductor. Since in the qconv weight prepack pass, we will match the `dequant->conv2d` pattern. If the `dequant pattern` has multi user nodes, it will fail to be matched. Taking the example of ``` conv1 / \ conv2 conv3 ``` After quantization flow, it will generate pattern as ``` dequant1 | conv1 | quant2 | dequant2 / \ conv2 conv3 ``` We need to duplicate `dequant2` into `dequant2` and `dequant3`, in order to make `dequant2->conv2` and `dequant3->conv3` pattern matched. **Test Plan** ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_dequant_promotion ``` Pull Request resolved: #104590 Approved by: https://github.com/jgong5, https://github.com/eellison ghstack dependencies: #104580, #104581, #104588

… inside inductor (#105455) **Summary** Enable the `dequant-conv2d-unary_postop(relu)-quant` pattern fusion and lowering inside inductor. **Test Plan** ``` clear && python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_unary ``` Pull Request resolved: #105455 Approved by: https://github.com/jgong5, https://github.com/eellison ghstack dependencies: #104580, #104581, #104588, #104590

…rn fusion inside inductor (#105456) **Summary** Enable the `dequant-conv2d-binary_postop(add)-unary_postop(relu)-quant` pattern fusion and lowering inside inductor. **Test Plan** ``` clear && python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_binary ``` Pull Request resolved: #105456 Approved by: https://github.com/jgong5, https://github.com/eellison ghstack dependencies: #104580, #104581, #104588, #104590, #105455

…ol2d) (#105639) **Summary** In this PR, we mainly enable 2 things. - Enable the skeleton of quantization recipe for single quantizable operators in `X86InductorQuantizer`. - Add quantization recipe of `maxpool2d` and annotate it as input./output share observer. **Test Plan** ``` python -m pytest test_x86inductor_quantizer.py -k test_maxpool2d_recipe ``` Pull Request resolved: #105639 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456

**Summary** Enable the `dq-maxpool2d-q` pattern match and lower into `torch.ops.quantized.max_pool2d`. **Test Plan** ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_qmaxpool2d python -m pytest test_quantized_op.py -k test_max_pool2d_pt2e ``` Pull Request resolved: #105906 Approved by: https://github.com/jgong5, https://github.com/eellison ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456, #105639

**Summary** After oneDNN 3.1 upgrade, we don't need to do the weight scale reciprocal calculation. So, remove the redundant reciprocal calculation to optimize QConv performance and using IDeep version API to implement it in this PR: - This QConv implementation expects to work functionally both with current IDeep version and the following IDeep upgrade in PR: #107565. - With the following IDeep upgrade in PR: #107565, the QConv has better performance since the redundant reciprocal calculation are removed. Pull Request resolved: #105996 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456, #105639, #105906

…ht scale reciprocal calculation (#107565) **Summary** Upgrade IDeep which includes 1 IDeep change as IDeep PR: intel/ideep#226 - For IDeep PR: intel/ideep#226 which has done 2 things: - Remove the redundant QConv weight scale reciprocal calculation. - Pump IDEEP_VERSION_REVISION version from 0 to 1. So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch. Pull Request resolved: #107565 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456, #105639, #105906, #105996

…r constant folding (#104581) **Summary** Enable quantization conv weight prepack inside inductor constant folding. **Test Plan** ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_unary ``` Pull Request resolved: #104581 Approved by: https://github.com/jgong5, https://github.com/eellison ghstack dependencies: #104580