New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[Inductor] [Quant] Enable lowering of quant per tensor and refactor quant pattern #124041

Closed

leslie-fang-intel wants to merge 14 commits into gh/leslie-fang-intel/92/base from gh/leslie-fang-intel/92/head

Collaborator

leslie-fang-intel commented Apr 15, 2024 •

edited

Stack from ghstack (oldest at bottom):

Summary
Per the discussion in #123444, the decomposed quant/dequant patterns changed after #123445, we can move the optimization of decomposed quant/dequant from inductor decomposition into lowering phase to avoid the changes. In this way, we can:

Avoid the pattern matcher failure introduced in Fixed arange decomp for float dtype #123445
Make the quantization pattern clearer in the pattern matcher phase, since the quant/dequant nodes have not been decomposed.

Changes in this PR

Move optimization of decomposed quant/dequant from inductor decomposition into lowering phase.
Corresponding changes in the quantization pattern matcher to ensure no bc-breaking.

TestPlan

python -u -m pytest -s -v test/inductor/test_mkldnn_pattern_matcher.py -k test_q

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @ColinPeppler @amjames @desertfire @chauhang


          Enable lowering of quant per tensor and refactor quant pattern

92daa22

[ghstack-poisoned]

pytorch-bot bot commented Apr 15, 2024 •

edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/124041

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 5f81bea with merge base fdff992 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot bot added ciflow/inductor module: inductor labels

leslie-fang-intel added a commit that referenced this pull request


          Enable lowering of quant per tensor and refactor quant pattern

997439f

ghstack-source-id: b947d4e653552e2decf30d30888d9ac911b6f40c
Pull Request resolved: #124041

pytorchbot added the open source label


          Update on "Enable lowering of quant per tensor and refactor quant pat…

15b811d

…tern"

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]

leslie-fang-intel added the ciflow/trunk label

leslie-fang-intel added a commit that referenced this pull request


          Enable lowering of quant per tensor and refactor quant pattern

710717a

ghstack-source-id: e79df3747a4f820efb45917048b06665ebeb9323
Pull Request resolved: #124041

leslie-fang-intel added the topic: not user facing label

leslie-fang-intel marked this pull request as draft

April 15, 2024 07:49


          Update on "Enable lowering of quant per tensor and refactor quant pat…

3db2e03

…tern"

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]

pytorch-bot bot added module: cpu release notes: quantization labels

leslie-fang-intel added a commit that referenced this pull request


          Enable lowering of quant per tensor and refactor quant pattern

54f1a75

ghstack-source-id: 77b385d916550ec2cc83d340c4a04d8162ddd0bf
Pull Request resolved: #124041


          Update on "Enable lowering of quant per tensor and refactor quant pat…

a93d7aa

…tern"

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]

leslie-fang-intel added a commit that referenced this pull request


          Enable lowering of quant per tensor and refactor quant pattern

a5277e6

ghstack-source-id: 8d18ac99ac988f0ef3aa8054dfc38ebbb5ca1d2e
Pull Request resolved: #124041


          Update on "Enable lowering of quant per tensor and refactor quant pat…

6eb1a9c

…tern"

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]

leslie-fang-intel added a commit that referenced this pull request


          Enable lowering of quant per tensor and refactor quant pattern

bdddc1b

ghstack-source-id: b5cea99eca1aa3635e4070b08f27a983507f28e8
Pull Request resolved: #124041


          Update on "Enable lowering of quant per tensor and refactor quant pat…

3611bb9

…tern"

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]

leslie-fang-intel added a commit that referenced this pull request


          Enable lowering of quant per tensor and refactor quant pattern

89ff597

ghstack-source-id: c27e0a6860cf6ce6891dd613a842ef3c59599119
Pull Request resolved: #124041


          Update on "Enable lowering of quant per tensor and refactor quant pat…

d3c228f

…tern"

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]

leslie-fang-intel added a commit that referenced this pull request


          Enable lowering of quant per tensor and refactor quant pattern

04c0b62

ghstack-source-id: c90f2a0235edc5b2fe98b1f5295b57effc513aab
Pull Request resolved: #124041


          Update on "Enable lowering of quant per tensor and refactor quant pat…

0f569fd

…tern"

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]

leslie-fang-intel added a commit that referenced this pull request


          Enable lowering of quant per tensor and refactor quant pattern

7881df2

ghstack-source-id: 5fea49bfba3f0e1d454c336ff8a2d4f20f7ee6b9
Pull Request resolved: #124041


          Update on "Enable lowering of quant per tensor and refactor quant pat…

ad1b7f3

…tern"

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]

leslie-fang-intel added a commit that referenced this pull request


          Enable lowering of quant per tensor and refactor quant pattern

e34dadb

ghstack-source-id: 17bd7bb1e7a1d35194d4be4467a1c4f2b41fb009
Pull Request resolved: #124041


          Update on "Enable lowering of quant per tensor and refactor quant pat…

00c4ebc

…tern"

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]

leslie-fang-intel added a commit that referenced this pull request


          Enable lowering of quant per tensor and refactor quant pattern

2b1161c

ghstack-source-id: 06d74132126f6e5e02900540461cf907cb841868
Pull Request resolved: #124041

leslie-fang-intel marked this pull request as ready for review

April 16, 2024 08:55

pytorchmergebot added the merging label

Collaborator

pytorchmergebot commented May 9, 2024

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot added the Merged label

pytorchmergebot closed this in

33e6791

pytorchmergebot removed the merging label

pytorchmergebot pushed a commit that referenced this pull request


          [Inductor][Quant] Change the QConv output scale name (#124246)

9ba9f7f

**Summary**
Change the name of QConv output scale from `inv_output_scale` to `output_scale` after we move the optimization of quant/dequant from decomposition to lowering phase.

Pull Request resolved: #124246
Approved by: https://github.com/jgong5, https://github.com/peterbell10
ghstack dependencies: #124041

pytorchmergebot pushed a commit that referenced this pull request


          [Inductor][Quant] Fix PT2E Dynamic Quant regression (#125207)

3da949b

**Summary**
Fix 2 regression issues caused by previous refactor:

- Fix the issue in dequant promotion pass with dynamic quant when the dequant node is with `tensor` overload.
- Fix numerical issue in dynamic quant, since input will convert to scales' dtype (which is `double`) to do quant operatoration with previous implementation.

**TestPlan**
```
clear && python -u -m pytest -s -v test/inductor/test_mkldnn_pattern_matcher.py -k test_dynamic_qlinear_input_dim_exceeds_2
clear && python -u -m pytest -s -v test/inductor/test_mkldnn_pattern_matcher.py -k test_qlinear_dequant_promotion_dynamic_cpu
```

Pull Request resolved: #125207
Approved by: https://github.com/peterbell10, https://github.com/jgong5
ghstack dependencies: #124041, #124246

Contributor

huydhn commented May 9, 2024

@pytorchbot revert -m 'Sorry for reverting your change but I think there is a land race with the change https://hud.pytorch.org/pytorch/pytorch/commit/33e6791645b5950b0f39301f55b8a4a79c0ca847' -c landrace

Collaborator

pytorchmergebot commented May 9, 2024

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot added a commit that referenced this pull request


          Revert "[Inductor][Quant] Fix PT2E Dynamic Quant regression (#125207)"

97509c8

This reverts commit 3da949b.

Reverted #125207 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think there is a land race with the change https://hud.pytorch.org/pytorch/pytorch/commit/33e6791645b5950b0f39301f55b8a4a79c0ca847 ([comment](#124041 (comment)))

pytorchmergebot added a commit that referenced this pull request


          Revert "[Inductor][Quant] Change the QConv output scale name (#124246)"

ca579c1

This reverts commit 9ba9f7f.

Reverted #124246 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think there is a land race with the change https://hud.pytorch.org/pytorch/pytorch/commit/33e6791645b5950b0f39301f55b8a4a79c0ca847 ([comment](#124041 (comment)))

pytorchmergebot added a commit that referenced this pull request


          Revert "[Inductor] [Quant] Enable lowering of quant per tensor and re…

ea3f625

…factor quant pattern (#124041)"

This reverts commit 33e6791.

Reverted #124041 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think there is a land race with the change https://hud.pytorch.org/pytorch/pytorch/commit/33e6791645b5950b0f39301f55b8a4a79c0ca847 ([comment](#124041 (comment)))

Collaborator

pytorchmergebot commented May 9, 2024

@leslie-fang-intel your PR has been successfully reverted.

pytorchmergebot added the Reverted label

pytorchmergebot reopened this

Collaborator Author

leslie-fang-intel commented May 9, 2024

@pytorchbot revert -m 'Sorry for reverting your change but I think there is a land race with the change https://hud.pytorch.org/pytorch/pytorch/commit/33e6791645b5950b0f39301f55b8a4a79c0ca847' -c landrace

Hi @huydhn, has this change https://hud.pytorch.org/pytorch/pytorch/commit/33e6791645b5950b0f39301f55b8a4a79c0ca847' landed? Should I rebase this ghstack and try land again?

Contributor

huydhn commented May 9, 2024

Yes, please do a rebase to main and try to land this again

Contributor

huydhn commented May 9, 2024

The stack is probably having a landrace with this change #122832 that was landed few hours ago


          Update on "[Inductor] [Quant] Enable lowering of quant per tensor and…

5f81bea

… refactor quant pattern"


**Summary**
Per the discussion in #123444, the `decomposed quant/dequant` patterns changed after #123445, we can move the optimization of `decomposed quant/dequant` from inductor decomposition into lowering phase to avoid the changes. In this way, we can:

- Avoid the pattern matcher failure introduced in #123445
- Make the quantization pattern clearer in the pattern matcher phase, since the `quant/dequant` nodes have not been decomposed.

**Changes in this PR**

- Move optimization of `decomposed quant/dequant` from inductor decomposition into lowering phase.
- Corresponding changes in the quantization pattern matcher to ensure no bc-breaking.

**TestPlan**
```
python -u -m pytest -s -v test/inductor/test_mkldnn_pattern_matcher.py -k test_q
```


cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]

Collaborator Author

leslie-fang-intel commented May 9, 2024

The stack is probably having a landrace with this change #122832 that was landed few hours ago

Rebase to main and check the preCI again.

Collaborator Author

leslie-fang-intel commented May 9, 2024

@pytorchbot merge

Collaborator Author

leslie-fang-intel commented May 9, 2024

Yes, please do a rebase to main and try to land this again

Hi @huydhn, after rebase, the preCI are all green. I am going to re-land.

pytorchmergebot added the merging label

Collaborator

pytorchmergebot commented May 9, 2024

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot closed this in

d83ab88

pytorchmergebot removed the merging label

pytorchmergebot pushed a commit that referenced this pull request


          [Inductor][Quant] Change the QConv output scale name (#124246)

c337395

**Summary**
Change the name of QConv output scale from `inv_output_scale` to `output_scale` after we move the optimization of quant/dequant from decomposition to lowering phase.

Pull Request resolved: #124246
Approved by: https://github.com/jgong5, https://github.com/peterbell10
ghstack dependencies: #124041

pytorchmergebot pushed a commit that referenced this pull request


          [Inductor][Quant] Fix PT2E Dynamic Quant regression (#125207)

b958810

**Summary**
Fix 2 regression issues caused by previous refactor:

- Fix the issue in dequant promotion pass with dynamic quant when the dequant node is with `tensor` overload.
- Fix numerical issue in dynamic quant, since input will convert to scales' dtype (which is `double`) to do quant operatoration with previous implementation.

**TestPlan**
```
clear && python -u -m pytest -s -v test/inductor/test_mkldnn_pattern_matcher.py -k test_dynamic_qlinear_input_dim_exceeds_2
clear && python -u -m pytest -s -v test/inductor/test_mkldnn_pattern_matcher.py -k test_qlinear_dequant_promotion_dynamic_cpu
```

Pull Request resolved: #125207
Approved by: https://github.com/peterbell10, https://github.com/jgong5
ghstack dependencies: #124041, #124246

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

jgong5 jgong5 approved these changes

peterbell10 peterbell10 approved these changes

jerryzh168 Awaiting requested review from jerryzh168 jerryzh168 is a code owner

salilsdesai Awaiting requested review from salilsdesai salilsdesai is a code owner

kimishpatel Awaiting requested review from kimishpatel kimishpatel is a code owner

digantdesai Awaiting requested review from digantdesai digantdesai is a code owner

jianyuh Awaiting requested review from jianyuh jianyuh is a code owner

Xia-Weiwen Awaiting requested review from Xia-Weiwen