[Quant][PT2E] Enable weight scale optimization in QConv PT2E #105996

leslie-fang-intel · 2023-07-26T02:20:35Z

Stack from ghstack (oldest at bottom):

Summary
After oneDNN 3.1 upgrade, we don't need to do the weight scale reciprocal calculation. So, remove the redundant reciprocal calculation to optimize QConv performance and using IDeep version API to implement it in this PR:

This QConv implementation expects to work functionally both with current IDeep version and the following IDeep upgrade in PR: [submodule][Quant][PT2E] Upgrade IDeep to remove redundant QConv weight scale reciprocal calculation #107565.
With the following IDeep upgrade in PR: [submodule][Quant][PT2E] Upgrade IDeep to remove redundant QConv weight scale reciprocal calculation #107565, the QConv has better performance since the redundant reciprocal calculation are removed.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @gujinghui @PenghuiCheng @jianyuh @min-jean-cho @yanbing-j @Guobing-Chen @Xia-Weiwen

pytorch-bot · 2023-07-26T02:20:38Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/105996

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 430294f with merge base 97a291f ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 05dcbc7472ef4573dde8c06e4bd8d6b0c9ee76f7 Pull Request resolved: #105996

[ghstack-poisoned]

ghstack-source-id: 7455c3a56d6ef5cd25a5a03df06fd65467dbf689 Pull Request resolved: #105996

[ghstack-poisoned]

ghstack-source-id: b60eea67f24668eb4269380fb1254e9d013f24e7 Pull Request resolved: #105996

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

…ndant QConv weight scale reciprocal calculation" **Summary** Upgrade IDeep which includes 2 IDeep change as IDeep PR: intel/ideep#222 and intel/ideep#223 - For IDeep PR: intel/ideep#222 which has done 2 things: - Remove the redundant QConv weight scale reciprocal calculation. - Pump IDEEP_VERSION_REVISION version from 0 to 1. So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch. - For IDeep PR: intel/ideep#223 which includes AArch64 specific changes with the oneDNN 3.1.1 upgrade. cc gujinghui PenghuiCheng XiaobingSuper jianyuh jgong5 mingfeima sanchitintel ashokei jingxu10 min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

…ht scale reciprocal calculation" **Summary** Upgrade IDeep which includes 2 IDeep change as IDeep PR: intel/ideep#222 and intel/ideep#223 - For IDeep PR: intel/ideep#222 which has done 2 things: - Remove the redundant QConv weight scale reciprocal calculation. - Pump IDEEP_VERSION_REVISION version from 0 to 1. So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch. - For IDeep PR: intel/ideep#223 which includes AArch64 specific changes with the oneDNN 3.1.1 upgrade. cc gujinghui PenghuiCheng XiaobingSuper jianyuh jgong5 mingfeima sanchitintel ashokei jingxu10 min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

**Summary** After oneDNN 3.1 upgrade, we don't need to do the weight scale reciprocal calculation. So, remove the redundant reciprocal calculation to optimize QConv performance and using IDeep version API to implement it in this PR: - This QConv implementation expects to work functionally both with current IDeep version and the following IDeep upgrade in PR: #107565. - With the following IDeep upgrade in PR: #107565, the QConv has better performance since the redundant reciprocal calculation are removed. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

ghstack-source-id: 2e5ddcd8b1e64d459b1934ff16469e6cb8feb8a5 Pull Request resolved: pytorch#105996

…ndant QConv weight scale reciprocal calculation" **Summary** Upgrade IDeep which includes 1 IDeep change as IDeep PR: intel/ideep#226 - For IDeep PR: intel/ideep#226 which has done 2 things: - Remove the redundant QConv weight scale reciprocal calculation. - Pump IDEEP_VERSION_REVISION version from 0 to 1. So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch. cc gujinghui PenghuiCheng XiaobingSuper jianyuh jgong5 mingfeima sanchitintel ashokei jingxu10 min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

…ht scale reciprocal calculation" **Summary** Upgrade IDeep which includes 1 IDeep change as IDeep PR: intel/ideep#226 - For IDeep PR: intel/ideep#226 which has done 2 things: - Remove the redundant QConv weight scale reciprocal calculation. - Pump IDEEP_VERSION_REVISION version from 0 to 1. So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch. cc gujinghui PenghuiCheng XiaobingSuper jianyuh jgong5 mingfeima sanchitintel ashokei jingxu10 min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

**Summary** After oneDNN 3.1 upgrade, we don't need to do the weight scale reciprocal calculation. So, remove the redundant reciprocal calculation to optimize QConv performance and using IDeep version API to implement it in this PR: - This QConv implementation expects to work functionally both with current IDeep version and the following IDeep upgrade in PR: #107565. - With the following IDeep upgrade in PR: #107565, the QConv has better performance since the redundant reciprocal calculation are removed. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

ghstack-source-id: 6888c85b439a646fa3994a643162da995b82bb67 Pull Request resolved: pytorch#105996

…remove redundant QConv weight scale reciprocal calculation" **Summary** Upgrade IDeep which includes 1 IDeep change as IDeep PR: intel/ideep#226 - For IDeep PR: intel/ideep#226 which has done 2 things: - Remove the redundant QConv weight scale reciprocal calculation. - Pump IDEEP_VERSION_REVISION version from 0 to 1. So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch. cc gujinghui PenghuiCheng XiaobingSuper jianyuh jgong5 mingfeima sanchitintel ashokei jingxu10 min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

… QConv weight scale reciprocal calculation" **Summary** Upgrade IDeep which includes 1 IDeep change as IDeep PR: intel/ideep#226 - For IDeep PR: intel/ideep#226 which has done 2 things: - Remove the redundant QConv weight scale reciprocal calculation. - Pump IDEEP_VERSION_REVISION version from 0 to 1. So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch. cc gujinghui PenghuiCheng XiaobingSuper jianyuh jgong5 mingfeima sanchitintel ashokei jingxu10 min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

**Summary** After oneDNN 3.1 upgrade, we don't need to do the weight scale reciprocal calculation. So, remove the redundant reciprocal calculation to optimize QConv performance and using IDeep version API to implement it in this PR: - This QConv implementation expects to work functionally both with current IDeep version and the following IDeep upgrade in PR: #107565. - With the following IDeep upgrade in PR: #107565, the QConv has better performance since the redundant reciprocal calculation are removed. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

…remove redundant QConv weight scale reciprocal calculation" **Summary** Upgrade IDeep which includes 1 IDeep change as IDeep PR: intel/ideep#226 - For IDeep PR: intel/ideep#226 which has done 2 things: - Remove the redundant QConv weight scale reciprocal calculation. - Pump IDEEP_VERSION_REVISION version from 0 to 1. So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch. cc gujinghui PenghuiCheng XiaobingSuper jianyuh jgong5 mingfeima sanchitintel ashokei jingxu10 min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

… QConv weight scale reciprocal calculation" **Summary** Upgrade IDeep which includes 1 IDeep change as IDeep PR: intel/ideep#226 - For IDeep PR: intel/ideep#226 which has done 2 things: - Remove the redundant QConv weight scale reciprocal calculation. - Pump IDEEP_VERSION_REVISION version from 0 to 1. So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch. cc gujinghui PenghuiCheng XiaobingSuper jianyuh jgong5 mingfeima sanchitintel ashokei jingxu10 min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

**Summary** After oneDNN 3.1 upgrade, we don't need to do the weight scale reciprocal calculation. So, remove the redundant reciprocal calculation to optimize QConv performance and using IDeep version API to implement it in this PR: - This QConv implementation expects to work functionally both with current IDeep version and the following IDeep upgrade in PR: #107565. - With the following IDeep upgrade in PR: #107565, the QConv has better performance since the redundant reciprocal calculation are removed. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

…remove redundant QConv weight scale reciprocal calculation" **Summary** Upgrade IDeep which includes 1 IDeep change as IDeep PR: intel/ideep#226 - For IDeep PR: intel/ideep#226 which has done 2 things: - Remove the redundant QConv weight scale reciprocal calculation. - Pump IDEEP_VERSION_REVISION version from 0 to 1. So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch. cc gujinghui PenghuiCheng XiaobingSuper jianyuh jgong5 mingfeima sanchitintel ashokei jingxu10 min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

… QConv weight scale reciprocal calculation" **Summary** Upgrade IDeep which includes 1 IDeep change as IDeep PR: intel/ideep#226 - For IDeep PR: intel/ideep#226 which has done 2 things: - Remove the redundant QConv weight scale reciprocal calculation. - Pump IDEEP_VERSION_REVISION version from 0 to 1. So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch. cc gujinghui PenghuiCheng XiaobingSuper jianyuh jgong5 mingfeima sanchitintel ashokei jingxu10 min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

**Summary** After oneDNN 3.1 upgrade, we don't need to do the weight scale reciprocal calculation. So, remove the redundant reciprocal calculation to optimize QConv performance and using IDeep version API to implement it in this PR: - This QConv implementation expects to work functionally both with current IDeep version and the following IDeep upgrade in PR: #107565. - With the following IDeep upgrade in PR: #107565, the QConv has better performance since the redundant reciprocal calculation are removed. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

leslie-fang-intel · 2023-08-26T08:37:10Z

@pytorchbot merge

pytorchmergebot · 2023-08-26T08:39:02Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…ht scale reciprocal calculation (#107565) **Summary** Upgrade IDeep which includes 1 IDeep change as IDeep PR: intel/ideep#226 - For IDeep PR: intel/ideep#226 which has done 2 things: - Remove the redundant QConv weight scale reciprocal calculation. - Pump IDEEP_VERSION_REVISION version from 0 to 1. So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch. Pull Request resolved: #107565 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456, #105639, #105906, #105996

**Summary** After oneDNN 3.1 upgrade, we don't need to do the weight scale reciprocal calculation. So, remove the redundant reciprocal calculation to optimize QConv performance and using IDeep version API to implement it in this PR: - This QConv implementation expects to work functionally both with current IDeep version and the following IDeep upgrade in PR: #107565. - With the following IDeep upgrade in PR: #107565, the QConv has better performance since the redundant reciprocal calculation are removed. Pull Request resolved: #105996 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456, #105639, #105906

…ht scale reciprocal calculation (#107565) **Summary** Upgrade IDeep which includes 1 IDeep change as IDeep PR: intel/ideep#226 - For IDeep PR: intel/ideep#226 which has done 2 things: - Remove the redundant QConv weight scale reciprocal calculation. - Pump IDEEP_VERSION_REVISION version from 0 to 1. So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch. Pull Request resolved: #107565 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456, #105639, #105906, #105996

leslie-fang-intel requested review from jerryzh168, salilsdesai, kimishpatel, digantdesai and jianyuh as code owners July 26, 2023 02:20

pytorch-bot bot added the release notes: quantization release notes category label Jul 26, 2023

leslie-fang-intel marked this pull request as draft July 26, 2023 02:21

leslie-fang-intel added a commit that referenced this pull request Jul 26, 2023

enabel weight scale optimization

fda12be

ghstack-source-id: 05dcbc7472ef4573dde8c06e4bd8d6b0c9ee76f7 Pull Request resolved: #105996

leslie-fang-intel changed the title ~~enabel weight scale optimization~~ [Test Only] Enable weight scale optimization in QConv PT2E Jul 26, 2023

leslie-fang-intel added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 26, 2023

enabel weight scale optimization

463c793

[ghstack-poisoned]

leslie-fang-intel added a commit that referenced this pull request Jul 26, 2023

enabel weight scale optimization

7a3b3c5

ghstack-source-id: 7455c3a56d6ef5cd25a5a03df06fd65467dbf689 Pull Request resolved: #105996

github-actions bot added module: cpu CPU specific problem (e.g., perf, algorithm) module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration labels Jul 26, 2023

Update on "[Test Only] Enable weight scale optimization in QConv PT2E"

cb17d71

[ghstack-poisoned]

pytorchbot added the open source label Jul 26, 2023

leslie-fang-intel added a commit that referenced this pull request Jul 26, 2023

enabel weight scale optimization

94be08e

ghstack-source-id: b60eea67f24668eb4269380fb1254e9d013f24e7 Pull Request resolved: #105996

Update on "[Test Only] Enable weight scale optimization in QConv PT2E"

4d48a6c

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 gujinghui PenghuiCheng jianyuh min-jean-cho yanbing-j Guobing-Chen Xia-Weiwen [ghstack-poisoned]

leslie-fang-intel mentioned this pull request Jul 28, 2023

[quant][pt2e] store scale/zero_point as tensor attributes to support serialization #105894

Closed

jerryzh168 approved these changes Aug 23, 2023

View reviewed changes

leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Aug 25, 2023

enabel weight scale optimization

ccb4c5d

ghstack-source-id: 2e5ddcd8b1e64d459b1934ff16469e6cb8feb8a5 Pull Request resolved: pytorch#105996

leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Aug 25, 2023

enabel weight scale optimization

6719083

ghstack-source-id: 6888c85b439a646fa3994a643162da995b82bb67 Pull Request resolved: pytorch#105996

leslie-fang-intel mentioned this pull request Aug 25, 2023

[Quant][PT2E]Make _fuse_conv_bn_ support graph capture by torch._dynamo.export #107951

Closed

pytorchmergebot added the merging label Aug 26, 2023

pytorchmergebot added Merged and removed merging labels Aug 26, 2023

pytorchmergebot closed this in 780a5a0 Aug 26, 2023

facebook-github-bot deleted the gh/leslie-fang-intel/63/head branch August 29, 2023 14:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Quant][PT2E] Enable weight scale optimization in QConv PT2E #105996

[Quant][PT2E] Enable weight scale optimization in QConv PT2E #105996

leslie-fang-intel commented Jul 26, 2023 •

edited

pytorch-bot bot commented Jul 26, 2023 •

edited

leslie-fang-intel commented Aug 26, 2023

pytorchmergebot commented Aug 26, 2023

[Quant][PT2E] Enable weight scale optimization in QConv PT2E #105996

[Quant][PT2E] Enable weight scale optimization in QConv PT2E #105996

Conversation

leslie-fang-intel commented Jul 26, 2023 • edited

pytorch-bot bot commented Jul 26, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/105996

✅ No Failures

leslie-fang-intel commented Aug 26, 2023

pytorchmergebot commented Aug 26, 2023

Merge started

leslie-fang-intel commented Jul 26, 2023 •

edited

pytorch-bot bot commented Jul 26, 2023 •

edited