-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[quant][pt2e] store scale/zero_point as tensor attributes to support serialization #105894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…serialization Summary: Currently scale/zero_point for per tensor quant is stored as burnt in literals, this means these values can't be serialized in state_dict, this PR changes them to buffers/Tensors so that they can be serialized Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/105894
Note: Links to docs will display an error until the docs builds have been completed. ✅ 2 Unrelated FailuresAs of commit ac16b45: UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…to support serialization" Summary: Currently scale/zero_point for per tensor quant is stored as burnt in literals, this means these values can't be serialized in state_dict, this PR changes them to buffers/Tensors so that they can be serialized Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
…to support serialization" Summary: Currently scale/zero_point for per tensor quant is stored as burnt in literals, this means these values can't be serialized in state_dict, this PR changes them to buffers/Tensors so that they can be serialized Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
…serialization Summary: Currently scale/zero_point for per tensor quant is stored as burnt in literals, this means these values can't be serialized in state_dict, this PR changes them to buffers/Tensors so that they can be serialized Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 5522977 Pull Request resolved: #105894
@jerryzh168 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
…to support serialization" Summary: Currently scale/zero_point for per tensor quant is stored as burnt in literals, this means these values can't be serialized in state_dict, this PR changes them to buffers/Tensors so that they can be serialized Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D47770963](https://our.internmc.facebook.com/intern/diff/D47770963) [ghstack-poisoned]
…serialization Summary: Currently scale/zero_point for per tensor quant is stored as burnt in literals, this means these values can't be serialized in state_dict, this PR changes them to buffers/Tensors so that they can be serialized Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: e281fa3 Pull Request resolved: #105894
@jerryzh168 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
@pytorchbot revert -m "breaking executorch tests internally" -c ghfirst |
@pytorchbot successfully started a revert job. Check the current status here. |
Can't revert PR that was landed via phabricator as D47770963. Please revert by going to the internal diff and clicking Unland. |
@pytorchbot revert -m "breaking executorch tests internally" -c ghfirst |
@pytorchbot successfully started a revert job. Check the current status here. |
Can't revert PR that was landed via phabricator as D47770963. Please revert by going to the internal diff and clicking Unland. |
Can't revert PR that was landed via phabricator as D47770963. Please revert by going to the internal diff and clicking Unland. |
@pytorchbot revert -m "breaking executorch tests internally" -c ghfirst |
@pytorchbot successfully started a revert job. Check the current status here. |
@jerryzh168 your PR has been successfully reverted. |
…support serialization (#105894)" This reverts commit 3ca71ed. Reverted #105894 on behalf of https://github.com/huydhn due to breaking executorch tests internally ([comment](#105894 (comment)))
Hi @jerryzh168, I think this PR has broken quantization inductor flow in this ghstack #105996. Can we have more discussion about the solution before re-landing of this PR? cc @jgong5 @Guobing-Chen |
…serialization (pytorch#105894) Summary: Currently scale/zero_point for per tensor quant is stored as burnt in literals, this means these values can't be serialized in state_dict, this PR changes them to buffers/Tensors so that they can be serialized Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D47770963](https://our.internmc.facebook.com/intern/diff/D47770963) Pull Request resolved: pytorch#105894 Approved by: https://github.com/kimishpatel
sure, this changes quantize_per_tensor.default to quantize_per_tensor.tensor and that's pretty much it, is there anything else that's broken? |
…support serialization (pytorch#105894)" This reverts commit 3ca71ed. Reverted pytorch#105894 on behalf of https://github.com/huydhn due to breaking executorch tests internally ([comment](pytorch#105894 (comment)))
**Summary** Draft the fix of QConv lowering in Inductor after PR #105894 cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]
…alar tensor" **Summary** Draft the fix of QConv lowering in Inductor after PR #105894 cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]
**Summary** Cherry-pick #105894 for further testing. [ghstack-poisoned]
**Summary** Draft the fix of QConv lowering in Inductor after PR #105894 cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]
**Summary** Cherry-pick #105894 for further testing. [ghstack-poisoned]
…support serialization (pytorch#105894) Summary: Currently scale/zero_point for per tensor quant is stored as burnt in literals, this means these values can't be serialized in state_dict, this PR changes them to buffers/Tensors so that they can be serialized Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/3ca71ed735257cb7ad377b57a45057c265893a40 Test plan from GitHub: python test/test_quantization.py TestQuantizePT2E Original Phabricator Test Plan: python test/test_quantization.py TestQuantizePT2E Imported from OSS Differential Revision: D47933210 fbshipit-source-id: 6c0993e58c5f8fd95c4d57b0fdb51e14a3573989
**Summary** Cherry-pick #105894 for further testing. [ghstack-poisoned]
**Summary** Cherry-pick #105894 for further testing. [ghstack-poisoned]
…alar tensor" **Summary** Draft the fix of QConv lowering in Inductor after PR #105894 cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]
**Summary** Draft the fix of QConv lowering in Inductor after PR #105894 cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]
…alar tensor" **Summary** Draft the fix of QConv lowering in Inductor after PR #105894 cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]
**Summary** Draft the fix of QConv lowering in Inductor after PR #105894 cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]
Stack from ghstack (oldest at bottom):
Summary:
Currently scale/zero_point for per tensor quant is stored as burnt in literals, this means these values can't be serialized in state_dict, this
PR changes them to buffers/Tensors so that they can be serialized
Test Plan:
python test/test_quantization.py TestQuantizePT2E
Reviewers:
Subscribers:
Tasks:
Tags: