Add quantized.linear_unpacked_dynamic_fp16 #122762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

muchulee8 wants to merge 4 commits into gh/muchulee8/28/base from gh/muchulee8/28/head

Contributor

muchulee8 commented Mar 27, 2024 •

edited

Loading

Stack from ghstack (oldest at bottom):

Summary:

We add a new op quantized.linear_unpacked_dynamic_fp16, which is essentially linear_dynamic_fp16 with different (unpacked) weight/bias format.
This op does packing on the fly for each call with standard at::Tensor weight & bias.

Test Plan:
Included in commit.
test_quantized_op::test_unpacked_qlinear_dynamic_fp16

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

Differential Revision: D55433203


          Add quantized.linear_unpacked_dynamic_fp16

69ed46c

Summary:

We add a new op quantized.linear_unpacked_dynamic_fp16, which is essentially linear_dynamic_fp16 with different (unpacked) weight/bias format.
This op does packing on the fly for each call with standard at::Tensor weight & bias.

Test Plan:
Included in commit.
test_quantized_op::test_unpacked_qlinear_dynamic_fp16

[ghstack-poisoned]

muchulee8 requested review from digantdesai, jerryzh168, jianyuh, kimishpatel and salilsdesai as code owners

March 27, 2024 04:04

pytorch-bot bot commented Mar 27, 2024 •

edited

Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/122762

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit c4c3361 with merge base f2c1060 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot bot added module: cpu release notes: quantization labels

muchulee8 mentioned this pull request

Add wrapper for fbgemm quantization operations #122763

Closed

Contributor Author

muchulee8 commented Mar 27, 2024

@muchulee8 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

muchulee8 mentioned this pull request

[AOTInductor] Support quantized linear on CPU with fbgemm #122820

Closed

Contributor Author

muchulee8 commented Mar 27, 2024

@muchulee8 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.


          Update on "Add quantized.linear_unpacked_dynamic_fp16"

12a22a3

Summary:

We add a new op quantized.linear_unpacked_dynamic_fp16, which is essentially linear_dynamic_fp16 with different (unpacked) weight/bias format.
This op does packing on the fly for each call with standard at::Tensor weight & bias.

Test Plan:
Included in commit.
test_quantized_op::test_unpacked_qlinear_dynamic_fp16

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

Differential Revision: [D55433203](https://our.internmc.facebook.com/intern/diff/D55433203)

[ghstack-poisoned]

Contributor Author

muchulee8 commented Mar 27, 2024

@muchulee8 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.


          Update on "Add quantized.linear_unpacked_dynamic_fp16"

f966cc9

Summary:

We add a new op quantized.linear_unpacked_dynamic_fp16, which is essentially linear_dynamic_fp16 with different (unpacked) weight/bias format.
This op does packing on the fly for each call with standard at::Tensor weight & bias.

Test Plan:
Included in commit.
test_quantized_op::test_unpacked_qlinear_dynamic_fp16

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

Differential Revision: [D55433203](https://our.internmc.facebook.com/intern/diff/D55433203)

[ghstack-poisoned]

Contributor Author

muchulee8 commented Mar 27, 2024

@muchulee8 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

jerryzh168 approved these changes

View reviewed changes

pytorch-bot bot added the ciflow/trunk label


          Update on "Add quantized.linear_unpacked_dynamic_fp16"

c4c3361

Summary:

We add a new op quantized.linear_unpacked_dynamic_fp16, which is essentially linear_dynamic_fp16 with different (unpacked) weight/bias format.
This op does packing on the fly for each call with standard at::Tensor weight & bias.

Test Plan:
Included in commit.
test_quantized_op::test_unpacked_qlinear_dynamic_fp16

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

Differential Revision: [D55433203](https://our.internmc.facebook.com/intern/diff/D55433203)

[ghstack-poisoned]

muchulee8 added a commit that referenced this pull request


          Add quantized.linear_unpacked_dynamic_fp16

ddcca39

Summary:

We add a new op quantized.linear_unpacked_dynamic_fp16, which is essentially linear_dynamic_fp16 with different (unpacked) weight/bias format.
This op does packing on the fly for each call with standard at::Tensor weight & bias.

Test Plan:
Included in commit.
test_quantized_op::test_unpacked_qlinear_dynamic_fp16

ghstack-source-id: ce0eb1b
Pull Request resolved: #122762

Contributor Author

muchulee8 commented Mar 28, 2024

@muchulee8 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

2 similar comments

Contributor Author

muchulee8 commented Mar 28, 2024

@muchulee8 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Contributor Author

muchulee8 commented Mar 28, 2024

@muchulee8 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Contributor

facebook-github-bot commented Mar 28, 2024

@pytorchbot merge -f 'Landed internally'

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

pytorchmergebot added the merging label

Collaborator

pytorchmergebot commented Mar 28, 2024

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot closed this in

a3b3085

pytorchmergebot added Merged and removed merging labels

pytorchmergebot pushed a commit that referenced this pull request


          Add wrapper for fbgemm quantization operations (#122763)

966ae94

Summary:
We add wrappers for fbgemm's packing so we can pass it through PT2 to
lowering phase of AOTInductor.

Test Plan:
Included in commit.
test_quantized_ops::test_wrapped_fbgemm_linear_fp16

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D55433204](https://our.internmc.facebook.com/intern/diff/D55433204)
Pull Request resolved: #122763
Approved by: https://github.com/jerryzh168
ghstack dependencies: #122762

sanketpurandare pushed a commit to sanketpurandare/pytorch that referenced this pull request


          Add quantized.linear_unpacked_dynamic_fp16 (pytorch#122762)

5019a02

Summary:

We add a new op quantized.linear_unpacked_dynamic_fp16, which is essentially linear_dynamic_fp16 with different (unpacked) weight/bias format.
This op does packing on the fly for each call with standard at::Tensor weight & bias.

Test Plan:
Included in commit.
test_quantized_op::test_unpacked_qlinear_dynamic_fp16

Differential Revision: [D55433203](https://our.internmc.facebook.com/intern/diff/D55433203)
Pull Request resolved: pytorch#122762
Approved by: https://github.com/jerryzh168

sanketpurandare pushed a commit to sanketpurandare/pytorch that referenced this pull request


          Add wrapper for fbgemm quantization operations (pytorch#122763)

13c089f

Summary:
We add wrappers for fbgemm's packing so we can pass it through PT2 to
lowering phase of AOTInductor.

Test Plan:
Included in commit.
test_quantized_ops::test_wrapped_fbgemm_linear_fp16

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D55433204](https://our.internmc.facebook.com/intern/diff/D55433204)
Pull Request resolved: pytorch#122763
Approved by: https://github.com/jerryzh168
ghstack dependencies: pytorch#122762

github-actions bot deleted the gh/muchulee8/28/head branch

April 28, 2024 01:54

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

jerryzh168 jerryzh168 approved these changes

salilsdesai Awaiting requested review from salilsdesai

kimishpatel Awaiting requested review from kimishpatel

digantdesai Awaiting requested review from digantdesai

jianyuh Awaiting requested review from jianyuh

Labels

ciflow/trunk Merged module: cpu release notes: quantization