Skip to content

Conversation

hl475
Copy link
Contributor

@hl475 hl475 commented Aug 22, 2024

Summary:
This diff adds two new operators torch.ops._quantized.wrapped_linear_prepack and torch.ops._quantized.wrapped_quantized_linear_prepacked. It is a decomposition of the op torch.ops._quantized.wrapped_quantized_linear added in the previous diff.

We decomposed in this way as packed weight could be computed early so we don;t need to do it in every forward in AOTI

Reviewed By: jerryzh168

Differential Revision: D61395887

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

Copy link

pytorch-bot bot commented Aug 22, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/134232

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 3c8b8d1 with merge base fee677e (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added module: cpu CPU specific problem (e.g., perf, algorithm) release notes: quantization release notes category labels Aug 22, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D61395887

Copy link
Contributor

Attention! native_functions.yaml was changed

If you are adding a new function or defaulted argument to native_functions.yaml, you cannot use it from pre-existing Python frontend code until our FC window passes (two weeks). Split your PR into two PRs, one which adds the new C++ functionality, and one that makes use of it from Python, and land them two weeks apart. See https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#forwards-compatibility-fc for more info.


Caused by:

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D61395887

@hl475 hl475 force-pushed the export-D61395887 branch from 4a8c269 to 2e8edd2 Compare August 22, 2024 17:20
…cked (pytorch#134232)

Summary:
Pull Request resolved: pytorch#134232

This diff adds two new operators torch.ops._quantized.wrapped_linear_prepack and torch.ops._quantized.wrapped_quantized_linear_prepacked. It is a decomposition of the op torch.ops._quantized.wrapped_quantized_linear added in the previous diff.

We decomposed in this way as packed weight could be computed early so we don;t need to do it in every forward in AOTI

Reviewed By: jerryzh168

Differential Revision: D61395887
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D61395887

@hl475 hl475 force-pushed the export-D61395887 branch from 2e8edd2 to 3c8b8d1 Compare August 22, 2024 18:20
Copy link
Member

@houseroad houseroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 22, 2024
@hl475
Copy link
Contributor Author

hl475 commented Aug 22, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

@hl475
Copy link
Contributor Author

hl475 commented Aug 23, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Comment on lines +3401 to +3403
- func: wrapped_linear_prepack(Tensor weight, Tensor weight_scale, Tensor weight_zero_point, Tensor bias) -> Tensor

- func: wrapped_quantized_linear_prepacked(Tensor input, Tensor input_scale, Tensor input_zero_point, Tensor packed_weight, Tensor output_scale, Tensor output_zero_point, int out_channel) -> Tensor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need these -- these generate aten::wrapped_linear_prepack ops. It looks like you're only defining _quantized::wrapped_linear_prepack

@zou3519
Copy link
Contributor

zou3519 commented Sep 6, 2024

@albanD, why didn't the public API tests trigger on this? It looks like we added a new torch.wrapped_linear_prepack with no docstring

Comment on lines +468 to +469
auto ret = cpp_custom_type_hack::create(
std::move(unique_ptr_wrapper), weight.options());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really bad idea. Are you sure you want this? @huayuli00 @houseroad

pytorch-bot bot pushed a commit that referenced this pull request Sep 6, 2024
…to private by adding _ as prefix

Summary: In #134232, we added two new ops wrapped_linear_prepack and wrapped_quantized_linear_prepacked. From the review comments and offline discussion, we are changing them to private by adding `_` as prefix

Differential Revision: D62325142
pytorch-bot bot pushed a commit that referenced this pull request Sep 7, 2024
…to private by adding _ as prefix (#135401)

Summary:
Pull Request resolved: #135401

In #134232, we added two new ops wrapped_linear_prepack and wrapped_quantized_linear_prepacked. From the review comments and offline discussion, we are changing them to private by adding `_` as prefix

Differential Revision: D62325142
hl475 added a commit to hl475/pytorch that referenced this pull request Sep 7, 2024
…to private by adding _ as prefix (pytorch#135401)

Summary:
Pull Request resolved: pytorch#135401

In pytorch#134232, we added two new ops wrapped_linear_prepack and wrapped_quantized_linear_prepacked. From the review comments and offline discussion, we are changing them to private by adding `_` as prefix

Reviewed By: houseroad

Differential Revision: D62325142
pytorchmergebot pushed a commit that referenced this pull request Sep 8, 2024
…to private by adding _ as prefix (#135401)

Summary: In #134232, we added two new ops wrapped_linear_prepack and wrapped_quantized_linear_prepacked. From the review comments and offline discussion, we are changing them to private by adding `_` as prefix

Differential Revision: D62325142

Pull Request resolved: #135401
Approved by: https://github.com/houseroad
Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Sep 20, 2024
…to private by adding _ as prefix (pytorch#135401)

Summary: In pytorch#134232, we added two new ops wrapped_linear_prepack and wrapped_quantized_linear_prepacked. From the review comments and offline discussion, we are changing them to private by adding `_` as prefix

Differential Revision: D62325142

Pull Request resolved: pytorch#135401
Approved by: https://github.com/houseroad
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request fb-exported Merged module: cpu CPU specific problem (e.g., perf, algorithm) release notes: quantization release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants