-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Add new ops wrapped_linear_prepack and wrapped_quantized_linear_prepacked #134232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/134232
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (2 Unrelated Failures)As of commit 3c8b8d1 with merge base fee677e ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D61395887 |
Attention! native_functions.yaml was changedIf you are adding a new function or defaulted argument to native_functions.yaml, you cannot use it from pre-existing Python frontend code until our FC window passes (two weeks). Split your PR into two PRs, one which adds the new C++ functionality, and one that makes use of it from Python, and land them two weeks apart. See https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#forwards-compatibility-fc for more info. Caused by: |
This pull request was exported from Phabricator. Differential Revision: D61395887 |
4a8c269
to
2e8edd2
Compare
…cked (pytorch#134232) Summary: Pull Request resolved: pytorch#134232 This diff adds two new operators torch.ops._quantized.wrapped_linear_prepack and torch.ops._quantized.wrapped_quantized_linear_prepacked. It is a decomposition of the op torch.ops._quantized.wrapped_quantized_linear added in the previous diff. We decomposed in this way as packed weight could be computed early so we don;t need to do it in every forward in AOTI Reviewed By: jerryzh168 Differential Revision: D61395887
This pull request was exported from Phabricator. Differential Revision: D61395887 |
2e8edd2
to
3c8b8d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks!
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
- func: wrapped_linear_prepack(Tensor weight, Tensor weight_scale, Tensor weight_zero_point, Tensor bias) -> Tensor | ||
|
||
- func: wrapped_quantized_linear_prepacked(Tensor input, Tensor input_scale, Tensor input_zero_point, Tensor packed_weight, Tensor output_scale, Tensor output_zero_point, int out_channel) -> Tensor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need these -- these generate aten::wrapped_linear_prepack ops. It looks like you're only defining _quantized::wrapped_linear_prepack
@albanD, why didn't the public API tests trigger on this? It looks like we added a new torch.wrapped_linear_prepack with no docstring |
auto ret = cpp_custom_type_hack::create( | ||
std::move(unique_ptr_wrapper), weight.options()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really bad idea. Are you sure you want this? @huayuli00 @houseroad
…to private by adding _ as prefix Summary: In #134232, we added two new ops wrapped_linear_prepack and wrapped_quantized_linear_prepacked. From the review comments and offline discussion, we are changing them to private by adding `_` as prefix Differential Revision: D62325142
…to private by adding _ as prefix (#135401) Summary: Pull Request resolved: #135401 In #134232, we added two new ops wrapped_linear_prepack and wrapped_quantized_linear_prepacked. From the review comments and offline discussion, we are changing them to private by adding `_` as prefix Differential Revision: D62325142
…to private by adding _ as prefix (pytorch#135401) Summary: Pull Request resolved: pytorch#135401 In pytorch#134232, we added two new ops wrapped_linear_prepack and wrapped_quantized_linear_prepacked. From the review comments and offline discussion, we are changing them to private by adding `_` as prefix Reviewed By: houseroad Differential Revision: D62325142
…to private by adding _ as prefix (#135401) Summary: In #134232, we added two new ops wrapped_linear_prepack and wrapped_quantized_linear_prepacked. From the review comments and offline discussion, we are changing them to private by adding `_` as prefix Differential Revision: D62325142 Pull Request resolved: #135401 Approved by: https://github.com/houseroad
…to private by adding _ as prefix (pytorch#135401) Summary: In pytorch#134232, we added two new ops wrapped_linear_prepack and wrapped_quantized_linear_prepacked. From the review comments and offline discussion, we are changing them to private by adding `_` as prefix Differential Revision: D62325142 Pull Request resolved: pytorch#135401 Approved by: https://github.com/houseroad
Summary:
This diff adds two new operators torch.ops._quantized.wrapped_linear_prepack and torch.ops._quantized.wrapped_quantized_linear_prepacked. It is a decomposition of the op torch.ops._quantized.wrapped_quantized_linear added in the previous diff.
We decomposed in this way as packed weight could be computed early so we don;t need to do it in every forward in AOTI
Reviewed By: jerryzh168
Differential Revision: D61395887
cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10