[AOTI] add C shim for QConvPointWise #138540

chunyuan-w · 2024-10-22T02:27:08Z

Stack from ghstack (oldest at bottom):

This PR adds C shim for QConvPointWisePT2E and QConvPointWiseBinaryPT2E similar to #138439. Besides that, we aligned the implementation of qconv_pointwise with qlinear_pointwise in the following aspects:

The parameter order of qconv_pointwise and qlinear_pointwise are quite different, we aligned the schema of qconv_pointwise to have similar parameter order as qlinear_pointwise to make it more consistent.
We always converted x_scale and x_zero_point to Tensors, just like in the lowering of qlinear_pointwise. This avoids the need to create two separate C APIs (one for double x_scale and int64_t x_zero_point, and another for Tensor versions). Instead, we only need one API for Tensor-based x_scale and x_zero_point. If we later add dynamic quantization for qconv (which will use Tensor for x_scale and x_zero_point), we can reuse the code from this PR and don't need to change the C shim layer API.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

pytorch-bot · 2024-10-22T02:27:12Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138540

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 8177bed with merge base 07b0d63 ():

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

inductor / cuda12.1-py3.10-gcc9-sm86 / test (inductor_cpp_wrapper, 1, 1, linux.g5.4xlarge.nvidia.gpu) (gh) (trunk failure)
inductor/test_torchinductor.py::CpuTests::test_baddbmm_cpu

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

ghstack-source-id: 427af09 Pull Request resolved: #138540

[ghstack-poisoned]

desertfire

Some of the APIs definitely felt clunky, when I worked on adding ABI compatibility support for them. Thanks for making them better.

desertfire · 2024-10-25T15:18:11Z

aten/src/ATen/native/quantized/library.cpp


  // Conv2D with binary postop
-  m.def(TORCH_SELECTIVE_SCHEMA("onednn::qconv2d_pointwise.binary(Tensor qx, float x_scale, int x_zero_point, Tensor qaccum, float accum_scale, int accum_zero_point, Tensor qw, Tensor w_scale, Tensor w_zero_point, Tensor? bias, int[] stride, int[] padding, int[] dilation, int groups, float output_scale, int output_zero_point, ScalarType? output_dtype, str binary_attr, Scalar? alpha, str? unary_attr, Scalar?[] unary_scalars, str? unary_algorithm) -> Tensor"));
+  m.def(TORCH_SELECTIVE_SCHEMA("onednn::qconv2d_pointwise.binary(Tensor qx, float x_scale, int x_zero_point, Tensor qw, Tensor w_scale, Tensor w_zero_point, Tensor qaccum, Tensor? bias, int[] stride, int[] padding, int[] dilation, int groups, float output_scale, int output_zero_point, ScalarType? output_dtype, float accum_scale, int accum_zero_point, str binary_attr, Scalar? alpha, str? unary_attr, Scalar?[] unary_scalars, str? unary_algorithm) -> Tensor"));


Is this safe to directly change this one?

I guess my question is, what is the BC policy for onednn ops?

Another option is to add something like a v2 version of this OP to avoid the BC-breaking.

Regarding the qconv2d_pointwise, since the cpu quantization support in inductor is still a prototype feature, I guess it should be fine to make API change since it's not yet stable. May I know if the current way looks fine to you or it's more preferred to add a v2 version?

Since this is an ondnn specific op, I am ok with your decision.

[ghstack-poisoned]

ghstack-source-id: ae23a41 Pull Request resolved: pytorch#138540

chunyuan-w · 2024-10-31T01:55:43Z

@pytorchbot merge

pytorchmergebot · 2024-10-31T01:57:21Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

This PR adds C shim for `QConvPointWisePT2E` and `QConvPointWiseBinaryPT2E` similar to pytorch#138439. Besides that, we aligned the implementation of `qconv_pointwise` with `qlinear_pointwise` in the following aspects: 1. The parameter order of `qconv_pointwise` and `qlinear_pointwise` are quite different, we aligned the schema of `qconv_pointwise` to have similar parameter order as `qlinear_pointwise` to make it more consistent. 2. We always converted `x_scale` and `x_zero_point` to Tensors, just like in the lowering of `qlinear_pointwise`. This avoids the need to create two separate C APIs (one for `double x_scale` and `int64_t x_zero_point`, and another for `Tensor` versions). Instead, we only need one API for `Tensor`-based `x_scale` and `x_zero_point`. If we later add dynamic quantization for qconv (which will use `Tensor` for `x_scale` and `x_zero_point`), we can reuse the code from this PR and don't need to change the C shim layer API. Pull Request resolved: pytorch#138540 Approved by: https://github.com/jgong5, https://github.com/desertfire ghstack dependencies: pytorch#138691, pytorch#138806

chunyuan-w requested review from digantdesai, jerryzh168, jianyuh, kimishpatel and salilsdesai as code owners October 22, 2024 02:27

pytorch-bot bot added ciflow/inductor module: cpu CPU specific problem (e.g., perf, algorithm) module: inductor labels Oct 22, 2024

chunyuan-w mentioned this pull request Oct 22, 2024

[AOTI] add C shim for QLinearPointwise #138439

Closed

pytorch-bot bot added the release notes: quantization release notes category label Oct 22, 2024

chunyuan-w marked this pull request as draft October 22, 2024 02:27

chunyuan-w removed request for digantdesai, jerryzh168, jianyuh, kimishpatel and salilsdesai October 22, 2024 02:30

pytorchbot added the open source label Oct 22, 2024

chunyuan-w added 3 commits October 22, 2024 10:26

Update

36357c3

[ghstack-poisoned]

Update

60dbd20

[ghstack-poisoned]

Update

6b11fbd

[ghstack-poisoned]

chunyuan-w added a commit that referenced this pull request Oct 22, 2024

[AOTI] add C shim for QConvPointWise

283ceb0

ghstack-source-id: 427af09 Pull Request resolved: #138540

chunyuan-w added 2 commits October 22, 2024 15:48

Update

4717357

[ghstack-poisoned]

Update

2deb374

[ghstack-poisoned]

chunyuan-w added 2 commits October 22, 2024 16:57

Update

69d01f1

[ghstack-poisoned]

Update

6259441

[ghstack-poisoned]

chunyuan-w mentioned this pull request Oct 23, 2024

[AOTI] add C shim for _weight_int8pack_mm #138691

Closed

chunyuan-w marked this pull request as ready for review October 23, 2024 08:54

chunyuan-w requested review from desertfire, jgong5 and leslie-fang-intel October 23, 2024 08:54

chunyuan-w added 2 commits October 23, 2024 10:00

Update

45651b2

[ghstack-poisoned]

Update

5069d6d

[ghstack-poisoned]

jgong5 approved these changes Oct 24, 2024

View reviewed changes

chunyuan-w mentioned this pull request Oct 24, 2024

[AOTI] fix pointer_to_list #138806

Closed

desertfire approved these changes Oct 25, 2024

View reviewed changes

chunyuan-w mentioned this pull request Oct 28, 2024

[AOTI] Use len(serialized_weights) when calculating consts_size #139054

Closed

chunyuan-w added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 29, 2024

Update

8177bed

[ghstack-poisoned]

chunyuan-w added a commit to chunyuan-w/pytorch that referenced this pull request Oct 30, 2024

[AOTI] add C shim for QConvPointWise

cc215c1

ghstack-source-id: ae23a41 Pull Request resolved: pytorch#138540

pytorchmergebot added the merging label Oct 31, 2024

pytorchmergebot added the Merged label Oct 31, 2024

pytorchmergebot closed this in d7411c0 Oct 31, 2024

pytorchmergebot removed the merging label Oct 31, 2024

github-actions bot deleted the gh/chunyuan-w/39/head branch November 30, 2024 02:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AOTI] add C shim for QConvPointWise #138540

[AOTI] add C shim for QConvPointWise #138540

Uh oh!

chunyuan-w commented Oct 22, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 22, 2024 •

edited

Loading

Uh oh!

desertfire left a comment

Uh oh!

desertfire Oct 25, 2024

Uh oh!

desertfire Oct 25, 2024

Uh oh!

chunyuan-w Oct 28, 2024

Uh oh!

desertfire Oct 31, 2024

Uh oh!

chunyuan-w commented Oct 31, 2024

Uh oh!

pytorchmergebot commented Oct 31, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[AOTI] add C shim for QConvPointWise #138540

[AOTI] add C shim for QConvPointWise #138540

Uh oh!

Conversation

chunyuan-w commented Oct 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138540

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

desertfire left a comment

Choose a reason for hiding this comment

Uh oh!

desertfire Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

desertfire Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

chunyuan-w Oct 28, 2024

Choose a reason for hiding this comment

Uh oh!

desertfire Oct 31, 2024

Choose a reason for hiding this comment

Uh oh!

chunyuan-w commented Oct 31, 2024

Uh oh!

pytorchmergebot commented Oct 31, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

chunyuan-w commented Oct 22, 2024 •

edited

Loading

pytorch-bot bot commented Oct 22, 2024 •

edited

Loading