Skip to content

Conversation

chunyuan-w
Copy link
Collaborator

@chunyuan-w chunyuan-w commented Oct 22, 2024

Stack from ghstack (oldest at bottom):

This PR adds C shim for QConvPointWisePT2E and QConvPointWiseBinaryPT2E similar to #138439. Besides that, we aligned the implementation of qconv_pointwise with qlinear_pointwise in the following aspects:

  1. The parameter order of qconv_pointwise and qlinear_pointwise are quite different, we aligned the schema of qconv_pointwise to have similar parameter order as qlinear_pointwise to make it more consistent.
  2. We always converted x_scale and x_zero_point to Tensors, just like in the lowering of qlinear_pointwise. This avoids the need to create two separate C APIs (one for double x_scale and int64_t x_zero_point, and another for Tensor versions). Instead, we only need one API for Tensor-based x_scale and x_zero_point. If we later add dynamic quantization for qconv (which will use Tensor for x_scale and x_zero_point), we can reuse the code from this PR and don't need to change the C shim layer API.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

Copy link

pytorch-bot bot commented Oct 22, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138540

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 8177bed with merge base 07b0d63 (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
chunyuan-w added a commit that referenced this pull request Oct 22, 2024
ghstack-source-id: 427af09
Pull Request resolved: #138540
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Copy link
Contributor

@desertfire desertfire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the APIs definitely felt clunky, when I worked on adding ABI compatibility support for them. Thanks for making them better.


// Conv2D with binary postop
m.def(TORCH_SELECTIVE_SCHEMA("onednn::qconv2d_pointwise.binary(Tensor qx, float x_scale, int x_zero_point, Tensor qaccum, float accum_scale, int accum_zero_point, Tensor qw, Tensor w_scale, Tensor w_zero_point, Tensor? bias, int[] stride, int[] padding, int[] dilation, int groups, float output_scale, int output_zero_point, ScalarType? output_dtype, str binary_attr, Scalar? alpha, str? unary_attr, Scalar?[] unary_scalars, str? unary_algorithm) -> Tensor"));
m.def(TORCH_SELECTIVE_SCHEMA("onednn::qconv2d_pointwise.binary(Tensor qx, float x_scale, int x_zero_point, Tensor qw, Tensor w_scale, Tensor w_zero_point, Tensor qaccum, Tensor? bias, int[] stride, int[] padding, int[] dilation, int groups, float output_scale, int output_zero_point, ScalarType? output_dtype, float accum_scale, int accum_zero_point, str binary_attr, Scalar? alpha, str? unary_attr, Scalar?[] unary_scalars, str? unary_algorithm) -> Tensor"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this safe to directly change this one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my question is, what is the BC policy for onednn ops?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to add something like a v2 version of this OP to avoid the BC-breaking.

Regarding the qconv2d_pointwise, since the cpu quantization support in inductor is still a prototype feature, I guess it should be fine to make API change since it's not yet stable. May I know if the current way looks fine to you or it's more preferred to add a v2 version?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is an ondnn specific op, I am ok with your decision.

@chunyuan-w chunyuan-w added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 29, 2024
[ghstack-poisoned]
chunyuan-w added a commit to chunyuan-w/pytorch that referenced this pull request Oct 30, 2024
ghstack-source-id: ae23a41
Pull Request resolved: pytorch#138540
@chunyuan-w
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

rahulsingh-intel pushed a commit to rahulsingh-intel/pytorch that referenced this pull request Nov 5, 2024
This PR adds C shim for `QConvPointWisePT2E` and `QConvPointWiseBinaryPT2E` similar to pytorch#138439. Besides that, we aligned the implementation of `qconv_pointwise` with `qlinear_pointwise` in the following aspects:
1. The parameter order of `qconv_pointwise` and `qlinear_pointwise` are quite different, we aligned the schema of `qconv_pointwise` to have similar parameter order as `qlinear_pointwise` to make it more consistent.
2. We always converted `x_scale` and `x_zero_point` to Tensors, just like in the lowering of `qlinear_pointwise`. This avoids the need to create two separate C APIs (one for `double x_scale` and `int64_t x_zero_point`, and another for `Tensor` versions). Instead, we only need one API for `Tensor`-based `x_scale` and `x_zero_point`. If we later add dynamic quantization for qconv (which will use `Tensor` for `x_scale` and `x_zero_point`), we can reuse the code from this PR and don't need to change the C shim layer API.

Pull Request resolved: pytorch#138540
Approved by: https://github.com/jgong5, https://github.com/desertfire
ghstack dependencies: pytorch#138691, pytorch#138806
@github-actions github-actions bot deleted the gh/chunyuan-w/39/head branch November 30, 2024 02:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged module: cpu CPU specific problem (e.g., perf, algorithm) module: inductor open source release notes: quantization release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants