Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overload vec::dequantize to eliminate rounding error for quantized sigmoid #114098

Closed
wants to merge 1 commit into from

Conversation

Xia-Weiwen
Copy link
Collaborator

@Xia-Weiwen Xia-Weiwen commented Nov 20, 2023

Stack from ghstack (oldest at bottom):

Description
Fix #107030
Dequantize X by (x_val - zp) * scale instead of x_val * scale + (-zp * scale) to eliminate rounding error.
Now this overload is used for sigmoid only.

Performance impact:
image
Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake)

Test plan
python test_quantization.py TestQuantizedOps.test_sigmoid_dequantize_rounding_error

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

Copy link

pytorch-bot bot commented Nov 20, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/114098

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit a7ba41b with merge base 5a96a42 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Xia-Weiwen added a commit that referenced this pull request Nov 20, 2023
…gmoid

ghstack-source-id: ae0fc902a817baaf4da14943587808f572bf78bc
Pull Request resolved: #114098
@github-actions github-actions bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label Nov 20, 2023
@Xia-Weiwen Xia-Weiwen marked this pull request as draft November 20, 2023 06:55
Copy link
Collaborator

@jgong5 jgong5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need to revisit other callers and fix them too?

@Xia-Weiwen Xia-Weiwen marked this pull request as ready for review November 21, 2023 05:21
@Xia-Weiwen Xia-Weiwen added the intel This tag is for PR from Intel label Nov 21, 2023
@Xia-Weiwen
Copy link
Collaborator Author

Maybe we need to revisit other callers and fix them too?

Yes. Maybe we can do it one by one if the same issue occurs for other ops so that we can minimize the impact.

@Xia-Weiwen
Copy link
Collaborator Author

Hi @jerryzh168 @salilsdesai @kimishpatel @digantdesai @jianyuh Could you please review this PR? Thanks!

@Xia-Weiwen
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 23, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

xunsongh pushed a commit to xunsongh/pytorch that referenced this pull request Nov 24, 2023
…gmoid (pytorch#114098)

**Description**
Fix pytorch#107030
Dequantize X by `(x_val - zp) * scale` instead of `x_val * scale + (-zp * scale)` to eliminate rounding error.
Now this overload is used for sigmoid only.

Performance impact:
![image](https://github.com/pytorch/pytorch/assets/12522207/655abd16-7d9d-4a9a-8c59-327ebf39157a)
Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake)

**Test plan**
`python test_quantization.py TestQuantizedOps.test_sigmoid_dequantize_rounding_error`

Pull Request resolved: pytorch#114098
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
@facebook-github-bot facebook-github-bot deleted the gh/Xia-Weiwen/18/head branch November 26, 2023 15:26
@kimishpatel
Copy link
Contributor

Dequantize X by (x_val - zp) * scale instead of x_val * scale + (-zp * scale) to eliminate rounding error

Why were we doing x_val * scale + (-zp * scale) in the first place? That doesnt sound right.

@Xia-Weiwen
Copy link
Collaborator Author

Dequantize X by (x_val - zp) * scale instead of x_val * scale + (-zp * scale) to eliminate rounding error

Why were we doing x_val * scale + (-zp * scale) in the first place? That doesnt sound right.

I guess it was about making use of fma to speed up the computation.

@jgong5
Copy link
Collaborator

jgong5 commented Nov 29, 2023

I guess it was about making use of fma to speed up the computation.

It is bound by mem access, not sure if fma really matters here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request intel This tag is for PR from Intel Merged module: cpu CPU specific problem (e.g., perf, algorithm) open source release notes: quantization release notes category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants