-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vectorized operation on quantized tensors returns wrong values (different rounding) #107030
Comments
I traced the issue to a difference in the de-quantization:
In the current case we have a clipped quantized value: For regular de-quantization the formula is However the vectorized de-quantization doesn't use this exact formula: pytorch/aten/src/ATen/cpu/vec/vec256/vec256_qint.h Lines 678 to 701 in 9f26503
So what is used is: While mathematically equivalent due to rounding the result is different. Hence while the non-vectorized part yields the correct de-quantized value "0", the vectorized part yields "0.0112305" which is exactly the rounding error which can be expressed as The test tries to compensate for similar errors by computing the reference on a dequantized value of the quantized tensor:
This could have caught the issue, however For reference FBGEMM uses:
and the vector/multi-elem version simply iterates over this. So it also doesn't suffer from the rounding issue. So I'd argue this is a bug in the vectorized dequantization implementation in PyTorch. Especially suspicious is
I.e. an So I'd guess expanding the quantized vector to |
@Xia-Weiwen could you please take a look at this one ? |
I will take a look later. |
Any updates here? This is still an issue in 2.1 |
@Xia-Weiwen Any comments? |
Hi @Flamefire. Sorry for the late reply. We plan to fix it by PyTorch 2.2. For the main branch, it will be earlier than that. Thanks! |
…gmoid (pytorch#114098) **Description** Fix pytorch#107030 Dequantize X by `(x_val - zp) * scale` instead of `x_val * scale + (-zp * scale)` to eliminate rounding error. Now this overload is used for sigmoid only. Performance impact: ![image](https://github.com/pytorch/pytorch/assets/12522207/655abd16-7d9d-4a9a-8c59-327ebf39157a) Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) **Test plan** `python test_quantization.py TestQuantizedOps.test_sigmoid_dequantize_rounding_error` Pull Request resolved: pytorch#114098 Approved by: https://github.com/jgong5, https://github.com/jerryzh168
🐛 Describe the bug
The following code fails:
In particular the first 64 values are "0.5039" while the remainder are "0.5000". This happens for any remainder not fitting into chunks of 64 values.
Found by reducing an example of a failing test in
test_quantization
:This seems to happen for all PyTorch versions so far and does not depend on the host CPU. I reproduced this even on ppc64le.
Versions
cc @jerryzh168 @jianyuh @raghuramank100 @jamesr66a @vkuzo @jgong5 @Xia-Weiwen @leslie-fang-intel
The text was updated successfully, but these errors were encountered: