Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantized Convolution error on Mobile with some scale values. #33466

Closed
supriyar opened this issue Feb 18, 2020 · 5 comments
Closed

Quantized Convolution error on Mobile with some scale values. #33466

supriyar opened this issue Feb 18, 2020 · 5 comments
Assignees
Labels
oncall: mobile Related to mobile support, including iOS and Android oncall: quantization Quantization support in PyTorch triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@supriyar
Copy link
Contributor

supriyar commented Feb 18, 2020

馃悰 Bug

QNNPACK throws an error with certain scale values for input and weight tensors. The error happens when the convolution scale is greater than 1.0. convolution scale is computed as input_scale * kernel_scale / output_scale;

This is a problem which arises when model is trained with QAT and run on QNNPACK mobile backend.

To Reproduce

Script to repro the error

import torch
import torch.nn as nn

qconv = torch.ops.quantized.conv2d
qconv_prepack = torch.ops.quantized.conv2d_prepack

strides = (1, 1)
pads = (0, 0)
dilations = (1, 1)
groups = 1


for name in ["fbgemm", "qnnpack"]:
    torch.backends.quantized.engine = name
    print("Running on backend ", name)
    x = torch.randn(1, 4, 4, 4)
    qx = torch.quantize_per_tensor(x, scale=0.052, zero_point=0, dtype=torch.quint8)
    weight = torch.randn(2, 4, 2, 2)
    qweight = torch.quantize_per_tensor(weight, scale=2.39, zero_point=0, dtype=torch.qint8)
    w_prepack = qconv_prepack(qweight, None, strides, pads, dilations, groups)
    print(qconv(qx, w_prepack, strides, pads, dilations, groups, 0.112, 0))

Expected behavior

Output from FBGEMM

tensor([[[[0.0000, 0.0000, 0.0000],
          [0.0000, 0.0000, 0.0000],
          [0.0000, 0.0000, 0.0000]],

         [[1.2320, 0.2240, 0.0000],
          [0.0000, 0.0000, 2.6880],
          [0.4480, 0.0000, 0.0000]]]], size=(1, 2, 3, 3), dtype=torch.quint8,
       quantization_scheme=torch.per_tensor_affine, scale=0.112, zero_point=0)

Output from QNNPACK
Error in QNNPACK: failed to create convolution with 0.052 input scale, 2.39 kernel scale, and 0.112 output scale: convolution scale 1.109643 is greater or equal to 1.0

cc @jerryzh168 @jianyuh @dzhulgakov @raghuramank100 @jamesr66a, @kimishpatel

@kimishpatel
Copy link
Contributor

@raghuramank100 @supriyar, isn't it strange to have kernel scale 2.39? What is the weight tensor like?

@supriyar
Copy link
Contributor Author

I believe the scale value is computed as (max - min)/(qmax - qmin). Given that, I wouldn't expect scale of 2.39 to be strange. Actually the convolution scale is 1.1 which is just slightly over 1.0. Since FBGEMM supports this, I anticipate this being an issue in running models on mobile. Would it be possible to make the requirement of 1.0 a little slack?

@kimishpatel
Copy link
Contributor

@supriyar, my bad, I misunderstood this. Let's discuss in person with Raghu, why we are running into this. On the side of fixing this in pytorch QNNPACK, this needs to be carefully looked at because this may impact the assumptions made in requantization logic. (Probably doable if we follow similar requantization semantics as fbgemm).

@yf225 yf225 added oncall: quantization Quantization support in PyTorch triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module oncall: mobile Related to mobile support, including iOS and Android labels Feb 20, 2020
@kimishpatel
Copy link
Contributor

#37683 and #35856 should resolve this issue.

@vkuzo
Copy link
Contributor

vkuzo commented Jul 8, 2020

closing since this is fixed by @kimishpatel

@vkuzo vkuzo closed this as completed Jul 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
oncall: mobile Related to mobile support, including iOS and Android oncall: quantization Quantization support in PyTorch triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

4 participants