Skip to content

quantize_per_tensor returns inconsistent results on ARM for quint8 #60077

@malfet

Description

@malfet

🐛 Bug

quantize_per_tensor returns tensor filled with different values while quantizing torch.ones(10) * 2**32:

$ python3 -c "import torch;print(torch.torch.quantize_per_tensor(torch.ones(10) * 2**32, 0.5, 1, torch.quint8))" 
tensor([127.0000, 127.0000, 127.0000, 127.0000, 127.0000, 127.0000, 127.0000,
        127.0000,  -0.5000,  -0.5000], size=(10,), dtype=torch.quint8,
       quantization_scheme=torch.per_tensor_affine, scale=0.5, zero_point=1)

This happens because of the integer overflow here:

auto r = zero_point + static_cast<int32_t>(Round(value * inv_scale));

Expected behavior

$ python3 -c "import torch;print(torch.torch.quantize_per_tensor(torch.ones(10) * 2**32, 0.5, 1, torch.quint8))" 
tensor([127., 127., 127., 127., 127., 127., 127., 127., 127., 127.],
       size=(10,), dtype=torch.quint8,
       quantization_scheme=torch.per_tensor_affine, scale=0.5, zero_point=1)

cc @malfet @jerryzh168 @jianyuh @raghuramank100 @jamesr66a @vkuzo

Metadata

Metadata

Assignees

Labels

module: armRelated to ARM architectures builds of PyTorch. Includes Apple M1module: correctness (silent)issue that returns an incorrect result silentlyoncall: quantizationQuantization support in PyTorch

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions