Finalize the constraints on QuantizationZeroPoint #1405

sdasgup3 · 2023-04-14T01:05:07Z

The goal of the ticket is to resolve the following discussions around the quantization zero_point:

update
3. Following the TFlite op behavior, should zero_points be be restricted to certain ranges (e.g., tanh op)?

The text was updated successfully, but these errors were encountered:

jingpu · 2023-06-08T18:28:04Z

Hi, I didn't get a chance to comment during the discussion meeting today, so I'd add my comment here. I feel 8bit zero point is more of a TFLite x INT8 quant specific choice, and in StableHLO spec it is better to be more general than that. For example, if we think about INT4 quantization, do we want to restrict the zero point to 8bit too? In this case, 8-bit seems a much more magic/arbitrary number in this case (why not 4bit, 5bit, etc.).

igorsafo · 2023-06-08T23:36:44Z

Hi, thanks for the today's discussion!
I think limiting zero point to dst data type range can restrict quantization schemes supported. If we have the following quantization:

x_f32 = alpha * (x_u8 - z_x)
min(x_f32) -> u8_min
max(x_f32) -> u8_max

alpha = (max(x_f32) - min(x_f32)) / (u8_max - u8_min)
z_x = u8_max - max(x_f32)/alpha

If 0_f32 < min(x_f32) then z_x might not fit into destination range. For example:

>>> x = [4, 6]
>>> alpha = (max(x) - min(x)) / (255-0)
>>> z = 255 - max(x)/alpha
>>> alpha
0.00784313725490196
>>> z
-510.0
>>> 4 / alpha + z
0.0
>>> 6 / alpha + z
255.0

As you can see the zero point is -510 which doesn't fit into u8 data type.

sdasgup3 · 2023-06-12T16:39:59Z

Thanks @igorsafo for bringing this up and thanks for the super useful example.

Let me share a bit of history on why we end up at type(zero_point) = storage_type.
I remember that the initial proposal of the spec has For all i, type(zero_points[i]) = i64. That was to accommodate the use-case you shared, when 0_f32 does not lie between [min(x_f32), max(x_f32)], which makes the zero-point to be outside of [q_min, q_min] and hence an i64 can be safety be used to store the zero-point.

Later based on the discussion, we decided to add the constraint (C8) storage_min <= zero_points <= storage_max, to enforce that the 0_f32 will always be within
[min(x_f32), max(x_f32)] and if C8 is true then we can say that type(zero_point) = storage_type. The decision was backed by how tflite ensures that 0_f32 be included in the float range. (refs).

Coming back at the current discussion: I agree that, per the current form of specification, the running example is not allowed as the zero-point goes out of [u8_max, u8_min]. However, I am happy to explore and work on how we can enable that.

Just to get a bit more context: Do you mind sharing on the motivation for not including 0_f32 in the float range. I was wondering if something like https://github.com/tensorflow/tensorflow/blob/f93185001ff1d7b75e9966b80a144601d7d6e696/tensorflow/lite/tools/optimize/quantization_utils.cc#L74 work in your case.

igorsafo · 2023-06-15T00:47:59Z

@sdasgup3 Thank you for the links! It makes a lot of sense!
Unfortunately, I don't have a real use case. I am a developer of oneDNN and we implemented zero points as s32 to support frameworks/applications that define it this way. What is the plan to support fameworks/applications that define zero point larger than quantization data type? For example, PyTorch specifies zero point as an integer https://pytorch.org/docs/stable/quantization.html#quantized-tensor

sdasgup3 · 2023-06-27T14:27:46Z

Thanks for the information!
I am trying to get the rationale behind Pytorch representing the zero point as int: For example, is it to express zero point outside the storage type range or it is just a catch all for all supported quantized dtypes. Let me know if you have any information. In any case, I will get back asap.

update
posted my question: https://dev-discuss.pytorch.org/t/semantics-of-zero-point/1349

sdasgup3 · 2024-01-25T02:03:00Z

PyTorch specifies zero point as an integer https://pytorch.org/docs/stable/quantization.html#quantized-tensor

I think it is just a catch all to express zero-point in any supported quantized dtype. Per the formula x_u8 = x_f32 / alpha + z_x, if any zero_point value is out of the range of the storage type, it will just overflow the storage type. In this case, all values will be quantized to the same value (min or max of the storage type).

For now, let us close this issue with the resolution that zp should be the same or narrower than the storage type enforcing that the 0_f32 will always be within the float range. Feel free to re-open this one with concrete use-cases or address them in separate issues.

sdasgup3 added the Spec label Apr 14, 2023

sdasgup3 self-assigned this Apr 14, 2023

burmako changed the title ~~Fixing the constraints on zero_point~~ Finalize the constraints on QuantizationZeroPoint Apr 14, 2023

burmako mentioned this issue Apr 14, 2023

Introduce QuantizedType #1352

Merged

sdasgup3 mentioned this issue Jun 20, 2023

Add constraints to some quantized ops related to per-axis scheme #1535

Merged

sdasgup3 closed this as completed Jan 25, 2024

GleasonK added the Quantization label Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finalize the constraints on QuantizationZeroPoint #1405

Finalize the constraints on QuantizationZeroPoint #1405

sdasgup3 commented Apr 14, 2023 •

edited

Loading

jingpu commented Jun 8, 2023

igorsafo commented Jun 8, 2023

sdasgup3 commented Jun 12, 2023

igorsafo commented Jun 15, 2023

sdasgup3 commented Jun 27, 2023 •

edited

Loading

sdasgup3 commented Jan 25, 2024

Finalize the constraints on QuantizationZeroPoint #1405

Finalize the constraints on QuantizationZeroPoint #1405

Comments

sdasgup3 commented Apr 14, 2023 • edited Loading

jingpu commented Jun 8, 2023

igorsafo commented Jun 8, 2023

sdasgup3 commented Jun 12, 2023

igorsafo commented Jun 15, 2023

sdasgup3 commented Jun 27, 2023 • edited Loading

sdasgup3 commented Jan 25, 2024

sdasgup3 commented Apr 14, 2023 •

edited

Loading

sdasgup3 commented Jun 27, 2023 •

edited

Loading