Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize the constraints on QuantizationZeroPoint #1405

Closed
sdasgup3 opened this issue Apr 14, 2023 · 6 comments
Closed

Finalize the constraints on QuantizationZeroPoint #1405

sdasgup3 opened this issue Apr 14, 2023 · 6 comments
Assignees

Comments

@sdasgup3
Copy link
Member

sdasgup3 commented Apr 14, 2023

The goal of the ticket is to resolve the following discussions around the quantization zero_point:

  1. Can zero point usage be restricted to 8-bit and lower?
  2. Following the TFLite behavior, should zero_point granularity be restricted to per-tensor?

update
3. Following the TFlite op behavior, should zero_points be be restricted to certain ranges (e.g., tanh op)?

@sdasgup3 sdasgup3 added the Spec label Apr 14, 2023
@sdasgup3 sdasgup3 self-assigned this Apr 14, 2023
@burmako burmako changed the title Fixing the constraints on zero_point Finalize the constraints on QuantizationZeroPoint Apr 14, 2023
@jingpu
Copy link

jingpu commented Jun 8, 2023

Hi, I didn't get a chance to comment during the discussion meeting today, so I'd add my comment here. I feel 8bit zero point is more of a TFLite x INT8 quant specific choice, and in StableHLO spec it is better to be more general than that. For example, if we think about INT4 quantization, do we want to restrict the zero point to 8bit too? In this case, 8-bit seems a much more magic/arbitrary number in this case (why not 4bit, 5bit, etc.).

@igorsafo
Copy link

igorsafo commented Jun 8, 2023

Hi, thanks for the today's discussion!
I think limiting zero point to dst data type range can restrict quantization schemes supported. If we have the following quantization:

x_f32 = alpha * (x_u8 - z_x)
min(x_f32) -> u8_min
max(x_f32) -> u8_max

alpha = (max(x_f32) - min(x_f32)) / (u8_max - u8_min)
z_x = u8_max - max(x_f32)/alpha

If 0_f32 < min(x_f32) then z_x might not fit into destination range. For example:

>>> x = [4, 6]
>>> alpha = (max(x) - min(x)) / (255-0)
>>> z = 255 - max(x)/alpha
>>> alpha
0.00784313725490196
>>> z
-510.0
>>> 4 / alpha + z
0.0
>>> 6 / alpha + z
255.0

As you can see the zero point is -510 which doesn't fit into u8 data type.

@sdasgup3
Copy link
Member Author

Thanks @igorsafo for bringing this up and thanks for the super useful example.

Let me share a bit of history on why we end up at type(zero_point) = storage_type.
I remember that the initial proposal of the spec has For all i, type(zero_points[i]) = i64. That was to accommodate the use-case you shared, when 0_f32 does not lie between [min(x_f32), max(x_f32)], which makes the zero-point to be outside of [q_min, q_min] and hence an i64 can be safety be used to store the zero-point.

Later based on the discussion, we decided to add the constraint (C8) storage_min <= zero_points <= storage_max, to enforce that the 0_f32 will always be within
[min(x_f32), max(x_f32)] and if C8 is true then we can say that type(zero_point) = storage_type. The decision was backed by how tflite ensures that 0_f32 be included in the float range. (refs).

Coming back at the current discussion: I agree that, per the current form of specification, the running example is not allowed as the zero-point goes out of [u8_max, u8_min]. However, I am happy to explore and work on how we can enable that.

Just to get a bit more context: Do you mind sharing on the motivation for not including 0_f32 in the float range. I was wondering if something like https://github.com/tensorflow/tensorflow/blob/f93185001ff1d7b75e9966b80a144601d7d6e696/tensorflow/lite/tools/optimize/quantization_utils.cc#L74 work in your case.

@igorsafo
Copy link

@sdasgup3 Thank you for the links! It makes a lot of sense!
Unfortunately, I don't have a real use case. I am a developer of oneDNN and we implemented zero points as s32 to support frameworks/applications that define it this way. What is the plan to support fameworks/applications that define zero point larger than quantization data type? For example, PyTorch specifies zero point as an integer https://pytorch.org/docs/stable/quantization.html#quantized-tensor

@sdasgup3
Copy link
Member Author

sdasgup3 commented Jun 27, 2023

Thanks for the information!
I am trying to get the rationale behind Pytorch representing the zero point as int: For example, is it to express zero point outside the storage type range or it is just a catch all for all supported quantized dtypes. Let me know if you have any information. In any case, I will get back asap.

update
posted my question: https://dev-discuss.pytorch.org/t/semantics-of-zero-point/1349

@sdasgup3
Copy link
Member Author

PyTorch specifies zero point as an integer https://pytorch.org/docs/stable/quantization.html#quantized-tensor

I think it is just a catch all to express zero-point in any supported quantized dtype. Per the formula x_u8 = x_f32 / alpha + z_x, if any zero_point value is out of the range of the storage type, it will just overflow the storage type. In this case, all values will be quantized to the same value (min or max of the storage type).

For now, let us close this issue with the resolution that zp should be the same or narrower than the storage type enforcing that the 0_f32 will always be within the float range. Feel free to re-open this one with concrete use-cases or address them in separate issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

4 participants