INT4 and other low-precision conversion support status #64193

AIWintermuteAI · 2024-03-21T17:52:43Z

What is the current status of model conversion (PTQ specifically) with INT4 precision?

The question was raised before here #60125.
Also it looks like INT4 support is being added to various parts of tensorflow, as evidenced by #63870 and

tensorflow/tensorflow/compiler/mlir/quantization/tensorflow/quantization_options.proto

Line 76 in 6738c28

TENSORTYPE_INT_4 = 1;

.

However at the moment there seems to be no way to quantize model to INT4 (specifically the weights):

the tflite converter MLIR based conversion (the "experimental" conversion) seems abandoned. It is not possible to specify the type of weights quantization
what seems to be a new API for quantization (is it the one from blog article https://blog.tensorflow.org/2023/05/google-io-2023-whats-new-in-tensorflow-and-keras.html ?) lives here https://github.com/tensorflow/tensorflow/tree/master/tensorflow/compiler/mlir/quantization/tensorflow (???). It is possible to use it with INT8, but not INT4 - but the resulting SavedModel cannot be converted to .tflite (SegFault).

Can anyone who actively works on this in TF team shine the light on what is the current direction and where one needs to dig to add INT4 PTQ quantization?

LakshmiKalaKadali · 2024-03-27T06:13:04Z

Hi @AIWintermuteAI,

Please share your .tflite file to reproduce the issue.

Thank You

github-actions · 2024-04-04T01:47:37Z

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

pkgoogle · 2024-04-08T18:10:55Z

Hi @abattery, can you please take a look or comment on this? Thanks.

AIWintermuteAI added the TFLiteConverter For issues related to TFLite converter label Mar 21, 2024

google-ml-butler bot assigned SuryanarayanaY Mar 21, 2024

SuryanarayanaY assigned LakshmiKalaKadali and unassigned SuryanarayanaY Mar 22, 2024

LakshmiKalaKadali added the comp:lite TF Lite related issues label Mar 27, 2024

LakshmiKalaKadali added the stat:awaiting response Status - Awaiting response from author label Mar 27, 2024

github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Apr 4, 2024

LakshmiKalaKadali assigned pkgoogle and unassigned LakshmiKalaKadali Apr 8, 2024

pkgoogle added type:feature Feature requests ModelOptimizationToolkit TF Model Optimization Toolkit and removed stale This label marks the issue/pr stale - to be closed automatically if no activity labels Apr 8, 2024

google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label Apr 8, 2024

pkgoogle added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Apr 8, 2024

pkgoogle assigned abattery Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

INT4 and other low-precision conversion support status #64193

INT4 and other low-precision conversion support status #64193

AIWintermuteAI commented Mar 21, 2024

LakshmiKalaKadali commented Mar 27, 2024

github-actions bot commented Apr 4, 2024

pkgoogle commented Apr 8, 2024

INT4 and other low-precision conversion support status #64193

INT4 and other low-precision conversion support status #64193

Comments

AIWintermuteAI commented Mar 21, 2024

LakshmiKalaKadali commented Mar 27, 2024

github-actions bot commented Apr 4, 2024

pkgoogle commented Apr 8, 2024