Skip to content

Failed with Post Training Quantization after Quantization Aware Training #408

@kalaluthien

Description

@kalaluthien

Describe the requests
I am working with recent neural networks targeting mobile devices, and I found there are obstacles to perform integer-quantization after QAT.

I know these APIs are not available now, but if you have plans to address following issues, please let me know when they will be available :)

  1. AveragePooling2D
x = layers.Conv2D(32, 5, padding='same', activation='relu')(input)
x = layers.AveragePooling2D((2, 2), (2, 2), padding='same')(x)  #<- succeed to convert, failed to prepare
x = layers.Conv2D(64, 5, padding='same', activation='relu')(x)

tensorflow/lite/kernels/pooling.cc:94 input->params.scale != output->params.scale (-1045139600 != 653455232)
Node number 2 (AVERAGE_POOL_2D) failed to prepare.

  • Same with MaxPooling2D problem.
  1. MaxPooling2D
x = layers.Conv2D(32, 5, padding='same', activation='relu')(input)
x = layers.MaxPooling2D((2, 2), (2, 2), padding='same')(x)  #<- succeed to convert, failed to prepare
x = layers.Conv2D(64, 5, padding='same', activation='relu')(x)

tensorflow/lite/kernels/pooling.cc:94 input->params.scale != output->params.scale (-1045139600 != 653454832)
Node number 2 (MAX_POOL_2D) failed to prepare.

  • Same with AveragePooling2D problem.
  1. Residual connection
input = tf.keras.Input(input_shape)
shortcut = input
x = layers.Conv2D(16, 1, padding='same', use_bias=False)(input)
x = layers.BatchNormalization()(x)
x = layers.ReLU(6.0)(x)
x = x + shortcut  #<- failed to convert addition because '+' reduced to TensorFlowOpLayer, not Add.

Layer tf_op_layer_AddV2:<class 'tensorflow.python.keras.engine.base_layer.TensorFlowOpLayer'> is not supported.
You can quantize this layer by passing a tfmot.quantization.keras.QuantizeConfig instance to the quantize_annotate_layer API.

  • This problem cause below failure.
  1. HardSwish
x = layers.Conv2D(32, 3, 2, padding='same', use_bias=False)(input)
x = layers.BatchNormalization()(x)
x = layers.ReLU(6.0)(x + 3) * (1 / 6)  #<- equivalent to `HardSwish`

Layer tf_op_layer_AddV2_1:<class 'tensorflow.python.keras.engine.base_layer.TensorFlowOpLayer'> is not supported. You can quantize this layer by passing a tfmot.quantization.keras.QuantizeConfig instance to the quantize_annotate_layer API.

  • There are two levels of the problem.
    • I had configured QuantizeConfig to support TensorFlowOpLayer to use Add and Multiply ops, however these ops are placed between BN and ReLU6, Conv2D-BN-ReLU layers could not be fused correctly. -> Quantized MobileNetV3 became slower than floating pointer version on the android device.
    • Main building block of MobileNetV3: Conv2D-BN-HardSwish is not supported pattern.
  1. GlobalAveragePooling-Dense
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(1024, activation='relu')(x)  #<- succeed to convert, failed to prepare

tensorflow/lite/kernels/kernel_util.cc:129 std::abs(input_product_scale - bias_scale) <= 1e-6 * std::min(input_product_scale, bias_scale) was not true.
Node number 4 (FULLY_CONNECTED) failed to prepare.

  • This bug prevent me from benchmark official MobileNetV2 network imported from tf.keras.

System information

TensorFlow installed from (source or binary): binary

TensorFlow version: 2.2.0 (release)

TensorFlow Model Optimization version: 0.3.0 (release)

Python version: 3.6.0

Code to reproduce the issue
Gist to reproduce full test
https://gist.github.com/kalaluthien/b270c71afb6866ae61ef0dc088a762f2

Metadata

Metadata

Assignees

Labels

feature requestfeature requesttechnique:qatRegarding tfmot.quantization.keras (for quantization-aware training) APIs and docs

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions