-
Notifications
You must be signed in to change notification settings - Fork 335
Description
Describe the requests
I am working with recent neural networks targeting mobile devices, and I found there are obstacles to perform integer-quantization after QAT.
I know these APIs are not available now, but if you have plans to address following issues, please let me know when they will be available :)
- AveragePooling2D
x = layers.Conv2D(32, 5, padding='same', activation='relu')(input)
x = layers.AveragePooling2D((2, 2), (2, 2), padding='same')(x) #<- succeed to convert, failed to prepare
x = layers.Conv2D(64, 5, padding='same', activation='relu')(x)tensorflow/lite/kernels/pooling.cc:94 input->params.scale != output->params.scale (-1045139600 != 653455232)
Node number 2 (AVERAGE_POOL_2D) failed to prepare.
- Same with
MaxPooling2Dproblem.
- MaxPooling2D
x = layers.Conv2D(32, 5, padding='same', activation='relu')(input)
x = layers.MaxPooling2D((2, 2), (2, 2), padding='same')(x) #<- succeed to convert, failed to prepare
x = layers.Conv2D(64, 5, padding='same', activation='relu')(x)tensorflow/lite/kernels/pooling.cc:94 input->params.scale != output->params.scale (-1045139600 != 653454832)
Node number 2 (MAX_POOL_2D) failed to prepare.
- Same with
AveragePooling2Dproblem.
- Residual connection
input = tf.keras.Input(input_shape)
shortcut = input
x = layers.Conv2D(16, 1, padding='same', use_bias=False)(input)
x = layers.BatchNormalization()(x)
x = layers.ReLU(6.0)(x)
x = x + shortcut #<- failed to convert addition because '+' reduced to TensorFlowOpLayer, not Add.Layer tf_op_layer_AddV2:<class 'tensorflow.python.keras.engine.base_layer.TensorFlowOpLayer'> is not supported.
You can quantize this layer by passing atfmot.quantization.keras.QuantizeConfiginstance to thequantize_annotate_layerAPI.
- This problem cause below failure.
- HardSwish
x = layers.Conv2D(32, 3, 2, padding='same', use_bias=False)(input)
x = layers.BatchNormalization()(x)
x = layers.ReLU(6.0)(x + 3) * (1 / 6) #<- equivalent to `HardSwish`Layer tf_op_layer_AddV2_1:<class 'tensorflow.python.keras.engine.base_layer.TensorFlowOpLayer'> is not supported. You can quantize this layer by passing a
tfmot.quantization.keras.QuantizeConfiginstance to thequantize_annotate_layerAPI.
- There are two levels of the problem.
- I had configured
QuantizeConfigto supportTensorFlowOpLayerto useAddandMultiplyops, however these ops are placed between BN and ReLU6, Conv2D-BN-ReLU layers could not be fused correctly. -> Quantized MobileNetV3 became slower than floating pointer version on the android device. - Main building block of MobileNetV3: Conv2D-BN-HardSwish is not supported pattern.
- I had configured
- GlobalAveragePooling-Dense
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(1024, activation='relu')(x) #<- succeed to convert, failed to preparetensorflow/lite/kernels/kernel_util.cc:129 std::abs(input_product_scale - bias_scale) <= 1e-6 * std::min(input_product_scale, bias_scale) was not true.
Node number 4 (FULLY_CONNECTED) failed to prepare.
- This bug prevent me from benchmark official MobileNetV2 network imported from tf.keras.
System information
TensorFlow installed from (source or binary): binary
TensorFlow version: 2.2.0 (release)
TensorFlow Model Optimization version: 0.3.0 (release)
Python version: 3.6.0
Code to reproduce the issue
Gist to reproduce full test
https://gist.github.com/kalaluthien/b270c71afb6866ae61ef0dc088a762f2