Post-Quantization of mobilenetv2-keras model slower than your given quantized model #10349

oconnor127 · 2021-11-05T14:31:05Z

Hey,
I am quantizing the mobilenetv2_1.4_224 keras model using the post-training quantization tool (full integer). However, the runtime numbers are different to the runtime when I use your provided quantized mobilenetv2_1.4_224. Analyzing this issue, I found out that you converted it with TOCO, while by default the tflite converter uses MLIR. Explicitly setting converter.experimental_new_converter = False (use of TOCO) is in newer (>=2.6) TF versions not possible, because TOCO has been removed. Using previous TF version (<2.6) raise the error TypeError: ('Keyword argument not understood:', 'keepdims').

The difference between the full integer post-quantization using the TF guide and your provided quantized model is pretty severe. Your model runs with 4ms, while using the TF guide and MLIR it needs 10ms on the same HW.

Question: Could you please give me any information to reproduce your quantization of the mobilenetv2 ?

My quantization code is basically:

model = tf.keras.applications.mobilenet_v2.MobileNetV2(
    input_shape=(224,224,3), alpha=1.4, include_top=True, weights='imagenet',
    input_tensor=None, pooling=None, classes=1000,
    classifier_activation='softmax',)

    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    converter.experimental_new_converter = True
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    converter.representative_dataset = representative_data_gen
    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
    converter.inference_input_type = tf.uint8
    converter.inference_output_type = tf.uint8
    tflite_model = converter.convert()
    tflite_filepath.write_bytes(tflite_model)
`

The text was updated successfully, but these errors were encountered:

saberkun · 2021-11-08T17:48:39Z

The OD API team may not use keras.applications. keras.applications is not released by the original MobileNet team. So there might be something they missed. Good to ask keras team.
@tombstone for MobileNet used by OD API.

oconnor127 added models:official models that come under official repository type:docs labels Nov 5, 2021

oconnor127 changed the title ~~Post-Quantization of mobilenetv2-keras model slower your given quantized model~~ Post-Quantization of mobilenetv2-keras model slower than your given quantized model Nov 5, 2021

kumariko assigned kumariko, jaeyounkim, saberkun and rachellj218 and unassigned kumariko Nov 8, 2021

saberkun assigned tombstone and unassigned jaeyounkim, saberkun and rachellj218 Nov 8, 2021

saberkun added models:research:odapi ODAPI and removed models:official models that come under official repository labels Nov 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Post-Quantization of mobilenetv2-keras model slower than your given quantized model #10349

Post-Quantization of mobilenetv2-keras model slower than your given quantized model #10349

oconnor127 commented Nov 5, 2021 •

edited

saberkun commented Nov 8, 2021

Post-Quantization of mobilenetv2-keras model slower than your given quantized model #10349

Post-Quantization of mobilenetv2-keras model slower than your given quantized model #10349

Comments

oconnor127 commented Nov 5, 2021 • edited

saberkun commented Nov 8, 2021

oconnor127 commented Nov 5, 2021 •

edited