Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model conversion error #8

Closed
metanav opened this issue May 29, 2020 · 33 comments
Closed

Model conversion error #8

metanav opened this issue May 29, 2020 · 33 comments
Labels
enhancement New feature or request

Comments

@metanav
Copy link

metanav commented May 29, 2020

After running download.sh, I am trying to convert face_detection_front.pb to quantized tflite model. I have tried all scripts inside 30_BlazeFace/01_float32 directory but all gets failed with the following error:

ValueError: This converter can only convert a single ConcreteFunction. Converting multiple functions is under development.

I am using TensorFlow 2.2.0 on MacOS. Also, tried with a Linux machine with 2.1.0 and 2.2.0.

NOTE: I am trying to rebuild quantized model to run on microcontroller. The quantized model for the blazeface provided in your repo download.sh has error while running on the microcontroller: "Didn't find op for builtin opcode 'CONV_2D' version '3'" or "Didn't find op for builtin opcode 'QUANTIZE' version '2'".

@PINTO0309
Copy link
Owner

Please try using Tensorflow v1.15.2.

@metanav
Copy link
Author

metanav commented May 30, 2020

Thanks! It works but I still get "Didn't find op for builtin opcode 'QUANTIZE' version '2'". while allocating tensors on the microcontroller. May be some ops are missing.

@PINTO0309
Copy link
Owner

I think it's just that the OP on the TF Lite micro side doesn't support it. There is nothing wrong with the conversion script itself.
https://github.com/tensorflow/tensorflow/blob/3ebdec47df6c1afaa31ea6f0aa980f0f90729e87/tensorflow/lite/micro/kernels/quantize.cc#L156-L169

// This Op (QUANTIZE) quantizes the input and produces quantized output.
// AffineQuantize takes scale and zero point and quantizes the float value to
// quantized output, in int8 or uint8 format.
TfLiteRegistration* Register_QUANTIZE() {
  static TfLiteRegistration r = {/*init=*/quantize::Init,
                                 /*free=*/nullptr,
                                 /*prepare=*/quantize::Prepare,
                                 /*invoke=*/quantize::Eval,
                                 /*profiling_string=*/nullptr,
                                 /*builtin_code=*/0,
                                 /*custom_name=*/nullptr,
                                 /*version=*/0};
  return &r;
}

@PINTO0309
Copy link
Owner

@metanav
Copy link
Author

metanav commented May 30, 2020

Yes I used TF 2.2.0 for the conversion. But the models generated by 03_interger_quantization.py or 04_full_integer_quantization.py have same error.

@metanav
Copy link
Author

metanav commented May 30, 2020

Also I am using TF Lite Micro library generated by TF 2.2.0 branch. May be it is TF Lite Micro issue.

@PINTO0309
Copy link
Owner

I empathize with you.

@metanav
Copy link
Author

metanav commented May 30, 2020

Thanks for your understanding. I am struggling with this issue for 3 days.

@PINTO0309
Copy link
Owner

@metanav
Copy link
Author

metanav commented May 30, 2020

Yes, I have seen that before. But I just saw TF Lite Micro has pushed new changes 1 hour ago: tensorflow/tensorflow@3ebdec4

I will try with the latest changes to see if the issue is resolved.

@PINTO0309
Copy link
Owner

It's very interesting!!

@metanav
Copy link
Author

metanav commented May 30, 2020

With the latest changes, error has gone! Thanks for your help, I appreciate it. But now I am facing another issue which is related to the size of the model. The full integer quantized model needs a little over 500KB RAM (tensor allocation, weight/activation) but the microcontroller has only 512 KB RAM. Is there any way to decrease the size (Blazeface) further?

@PINTO0309
Copy link
Owner

It is a very difficult problem. The following is a model pruning project, but it is still under development, and I could only find a sample of Keras. I haven't tried it yet, so I'm not sure if it's helpful yet.
https://github.com/tensorflow/model-optimization.git
https://blog.tensorflow.org/2019/05/tf-model-optimization-toolkit-pruning-API.html

@metanav
Copy link
Author

metanav commented May 30, 2020

Thanks for the clue! I guess I need to load "pb" saved model into TF Keras.

@PINTO0309
Copy link
Owner

For example, there are model conversion tools like this one.
https://github.com/microsoft/MMdnn.git

@metanav
Copy link
Author

metanav commented May 30, 2020

I am trying to use model-optimization. I was able to load the model into Keras but can't see the model summary.

>>> import tensorflow as tf
>>> from tensorflow import keras
>>> print(tf.version.VERSION)
2.2.0
>>> model = tf.keras.models.load_model('saved_model')
>>> model
<tensorflow.python.training.tracking.tracking.AutoTrackable object at 0x101726890>
>>> model.summary()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'AutoTrackable' object has no attribute 'summary'

@PINTO0309
Copy link
Owner

PINTO0309 commented May 30, 2020

I suffer from the same phenomenon. It may be that I used tf.compat.v1.saved_model.simple_save when I first exported the model to the saved_model format. I may need to review the initial steps to generate saved_model using v2.x, but I don't see the answer right now.

>>> model = tf.keras.models.load_model('saved_model')
>>> model.
model.asset_paths             model.initializer             model.signatures              model.variables
model.graph                   model.prune(                  model.tensorflow_git_version  
model.graph_debug_info        model.restore(                model.tensorflow_version 

@metanav
Copy link
Author

metanav commented May 31, 2020

Yes, it would be good if we can reduce the model size further for TFlite micro.
Also, there is no saved model for blazeface available from mediapipe repo.

@metanav
Copy link
Author

metanav commented May 31, 2020

The quantized model is only around 50% of the 32-bit float model. Ideally, it should be reduced by 70-75%.
face_detection_front.tflite: 413K
face_detection_front_128_full_integer_quant.tflite: 197K

If we can achieve at least 70% reduction, it would be really great. I am wondering if it is related to TF1 to TF2 conversion?

@PINTO0309
Copy link
Owner

PINTO0309 commented May 31, 2020

Ideally, it should be reduced by 70-75%.

It may be difficult to reduce the size to this level. Even if you have pruned the branches. The model I have generated is as follows.

  1. face_detection_front_128_weight_quant.tflite 163.8KB download here
  2. face_detection_front_128_integer_quant.tflite 201.6KB download here

What do you mean by the 70% reduction that you say is below?

  1. face_detection_front_128_weight_quant.tflite 49.14KB (70% reduced)
  2. face_detection_front_128_integer_quant.tflite 60.48KB (70% reduced)

I am wondering if it is related to TF1 to TF2 conversion?

My script that generates saved_model uses the v1 API. However, the saved_model generated by the v1 API seems to lack some information about Keras due to backward compatibility. If I could replace my .tflite to saved_model conversion script with a v2 base, I might be able to avoid the no attribute error.

@metanav
Copy link
Author

metanav commented May 31, 2020

I mean 70% of the default 32-bit float TFlite model.

For full integer quantization,

Currently: 413K --> 197K
Ideally:   413K --> 124K

Please see here for the reference:
https://www.tensorflow.org/lite/performance/model_optimization#quantization

@PINTO0309
Copy link
Owner

I understand. In a normal model, it is reduced to a quarter of the size after quantization. However, the original size of the BlazeFace is too small, so the size reduction effect after quantization seems to be small.

@metanav
Copy link
Author

metanav commented May 31, 2020

The size of the mediapipe/blazeface model is 224 KB which is available here: https://github.com/google/mediapipe/blob/master/mediapipe/models/face_detection_front.tflite.
I believe this model is FP16, so it is around 1/2 of the FP32. My assumption may not be correct but Int8 quantization should reduce the size further. From where did you get the TFLite FP32 model? I could not find it at the mediapipe GitHub repo.

@metanav
Copy link
Author

metanav commented May 31, 2020

Also, here is an example which shows a very simple model conversion,
https://www.tensorflow.org/lite/performance/post_training_integer_quant
83424 --> 23368
So I guess size of the model does not matter.

@PINTO0309
Copy link
Owner

The .tflite in the link you presented to me is the Float16 model. It's about the same size as the Float16 model I produced on hand.

From where did you get the TFLite FP32 model? I could not find it at the mediapipe GitHub repo.

The model on MediaPipe's official site was updated from Float32 to Float16 10 days ago. The Float32 model I downloaded 15 days ago is the one I downloaded.

Also, there is an example here which shows a very simple model conversion,

I'm not sure about the internal behavior of the TensorflowLite Converter as I'm performing the same procedure to quantize it.

@metanav
Copy link
Author

metanav commented May 31, 2020

As per documentation pruning may not reduce the RAM requirement of the model, it will only reduce the model disk size (compression). It would be interesting to see if the size can be reduced further after you will change your script to TF v2. Thanks for your time!

@PINTO0309
Copy link
Owner

PINTO0309 commented May 31, 2020

memorandum of understanding
https://github.com/minus31/BlazeFace

It may be possible to perform the training from the beginning with pruning enabled in Keras.

@PINTO0309 PINTO0309 added the enhancement New feature or request label Jun 1, 2020
@metanav
Copy link
Author

metanav commented Jun 2, 2020

I am also thinking to train it from scratch. Is training data available somewhere? Or, schema of the training data so that I can prepare my own.

@PINTO0309
Copy link
Owner

The six key points (eyes, nose, ears, and mouth) do not seem to be detectable, but a training script with FDDB dataset is available. I'm exhausted from my day job, so I'll try it tomorrow.
https://github.com/vietanhdev/blazeface_keras

@PINTO0309
Copy link
Owner

I have started training for BlazeFace by Keras. I'm going to try pruning after the 160 epoch is done.
Screenshot 2020-06-03 23:55:29

@metanav
Copy link
Author

metanav commented Jun 4, 2020

Great! It would be interesting to see the final model size.

@PINTO0309
Copy link
Owner

Pruning doesn't seem to work. 😢
Screenshot 2020-06-04 23:39:49

Screenshot 2020-06-04 23:36:42

@metanav
Copy link
Author

metanav commented Jun 6, 2020

That's sad. But I think pruning won't help in the RAM usage, it would only make the model size smaller for storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants