Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deply my own .Tflite file #20

Open
caiya55 opened this issue May 7, 2019 · 40 comments
Open

How to deply my own .Tflite file #20

caiya55 opened this issue May 7, 2019 · 40 comments
Labels
Getting Started Related to following Google's tutorials help wanted Extra attention is needed Tutorial Linux stuff that you can do to have a better Coral experience.

Comments

@caiya55
Copy link

caiya55 commented May 7, 2019

  • Doc you were trying to follow: coral tflite file
  • Your host OS: Ubuntu
  • Your Python3 version: python 3.6

Now I have my own complied .tflite model, the file is already passed the Compile your model for the Edge TPU, and visualize.py shows that the operators are all become UINT8 type. Now I'd like to know how to deploy the model and make it run in TPU. The python API in the official website provides two engines, one is edgetpu.classification.engine, and another is edgetpu.detection.engine. But my mode is the open-pose (human pose estimation) model, so the output is different.

Is anyone working on the deployment of your own model? I will appreciate that if someone could give me some clues.

@caiya55 caiya55 changed the title How to deply mu own .Tflite file How to deply my own .Tflite file May 7, 2019
@caiya55
Copy link
Author

caiya55 commented May 8, 2019

I found the solution, it is not working very well, but I will update it in this issue.

if you want deploy your own tflite model in the Coral, basically you need to follow these steps:

  1. convert your .pb file to .tflite file. And make sure your tflite pass the Edge TPU Model Compiler. And downloaded the .tflite file, this is your tflite-edgetpu file.
  2. upload your model into the device by mdt push command. (scp is also working)
  3. in your python script, using the official python API to load the model.
    My code looks like this
from edgetpu.basic.basic_engine import BasicEngine
import numpy as np
from PIL import Image

image_path = "./images/p1.jpg"
model_path = 'open_pose_tflite'
target_size=(432, 368)
output_size = [54,46,57]
'''load the image'''
image = Image.open(image_path)
image = image.resize(target_size, Image.ANTIALIAS)
image = np.array(image).flatten()

'''load the model'''
engine = BasicEngine(model_path)
result = engine.RunInference(input_tensor = image)
process_time = result[0]
my_model_output = result[1]. reshape(output_size)

@caiya55
Copy link
Author

caiya55 commented May 8, 2019

Basically, this is working if you convert your model correctly. But I get weird output now. And one thing that really bothers me is that the type of output in my .tflite file is UINT8, but actually, it is FLOAT32 in .pb file.

I am still working on it. According to the official tflite, their output is FLOAT32, so I am checking the converting way, maybe there are some tricks.

And you can use tensorflow/lite/tools/visualize.py to check the tensors of your tflite model. It is very convenient.

@caiya55
Copy link
Author

caiya55 commented May 9, 2019

I found a good softeware called netron. It is very powerful softeware to visualized your model, and find the name of input and output. It is more efficient than visualized.py. So I recommended netron now.

@Charlesl0129
Copy link

Hi caiya have you solved the problem yet? I'm in a similar situation trying to get pose estimation running.
But model.RunInference gives segmentation fault...

@caiya55
Copy link
Author

caiya55 commented May 18, 2019

Yes, as for the UINT8 output problem, it is solved, it seems like the official API will use code to convert your UINT8 output into float32. So finally you will get float32 output. Here is the source code from Coral support team:

const auto& output_indices = interpreter->outputs();
  const int num_outputs = output_indices.size();
  int out_idx = 0;
  for (int i = 0; i < num_outputs; ++i) {
    const auto* out_tensor = interpreter->tensor(output_indices[i]);
    CHECK(out_tensor);
    if (out_tensor->type == kTfLiteUInt8) {
      const int num_values = out_tensor->bytes;
      const uint8_t* output = interpreter->typed_output_tensor<uint8_t>(i);
      CHECK(output);
      for (int j = 0; j < num_values; ++j) {
        output_data[out_idx++] = (output[j] - out_tensor->params.zero_point) *
                                 out_tensor->params.scale;
      }
    } else if (out_tensor->type == kTfLiteFloat32) {
      const int num_values = out_tensor->bytes / sizeof(float);
      const float* output = interpreter->typed_output_tensor<float>(i);
      CHECK(output);
      for (int j = 0; j < num_values; ++j) {
        output_data[out_idx++] = output[j];
      }
    } else {
      LOG(FATAL) << "Tensor " << out_tensor->name
                 << " has unsupported output type: " << out_tensor->type;
    }
    CHECK_LE(out_idx, output_size);
  }

But after that, I got a float32 output but it is not correct. Quickly I get the reason. Since the opt.pb provided by the open-pose GitHub project is not a quantization-Aware Training model, so the output from the open-pose model is very different from the correct output because you have to set the converter.default_ranges_stats when you convert the .pb file into the .tflite file. This is because the max-min numbers are required and the correct min-max numbers are missing from the original .pb file. I try different ranges, such as (0,6), (-1,2), or others, but I can't find a good range to make the final result even close to the correct results. If you can find one, please let me know. The recommended way is to train the open-pose estimation model again with quantization-aware training flag =1.

As for the segmentation fault, sorry I didn't get that error before. Do Do you use the open-pose model from the here? Which model do you use, and does your model pass the Compile your model for the Edge TPU?

Anyway, I'm glad there is someone working on a similar thing, Let's keep sharing information.

@mbcel
Copy link

mbcel commented Jun 9, 2019

I am also having difficulties to get my own model to run on the Edge TPU. I am using the local ubuntu compiler to compile my .tflite file. However I realized that the compilation does not work if my image input size is too large. The compilation just hangs and gives an internal error without further specifying what error actually occured. (Do you get any further errors during compilation or just a general internal error?)

Did you try a bigger input size with your net (e.g. double your currently used image size) and did it still compile? I fell like there is a limit on the maximum number of nodes/calculations that can be done in one layer. If that is too high it feels like it is not compiling without telling me whats actually wrong. Do you maybe know more about that? And what inference times do you get right now with your current image size?

Regarding your problem I think you should defenitly retrain your model with quantization aware training since it is really not straight forward to do post-quantization.

Good to know I am not the only one having difficulties :)

@AlbertoLanaro
Copy link

@caiya55 did you use the model from here? In that case, which model did you use? Have you tried to retrain the model with quantization aware training?

Thank you.

@caiya55
Copy link
Author

caiya55 commented Jun 13, 2019

Yes, I get the model from ildoonet/tf-pose-estimation. I already get the quantized aware training model based on Mobilenet v2. Training script is available in tf_pose/train.py. You can change the quantization flag. After that, you can run the run_checkpoint.py model to obtain the eval_graph.pb, and freeze it and convert it to tflite. I get unsupported op error when I try to convert the frozen pb file intot tflite file. I am still working on it. If anyone has a similar problem like unsupported op (cast, size or Fusebatchnorm V3) for quantization, please share the information.

@AlbertoLanaro
Copy link

Thank you @caiya55. I'm trying to train the model with quantization aware training but I'm having some issues with the training.py script. Can you please share your quantization aware trained model? Thank you!

@caiya55
Copy link
Author

caiya55 commented Jun 13, 2019

Do you have wechat?

@YaroslavSchubert
Copy link

Check out this thread about quantization-aware training and the new tool for post-training quantization, might be useful for you
tensorflow/tensorflow#27880 (comment)

@Mxgra
Copy link

Mxgra commented Sep 9, 2019

I am also having difficulties to get my own model to run on the Edge TPU. I am using the local ubuntu compiler to compile my .tflite file. However I realized that the compilation does not work if my image input size is too large. The compilation just hangs and gives an internal error without further specifying what error actually occured. (Do you get any further errors during compilation or just a general internal error?)

Did you try a bigger input size with your net (e.g. double your currently used image size) and did it still compile? I fell like there is a limit on the maximum number of nodes/calculations that can be done in one layer. If that is too high it feels like it is not compiling without telling me whats actually wrong. Do you maybe know more about that? And what inference times do you get right now with your current image size?

Regarding your problem I think you should defenitly retrain your model with quantization aware training since it is really not straight forward to do post-quantization.

Good to know I am not the only one having difficulties :)

Heyo marcel1991,

did you make any progress or could you confirm it was the input size of the model that caused your problems? With what input size were you working?
I'm currently trying to do a keyword recognizer and am running into the undefined error too. But if I check the input size of a mobilnet.lite model, it's 244x244x3 which is way bigger than the arrays I'm working with (as long as I dont horribly confuse two things here, I had a 2motnhs study break during this project).

Much thanks for an answer!
Best regards,
Max

@Namburger
Copy link

Hi @caiya55 hope you figured out your issues.
For future references, the quantization step that you did to convert the graph.pb model to a tflite model does optimization on it so the output of Uint8 form Float32 is correct.
As for compiling models, the edgetpu_compiler can now be downloaded and install directly on a host machine so you can just compile your tflite model and scp it to your board. Here a link to install the compiler:
https://coral.withgoogle.com/docs/edgetpu/compiler/#download

As for running inference, you can just use one of the open-source demo script and from there modify it to your need:
https://coral.googlesource.com/edgetpu/+/refs/heads/release-chef/edgetpu/demo/

Hope this helps

@jk78346
Copy link

jk78346 commented Oct 11, 2019

I tried to convert mobilenetv2 model into post-training quantized tflite model, and I got the following message:

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs.
Number of operations that will run on Edge TPU: 70
Number of operations that will run on CPU: 2

Operator                       Count      Status

ADD                            10         Mapped to Edge TPU
PAD                            5          Mapped to Edge TPU
QUANTIZE                       1          Operation is otherwise supported, but not mapped due to some unspecified limitation
CONV_2D                        35         Mapped to Edge TPU
DEPTHWISE_CONV_2D              17         Mapped to Edge TPU
DEQUANTIZE                     1          Operation is working on an unsupported data type
MEAN                           1          Mapped to Edge TPU
FULLY_CONNECTED                1          Mapped to Edge TPU
SOFTMAX                        1          Mapped to Edge TPU

I can't figure out why only QUANTIZE and DEQUANTIZE operations are not supported for coral dev board.

Here is my python API code:

converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimization = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
quant_model = converter.convert()

Maybe it doesn't matter. But later when I run it on edgetpu by using label_image.py example code, it shows the following error:

RuntimeError: Encountered unresolved custom op: edgetpu-custom-op.Node number 1 (edgetpu-custom-op) failed to prepare.

Does anyone have clue which step I took is problematic? thanks.

@Namburger
Copy link

@jk78346 what compiler version are you running?
Have you tried running visualize tool on your model to see it meets all requirements?

@jk78346
Copy link

jk78346 commented Oct 11, 2019

@jk78346 what compiler version are you running?
Have you tried running visualize tool on your model to see it meets all requirements?

HI, my edgetpu compiler version is: 2.0.267685300
also I use netron to visualize the model and it looks like this:
Screen Shot 2019-10-11 at 12 12 47 PM

@Namburger
Copy link

Looks like you're up to date on your compiler.
Could you try this one, actually?
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tools/visualize.py
The reason why I ask this, is because the out put actually show all ops and tensor inputs.

@jk78346
Copy link

jk78346 commented Oct 11, 2019

Looks like you're up to date on your compiler.
Could you try this one, actually?
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tools/visualize.py
The reason why I ask this, is because the out put actually show all ops and tensor inputs.

Yes, I got the foo.html file and it seems like the same case that only an edgetpu-custom-op in between, no other detail layers displayed.

I think my question is:
What is the output type of converter.convert() ?
it gives me "bytes" and so far I don't know if this is a valid model for run.

@Namburger
Copy link

@jk78346 bytes type is correct for the model, I'm saving mine in such manor:

tflite_model =converter.convert()
out = '/path/to/save/model.tflite'
out.write_bytes(tflite_model)

does foo.html shows any float32 or any unusual types on the "type" column?
all the step you've taken so far looks correct to me, how did you train your model?

@jk78346
Copy link

jk78346 commented Oct 11, 2019

@jk78346 bytes type is correct for the model, I'm saving mine in such manor:

tflite_model =converter.convert()
out = '/path/to/save/model.tflite'
out.write_bytes(tflite_model)

does foo.html shows any float32 or any unusual types on the "type" column?
all the step you've taken so far looks correct to me, how did you train your model?

Yes it has float32, hmm.

0 Identity_int8 INT8 [1, 1000] 0 {'zero_point': [-128], 'details_type': 'NONE', 'scale': [0.003906], 'quantized_dimension': 0}
1 input_1_int8 INT8 [1, 224, 224, 3] 0 {'zero_point': [-128], 'details_type': 'NONE', 'scale': [0.003922], 'quantized_dimension': 0}
2 input_1 FLOAT32 [1, 224, 224, 3] 0 {'details_type': 'NONE', 'quantized_dimension': 0}
3 Identity FLOAT32 [1, 1000] 0 {'details_type': 'NONE', 'quantized_dimension': 0}

And I just use this model:

model = tf.keras.applications.MobileNetV2(
    weights="imagenet", input_shape=(224, 224, 3))

So I think here I tried to follow this example and still got the same situation. I can't get ride of the float32 operation.

@jk78346
Copy link

jk78346 commented Oct 15, 2019

after some search, I'm not sure if the

converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8

actually give quantized input/output type.

Because now every intermediate layer seems ok except the input/output layers.

@xadrianzetx
Copy link

I've got similar problem as @jk78346

After converting with with flags

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8

using latest nightly, tf.lite.Interpreter().get_input_details() yields:

[{'dtype': numpy.float32,
  'index': 302,
  'name': 'input_1',
  'quantization': (0.0, 0),
  'shape': array([  1, 512, 512,   3], dtype=int32)}]

and tf.lite.Interpreter().get_output_details():

[{'dtype': numpy.float32,
  'index': 303,
  'name': 'Identity',
  'quantization': (0.0, 0),
  'shape': array([  1, 512, 512,   1], dtype=int32)}]

However converter still succeeds with log

Edge TPU Compiler version 2.0.267685300
Input: MobileUNetV2.tflite
Output: MobileUNetV2_edgetpu.tflite

Operator                       Count      Status

CONV_2D                        46         Mapped to Edge TPU
CONV_2D                        7          More than one subgraph is not supported
DEPTHWISE_CONV_2D              29         Mapped to Edge TPU
DEPTHWISE_CONV_2D              7          More than one subgraph is not supported
DEQUANTIZE                     1          Operation is working on an unsupported data type
LOGISTIC                       1          More than one subgraph is not supported
ADD                            14         Mapped to Edge TPU
ADD                            2          More than one subgraph is not supported
RESIZE_NEAREST_NEIGHBOR        2          Mapped to Edge TPU
RESIZE_NEAREST_NEIGHBOR        3          Operation is otherwise supported, but not mapped due to some unspecified limitation
PAD                            5          Mapped to Edge TPU
QUANTIZE                       4          Mapped to Edge TPU
QUANTIZE                       1          Operation is otherwise supported, but not mapped due to some unspecified limitation
CONCATENATION                  2          Mapped to Edge TPU
CONCATENATION                  2          More than one subgraph is not supported

But I still get RuntimeError on Edge TPU

RuntimeError: Internal: :68 tf_lite_type != kTfLiteUInt8 (9 != 3)Node number 1 (edgetpu-custom-op) failed to prepare.
Failed to allocate tensors.

@mapeima
Copy link

mapeima commented Nov 15, 2019

I have found here that TensorFlow 2.0 supports only float input/output. That is the reason why my simple mnist test model compiles like this:

Input: mnist_post_quant_model_io.tflite
Output: mnist_post_quant_model_io_edgetpu.tflite

Operator                       Count      Status

DEQUANTIZE                     1          Operation is working on an unsupported data type
SOFTMAX                        1          Mapped to Edge TPU
FULLY_CONNECTED                2          Mapped to Edge TPU
QUANTIZE                       1          Operation is otherwise supported, but not mapped due to some unspecified limitation

The solution is to downgrade to version 1.15.

@jk78346
Copy link

jk78346 commented Nov 15, 2019

For preparing .tflite model, I use tf=1.13.1 as well as tflite_convert from command line; when running tf2.0 is used since I want to use delegate.

@sdu2011
Copy link

sdu2011 commented Dec 12, 2019

Yes, as for the UINT8 output problem, it is solved, it seems like the official API will use code to convert your UINT8 output into float32. So finally you will get float32 output. Here is the source code from Coral support team:

const auto& output_indices = interpreter->outputs();
  const int num_outputs = output_indices.size();
  int out_idx = 0;
  for (int i = 0; i < num_outputs; ++i) {
    const auto* out_tensor = interpreter->tensor(output_indices[i]);
    CHECK(out_tensor);
    if (out_tensor->type == kTfLiteUInt8) {
      const int num_values = out_tensor->bytes;
      const uint8_t* output = interpreter->typed_output_tensor<uint8_t>(i);
      CHECK(output);
      for (int j = 0; j < num_values; ++j) {
        output_data[out_idx++] = (output[j] - out_tensor->params.zero_point) *
                                 out_tensor->params.scale;
      }
    } else if (out_tensor->type == kTfLiteFloat32) {
      const int num_values = out_tensor->bytes / sizeof(float);
      const float* output = interpreter->typed_output_tensor<float>(i);
      CHECK(output);
      for (int j = 0; j < num_values; ++j) {
        output_data[out_idx++] = output[j];
      }
    } else {
      LOG(FATAL) << "Tensor " << out_tensor->name
                 << " has unsupported output type: " << out_tensor->type;
    }
    CHECK_LE(out_idx, output_size);
  }

But after that, I got a float32 output but it is not correct. Quickly I get the reason. Since the opt.pb provided by the open-pose GitHub project is not a quantization-Aware Training model, so the output from the open-pose model is very different from the correct output because you have to set the converter.default_ranges_stats when you convert the .pb file into the .tflite file. This is because the max-min numbers are required and the correct min-max numbers are missing from the original .pb file. I try different ranges, such as (0,6), (-1,2), or others, but I can't find a good range to make the final result even close to the correct results. If you can find one, please let me know. The recommended way is to train the open-pose estimation model again with quantization-aware training flag =1.

As for the segmentation fault, sorry I didn't get that error before. Do Do you use the open-pose model from the here? Which model do you use, and does your model pass the Compile your model for the Edge TPU?

Anyway, I'm glad there is someone working on a similar thing, Let's keep sharing information.

老哥中国人?你用这段代码没问题吗?我有一个 quantization-Aware Training model,转成tflite. float和int映射应该是zero_mean=128,scale=1/128 而且我自己写代码加载tflite做分类是没问题的. 但是不知道为什么,在coral上用编译后的edgetpu.tflite做推理

>       for (int j = 0; j < num_values; ++j) {
>         output_data[out_idx++] = (output[j] - out_tensor->params.zero_point) *
>                                  out_tensor->params.scale;

我加了打印,发现out_tensor->params.zero_point=0,out_tensor->params.scale=1/255

@Eashwar93
Copy link

Eashwar93 commented Apr 22, 2020

@caiya55 Were you successfully able to convert the openpose model for the edgetpu using the edgetpu compiler. I am curretly trying to do it ,but the edgetpu compiler just abort without any debug info. I downloaded the frozen graph from the same source as you (ildoonet/tf-pose-estimation). I also created a new thread regarding the issue.

If you have successfully converted, then please let me know what I did wrong.
Also it would be nice if you could share the converted model, in-case if you are unable to find what is the issue in my code.

@caiya55
Copy link
Author

caiya55 commented Apr 23, 2020

@Eashwar93 Thanks for asking. I've been trying to convert openpose on coral last year. I've worked for several months, and finally, I succeeded. But the problem is: openpose is too slow for coral due to its pose processing step. The reason is obvious: coral is good for TPU, but I think its CPU is not powerful. The speed that I test is around1.5s/frame, and the post-process step takes 1.3s.

My next step was to convert the post-processing step into C++ and improve the speed. I didn't continue this task, because, at that time, the Google Coral team just release its pose estimation model on coral, which is easy to implement, and its almost real-time (7-9 frame/s). So we just use their model. You can easily find the model from Google Coral official website.

If you are still interested in converting the openpose model on coral, my experience is, using the Keras model! I found that the Keras model is easier to pass the TPU compiler, and less error, when I use this command: tf.compat.v1.lite.TFLiteConverter.from_keras_model_file. So, generally, my steps are, download the ckpt files, establish a Keras model, load the ckpt parameters to each layer of Keras model, and then save the model into h5 file. Trust me, compared with quantization-Aware Training the openpose from scratch, this is easier way. Finally, you can covert the .h5 model to .tflite file.

If you are interested, you can check my colab file here, using colab file to convert is a good way to avoid the configuration and different-version errors.

@Eashwar93
Copy link

@caiya55 Thanks a lot for your detailed explanation. I tried Google's Posenet but I think the accuracy is not that great for a real world robot to use it. I wanted to evaluate if Openpose is offering the appropraite solution. Thanks for the heads up that the Openpose is slow in EdgeTPU. I think I would be next try to write the pose inference in C++ and probably try to find out if there is a way to postprocess using GPU if that would make it real time.

I will try out your convert from keras file colab notebook to see if I'm able to convert if to edgetpu model.

Once again thanks a lot.

@caiya55
Copy link
Author

caiya55 commented Apr 23, 2020 via email

@Eashwar93
Copy link

@caiya55 thanks but unfortunately I don't use we-chat and even if I create one I don't have a friend here to verify my we chat account through the QR code scanning procedure. Thanks for your help. It does mean a lot.

@Ekta246
Copy link

Ekta246 commented May 17, 2020

  • Doc you were trying to follow: coral tflite file
  • Your host OS: Ubuntu
  • Your Python3 version: python 3.6

Now I have my own complied .tflite model, the file is already passed the Compile your model for the Edge TPU, and visualize.py shows that the operators are all become UINT8 type. Now I'd like to know how to deploy the model and make it run in TPU. The python API in the official website provides two engines, one is edgetpu.classification.engine, and another is edgetpu.detection.engine. But my mode is the open-pose (human pose estimation) model, so the output is different.

Is anyone working on the deployment of your own model? I will appreciate that if someone could give me some clues.

Hi @caiya55 I would like to know how were you able to convert your .pb model to the .tflie model. I want to do an inference on the Google Coral for Efficientdet.

@Eashwar93
Copy link

Eashwar93 commented May 17, 2020

@Ekta246

You can use the code below that was suitable for my model to convert it to a tflite model.

import tensorflow as tf
import numpy as np

def representative_dataset_gen():
for _ in range(100):
fake_image = np.random.random((1,432,368,3)).astype(np.float32)
yield [fake_image]

graph_pb = 'graph_freeze.pb'
inp = ['image']
out = ['Openpose/concat_stage7']
converter=tf.lite.TFLiteConverter.from_frozen_graph(
graph_pb, inp, out,input_shapes={"image":[1,432,368,3]})
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model = converter.convert()

f = open("tflite_model/mobilenet_thin_openpose_opt_fullint_tf1.tflite", "wb")
f.write(tflite_model)

f.close()
print("conversion complete")

you need to modify the code according to your model. You will have to change the arguments graph_pb, inp, out, input_shapes passed into the TFLiteConverter.from_frozen_graph()
But I guess you can't run the tflite model directly in a coral Edgetpu. You will have to compile the model with the edgetpu compiler as shown here.

@Ekta246
Copy link

Ekta246 commented May 21, 2020

Basically, this is working if you convert your model correctly. But I get weird output now. And one thing that really bothers me is that the type of output in my .tflite file is UINT8, but actually, it is FLOAT32 in .pb file.

I am still working on it. According to the official tflite, their output is FLOAT32, so I am checking the converting way, maybe there are some tricks.

And you can use tensorflow/lite/tools/visualize.py to check the tensors of your tflite model. It is very convenient.

First, which version of the Tensorflow converter did you use for converting the graph to the .tflite model?
Second, did you happen to choose the Full Integer post-training quantization option? mentioned in the https://www.tensorflow.org/lite/performance/post_training_quantization

@Ekta246
Copy link

Ekta246 commented May 21, 2020

@Ekta246

You can use the code below that was suitable for my model to convert it to a tflite model.

import tensorflow as tf
import numpy as np

def representative_dataset_gen():
for _ in range(100):
fake_image = np.random.random((1,432,368,3)).astype(np.float32)
yield [fake_image]

graph_pb = 'graph_freeze.pb'
inp = ['image']
out = ['Openpose/concat_stage7']
converter=tf.lite.TFLiteConverter.from_frozen_graph(
graph_pb, inp, out,input_shapes={"image":[1,432,368,3]})
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model = converter.convert()

f = open("tflite_model/mobilenet_thin_openpose_opt_fullint_tf1.tflite", "wb")
f.write(tflite_model)

f.close()
print("conversion complete")

you need to modify the code according to your model. You will have to change the arguments graph_pb, inp, out, input_shapes passed into the TFLiteConverter.from_frozen_graph()
But I guess you can't run the tflite model directly in a coral Edgetpu. You will have to compile the model with the edgetpu compiler as shown here.

@Eashwar93 thanks for your quick response!
First, I have converted the saved_model to the .tflite instead of the graph.pb. I happened to convert it to a quantized .tflite file.
After passing it through the edgetpu_compiler it shows "invalid model. Model not quantized"
Any help with this.

@Eashwar93
Copy link

@Ekta246 Could you share the codes that you used to convert the saved_model file to a quantised .tflite model?

@Ekta246
Copy link

Ekta246 commented May 21, 2020

@Ekta246 Could you share the codes that you used to convert the saved_model file to a quantised .tflite model?

Yes, why not?

import tensorflow as tf
saved_model_dir =('./savedmodeldir')
saved_model_obj = tf.saved_model.load(export_dir=saved_model_dir)
print(saved_model_obj.signatures.keys())

    # Load the specific concrete function from the SavedModel.

concrete_func = saved_model_obj.signatures[tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY]

    # Set the shape of the input in the concrete function.
    # concrete_func.inputs[0].set_shape([])

    # Convert the model to a TFLite model.

#concrete_func = model.signatures[tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY]
concrete_func.inputs[0].set_shape([1, 512, 512, 3])
converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])

converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.experimental_new_converter = True
def representative_dataset_gen():
for _ in range(num_calibration_steps):
yield [np.array([[1,512,512,3].astype(np.uint8)])]
converter.representative_dataset = representative_dataset_gen

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
#converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
tflite_quant_model = converter.convert()

open("converted.tflite", "wb").write(tflite_quant_model)'''

@Ekta246
Copy link

Ekta246 commented May 21, 2020

On using the,
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
I get some errors like OSError:Saved_model doesn't exist at './path'
Cannot solve that issue since long.
And get the error as This converter can only convert a single "
"ConcreteFunction. Converting multiple functions is "
"under development.")

@Eashwar93
Copy link

@Ekta246 I think you are not quantising the input and output layers. I'm not sure if that is supported by the edgetpu compiler yet. In-order to quantise the input and output layers you should probably add

converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8

before running

tflite_quant_model = converter.convert()

Also have a look at this visualizer netron to see if quantisation has happened. I'm not an expert either but I think quantising your input and output layers should solve your issue.

@Ekta246
Copy link

Ekta246 commented May 21, 2020 via email

@Eashwar93
Copy link

@Ekta246 Oh ok. I could not find that in the code you shared, hence the suggestion. Sure I can share the quantized model. I am facing issues as well with the quantized model when I pass through the EdgeTPU compiler but its a different one. I would like to ask you to look at these issues if it was the same for you.
The quantised model is available in both of these issue pages
google-coral/edgetpu#100 (comment)
tensorflow/tensorflow#38978 (comment)

If you could share your model that would be nice as well for me to have a look

@Namburger Namburger added Getting Started Related to following Google's tutorials help wanted Extra attention is needed Tutorial Linux stuff that you can do to have a better Coral experience. labels May 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Getting Started Related to following Google's tutorials help wanted Extra attention is needed Tutorial Linux stuff that you can do to have a better Coral experience.
Projects
None yet
Development

No branches or pull requests