Internal compiler error #31368

DocDriven · 2019-08-06T13:43:34Z

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: -
TensorFlow installed from (source or binary): Docker image latest-gpu-py3
TensorFlow version (use command below): 1.14.0
Python version: 3.6
Bazel version (if compiling from source): -
GCC/Compiler version (if compiling from source): -
CUDA/cuDNN version: 10.1
GPU model and memory: RTX 2080 Ti / 12 GB

I have created a fully-quantized tf lite model from a saved model. But trying to compile it with the edgetpu_compiler, I get an error:

user@ubuntu:~/tf/tensorflow1_14$ edgetpu_compiler saved_converted_linearmodel_tpu_1.14.0.tflite 
Edge TPU Compiler version 2.0.258810407
INFO: Initialized TensorFlow Lite runtime.

Internal compiler error. Aborting!

Error message is unfortunately not very helpful. The non-compiled version is loadable and produces the correct results.

I have attached the model that I try to compile, as well as its visualization (via visualize.py).

litemodel.tar.gz

The text was updated successfully, but these errors were encountered:

cuongdv1 · 2019-09-04T10:25:13Z

@DocDriven I have same issue.
have you solved it? could you share your solution?

DocDriven · 2019-09-04T12:24:17Z

@cuongdv1 Unfortunately, I haven't figured it out yet because as far as I know, the source code for the compiler is not open source. Therefore, I couldn't debug it. Best bet is to wait for a new release of the compiler and try again.

jbrownkramer · 2019-09-06T21:51:52Z

I'll add that I get this error when I try to compile an object detection tflite model produced by Google Cloud AutoML. Also using Edge TPU Compiler version 2.0.258810407

DocDriven · 2019-09-11T11:45:45Z

Are there any updates on this topic? I have come across this problem multiple times now, even with networks that are shipped with keras (e.g. VGG16). The test code for this is below.

import os
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications import mobilenet, resnet50, inception_v3, vgg16
from tensorflow.keras.preprocessing.image import load_img
from tensorflow.keras.preprocessing.image import img_to_array
from tensorflow.keras.applications.resnet50 import decode_predictions


### Load and test model

imagenet_dir = './tiny-imagenet-200/test/images'

vgg16_model = vgg16.VGG16(weights='imagenet')
print(vgg16_model.summary())

filename = './bird.jpg'
original = load_img(filename, target_size=(224, 224))
numpy_image = img_to_array(original)
image_batch = np.expand_dims(numpy_image, axis=0)
processed_image = vgg16.preprocess_input(image_batch.copy())

predictions = vgg16_model.predict(processed_image)
label = decode_predictions(predictions)
print(label)

keras_file = 'vgg16.h5'
keras.models.save_model(vgg16_model, keras_file)


### TF lite conversion

def representative_dataset_gen():
	for image in os.listdir('./tiny-imagenet-200/test/images')[:500]:
		original = load_img(os.path.join('./tiny-imagenet-200/test/images', image), target_size=(224, 224))
		numpy_image = img_to_array(original)
		image_batch = np.expand_dims(numpy_image, axis=0)
		processed_image = vgg16.preprocess_input(image_batch.copy())
		print(processed_image.shape)
		print(type(processed_image))
		yield [processed_image]

converter = tf.compat.v1.lite.TFLiteConverter.from_keras_model_file(keras_file)
converter.representative_dataset = representative_dataset_gen
converter.optimizations = [tf.lite.Optimize.DEFAULT]

tflite_model = converter.convert()
open("vgg16_fiq.tflite", "wb").write(tflite_model)


### Test tflite model

interpreter = tf.lite.Interpreter(model_path="vgg16.tflite")
interpreter.allocate_tensors()

input_detail = interpreter.get_input_details()[0]
output_detail = interpreter.get_output_details()[0]
print('Input detail: ', input_detail)
print('Output detail: ', output_detail)

interpreter.set_tensor(input_detail['index'], processed_image)
interpreter.invoke()
pred_litemodel = interpreter.get_tensor(output_detail['index'])
label_lite = decode_predictions(pred_litemodel)
print(label_lite)

I used the Tiny ImageNet dataset for the post training quantization. Also, my test picture is the one from the Coral demo, which I have attached. It should output magpie (bird.jpg).

I can produce a tflite file with this code, but the TPU compiler throws the "Internal compiler error" again. Can you please confirm to me, if this is reproducable?

Lap1n · 2019-09-25T19:12:52Z

I have the same error using tensorflow 2.0 nightly and tensorflow 1.0 nightly. Any update on this? Since the error is very generic, it is very hard to debug...

DocDriven · 2019-09-26T10:04:50Z

@Lap1n
At least for the VGG16 model in my previous post, I was able to compile it with the new compiler version 2.0.267685300. Unfortunately, this did not resolve the original problem for me. Tested it with the TF 1.15 nightly docker image.

ynorz · 2019-11-07T08:51:46Z

I had the same error using the MobileNet v2 model in Keras with the tiny-imagenet-200 dataset. The TPU compiler version was 2.0.267685300. The quantized tflite file was produced successfully, but it cannot be compiled.

bhavitvyamalik · 2019-11-07T10:48:56Z

@ynorz can you show me the code you used for converting and quantizing your model?

ynorz · 2019-11-07T23:58:52Z

@ynorz can you show me the code you used for converting and quantizing your model?

def get_label(file_path):
  # convert the path to a list of path components
  parts = tf.strings.split(file_path, '/')
  # The second to last is the class-directory
  return parts[-3] == CLASS_NAMES

def decode_img(img):
  # convert the compressed string to a 3D uint8 tensor
  img = tf.image.decode_jpeg(img, channels=3)
  # Use `convert_image_dtype` to convert to floats in the [0,1] range.
  img = tf.image.convert_image_dtype(img, tf.float32)
  # resize the image to the desired size.
  return tf.image.resize(img, [224, 224])

def process_path(file_path):
  label = get_label(file_path)
  # load the raw data from the file as a string
  img = tf.io.read_file(file_path)
  img = decode_img(img)
  return label, img

data_dir = '/my_data_dir'
data_dir = pathlib.Path(data_dir)
list_ds = tf.data.Dataset.list_files(str(data_dir/'*/*/*'))
image_count = len(list(data_dir.glob('*/*/*.JPEG')))
CLASS_NAMES = np.array([item.name for item in data_dir.glob('*')])
labeled_ds = list_ds.map(process_path, num_parallel_calls=100)
tf.compat.v1.enable_eager_execution()
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
def representative_data_gen():
    for _,image in labeled_ds.take(100):
        image = tf.expand_dims(image, 0)
        yield [image]

converter.representative_dataset = tf.lite.RepresentativeDataset(representative_data_gen)
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
converted_tflite_model = converter.convert()
open(TFLITE_MODEL,"wb").write(converted_tflite_model)

ynorz · 2019-11-08T00:02:53Z

@bhavitvyamalik
The tflite model I got from this code could run on CPU but compiling it would trigger the compiler error.

bhavitvyamalik · 2019-11-08T04:52:52Z

There can be 2 possibilities why your model is giving internal compiler error.

Most basic of all is you have not done your modeling right and some operators are not supported by Edge TPU while you are compiling it. I remember my custom model giving the same error when it had only basic operators which are supported by Edge TPU. So make sure your model is supported which you are converting. If it applies then move on to next point.
Versioning error. I tried converting my code in Tensorflow2.0 which didn't run in Edge TPU even after compiling. Make sure you are using Tensorflow 1.15.0 for converting and quantizing. This was my code for the same:

def representative_data_gen():
  for input_value in mnist_data[:100]:     //mnist_data list contains images which were resized during appending
    data = np.array([input_value])
    yield [data]

opt = tf.lite.Optimize.DEFAULT
ops = tf.lite.OpsSet.TFLITE_BUILTINS_INT8
dtype = tf.uint8

converter = tf.lite.TFLiteConverter.from_keras_model_file(model_path)

converter.optimizations = [opt]
converter.representative_dataset = representative_data_gen
converter.target_spec.supported_ops = [ops]
converter.inference_input_type = dtype
converter.inference_output_type = dtype

tflite_quant_model = converter.convert()
open("model_quantised.tflite", "wb").write(tflite_quant_model)

ynorz · 2019-11-21T23:39:24Z

https://colab.research.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/performance/post_training_integer_quant.ipynb

I tried to do the post-training integer quantization following the official guide on MNIST. And this guide can only run on tensorflow 1.15.0, which, to some extent, did prove your point that tensorflow 1.15 worked better. However, I still get the internal compiler error with compiler version: 2.0.267685300.

bhavitvyamalik · 2019-11-22T04:40:39Z

If you are getting internal compiler error then your operations are supported by the Edge TPU during compilation. However it can still run on the CPU of your Edge TPU but it'll increase the inference time to a large extent.

Most importantly, you can compile only these models successfully on Edge TPU:

Mobilenet_v1
Mobilenet_v2
Inception_v3
ResNet50

If you'll use any other model, it might not work properly. Try using one of these models followed by quantization using the code I posted earlier. It should work flawlessly.

yanghaoyue001 · 2020-01-28T10:11:39Z

Figured out a solution that sounds stupid. I moved the folder 'models' with .tflite file to '/home/username/edgetpu', and then the compiler works with the same compile code provided on the official website. This 'edgetpu' folder was created through a beginner object detection retrain example using dataset of American bulldog and Abyssinian provided on the official website.

My setup: custom dataset, mobilenet_v1 or mobilenet_v2 downloaded from coral website, coral accelerator.

jdcast · 2020-02-09T03:07:21Z

Figured out a solution that sounds stupid. I moved the folder 'models' with .tflite file to '/home/username/edgetpu', and then the compiler works with the same compile code provided on the official website. This 'edgetpu' folder was created through a beginner object detection retrain example using dataset of American bulldog and Abyssinian provided on the official website.

My setup: custom dataset, mobilenet_v1 or mobilenet_v2 downloaded from coral website, coral accelerator.

On this same example code, I initially received this error after compiling with edgetpu_compiler output_tflite_graph.tflite:

Edge TPU Compiler version 2.0.291256449

Model compiled successfully in 276 ms.

Input model: output_tflite_graph.tflite
Input size: 5.34MiB
Output model: output_tflite_graph_edgetpu.tflite
Output size: 5.75MiB
On-chip memory available for caching model parameters: 7.62MiB
On-chip memory used for caching model parameters: 5.66MiB
Off-chip memory used for streaming uncached model parameters: 0.00B
Number of Edge TPU subgraphs: 1
Total number of operations: 64
Operation log: output_tflite_graph_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs.
Number of operations that will run on Edge TPU: 63
Number of operations that will run on CPU: 1
See the operation log file for individual operation details.
Error opening file for writing: output_tflite_graph_edgetpu.tflite

Internal compiler error. Aborting!

But was able to get around it after I ran with sudo, which gives the following output:

Edge TPU Compiler version 2.0.291256449

Model compiled successfully in 341 ms.

Input model: output_tflite_graph.tflite
Input size: 5.34MiB
Output model: output_tflite_graph_edgetpu.tflite
Output size: 5.75MiB
On-chip memory available for caching model parameters: 7.62MiB
On-chip memory used for caching model parameters: 5.66MiB
Off-chip memory used for streaming uncached model parameters: 0.00B
Number of Edge TPU subgraphs: 1
Total number of operations: 64
Operation log: output_tflite_graph_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs.
Number of operations that will run on Edge TPU: 63
Number of operations that will run on CPU: 1
See the operation log file for individual operation details.

Note: I didn't have to move the files around as mentioned in the previous post.

anilsathyan7 · 2020-02-19T13:40:22Z

I'am having same issue with a tflite model with transpose convolution. Tensorflow 1.x does not seem to support transpose convolution. With latest tf2.0-nighlty quantized tflite model it gives the error: 'Internal compiler error. Aborting!. It would be helpful, if the compiler exactly prints the cause of failure i.e. if some operators or it's version is not supported. It seem to works until some random convolutional layer(602) and produces compiler error after its inclusion!!!

pnet_test.tflite.zip

tekotan · 2020-05-05T05:23:00Z

I'am having same issue with a tflite model with transpose convolution. Tensorflow 1.x does not seem to support transpose convolution. With latest tf2.0-nighlty quantized tflite model it gives the error: 'Internal compiler error. Aborting!. It would be helpful, if the compiler exactly prints the cause of failure i.e. if some operators or it's version is not supported. It seem to works until some random convolutional layer(602) and produces compiler error after its inclusion!!!

pnet_test.tflite.zip

I am also having a similar issue. I have a custom model which uses transpose convolution that I want to compile for edge tpu. Was there any solution?

danieldanuega · 2020-09-15T15:21:05Z

I'am having same issue with a tflite model with transpose convolution. Tensorflow 1.x does not seem to support transpose convolution. With latest tf2.0-nighlty quantized tflite model it gives the error: 'Internal compiler error. Aborting!. It would be helpful, if the compiler exactly prints the cause of failure i.e. if some operators or it's version is not supported. It seem to works until some random convolutional layer(602) and produces compiler error after its inclusion!!!

pnet_test.tflite.zip

Have you tried edgetpu_compiler -s your_tflite_graph.tflite?
It prints out the log

tensorflowbutler · 2021-02-01T14:05:46Z

Hi There,

We are checking to see if you still need help on this, as you are using an older version of tensorflow which is officially considered end of life . We recommend that you upgrade to the latest 2.x version and let us know if the issue still persists in newer versions. Please open a new issue for any help you need against 2.x, and we will get you the right help.

This issue will be closed automatically 7 days from now. If you still need help with this issue, please provide us with more information.

google-ml-butler · 2021-02-09T06:00:24Z

Are you satisfied with the resolution of your issue?
Yes
No

oanush self-assigned this Aug 7, 2019

oanush added comp:lite TF Lite related issues TF 1.14 for issues seen with TF 1.14 type:bug Bug labels Aug 7, 2019

oanush assigned suharshs and unassigned oanush Aug 7, 2019

oanush added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Nov 7, 2019

tensorflowbutler closed this as completed Feb 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Internal compiler error #31368

Internal compiler error #31368

DocDriven commented Aug 6, 2019

cuongdv1 commented Sep 4, 2019

DocDriven commented Sep 4, 2019

jbrownkramer commented Sep 6, 2019

DocDriven commented Sep 11, 2019 •

edited

Lap1n commented Sep 25, 2019

DocDriven commented Sep 26, 2019

ynorz commented Nov 7, 2019

bhavitvyamalik commented Nov 7, 2019

ynorz commented Nov 7, 2019

ynorz commented Nov 8, 2019

bhavitvyamalik commented Nov 8, 2019

ynorz commented Nov 21, 2019

bhavitvyamalik commented Nov 22, 2019

yanghaoyue001 commented Jan 28, 2020

jdcast commented Feb 9, 2020 •

edited

anilsathyan7 commented Feb 19, 2020 •

edited

tekotan commented May 5, 2020

danieldanuega commented Sep 15, 2020

tensorflowbutler commented Feb 1, 2021

google-ml-butler bot commented Feb 9, 2021

Internal compiler error #31368

Internal compiler error #31368

Comments

DocDriven commented Aug 6, 2019

cuongdv1 commented Sep 4, 2019

DocDriven commented Sep 4, 2019

jbrownkramer commented Sep 6, 2019

DocDriven commented Sep 11, 2019 • edited

Lap1n commented Sep 25, 2019

DocDriven commented Sep 26, 2019

ynorz commented Nov 7, 2019

bhavitvyamalik commented Nov 7, 2019

ynorz commented Nov 7, 2019

ynorz commented Nov 8, 2019

bhavitvyamalik commented Nov 8, 2019

ynorz commented Nov 21, 2019

bhavitvyamalik commented Nov 22, 2019

yanghaoyue001 commented Jan 28, 2020

jdcast commented Feb 9, 2020 • edited

anilsathyan7 commented Feb 19, 2020 • edited

tekotan commented May 5, 2020

danieldanuega commented Sep 15, 2020

tensorflowbutler commented Feb 1, 2021

google-ml-butler bot commented Feb 9, 2021

DocDriven commented Sep 11, 2019 •

edited

jdcast commented Feb 9, 2020 •

edited

anilsathyan7 commented Feb 19, 2020 •

edited