Frequently Asked Questions

If you don't find an answer to your question here, please look through our detailed documentation for the topic or file a GitHub issue.

Model Conversion

What formats are supported for conversion from TensorFlow to TensorFlow Lite?

The TensorFlow Lite converter supports the following formats:

SavedModels: TFLiteConverter.from_saved_model
Frozen GraphDefs generated by freeze_graph.py: TFLiteConverter.from_frozen_graph
tf.keras HDF5 models: TFLiteConverter.from_keras_model_file
tf.Session: TFLiteConverter.from_session

The recommended approach is to integrate the Python converter into your model pipeline in order to detect compatibility issues early on.

Why doesn't my model convert?

Since the number of TensorFlow Lite operations is smaller than TensorFlow's, some inference models may not be able to convert. For unimplemented operations, take a look at the question on missing operators. Unsupported operators include embeddings and LSTM/RNNs. For conversion issues not related to missing operations, search our GitHub issues or file a new one.

How do I determine the inputs/outputs for GraphDef protocol buffer?

The easiest way to inspect a graph from a .pb file is to use the summarize_graph tool.

If that approach yields an error, you can visualize the GraphDef with TensorBoard and look for the inputs and outputs in the graph. To visualize a .pb file, use the import_pb_to_tensorboard.py script like below:

python import_pb_to_tensorboard.py --model_dir <model path> --log_dir <log dir path>

How do I inspect a `.tflite` file?

TensorFlow Lite models can be visualized using the visualize.py script in our repository.

Clone the TensorFlow repository
Run the visualize.py script with bazel:

bazel run //tensorflow/lite/tools:visualize model.tflite visualized_model.html

Models & Operations

Why are some operations not implemented in TensorFlow Lite?

In order to keep TensorFlow Lite lightweight, only certain operations were used in the converter. The Compatibility Guide provides a list of operations currently supported by TensorFlow Lite.

If you don’t see a specific operation (or an equivalent) listed, it's likely that it has not been prioritized. The team tracks requests for new operations on GitHub issue #21526. Leave a comment if your request hasn’t already been mentioned.

In the meanwhile, you could try implementing a custom operator or using a different model that only contains supported operators. If binary size is not a constraint, try using TensorFlow Lite with select TensorFlow ops.

How do I test that a TensorFlow Lite model behaves the same as the original TensorFlow model?

The best way to test the behavior of a TensorFlow Lite model is to use our API with test data and compare the outputs to TensorFlow for the same inputs. Take a look at our Python Interpreter example that generates random data to feed to the interpreter.

Optimization

How do I reduce the size of my converted TensorFlow Lite model?

Post-training quantization can be used during conversion to TensorFlow Lite to reduce the size of the model. Post-training quantization quantizes weights to 8-bits of precision from floating-point and dequantizes them during runtime to perform floating point computations. However, note that this could have some accuracy implications.

If retraining the model is an option, consider Quantization-aware training. However, note that quantization-aware training is only available for a subset of convolutional neural network architectures.

For a deeper understanding of different optimization methods, look at Model optimization.

How do I optimize TensorFlow Lite performance for my machine learning task?

The high-level process to optimize TensorFlow Lite performance looks something like this:

Make sure that you have the right model for the task. For image classification, check out our list of hosted models.
Tweak the number of threads. Many TensorFlow Lite operators support multi-threaded kernels. You can use SetNumThreads() in the C++ API to do this. However, increasing threads results in performance variability depending on the environment.
Use Hardware Accelerators. TensorFlow Lite supports model acceleration for specific hardware using delegates. For example, to use Android’s Neural Networks API, call UseNNAPI on the interpreter. Or take a look at our GPU delegate tutorial.
(Advanced) Profile Model. The Tensorflow Lite benchmarking tool has a built-in profiler that can show per-operator statistics. If you know how you can optimize an operator’s performance for your specific platform, you can implement a custom operator.

For a more in-depth discussion on how to optimize performance, take a look at Best Practices.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

faq.md

faq.md

Frequently Asked Questions

Model Conversion

What formats are supported for conversion from TensorFlow to TensorFlow Lite?

Why doesn't my model convert?

How do I determine the inputs/outputs for GraphDef protocol buffer?

How do I inspect a `.tflite` file?

Models & Operations

Why are some operations not implemented in TensorFlow Lite?

How do I test that a TensorFlow Lite model behaves the same as the original TensorFlow model?

Optimization

How do I reduce the size of my converted TensorFlow Lite model?

How do I optimize TensorFlow Lite performance for my machine learning task?

Files

faq.md

Latest commit

History

faq.md

File metadata and controls

Frequently Asked Questions

Model Conversion

What formats are supported for conversion from TensorFlow to TensorFlow Lite?

Why doesn't my model convert?

How do I determine the inputs/outputs for GraphDef protocol buffer?

How do I inspect a .tflite file?

Models & Operations

Why are some operations not implemented in TensorFlow Lite?

How do I test that a TensorFlow Lite model behaves the same as the original TensorFlow model?

Optimization

How do I reduce the size of my converted TensorFlow Lite model?

How do I optimize TensorFlow Lite performance for my machine learning task?

How do I inspect a `.tflite` file?