# Exercise 3: Converting Models to TFLite and Running Inference

https://www.tensorflow.org/lite/models/convert

In this exercise you will:
- Learn how to convert a Tensorflow/Keras model to TensorFlow Lite (TFLite).
- Learn how to use the TFLite Interpreter to run inference on a TFLite model.

As a first step, let's import the python modules we need.

In [2]:
import os
import numpy as np
import tensorflow as tf

## TensorFlow Lite Converter

<img width="400" height="500" src="../notebook_images/tflite_convert.png" style="padding: 0px; float: right;">

The TensorFlow Lite converter takes a TensorFlow model and generates a TensorFlow Lite model (an optimized [FlatBuffer](https://google.github.io/flatbuffers) format identified by the `.tflite` file extension). You have the following two options for using the converter:

1. Python API (**recommended**): This makes it easier to convert models as part of the model development pipeline, apply optimizations, add metadata and has many more features.
2. Command line: This only supports basic model conversion.

### Python API

Helper code: To identify the installed TensorFlow version, run `print(tf.__version__)` and to learn more about the TensorFlow Lite converter API, run `print(help(tf.lite.TFLiteConverter))`.

If you've installed [TensorFlow 2.x](https://www.tensorflow.org/install/pip#tensorflow-2-packages-are-available), you have the following two options (if you've installed [TensorFlow 1.x](https://www.tensorflow.org/install/pip#older-versions-of-tensorflow), refer to [Github](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/r1/convert/python_api.md)):

- Convert a TensorFlow 2.x model using [tf.lite.TFLiteConverter](https://www.tensorflow.org/api_docs/python/tf/lite/TFLiteConverter). A TensorFlow 2.x model is stored using the SavedModel format and is generated either using the high-level `tf.keras.*` APIs (a Keras model) or the low-level `tf.*` APIs (from which you generate concrete functions). As a result, you have the following three options (examples are in the next few sections):
  - [tf.lite.TFLiteConverter.from_saved_model()](https://www.tensorflow.org/lite/api_docs/python/tf/lite/TFLiteConverter#from_saved_model) (**recommended**): Converts a [SavedModel](https://www.tensorflow.org/guide/saved_model).
  - [tf.lite.TFLiteConverter.from_keras_model()](https://www.tensorflow.org/lite/api_docs/python/tf/lite/TFLiteConverter#from_keras_model): Converts a [Keras](https://www.tensorflow.org/guide/keras/sequential_model) model.
  - [tf.lite.TFLiteConverter.from_concrete_functions()](https://www.tensorflow.org/lite/api_docs/python/tf/lite/TFLiteConverter#from_concrete_functions): Converts [concrete functions](https://www.tensorflow.org/guide/intro_to_graphs).
- Convert a TensorFlow 1.x model using [tf.compat.v1.lite.TFLiteConverter](https://www.tensorflow.org/api_docs/python/tf/compat/v1/lite/TFLiteConverter) (examples are on [Github](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/r1/convert/python_api.md)):

  - [tf.compat.v1.lite.TFLiteConverter.from_saved_model()](https://www.tensorflow.org/api_docs/python/tf/compat/v1/lite/TFLiteConverter#from_saved_model): Converts a [SavedModel](https://www.tensorflow.org/guide/saved_model).
  - [tf.compat.v1.lite.TFLiteConverter.from_keras_model_file()](https://www.tensorflow.org/api_docs/python/tf/compat/v1/lite/TFLiteConverter#from_keras_model_file): Converts a [Keras](https://www.tensorflow.org/guide/keras/sequential_model) model.
  - [tf.compat.v1.lite.TFLiteConverter.from_session()](https://www.tensorflow.org/api_docs/python/tf/compat/v1/lite/TFLiteConverter#from_session): Converts a GraphDef from a session.
  - [tf.compat.v1.lite.TFLiteConverter.from_frozen_graph()](https://www.tensorflow.org/api_docs/python/tf/compat/v1/lite/TFLiteConverter#from_frozen_graph): Converts a Frozen GraphDef from a file. If you have checkpoints, then first convert it to a Frozen GraphDef file and then use this API as shown [here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/r1/convert/python_api.md#checkpoints).

  All examples below use TensorFlow 2.x.

### Convert a SavedModel (**recommended**)

The following example shows how to convert a [SavedModel](https://www.tensorflow.org/guide/saved_model) into a TensorFlow Lite model.

In [3]:
# Convert the model (let's use the saved model from the previous exercise)
import_path = os.path.join(os.getcwd(),'saved_models','model_1')
converter = tf.lite.TFLiteConverter.from_saved_model(import_path) # path to the SavedModel directory
tflite_model = converter.convert()

# Let's create the folder "saved_tflite_models" and store the TFLite models there
os.makedirs(os.path.join(os.getcwd(),'saved_tflite_models'), exist_ok = True)

# Save the model
tflite_model_path = os.path.join(os.getcwd(),'saved_tflite_models','model_1.tflite')
with open(tflite_model_path, 'wb') as f:
  f.write(tflite_model)

### Convert a Keras model 

The following example shows how to convert a [Keras](https://www.tensorflow.org/guide/keras/sequential_model) model into a TensorFlow Lite model.

In [4]:
# Create a model using high-level tf.keras.* APIs
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(units=1, input_shape=[1]),
    tf.keras.layers.Dense(units=16, activation='relu'),
    tf.keras.layers.Dense(units=1)
])

# Compile the model
model.compile(optimizer='sgd', loss='mean_squared_error')

# Train the model
model.fit(x=[-1, 0, 1], y=[-3, -1, 1], epochs=5)

# Save the model as a SavedModel (this step isn't really necessary)
tfcore_model2_path = os.path.join(os.getcwd(),'saved_models','model_2')
tf.saved_model.save(model, tfcore_model2_path)

# Convert the model (note: we are converting the model directly from the "model" variable)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Let's save the model as "model_2.tflite"
tflite_model2_path = os.path.join(os.getcwd(),'saved_tflite_models','model_2.tflite')
with open(tflite_model2_path, 'wb') as f:
  f.write(tflite_model)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
INFO:tensorflow:Assets written to: c:\Users\gabpat\projects\TEDS22\Workshop3\saved_models\model_2\assets
INFO:tensorflow:Assets written to: C:\Users\gabpat\AppData\Local\Temp\tmpma3gvqld\assets


### Convert concrete functions 

The following example shows how to convert [concrete functions](https://www.tensorflow.org/guide/intro_to_graphs) into a TensorFlow Lite model.

In [5]:
# Create a model using low-level tf.* APIs
class Squared(tf.Module):
  @tf.function
  def __call__(self, x):
    return tf.square(x)
model = Squared()

# Let's call the function (this step isn't really necessary)
input_data = tf.constant(5.0)
answer = model(input_data)
print(answer)

# Save the model as a SavedModel (this step isn't really necessary)
tfcore_model3_path = os.path.join(os.getcwd(),'saved_models','model_3')
tf.saved_model.save(model, tfcore_model3_path)

# Get the "concrete function"
concrete_func = model.__call__.get_concrete_function(input_data)

# Convert the model (note: we are converting the model directly from the "concrete_func" variable)
converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])
tflite_model = converter.convert()

# Let's save the model as "model_3.tflite"
tflite_model3_path = os.path.join(os.getcwd(),'saved_tflite_models','model_3.tflite')
with open(tflite_model3_path, 'wb') as f:
  f.write(tflite_model)

tf.Tensor(25.0, shape=(), dtype=float32)
INFO:tensorflow:Assets written to: c:\Users\gabpat\projects\TEDS22\Workshop3\saved_models\model_3\assets


INFO:tensorflow:Assets written to: c:\Users\gabpat\projects\TEDS22\Workshop3\saved_models\model_3\assets


We can also convert models using the [command line tool](https://www.tensorflow.org/lite/convert#command_line_tool_) (but the recommended way of doing this is using the Python API above).

[Optimization](https://www.tensorflow.org/lite/performance/model_optimization) can be applied when converting models (but the converter applies some default optimization for us).

Finally, we can load the TFLite models and run [inference](https://www.tensorflow.org/lite/guide/inference) on various edge devices, including various mobile phones, microcomputers and microcontrollers.

### Load and run a model in Python

https://www.tensorflow.org/lite/guide/inference

The Python API for running an inference is provided in the [tf.lite module](https://www.tensorflow.org/lite/api_docs/python/tf/lite). From which, you mostly need only [tf.lite.Interpreter](https://www.tensorflow.org/api_docs/python/tf/lite/Interpreter) to load a model and run an inference.

The following example shows how to use the Python interpreter to load a `.tflite` file and run inference with random input data.

In [6]:
# Load the TFLite model and allocate tensors.
# Let's load the TFLite model we saved to "./saved_tflite_models/model_2.tflite" above.
interpreter = tf.lite.Interpreter(model_path=tflite_model2_path)
interpreter.allocate_tensors()

# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Test the model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)

# "set_tensor" sets the data at the model's input.
interpreter.set_tensor(input_details[0]['index'], input_data)

# "invoke()" sends the data through the model.
interpreter.invoke()

# The function `get_tensor()` returns a copy of the tensor data.
# Use `tensor()` in order to get a pointer to the tensor.
# Let's get the output from the model and print it.
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

[[-0.482947]]


As an alternative to loading the model as a pre-converted `.tflite` file, you can combine your code with the [TensorFlow Lite Converter Python API](https://www.tensorflow.org/lite/models/convert) ([tf.lite.TFLiteConverter](https://www.tensorflow.org/lite/api_docs/python/tf/lite/TFLiteConverter)), allowing you to convert your TensorFlow model into the TensorFlow Lite format and then run inference.

This example contains a complete workflow:

1. Create a model using the `tf.keras.models.Sequential` API.
2. Train the model.
3. Convert the model to a TFLite model with the `tf.lite.TFLiteConverter`.
4. Load the model and run inference with the  `tf.lite.Interpreter`.

In [7]:
# Create a model using high-level tf.keras.* APIs
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(units=1, input_shape=[1], name='input_layer'),
    tf.keras.layers.Dense(units=16, activation='relu', name='hidden_layer'),
    tf.keras.layers.Dense(units=1, name='output_layer')
])

# Compile the model
model.compile(optimizer='sgd', loss='mean_squared_error')

# Train the model
model.fit(x=[-1, 0, 1], y=[-3, -1, 1], epochs=5)

# Convert to TFLite format
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Load the TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()

# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Test the model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)

# "set_tensor" sets the data at the model's input.
interpreter.set_tensor(input_details[0]['index'], input_data)

# "invoke()" sends the data through the model.
interpreter.invoke()

# Get the output from the model and print it.
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
INFO:tensorflow:Assets written to: C:\Users\gabpat\AppData\Local\Temp\tmph2n99psa\assets


INFO:tensorflow:Assets written to: C:\Users\gabpat\AppData\Local\Temp\tmph2n99psa\assets


[[-0.03515544]]
