
# Overview
This CodeLab demonstrates how to build a fused TFLite LSTM model for MNIST recognition using Keras, and how to convert it to TensorFlow Lite.

The CodeLab is very similar to the Keras LSTM [CodeLab](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/experimental_new_converter/keras_lstm.ipynb). However, we're creating fused LSTM ops rather than the unfused versoin.

Also note: We're not trying to build the model to be a real world application, but only demonstrate how to use TensorFlow Lite. You can a build a much better model using CNN models. For a more canonical lstm codelab, please see [here](https://github.com/keras-team/keras/blob/master/examples/imdb_lstm.py).


# Step 0: Prerequisites
It's recommended to try this feature with the newest TensorFlow nightly pip build.

In [1]:
!pip install tf-nightly

Collecting tf-nightly
  Downloading tf_nightly-2.10.0.dev20220424-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (502.9 MB)
[K     |████████████████████████████████| 502.9 MB 22 kB/s 
Collecting flatbuffers<2,>=1.12
  Downloading flatbuffers-1.12-py2.py3-none-any.whl (15 kB)
Collecting gast<=0.4.0,>=0.2.1
  Downloading gast-0.4.0-py3-none-any.whl (9.8 kB)
Collecting tf-estimator-nightly~=2.10.0.dev
  Downloading tf_estimator_nightly-2.10.0.dev2022042408-py2.py3-none-any.whl (438 kB)
[K     |████████████████████████████████| 438 kB 67.4 MB/s 
Collecting tb-nightly~=2.9.0.a
  Downloading tb_nightly-2.9.0a20220424-py3-none-any.whl (5.8 MB)
[K     |████████████████████████████████| 5.8 MB 41.7 MB/s 
Collecting keras-nightly~=2.10.0.dev
  Downloading keras_nightly-2.10.0.dev2022042407-py2.py3-none-any.whl (1.6 MB)
[K     |████████████████████████████████| 1.6 MB 38.1 MB/s 
Installing collected packages: tf-estimator-nightly, tb-nightly, keras-nightly, gast, flatbuffers, tf-ni

# Step 1: Build the MNIST LSTM model.

In [2]:
import numpy as np
import tensorflow as tf

In [3]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(28, 28), name='input'),
    tf.keras.layers.LSTM(20, time_major=False, return_sequences=True),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax, name='output')
])
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 28, 20)            3920      
                                                                 
 flatten (Flatten)           (None, 560)               0         
                                                                 
 output (Dense)              (None, 10)                5610      
                                                                 
Total params: 9,530
Trainable params: 9,530
Non-trainable params: 0
_________________________________________________________________


# Step 2: Train & Evaluate the model.
We will train the model using MNIST data.

In [4]:
# Load MNIST dataset.
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
x_train = x_train.astype(np.float32)
x_test = x_test.astype(np.float32)

# Change this to True if you want to test the flow rapidly.
# Train with a small dataset and only 1 epoch. The model will work poorly
# but this provides a fast way to test if the conversion works end to end.
_FAST_TRAINING = False
_EPOCHS = 5
if _FAST_TRAINING:
  _EPOCHS = 1
  _TRAINING_DATA_COUNT = 1000
  x_train = x_train[:_TRAINING_DATA_COUNT]
  y_train = y_train[:_TRAINING_DATA_COUNT]

model.fit(x_train, y_train, epochs=_EPOCHS)
model.evaluate(x_test, y_test, verbose=0)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


[0.06411228328943253, 0.9805999994277954]

In [17]:
x_train.shape

(60000, 28, 28)

# Step 3: Convert the Keras model to TensorFlow Lite model.

In [19]:
def representative_dataset():
    for _ in range(100):
      data = np.random.rand(1, 28, 28)
      yield [data.astype(np.float32)]
run_model = tf.function(lambda x: model(x))
# This is important, let's fix the input size.
BATCH_SIZE = 1
STEPS = 28
INPUT_SIZE = 28
concrete_func = run_model.get_concrete_function(
    tf.TensorSpec([BATCH_SIZE, STEPS, INPUT_SIZE], model.inputs[0].dtype))

# model directory.
MODEL_DIR = "keras_lstm"
model.save(MODEL_DIR, save_format="tf", signatures=concrete_func)

converter = tf.lite.TFLiteConverter.from_saved_model(MODEL_DIR)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
tflite_model = converter.convert()
with open('model_dynamic_intonly.tflite', 'wb') as f:
  f.write(tflite_model)



INFO:tensorflow:Assets written to: keras_lstm/assets


INFO:tensorflow:Assets written to: keras_lstm/assets


# Step 4: Check the converted TensorFlow Lite model.
Now load the TensorFlow Lite model and use the TensorFlow Lite python interpreter to verify the results.

In [20]:
# Run the model with TensorFlow to get expected results.
TEST_CASES = 10

# Run the model with TensorFlow Lite
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

for i in range(TEST_CASES):
  expected = model.predict(x_test[i:i+1])
  interpreter.set_tensor(input_details[0]["index"], x_test[i:i+1, :, :])
  interpreter.invoke()
  result = interpreter.get_tensor(output_details[0]["index"])

  # Assert if the result of TFLite model is consistent with the TF model.
  np.testing.assert_almost_equal(expected, result, decimal=4)
  print("Done. The result of TensorFlow matches the result of TensorFlow Lite.")

  # Please note: TfLite fused Lstm kernel is stateful, so we need to reset
  # the states.
  # Clean up internal states.
  interpreter.reset_all_variables()



ValueError: ignored

# Step 5: Let's inspect the converted TFLite model.

Let's check the model, you can see the LSTM will be in it's fused format.

![Fused LSTM](https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/lite/examples/experimental_new_converter/keras_lstm.png)
