# 10 - TensorFlow Lite

Previously, we trained a model based on MobileNetV2 to differentiate between cats and dogs.

Prior to conducting inference with this model, we will convert the model to Tensorflow Lite.
This will have performance advantages on constrained hardware.

## Pre-reading

- [TensorFlow Lite](https://www.tensorflow.org/lite/guide)
- [Pre-trained models for TensorFlow Lite](https://www.tensorflow.org/lite/models/trained)
- [Model conversion overview](https://www.tensorflow.org/lite/models/convert)


## Prerequisites

Before this will work we need to upload the previously trained model and some sample images.

### Upload sample images

The GitHub repository has several cat and dog pictures co-located with this notebook.
Upload them to a directory named `img/` or go find your own test pictures.

In [1]:
# This should show several .jpgs of cats and dogs, after uploading
!ls img

cat1.jpg  cat2.jpg  cat3.jpg  dog1.jpg	dog2.jpg  dog3.jpg  Vin.jpg


### Upload the Model

First, upload the saved model from the previous lesson (`cat-dog-tuned.zip`)

Then unzip the model.

In [2]:
# Run after uploading the file
!unzip cat-dog-tuned.zip

Archive:  cat-dog-tuned.zip
replace cat-dog-tuned/keras_metadata.pb? [y]es, [n]o, [A]ll, [N]one, [r]ename: ^C


## Convert the saved model

We'll use [TFLiteConverter](https://www.tensorflow.org/lite/api_docs/python/tf/lite/TFLiteConverter) to export the model to a single binary.

In [3]:
import tensorflow as tf

print(tf.__version__)
print(help(tf.lite.TFLiteConverter))

2023-06-21 15:05:31.904920: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-06-21 15:05:31.946597: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-06-21 15:05:31.947460: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


2.12.0
Help on class TFLiteConverterV2 in module tensorflow.lite.python.lite:

class TFLiteConverterV2(TFLiteFrozenGraphConverterV2)
 |  TFLiteConverterV2(funcs, trackable_obj=None)
 |  
 |  Converts a TensorFlow model into TensorFlow Lite model.
 |  
 |  Attributes:
 |    optimizations: Experimental flag, subject to change. Set of optimizations to
 |      apply. e.g {tf.lite.Optimize.DEFAULT}. (default None, must be None or a
 |      set of values of type `tf.lite.Optimize`)
 |    representative_dataset: A generator function used for integer quantization
 |      where each generated sample has the same order, type and shape as the
 |      inputs to the model. Usually, this is a small subset of a few hundred
 |      samples randomly chosen, in no particular order, from the training or
 |      evaluation dataset. This is an optional attribute, but required for full
 |      integer quantization, i.e, if `tf.int8` is the only supported type in
 |      `target_spec.supported_types`. Refer 

To convert the model we will open it and then follow [the docs](https://www.tensorflow.org/lite/models/convert/convert_models#convert_a_savedmodel_recommended_).

In [4]:
saved_model_dir = "cat-dog-tuned"  # path to the SavedModel directory
tflite_model = "cat-dog.tflite"  # what to save the converted model as

# Open the model
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
# Use Dynamic Range Quantization
# https://www.tensorflow.org/lite/performance/post_training_quantization
converter.optimizations = [tf.lite.Optimize.DEFAULT]
# Convert the model.
tflite_model = converter.convert()

# Save the model.
with open("cat-dog.tflite", "wb") as f:
    f.write(tflite_model)

2023-06-21 15:05:47.378725: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:364] Ignored output_format.
2023-06-21 15:05:47.378752: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:367] Ignored drop_control_dependency.
2023-06-21 15:05:47.379467: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: cat-dog-tuned
2023-06-21 15:05:47.411330: I tensorflow/cc/saved_model/reader.cc:89] Reading meta graph with tags { serve }
2023-06-21 15:05:47.411361: I tensorflow/cc/saved_model/reader.cc:130] Reading SavedModel debug info (if present) from: cat-dog-tuned
2023-06-21 15:05:47.494248: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:353] MLIR V1 optimization pass is not enabled
2023-06-21 15:05:47.517692: I tensorflow/cc/saved_model/loader.cc:231] Restoring SavedModel bundle.
2023-06-21 15:05:48.013180: I tensorflow/cc/saved_model/loader.cc:215] Running initialization op on SavedModel bundle at path: cat-dog-tuned
2023-06-21 15:05

## Conduct inference on novel images

Now that we've converted the model, let's test it with novel images!
We can do this both with the full Keras API, or rely on the Tensorflow Lite runtime - based on our hardware and OS.

### Upload images

You should see 7 `.jpg` images in the folder. Upload those to Colab prior to continuing, or feel free to grab your own images of cats and dogs!


### Use TF Keras API

This requires the full tensorflow install. It uses the original model saved in a directory.

In [17]:
# use Keras API
import tensorflow as tf
import numpy as np
import os
from time import process_time


def picture_to_img_array(picture):
    """Takes the path to a picture.
    Returns an image array ready for inference.
    """
    # Load and resize the image
    img = tf.keras.utils.load_img(picture, target_size=(160, 160))
    img_array = tf.keras.utils.img_to_array(img)
    img_array = tf.expand_dims(img_array, 0)  # Create a batch of size 1
    return img_array


def keras_inference(model, img_array):
    # Conduct inference and extract the result from the np array
    prediction = model.predict(img_array)
    return np.squeeze(prediction)


# Labels: 0 = Cat, 1 = Dog
model = tf.keras.models.load_model("cat-dog-tuned")

# Where test images should be uploaded to
dir = "img/"
# Iterate over all images in directory
for root, dirs, files in os.walk(dir):
    for file in files:
        if file.endswith(".jpg") or file.endswith(".jpeg"):
            file_path = os.path.join(root, file)

            img_array = picture_to_img_array(file_path)
            # This is what actually does the inference
            start_time = process_time()
            result = keras_inference(model, img_array)
            elapsed_time = process_time() - start_time
            # This is an exponential function used to gauge confidence
            sig_result = tf.nn.sigmoid(result)
            sig_predict = tf.where(sig_result < 0.5, 0, 1)
            sig_predict = sig_predict.numpy()

            print("Pic:", file_path)
            print("Elapsed time", elapsed_time)
            print("prediction", result)
            print("Inferred label:", sig_predict)

Ellapsed time 0.6665851090000103
Pic: img/dog2.jpg
prediction 9.456733
Inferred label: 1
Ellapsed time 0.10415499900000214
Pic: img/cat1.jpg
prediction -9.482505
Inferred label: 0
Ellapsed time 0.10698500099999819
Pic: img/dog1.jpg
prediction 6.181109
Inferred label: 1
Ellapsed time 0.11429039399999397
Pic: img/cat2.jpg
prediction -11.236158
Inferred label: 0
Ellapsed time 0.09350875300000894
Pic: img/Vin.jpg
prediction -9.335227
Inferred label: 0
Ellapsed time 0.09842011700000342
Pic: img/cat3.jpg
prediction -6.9037724
Inferred label: 0
Ellapsed time 0.10101701499999649
Pic: img/dog3.jpg
prediction 7.2035856
Inferred label: 1


### Use TF Lite Interpreter

This more closely mirrors what we'll do on our embedded system.
The only difference is we will use the included `tf.lite` module instead of the standalone `tflite-runtime`.

In [19]:
# Use tf.lite interpreter
import tensorflow as tf  # on embedded device use: import tflite_runtime.interpreter as tflite
import numpy as np
from PIL import Image
from time import process_time

model_path = "cat-dog.tflite"

# For running on tflite-runtime replace this with tflite.Interpreter
interpreter = tf.lite.Interpreter(model_path=model_path)

# Embedded devices are memory constrained, so this handles that
interpreter.allocate_tensors()
# Details about model inputs and outputs
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
input_shape = input_details[0]["shape"]

for p in [
    "Vin.jpg",
    "cat1.jpg",
    "cat2.jpg",
    "cat3.jpg",
    "dog1.jpg",
    "dog2.jpg",
    "dog3.jpg",
]:
    """Labels
    0 = Cat
    1 = Dog
    """
    # Load the image using PIL
    image = Image.open("img/" + p)
    # Resize the image to match what the model was trained on
    resized_image = image.resize((input_shape[1], input_shape[2]))
    input_data = np.array(resized_image, dtype=np.float32)
    input_data = np.expand_dims(input_data, axis=0)  # Create a batch of size 1

    # Conduct inference
    start_time = process_time()
    interpreter.set_tensor(input_details[0]["index"], input_data)
    interpreter.invoke()
    output_data = interpreter.get_tensor(output_details[0]["index"])
    # Pull out the raw value from the np array
    prediction = np.squeeze(output_data)
    elapsed_time = process_time() - start_time

    # Computing exponents for sigmoid function is expensive, so use a simple heuristic instead.
    # If  you need an "unknown" option or confidence threshold, use something like this.
    # label = 0 if prediction < -3 else (1 if prediction > 3 else -1)
    label = 0 if prediction < 0 else 1

    print("Pic:", p)
    print("Elapsed time:", elapsed_time)
    print("Prediction:", prediction)
    print("Inferred label:", label)

Pic: Vin.jpg
Elapsed time: 0.014929627999990203
Prediction: -10.589223
Inferred label: 0
Pic: cat1.jpg
Elapsed time: 0.014831803999996396
Prediction: -9.261114
Inferred label: 0
Pic: cat2.jpg
Elapsed time: 0.014625451999989991
Prediction: -10.044463
Inferred label: 0
Pic: cat3.jpg
Elapsed time: 0.01447718400000042
Prediction: -6.615289
Inferred label: 0
Pic: dog1.jpg
Elapsed time: 0.018185125000002245
Prediction: 5.3814516
Inferred label: 1
Pic: dog2.jpg
Elapsed time: 0.014137142999999242
Prediction: 10.115644
Inferred label: 1
Pic: dog3.jpg
Elapsed time: 0.014443624999998406
Prediction: 7.5566382
Inferred label: 1


## Next step: embedded device!

Now go put the tflite model on an embedded device, such as Raspberry Pi or Arduino and conduct inference!

### Running on Raspberry Pi OS Bullseye 11

You can get started with:

```bash
pip install tflite-runtime==2.11.0
```

Then change the above code to use `tflite` instead of `tf.lite`, as annotated in the comments.

Finally, run it! Then considering using your picamera for live inference!