# TensorFlow Lite

Previously, we trained a model based on MobileNetV2 to differentiate between cats and dogs.

Prior to conducting inference with this model, we will convert the model to Tensorflow Lite.
This will have performance advantages on constrained hardware.

## Pre-reading

- [TensorFlow Lite](https://www.tensorflow.org/lite/guide)
- [Pre-trained models for TensorFlow Lite](https://www.tensorflow.org/lite/models/trained)
- [Model conversion overview](https://www.tensorflow.org/lite/models/convert)


## Prerequisites

Before this will work we need to upload the previously trained model and some sample images.

### Upload sample images

The GitHub repository has several cat and dog pictures co-located with this notebook.
Upload them to a directory named `img/` or go find your own test pictures.

In [None]:
# This should show several .jpgs of cats and dogs, after uploading
!ls img

### Upload the Model

First, upload the saved model from the previous lesson (`cat-dog-tuned.zip`)

Then unzip the model.

In [None]:
# This should show several .jpgs of cats and dogs, after uploading
!ls img

### Upload the Model

First, upload the saved model from the previous lesson (`cat-dog-tuned.zip`)

Then unzip the model.

In [None]:
# Run after uploading the file
!unzip cat-dog-tuned.zip

## Convert the saved model

We'll use [TFLiteConverter](https://www.tensorflow.org/lite/api_docs/python/tf/lite/TFLiteConverter) to export the model to a single binary.

In [None]:
import tensorflow as tf

print(tf.__version__)
print(help(tf.lite.TFLiteConverter))

To convert the model we will open it and then follow [the docs](https://www.tensorflow.org/lite/models/convert/convert_models#convert_a_savedmodel_recommended_).

In [None]:
saved_model_dir = "cat-dog-tuned"  # path to the SavedModel directory
tflite_model = "cat-dog.tflite"  # what to save the converted model as

# Open the model
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
# Use Dynamic Range Quantization
# https://www.tensorflow.org/lite/performance/post_training_quantization
converter.optimizations = [tf.lite.Optimize.DEFAULT]
# Convert the model.
tflite_model = converter.convert()

# Save the model.
with open("cat-dog.tflite", "wb") as f:
    f.write(tflite_model)

## Conduct inference on novel images

Now that we've converted the model, let's test it with novel images!
We can do this both with the full Keras API, or rely on the Tensorflow Lite runtime - based on our hardware and OS.

### Upload images

You should see 7 `.jpg` images in the folder. Upload those to Colab prior to continuing, or feel free to grab your own images of cats and dogs!


### Use TF Keras API

This requires the full tensorflow install. It uses the original model saved in a directory.

In [None]:
# use Keras API
import tensorflow as tf
import numpy as np
import os
from time import process_time

# Labels: 0 = Cat, 1 = Dog
model = tf.keras.models.load_model("cat-dog-tuned")

# Where test images should be uploaded to
dir = "img/"
# Recursively iterate over all images in directory
for root, dirs, files in os.walk(dir):
    for file in files:
        if file.endswith(".jpg") or file.endswith(".jpeg"):
            # Load and resize the image
            file_path = os.path.join(root, file)
            img = tf.keras.utils.load_img(file_path, target_size=(160, 160))
            img_array = tf.keras.utils.img_to_array(img)
            img_array = tf.expand_dims(img_array, 0)  # Create a batch of size 1

            # Conduct inference and extract the result from the np array
            start_time = process_time()
            prediction = model.predict(img_array)
            result = np.squeeze(prediction)
            elapsed_time = process_time() - start_time

            # Activation function
            sig_result = tf.nn.sigmoid(result)
            sig_predict = tf.where(sig_result < 0.5, 0, 1)
            sig_predict = sig_predict.numpy()

            print("Img:", file)
            print("Inference time", elapsed_time)
            print("Raw prediction:", result)
            print("Inferred label:", sig_predict)

### Use TF Lite Interpreter

This more closely mirrors what we'll do on our embedded system.
The only difference is we will use the included `tf.lite` module instead of the standalone `tflite-runtime`.

In [None]:
# Use tf.lite interpreter
import tensorflow as tf  # on embedded device use: import tflite_runtime.interpreter as tflite
import numpy as np
from PIL import Image
from time import process_time

# Labels: 0 = Cat, 1 = Dog
model_path = "cat-dog.tflite"

# For running on tflite-runtime replace this with tflite.Interpreter
interpreter = tf.lite.Interpreter(model_path=model_path)

# Embedded devices are memory constrained, so this handles that
interpreter.allocate_tensors()
# Details about model inputs and outputs
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
input_shape = input_details[0]["shape"]

# Where test images should be uploaded to
dir = "img/"
# Recursively iterate over all images in directory
for root, dirs, files in os.walk(dir):
    for file in files:
        if file.endswith(".jpg") or file.endswith(".jpeg"):
            file_path = os.path.join(root, file)

            # Load the image using PIL
            image = Image.open(file_path)
            # Resize the image to match what the model was trained on
            resized_image = image.resize((input_shape[1], input_shape[2]))
            input_data = np.array(resized_image, dtype=np.float32)
            input_data = np.expand_dims(input_data, axis=0)  # Create a batch of size 1

            # Conduct inference
            start_time = process_time()
            interpreter.set_tensor(input_details[0]["index"], input_data)
            interpreter.invoke()
            output_data = interpreter.get_tensor(output_details[0]["index"])
            # Pull out the raw value from the np array
            prediction = np.squeeze(output_data)
            elapsed_time = process_time() - start_time

            # Computing exponents for sigmoid function is expensive, so use a simple heuristic instead.
            # If  you need an "unknown" option or confidence threshold, use something like this.
            # label = 0 if prediction < -3 else (1 if prediction > 3 else -1)
            label = 0 if prediction < 0 else 1

            print("Img:", file)
            print("Inference time", elapsed_time)
            print("Raw prediction:", result)
            print("Inferred label:", sig_predict)

## Next step: Raspberry Pi!

Now go put the tflite model on an embedded device, such as Raspberry Pi, and conduct inference!

### Running on Raspberry Pi OS Bullseye 11

This was tested with a Pi 4 running 64-bit Bullseye.

First, clone this repository to your Pi

```bash
git clone --depth 1 https://github.com/USAFA-ECE/ai-hardware.git
```

Then navigate to `cat-dog/`

Next, create a virtual environment. This will isolate Python dependencies.

```bash
python3 -m venv env
source env/bin/activate
```
Install dependencies (located in the `book/` dir):

```bash
pip install -r tflite-requirements.txt
```

Finally, change the above code to use `tflite` instead of `tf.lite`, as annotated in the comments.

Finally, run it!

### PiCamera for live inference

On your own exercise. Write a new Python script that will take a picture of a real animal and tell you if it is a dog or cat.