Use Tensorflow Lite for predictions with an off-the-shelf deep-learning model. The model architecture and weights are pre-determined. Use the ```Interpreter``` class from ```tflite_runtime``` that has only this class. 
1. Find the download link [here](https://dl.google.com/coral/python/tflite_runtime-1.14.0-cp37-cp37m-linux_armv7l.whl) for Python 3.7. 
2. Install with pip3 installer, pointing to the local copy of the '.whl' file. 
3. Verify by opening python kernel and typing ```from tflite_runtime.interpreter import Interpreter```.

Now, proceed with the script.

In [1]:
from tflite_runtime.interpreter import Interpreter
from PIL import Image
import numpy as np
import time
import os
from picamera import PiCamera
from io import BytesIO

The ```Interpreter``` class makes predictions with a pre-trained model that is in the format required for Tensorflow Lite. These models conform to an architecture that is a deep-learning staple, such as, MobileNet. However, this need not be very restrictive. It is possible to obtain such a model from a service like Google's [Teachable Machine](https://teachablemachine.withgoogle.com) that allows some tweaking the model based on neural transfer learning.

For this exercise, download the [MobileNet](https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_1.0_224_quant_and_labels.zip) Convolutional Neural Network (CNN). The zip archive has two files as follows: (1.) The set of weights (2.) The category labels for classification. The weights are represented with 8-bits each so that the model trades off precision for lower storage, thus taking up less memory for computation.

Let's get set up to load the model components extracted from the archive.

In [2]:
data_source = '/home/pi/Downloads/MobileNet'

path2model = os.path.join(data_source, 'mobilenet_v1_1.0_224_quant.tflite')
path2labels = os.path.join(data_source, 'labels_mobilenet_quant_v1_224.txt')

path2labels
path2model



'/home/pi/Downloads/MobileNet/mobilenet_v1_1.0_224_quant.tflite'

Let's load the category labels and examine the top 5 categories.

In [3]:
with open(path2labels) as text_labels:
    labels = text_labels.readlines()
labels = [label.rstrip() for label in labels]
labels[0:7]

['background',
 'tench',
 'goldfish',
 'great white shark',
 'tiger shark',
 'hammerhead',
 'electric ray']

Let's load the weights and allocate tensors.

In [20]:
interpreter = Interpreter(path2model)
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
height = input_details[0]["shape"][1]
width = input_details[0]["shape"][2]

The model expects an input image in certain shape. Let's examine. We will get our images into the shape required by the model later.

In [21]:
print("Use an input image of size {} x {}.".format(height, width))

Use an input image of size 224 x 224.


Now that we know the dimensions, let's prepare an image to use in classification. We will take a picture with picamera first. Let's set up the picamera. Note that the code in the cell (below) must be run only once or otherwise, restart the kernel and run the notebook starting over.

In [6]:
eye_stream = BytesIO()

eye = PiCamera()
eye.rotation = 180
eye.resolution = (300, 300)

time.sleep(2)  # Warm up

Capture an image from the byte stream and resize for feeding to the classifier.

In [26]:
eye.capture(eye_stream, 'jpeg')
eye_stream.seek(0)
snap = Image.open(eye_stream).convert('RGB').resize((height, width))
snap.show()

Now, set the tensor to feed the image to the deep-learning model for image classification. Call the ```invoke()``` method to propagate the input through the layers of the CNN in feed-forward mode. 

In [27]:

im2classify = np.expand_dims(snap, axis=0)
interpreter.set_tensor(input_details[0]["index"], im2classify)
interpreter.invoke()

With calculations complete, process the output to extract the *k*th best match.

In [39]:
output_details = interpreter.get_output_details()

output = np.squeeze(interpreter.get_tensor(output_details[0]["index"]))

scale, zero_point = output_details[0]["quantization"]
output = scale * (output - zero_point)

top = 1
ordered = np.argpartition(-output, top)
[(labels[i], output[i]) for i in ordered[:top]][0]

('Polaroid camera', 0.21484375)

Package the code in functions for use in the envisaged computer vision application. Ref. docstrings for details about each function.

In [47]:
def load_labels(path2labels):
    """
    Load the text file with the canned model into a list.
    Args
    - path2labels is the location (string) of the local copy of the file containing classification labels.
    Returns a list of classification labels.
    """
    with open(path2labels) as text_labels:
        labels = text_labels.readlines()
    return [label.rstrip() for label in labels]    

In [48]:
def set_input_tensor(interpreter, snap):
    """
    Convert the image from camera to a tensor for feeding to the convolutional neural network.
    Args
    - interpreter is the object of class Interpreter from Tensorflow Lite runtime 
    - snap is the open image from the picamera
    Returns None
    Note that Interpreter is the sole class in the Tensorflow Lite runtime.
    """
    im2classify = np.expand_dims(snap, axis=0)
    interpreter.set_tensor(interpreter.get_input_details()[0]["index"], im2classify)

In [49]:
def report_topkth(interpreter, topkth):
    """
    Processes the output to report the top kth match with success probability.
    Args
    - interpreter is the object of class Interpreter from Tensorflow Lite runtime
    - topkth is an integer specifying the desired rank of the match
    Returns a tuple with the match ID and raw success probability.
    """
    output_details = interpreter.get_output_details()

    output = np.squeeze(interpreter.get_tensor(output_details[0]["index"]))
    scale, zero_point = output_details[0]["quantization"]
    output = scale * (output - zero_point)

    ordered = np.argpartition(-output, topkth)
    return [(i, output[i]) for i in ordered[:topkth]][0]

In [50]:
def classify_me(interpreter, snap, topkth=1):
    set_input_tensor(interpreter, snap)
    interpreter.invoke()
    
    return report_topkth(interpreter, topkth)

    

In [57]:
labels = load_labels(path2labels)

classID, classProbability = classify_me(interpreter, snap)

print("Detected '{}' with probability of {}%.".format(labels[classID], int(classProbability*100)))

Detected 'Polaroid camera' with probability of 21%.


## Conclusion
We have tested an approach for offline object detection on Raspberry Pi. The online approach based on calling a web service requires WiFi whereas the offline approach does not. We have overcome issues arising out of compatibility of the Raspberry Pi's the ARM architecture with key computational components as follows:
1. Use PIL for working with image data instead of the pesky OpenCV.
2. Use Tensorflow Lite runtime instead of the full-fledged Tensorflow library. 
3. Use Teachable Machine to tweak thedeep-learning models that apply standard CNN architectures if needed.
We are now poised to add computer vision features to projects such as the smart-seat to prevent infant deaths in a "hot car" and the surveillance system for home security.

## References

[1.] A [guide](https://www.digikey.com/en/maker/projects/how-to-perform-object-detection-with-tensorflow-lite-on-raspberry-pi/b929e1519c7c43d5b2c6f89984883588) to object detection with Tensorflow Lite on Raspberry Pi from DigiKey.