# ML on Edge
## Ehsna Shaghaei
Nov 2022

# ML on Edge
## Ehsan Shaghaei
Nov 22

## Motivation
most of the time the so called “smart” devices are programmed to act like remote controlled devices controlled either by cloud or an app or just stream the sensor readings to the cloud where the actual processing happens . Given the limited RAM or the processing power available on these resource constraint devices there are only limited things that can be accomplished .


![image.png](https://yastatic.net/s3/lpc/d875ecb7-7b11-4d77-95c8-adb76f5a1895.png)

## Platforms

Initially, in 2019 TF announced support of microcontrollers, We have been hearing AI on edge as being the logical next step in the evolution of IOT devices but given the lack of open source frameworks there was very less innovation in this direction and with Google’s announcement it has opened lot of doors for embedded programmers to try build AI applications on edge .


currently, There is  edge-ml platform  which is an open-source and browser-based toolchain for machine learning on microcontrollers.
![](https://edge-ml.org/images/process.svg)
It supports ml-flow and with a few simple steps edge-ml lets you record data, label samples, train models and deploy validated embedded machine learning directly on the edge.

# Play around
I had few ESP32 Cam modules lying around and There were this guy on linkedIn deploying different models on a MUC,so I thought why not train and deploy a Fashion Mnist model to recognize fashion apparels directly using the onboard camera feed . The outcome beat my expectation , the application was able to recognize the images with reasonable accuracy .



# **Building TFlite model for Fashion Mnist dataset**
This notebook uses the Fashion MNIST dataset which contains 70,000 grayscale images in 10 categories. Each image in the dataset is a grayscale image of 28 x 28 pixels .

# Setup


In [4]:
# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
import pathlib


# Downloading dataset

In [5]:
import tensorflow_datasets as tfds
tfds.disable_progress_bar()

In [6]:
splits, info = tfds.load('fashion_mnist', with_info=True, as_supervised=True, 
                         split=['train[:80%]', 'train[80%:90%]', 'train[90%:]'])

(train_examples, validation_examples, test_examples) = splits

num_examples = info.splits['train'].num_examples
num_classes = info.features['label'].num_classes

Downloading and preparing dataset 29.45 MiB (download: 29.45 MiB, generated: 36.42 MiB, total: 65.87 MiB) to ~/tensorflow_datasets/fashion_mnist/3.0.1...
Dataset fashion_mnist downloaded and prepared to ~/tensorflow_datasets/fashion_mnist/3.0.1. Subsequent calls will reuse this data.


In [7]:
class_names = ['T-shirt_top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

In [8]:
# Store the labels in a text file to be downloaded later 
with open('labels.txt', 'w') as f:
  f.write('\n'.join(class_names))

In [9]:
# this will be our input size
IMG_SIZE = 28

# Preprocessing

In [10]:
def format_example(image, label):
  image = tf.cast(image, tf.float16)
  image = tf.image.resize(image, (IMG_SIZE, IMG_SIZE))
  image = image / 255.0
  return image, label

In [11]:
BATCH_SIZE = 28

### Create a Dataset from images and labels



In [12]:
train_batches = train_examples.cache().shuffle(num_examples//4).batch(BATCH_SIZE).map(format_example).prefetch(1)
validation_batches = validation_examples.cache().batch(BATCH_SIZE).map(format_example).prefetch(1)
test_batches = test_examples.cache().batch(1).map(format_example)

# Building and Training the model

In [13]:
model = tf.keras.Sequential([
  tf.keras.layers.Conv2D(6, 3, activation='relu', input_shape=(IMG_SIZE, IMG_SIZE, 1)),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(6, 3, activation='relu'),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer='adam',
    metrics=['accuracy'])

In [14]:
model.fit(train_batches, 
          epochs=40,
          validation_data=validation_batches)

Epoch 1/40


  return dispatch_target(*args, **kwargs)


Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40
Epoch 22/40
Epoch 23/40
Epoch 24/40
Epoch 25/40
Epoch 26/40
Epoch 27/40
Epoch 28/40
Epoch 29/40
Epoch 30/40
Epoch 31/40
Epoch 32/40
Epoch 33/40
Epoch 34/40
Epoch 35/40
Epoch 36/40
Epoch 37/40
Epoch 38/40
Epoch 39/40
Epoch 40/40


<keras.callbacks.History at 0x7fc6604ac590>

# Exporting to TFLite

In [15]:
export_dir = 'saved_model/1'
tf.saved_model.save(model, export_dir)



In [16]:
# Convert the model.
def representative_data_gen():
  for input_value, _ in test_batches.take(100):
    yield [input_value]

In [17]:

converter = tf.lite.TFLiteConverter.from_saved_model(export_dir)

converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = tf.lite.RepresentativeDataset(representative_data_gen)
tflite_model = converter.convert()

In [21]:
tflite_model_file = 'model.tflite'

with open(tflite_model_file, "wb") as f:
  f.write(tflite_model)

# Download the model

In [20]:
try:
  from google.colab import files

  files.download(tflite_model_file)
  files.download('labels.txt')
except:
  pass

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

# Deployment to Edge medium
The edge medium limitations can make the deployment stage challenging, limitations might be:
- there is no support of a filesystem 
- low resources e.g. little memory, little processing unit

Because of little ram we export our model values to a C source file.

``` xxd -i model.tflite > model_data.cc```

To deploy this model we use **tfmicro** library  
which is a TensorFlow lite interpreter developed by TFLite team which will interpret our model and get us predictions . We add these two components under “components” directory as shown above .

The hardware I used for the demo is AI thinker’s ESP CAM module .

The next step is to place the “model_data.cc” file we built in the last step of “Building the model” in “main/tf_model/” folder . Make sure that the variable names of the model array and array length in “include/model_data.h” are same as in “model_data.cc” file . Next we check the “/include/model_settings.h” file to make sure that the settings such as input size and number of categories match the model that we are deploying, if you are using any other models you need to modify the settings to match your model.

The setup process for tfmicro library is simple ,

First, we map the model_data using the GetModel function and pass the model data array name as the argument.

```
model = tflite::GetModel(model_data_tflite);
```
Second , we pull in the operation resolver which contains operations needed to realize the model. Here I used “AllOpsResolver” which includes all operations , best practice would be to include only the operations needed for your model and hence save some code space.
```
static tflite::ops::micro::AllOpsResolver micro_op_resolver;
```

Now we make the model interpretter

```
// Build an interpreter to run the model with.
static tflite::MicroInterpreter static_interpreter(
model, micro_op_resolver, tensor_arena, kTensorArenaSize, error_reporter);
interpreter = &static_interpreter;

// Allocate memory from the tensor_arena for the model's tensors.
TfLiteStatus allocate_status = interpreter->AllocateTensors();
```

This completes the setup process, we are now ready to start interpreting the input data to get our predictions. In order to infer the data we need to first fill interpreter’s input buffer with our input data and then call interpreter’s “invoke” function to run inference , the prediction are stored in interpreter’s output buffer .

``` 
// fetches the input buffer of the interperter where we fill our input data
interpreter->input(0)

// Invoking the function to run the model in the input data
interpreter->Invoke()

// Fetch the outcome of inference using the output buffer
interpreter->output(0)
```

![](https://miro.medium.com/max/720/1*do6x-6rJdK-uqaWgT8mj8A.gif)

# References
[Repo](https://github.com/akshayvernekar/)

[tflite](https://www.tensorflow.org/lite)

[edge-ml](https://edge-ml.org/)