<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/ds-kiel/TinyML-Labs/blob/WS24-25/Lab2.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/ds-kiel/TinyML-Labs/blob/WS24-25/Lab2.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
  <td>
    <a href="https://raw.githubusercontent.com/ds-kiel/TinyML-Labs/WS24-25/Lab2.ipynb" download><img src="https://www.tensorflow.org/images/download_logo_32px.png" />Download notebook</a>
  </td>
</table>

---


Before starting, you must click on the "Copy To Drive" option in the top bar. Go to File --> Save a Copy to Drive. Name it *'Group\<Your group number\>_Lab2.ipynb'*. <ins>This is the master notebook so you will not be able to save your changes without copying it !</ins> Once you click on that, make sure you are working on that version of the notebook so that your work is saved.



---

---

THIS IS A PRELIMINARY VERSION OF THE LAB! THE FINAL VERSION WILL FOLLOW BEFORE THE WEEKEND!
---

---

# Lab 2: Quantization and On-Device Execution

In the first lab you looked at the first part of the pipeline from data to executing models on low-power devices. You explored how to preprocess data and train neural networks with Edge Impulse. In this lab we continue the pipeline and you will explore how to [convert](https://ai.google.dev/edge/litert/models/convert_tf) a model to a [LiteRT](https://ai.google.dev/edge/litert) model, how to [quantize](https://ai.google.dev/edge/litert/models/post_training_integer_quant) [a model](https://www.tensorflow.org/model_optimization/guide/quantization/post_training), how to use [quantization-aware training](https://www.tensorflow.org/model_optimization/guide/quantization/training) and finally how to deploy the model and use the model with a microcontroller.

You will explore the full pipeline from data to device using Tensorflow. You will train a model and convert, deploy, and execute it on a microcontroller, specifically the [Arduino Nano 33 BLE Sense](https://store.arduino.cc/products/arduino-tiny-machine-learning-kit).

## Environment

The instructions for this lab come as a [Jupyter Notebook](https://jupyter.org/). You can run it locally in your own Python environment, but we recommend you to use [Google Colab](https://colab.research.google.com) to save your computer hardware, have an instantly working python environment, and allow for easy collaboration. If your decide to use your local computer, take a look at Python virtual environments to avoid messing with your usual Python environment.

Moreover, you need to obtain an API key from an Edge Impulse project. Register at [edgeimpulse.com](https://edgeimpulse.com/), log in and create a new project. Open the project, navigate to **Dashboard** and click on the **Keys** tab to view your API keys. Double-click on the API key to highlight it, right-click, and select **Copy**. Paste the key below in the cell starting with `ei.API_KEY`.

![Copy API key from Edge Impulse project](https://raw.githubusercontent.com/edgeimpulse/notebooks/main/.assets/images/python-sdk-copy-ei-api-key.png)

For this lab you will not use the project in the Edge Impulse Studio. We just need the API Key.

## What do you need to hand in?

This Jupyter Notebook is intended as a document that you use both for working on the lab as well as for answering the questions. For handing in your lab, please **upload this Jupyter notebook**. Make sure that all images you include and outputs you generate are visible in the version you hand in.

## Setup

In [None]:
# If you have not done so already, install the following dependencies
!python -m pip install tensorflow tensorflow-model-optimization scikit-learn edgeimpulse numpy matplotlib seaborn cbor2 pandas

### Imports

In [None]:
import numpy as np
import pandas as pd

import os
import cbor2

import tensorflow as tf
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Dense, Flatten, Dropout
from keras.callbacks import EarlyStopping

# from tensorflow.lite import TFLiteConverter
import tensorflow_model_optimization as tfmot

from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
import edgeimpulse as ei

import matplotlib.pyplot as plt
import seaborn as sns

### Helper Functions

In [None]:
plt.style.use('seaborn-darkgrid')

def plot_training_history(history, model_name):
    fig, (ax1, ax2) = plt.subplots(1, 2)
    fig.suptitle(f'Model {model_name}')
    fig.set_figwidth(15)

    ax1.plot(range(1, len(history.history['accuracy'])+1), history.history['accuracy'])
    ax1.plot(range(1, len(history.history['val_accuracy'])+1), history.history['val_accuracy'])
    ax1.set_title('Model accuracy')
    ax1.set(xlabel='epoch', ylabel='accuracy')
    ax1.legend(['training', 'validation'], loc='best')

    ax2.plot(range(1, len(history.history['loss'])+1), history.history['loss'])
    ax2.plot(range(1, len(history.history['val_loss'])+1), history.history['val_loss'])
    ax2.set_title('Model loss')
    ax2.set(xlabel='epoch', ylabel='loss')
    ax2.legend(['training', 'validation'], loc='best')
    plt.show()

### Edge Impulse API Key

Insert your Edge Impulse API Key as in Lab 1:

In [None]:
ei.API_KEY = "ei_dae2..." # Change this to your Edge Impulse API key

## Edge Impulse Dataset

### Prepare the data

---
**Task 1:** Navigate to the *Data acquisition* page in your Edge Impulse project of lab 1 and export the data.

**Task 2:** Import the data with the code below.

---

In [None]:
labels = ['idle', 'circle', 'left-right', 'up-down'] # Change this to your labels
num_classes = len(labels)

data_path = '...' # Change this to the path of your downloaded folder

# Select the window size and stride you used in Edge Impulse
window_size_ms = 2000 
window_stride_ms = 100


# Function to create windows from the data
def create_windows(df, window_size_ms, window_stride_ms, label):
    window_size = int(window_size_ms / 10)
    window_stride = int(window_stride_ms / 10)
    windows = []
    windows_labels = []
    for i in range(0, len(df) - window_size, window_stride):
        windows.append(df.iloc[i:i+window_size].values)
        windows_labels.append(label)
    return np.array(windows),windows_labels

# Load the data from the files
def load_data(data_path, folder):
    data = np.zeros((1, int(window_size_ms / 10), 3))
    data_labels = []
    for file in os.listdir(data_path+folder):
        if file.endswith('.cbor'):
            label = file.split('.')[0].strip()
            with open(data_path+folder+'/'+file, 'rb') as f_obj:
                data_file = cbor2.load(f_obj)
                df = pd.DataFrame(data_file['payload']['values'], columns=[item['name'] for item in data_file['payload']['sensors']])
                df = df.drop(columns=['gyrX', 'gyrY', 'gyrZ', 'magX', 'magY', 'magZ'])

                window_data, window_labels = create_windows(df, window_size_ms, window_stride_ms, labels.index(label))
                data = np.concatenate((data, window_data), axis=0)
                data_labels += window_labels

    data = np.delete(data, 0, axis=0)
    return data, data_labels


x_train, y_train = load_data(data_path, 'training')
x_test, y_test = load_data(data_path, 'testing')

y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

---
**Task 3 (optional):** Perform scaling on your data if you like to. *Please note: You have to do the same scaling later in your Arduino program.*

---

In [None]:
# perform your scaling here

### Build the model

---
**Task 4:** Add your best model from lab 1, that uses a raw data preprocessing block.

---

In [None]:
# Build model
def build_model(summary=True):
    model = Sequential()

    # ADD YOUR LAYERS HERE

    model.add(Dense(num_classes, activation='softmax'))

    # Compile model_mnist
    model.compile(
        optimizer='adam',
        loss='categorical_crossentropy',
        metrics=['accuracy']
    )

    if summary:
        model.summary()

    return model

In [None]:
model = build_model()

### Train the model

So far, you manually explored how many epochs are necessary to successfully train the model. However, Tensorflow gives you an option to automate this called [early stopping](https://keras.io/api/callbacks/early_stopping/). See also [here](https://machinelearningmastery.com/how-to-stop-training-deep-neural-networks-at-the-right-time-using-early-stopping/) and [here](https://towardsdatascience.com/a-practical-introduction-to-early-stopping-in-machine-learning-550ac88bc8fd).

---
**Task 7:** Use an early stopping callback in your fitting function to find the optimal number of epochs. Use reasonable configurations. How many epochs does it train for?

**Answer:** ...

---

In [None]:
early_stopping_cb = EarlyStopping(
    monitor=...,
    patience=...,
    min_delta=...,
    mode=...
)

num_epochs = 200
history = model.fit(x_train, y_train, batch_size=128, epochs=num_epochs, validation_split=0.1, callbacks=[early_stopping_cb])
plot_training_history(history, 1)

### Evaluate the Model



In [None]:
score_model = model.evaluate(x_test, y_test) #, verbose=0)
print("Test loss:", score_model[0])
print("Test accuracy:", score_model[1])

cm = confusion_matrix(np.argmax(y_test,axis=1), np.argmax(score_model.predict(x_test),axis=1))
# print(cm)

cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]

cm = pd.DataFrame(cm, index = labels,
                  columns = labels)

plt.figure(figsize = (4,4))
ax = sns.heatmap(cm*100,
           annot=True,
           fmt='.1f',
           cmap="Blues",
           cbar=False,
              )
ax.set_ylabel("True Class", fontdict= {'fontweight':'bold'})
ax.set_xlabel("Predicted Class", fontdict= {'fontweight':'bold'})

plt.show()

---
**Task 8:** How does the accuracy of your model compare to the accuracy you achieved with Edge Impulse?

**Answer:** ...

---

### On-device resource consumption

After training your model, we want to know whether we can run it on a microcontroller or whether it is too large. We will use the [Edge Impulse Python SDK](https://docs.edgeimpulse.com/docs/tools/edge-impulse-python-sdk) for profiling, so if you didn't add your API key on top, now is the time.

To start, we need to find the right target device for profiling. You are looking for the *Arduino Nano 33 BLE*.

In [None]:
# List the available profile target devices
ei.model.list_profile_devices()

Next you can estimate the memory usage and inference time for each of your models.

In [None]:
# Estimate the RAM, ROM, and inference time for our model on the target hardware family

your_model = ...
your_device = ...

try:
    profile = ei.model.profile(model=your_model,
                               device=your_device)
    print(profile.summary())
except Exception as e:
    print(f"Could not profile: {e}")

---
**Task 9:** Estimate the memory usage and inference time for your modes. **Compare your model's performance to your Edge Impulse models regarding ROM and RAM usage and their inference time**. Please do **<ins>not</ins>** use a table for this, but plot it, for example with [Matplotlib](https://matplotlib.org/stable/gallery/lines_bars_and_markers/bar_colors.html) or [Seaborn](https://seaborn.pydata.org/examples/grouped_barplot.html). Bar plots should be a good option for it. On the x-axis you can list the model and on the y-axis, you can show the respective memory usage for ROM and RAM.

**Task 10:** Briefly explain your plot(s) of task 9.

**Answer:** ...

---

### Save Model

To come back to a model to continue working on it, it might be useful to save it. We can use the `model.save()` [Function](https://www.tensorflow.org/guide/keras/serialization_and_saving) that exports a TensorFlow model object to SavedModel format.

If you use Google Colab, you can find the saved model as a `.keras`-file on the left under `Files/`.

In [None]:
export_path = 'saved_model.keras'
model.save(export_path)

### Model Quantization

Your microcontroller cannot use the Tensoflow model directly. Instead there is [LiteRT](https://ai.google.dev/edge/litert) for deploying models on mobile and edge devices.

---
**Task 11:** Load your model and convert it with LiteRT and save the model to a `.tflite`-file. (HINT: Check out [this](https://github.com/tensorflow/tflite-micro/tree/main/tensorflow/lite/micro/examples/hello_world) *Hello World* example and [these instructions](https://ai.google.dev/edge/litert/models/convert_tf).)

**Task 12:** Create a second LiteRT conversion that uses [optimizations](https://ai.google.dev/edge/api/tflite/python/tf/lite/Optimize) and enforce integer-only weights.
(Maybe a helpful [resource](https://ai.google.dev/edge/litert/models/post_training_quantization).)

**Task 13:** Create a third Tensorflow Lite conversion that in addition to the conversion in *Task 12* enforces integer-only quantization. (*Hint: Use a [representative dataset](https://ai.google.dev/edge/api/tflite/python/tf/lite/RepresentativeDataset).*)

**Task 14:** Evaluate all three converted models and compare them to the Tensorflow model they are based on regarding profiled memory usage and accuracy. Use plots.

**Task 15:** Explain your findings from Task 14. Why is there such a difference in performance and in memory usage?

**Answer:** ...

---

In [None]:
# ADD YOUR MODEL CONVERSIONS HERE

In [None]:
# Save the converted model

with open('model.tflite', 'wb') as f:
    f.write(tflite_model)


### Quantization Aware Training

### Model Export - Library Creation

### Model conversion and library creation with Edge Impulse