<h1 style="text-align: center;">Emotion Recognition Inference with Arduino Nano 33 BLE Sense</h1>

## Introduction
In this notebook, we demonstrate the deployment of the speech emotion recognition (SER) model, previously developed and converted into a TinyML model in [this notebook](https://github.com/Hannibal0420/Speech-Emotion-Recognition-TinyML/blob/main/01_TFLite_Model_Preparation.ipynb), with an Arduino Nano 33 BLE Sense microcontroller, a compact and energy-efficient board that is ideal for implementing real-time applications. The primary objective of this notebook is to showcase the process of processing audio data from a microphone connected to the board and running inferences on the TensorFlow Lite (TFLite) model designed to classify four emotions: Happy, Surprised, Neutral, and Unpleasant.

We begin by setting up a serial connection with the Arduino and calibrating the microphone for reliable data collection. Next, we define asynchronous functions for processing the data and running the model. We also measure the model's inference time to evaluate its performance on the microcontroller.

By the end of this notebook, you will have a better understanding of how to implement an SER model with a resource-constrained device, opening the door to applications in the fields of affective computing and IoT.

## Setting Up
1. Connect your Arduino 33 BLE Sense to your computer using a USB cable.
2. Open the Arduino IDE on your computer and select the correct board and port under the "Tools" menu.
3. Upload the [Data Streaming Code](https://github.com/Hannibal0420/Speech-Emotion-Recognition-TinyML/blob/main/data_streaming.ino) provided in the project repository to your Arduino board.
4. After uploading, open the Serial Monitor in the Arduino IDE and make sure the baud rate is set to 38400.
5. Press the reset button on the Arduino board to initialize it with the greeting message "Welcome to the data streaming...". If you see this message, congratulations! You can move on to the next step.

> **Note:** <br>Sometimes the board may go to sleep. If you experience any issues, press the reset button on the Arduino board again before running the code below.

## Importing Libraries
Include the necessary libraries for audio processing, communication with the Arduino, and TinyML inference in real time:

In [1]:
import asyncio
import time
from serial.tools import list_ports

import librosa
import librosa.display
import matplotlib.pyplot as plt
import numpy as np
import serial
from IPython.display import Audio, display, HTML

import tensorflow as tf
from tensorflow import keras


ports = list_ports.comports()
for port in ports:
    print(port)

COM9 - USB-SERIAL CH340 (COM9)


## Initialization
In this section, the serial connection with the Arduino is established, and the microphone is calibrated to normalize the input data within a specific range for improved processing and feature extraction.

In [2]:
SERIAL_PORT = 'COM9'
BAUD_RATE = 38400

def arduino_activate():
    try:
        arduino = serial.Serial(SERIAL_PORT, BAUD_RATE, timeout=1)
        time.sleep(2)  # Allow time for connection to establish
    except serial.SerialException as e:
        print(f"Failed to connect to serial port {SERIAL_PORT}: \n{e}")
        return None

    command = input("Type 'x' to Activate Arduino: ")
    if command.lower() == 'x':
        arduino.write(command.encode('utf-8'))
        print("Success Activated!")
        print("Open the Serial Monitor to check if it is working, then close it.")
    return arduino


arduino = arduino_activate()

Type 'x' to Activate Arduino: x
Success Activated!
Open the Serial Monitor to check if it is working, then close it.


In [4]:
def arduino_read(buffer, buffer_size, overlapping, norm=(None, None), max_attempts=10):
    buffer = np.roll(buffer, overlapping)
    num_data = buffer_size - overlapping
    for i in range(num_data):
        decoded_data = ''
        attempts = 0
        while decoded_data == '':
            arduino_data = arduino.readline()
            decoded_data = arduino_data[:len(arduino_data)].decode("utf-8").strip('\r\n')
            attempts += 1
            if attempts >= max_attempts:
                print('Fail to Retrieve Data...')
                break
        if norm[0] is not None:
            decoded_data = normalize(int(decoded_data), 1, -1, norm[0], norm[1])
        try:
            buffer[i+overlapping] = decoded_data
        except ValueError:
            print("Open the Serial Monitor to check if it is working. If it's not, press the reset button and rerun 'arduino_activate' funciton again.")
    return buffer

def normalize(array, new_max, new_min, old_max, old_min):
    array = (((array - old_min) * (new_max - new_min)) / (old_max - old_min)) + new_min
    return array.astype(np.float16)


# Calibrate the microphone
print("Calibrating: Please Speak to the Microphone")
time.sleep(1)
tuning_data = arduino_read(np.zeros(48000), 48000, 0)
TUNING_MAX, TUNING_MIN = (max(tuning_data), min(tuning_data))
print(f'Normalized signal from the range ({TUNING_MIN}, {TUNING_MAX}) to (-1, 1)')

Calibrating: Please Speak to the Microphone
Normalized signal from the range (-73.0, 80.0) to (-1, 1)


## Running Inference
In this section, we set up asynchronous functions for MFCC features extraction. Then, we initialize the TensorFlow Lite interpreter, run emotion recognition inference, and send results back to the Arduino.

In [5]:
#Signal processing
BUFFER_SIZE = 24000
OVERLAPPED = 512

NUM_MFCC = 13
N_FFT = 2048
HOP_LENGTH = 512
SAMPLE_RATE = 16000
###########################

EMOTIONS = ['neutral', 'happy', 'surprise', 'unpleasant']
COMMANDS = ['a', 'b', 'c', 'd']

# Manually Calibrate Sensitivity of Recognition
RECOG_MASK = np.array([1, 30000, 40000, 500])


async def tflite_process_data():
    data = np.zeros(BUFFER_SIZE)
    data = arduino_read(data, BUFFER_SIZE, OVERLAPPED, norm=(TUNING_MAX, TUNING_MIN))
    mfcc = librosa.feature.mfcc(y=data, sr=SAMPLE_RATE, n_mfcc=NUM_MFCC, n_fft=N_FFT, hop_length=HOP_LENGTH)
    features = np.array([mfcc.T], dtype=np.float32)

    # Model Input Shape = (None, None, 13)
    interpreter.set_tensor(interpreter.get_input_details()[0]['index'], features)
    interpreter.invoke()
    prediction = interpreter.get_tensor(interpreter.get_output_details()[0]['index'])[0]
    result = np.multiply(prediction, RECOG_MASK)
    emotion = EMOTIONS[np.argmax(result)]
    command = COMMANDS[np.argmax(result)]
    arduino.write(command.encode('utf-8'))
    print(result)
    print(f'Emotion: {emotion}\n')
    
async def tflite_run(rounds=10):
    tasks = []
    start_time = time.time()
    for turn in range(rounds):
        task = asyncio.create_task(tflite_process_data())
        tasks.append(task)
        time.sleep(0.6)
    await asyncio.gather(*tasks)
    display(HTML("<hr>"))
    print(f"Inference time for {turn+1} rounds: {time.time() - start_time} seconds") 


interpreter = tf.lite.Interpreter(model_path="output/Models/SER_quant.tflite")
interpreter.allocate_tensors()
await tflite_run(rounds=10)

[0.9981817  0.25002718 0.76402117 0.89542108]
Emotion: neutral

[2.22413272e-01 4.73774783e+02 8.84739757e+02 3.69837880e+02]
Emotion: surprise

[ 0.96156669  2.46809701  8.57897685 19.06829886]
Emotion: unpleasant

[ 0.92424935 13.32177955 67.30651949 36.81198135]
Emotion: surprise

[0.99739873 0.16329179 1.48442705 1.27935165]
Emotion: surprise

[0.99600047 0.53795125 1.51739878 1.97183457]
Emotion: unpleasant

[0.99796921 0.63563612 2.21974042 0.97709987]
Emotion: surprise

[ 0.93858731  4.07682106 11.98343234 30.48859723]
Emotion: unpleasant

[0.99763453 0.52881313 1.72818225 1.15234626]
Emotion: surprise

[ 0.97198164  1.74599561  7.44089019 13.88710365]
Emotion: unpleasant



Inference time for 10 rounds: 176.56919121742249 seconds
