## Setup Python Environment 

To set up the Python environment and install the necessary dependencies for the notebook, please run the following cell

In [None]:
# Remove all files and directories in the current working directory
!rm -r *

# Install the 'xxd' package
!apt-get -qq install xxd

# Install required Python packages for data manipulation, numerical computations, and data visualization
!pip install pandas numpy matplotlib

# Install TensorFlow
!pip install tensorflow==2.12.0


## Upload Data

1. Click on the "Files" tab located in the left-side menu of Colab.
2. Drag and drop your desired `.csv` files from your computer onto the "Files" tab. <br>
Alternatively, you can click on the "Upload" button within the "Files" tab and browse for the files you want to upload.
By following these steps, you will successfully upload your CSV files into Colab for further processing.

## Parse and prepare the data

To parse and prepare the data in the next cell, you need to update the `GESTURES` list with the gesture data you've collected in `.csv` format.

1.  Replace the values in the `GESTURES` list with the names of the gestures you have collected. For example:

`GESTURES = [
    "gesture_1",
    "gesture_2",
    "gesture_3"
]`

2.  Make sure the names in the `GESTURES` list match the actual filenames of your CSV files.

By updating the `GESTURES` list, the code will correctly process and transform the CSV files of the corresponding gestures for training.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import tensorflow as tf

# Set a random seed for reproducibility
SEED = 1111
np.random.seed(SEED)
tf.random.set_seed(SEED)

# Define the list of gestures
GESTURES = [
    "c_up_push",
    "other"
]

# Define the number of samples per gesture
SAMPLES_PER_GESTURE = 100

# Get the number of gestures
NUM_GESTURES = len(GESTURES)

# Create one-hot encoded representations of the gestures
ONE_HOT_ENCODED_GESTURES = np.eye(NUM_GESTURES)

# Initialize lists to store inputs and outputs
inputs = []
outputs = []

# Iterate over each gesture
for gesture_index in range(NUM_GESTURES):
    gesture = GESTURES[gesture_index]
    print(f"Processing index {gesture_index} for gesture '{gesture}'.")

    # Get the one-hot encoded output for the current gesture
    output = ONE_HOT_ENCODED_GESTURES[gesture_index]

    # Read the data from the CSV file for the current gesture
    df = pd.read_csv("/content/" + gesture + ".csv")

    # Calculate the number of recordings for the current gesture
    num_recordings = int(df.shape[0] / SAMPLES_PER_GESTURE)

    print(f"\tThere are {num_recordings} recordings of the {gesture} gesture.")

    # Iterate over each recording
    for i in range(num_recordings):
        tensor = []

        # Iterate over each sample in the recording
        for j in range(SAMPLES_PER_GESTURE):
            index = i * SAMPLES_PER_GESTURE + j

            # Normalize and scale the accelerometer and gyroscope data
            tensor += [
                (df['accelerometerX'][index] + 4) / 8,
                (df['accelerometerY'][index] + 4) / 8,
                (df['accelerometerZ'][index] + 4) / 8,
                (df['gyroscopeX'][index] + 2000) / 4000,
                (df['gyroscopeY'][index] + 2000) / 4000,
                (df['gyroscopeZ'][index] + 2000) / 4000
            ]

        # Append the tensor as input and the output to the respective lists
        inputs.append(tensor)
        outputs.append(output)

# Convert the input and output lists to numpy arrays
inputs = np.array(inputs)
outputs = np.array(outputs)

print("Data set parsing and preparation complete.")

## Randomize and split the input and output pairs for training

In order to train the model effectively, the input and output pairs need to be randomized and split into different sets for training, validation, and testing. Here's how the data is divided:

- Training Set: This set comprises 60% of the total input and output pairs. It is used to train the model by optimizing its parameters and updating the weights based on the provided input-output pairs.

- Validation Set: This set consists of 20% of the total input and output pairs. It is used to measure the model's performance during training. The validation set helps assess how well the model generalizes to unseen data and allows for monitoring the model's progress.

- Testing Set: This set also represents 20% of the total input and output pairs. It is used to test the model's performance after the training phase is completed. The testing set evaluates the model's ability to make accurate predictions on new, unseen data.

By splitting the data into these sets, you can train, validate, and evaluate the model's performance throughout the training process, ensuring reliable and accurate results.

In [None]:
# Get the total number of inputs
num_inputs = len(inputs)

# Create an array of indices from 0 to num_inputs
randomize = np.arange(num_inputs)

# Shuffle the indices randomly
np.random.shuffle(randomize)

# Randomize the order of inputs and outputs using the randomized indices
inputs = inputs[randomize]
outputs = outputs[randomize]

# Split the data into three sets: training, testing, and validation
TRAIN_SPLIT = int(0.6 * num_inputs)
TEST_SPLIT = int(0.2 * num_inputs + TRAIN_SPLIT)

# Split the inputs based on the defined splits
inputs_train, inputs_test, inputs_validate = np.split(inputs, [TRAIN_SPLIT, TEST_SPLIT])

# Split the outputs based on the defined splits
outputs_train, outputs_test, outputs_validate = np.split(outputs, [TRAIN_SPLIT, TEST_SPLIT])

print("Data set randomization and splitting complete.")

Data set randomization and splitting complete.


## Build & Train the Model

Build and train a [TensorFlow](https://www.tensorflow.org) model using the high-level [Keras](https://www.tensorflow.org/guide/keras) API.

In [None]:
# Create a sequential model
model = tf.keras.Sequential()

# Add a dense layer with 50 units and ReLU activation function
model.add(tf.keras.layers.Dense(50, activation='relu'))

# Add a dense layer with 15 units and ReLU activation function
model.add(tf.keras.layers.Dense(15, activation='relu'))

# Add a dense layer with NUM_GESTURES units and softmax activation function
# Softmax is used because we expect only one gesture to occur per input
model.add(tf.keras.layers.Dense(NUM_GESTURES, activation='softmax'))

# Compile the model with optimizer, loss, and metrics
model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])

# Train the model
# Use inputs_train and outputs_train as training data
# Train for 600 epochs with a batch size of 1
# Use inputs_validate and outputs_validate as validation data
history = model.fit(inputs_train, outputs_train, epochs=600, batch_size=1, validation_data=(inputs_validate, outputs_validate))

## Graph the loss

Graph the loss to see when the model stops improving.

In [None]:
# Increase the size of the graphs to (20, 10)
plt.rcParams["figure.figsize"] = (20, 10)

# Retrieve the loss values from the training history
loss = history.history['loss']
val_loss = history.history['val_loss']

# Create an array of epochs from 1 to the length of the loss values
epochs = range(1, len(loss) + 1)

# Plot the training loss as green dots
plt.plot(epochs, loss, 'g.', label='Training loss')

# Plot the validation loss as a solid blue line
plt.plot(epochs, val_loss, 'b', label='Validation loss')

# Set the title and labels for the graph
plt.title('Training and validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')

# Add a legend to the graph
plt.legend()

# Display the graph
plt.show()

## Graph the loss again, skipping a bit of the start

We'll graph the same data as the previous code cell, but start at index 100 so we can further zoom in once the model starts to converge.

In [None]:
# Define the number of epochs to skip at the start
SKIP = 100

# Plot the training loss starting from SKIP epoch
plt.plot(epochs[SKIP:], loss[SKIP:], 'g.', label='Training loss')

# Plot the validation loss starting from SKIP epoch
plt.plot(epochs[SKIP:], val_loss[SKIP:], 'b.', label='Validation loss')

# Set the title and labels for the graph
plt.title('Training and validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')

# Add a legend to the graph
plt.legend()

# Display the graph
plt.show()

## Graph the mean absolute error

[Mean absolute error](https://en.wikipedia.org/wiki/Mean_absolute_error) is another metric to judge the performance of the model.



In [None]:
# Retrieve the mean absolute error (MAE) values from the training history
mae = history.history['mae']
val_mae = history.history['val_mae']

# Plot the training MAE starting from SKIP epoch
plt.plot(epochs[SKIP:], mae[SKIP:], 'g.', label='Training MAE')

# Plot the validation MAE starting from SKIP epoch
plt.plot(epochs[SKIP:], val_mae[SKIP:], 'b.', label='Validation MAE')

# Set the title and labels for the graph
plt.title('Training and validation mean absolute error')
plt.xlabel('Epochs')
plt.ylabel('MAE')

# Add a legend to the graph
plt.legend()

# Display the graph
plt.show()

## Convert the Trained Model to Tensor Flow Lite

In the next cell, the model is converted to the TensorFlow Lite format, and the size of the model in bytes is printed out.

In [None]:
# Convert the model to TensorFlow Lite format without quantization
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Save the converted model to disk
open("gesture_model.tflite", "wb").write(tflite_model)

# Get the size of the saved model file
import os
basic_model_size = os.path.getsize("gesture_model.tflite")

# Print the size of the saved model
print("Model is %d bytes" % basic_model_size)

## Encode the Model in an Arduino Header File 

The next cell creates a constant byte array that contains the TensorFlow Lite model. The provided code converts the model to an Arduino header file format, which can be easily included and used in your Arduino sketch.

In [None]:
# Create the model.h file and write the initial line
!echo "const unsigned char model[] = {" > /content/model.h

# Convert the content of gesture_model.tflite to hexadecimal representation and append it to model.h
!cat gesture_model.tflite | xxd -i >> /content/model.h

# Write the closing line to model.h
!echo "};" >> /content/model.h

# Get the size of the model.h file
import os
model_h_size = os.path.getsize("model.h")

# Print the size of the model.h file
print(f"Header file, model.h, is {model_h_size:,} bytes.")
print("\nOpen the side panel (refresh if needed). Double click model.h to download the file.")

# Classifying IMU Data

Now it's time to switch back to `imu_classification` and run our new model on the Arduino Nano 33 BLE Sense to classify the accelerometer and gyroscope data.
