# ML: Activity classification

This notebook contains the code used to train machile learning models to recognize the activities of the TUG test. The Convolutional Neural Network (CNN) architecture has been chosen to classify the activities. This notebook contains:

- Analysis of the main different approaches used to split the TUG test subphases.
- Train two models to classify acitities from smartphone and smartwatch sensor data.

**Inputs:**

- `01_SENSOR-DATA/03_WINDOWED/`: data used to train the models.

**Outputs:** (`02_ML/`)

- `01_splitting-approaches-reports.json`: results of the models trained with the different splitting approaches.
- `02_splitting-approaches-comparison.csv` (**Table II**): comparison of the different splitting approaches.
- `03_sp_data_model.tflite`: ML model for activity classification from smartphone accelerometer and gyroscope data.
- `03_sw_data_model.tflite`: ML model for activity classification from smartwatch accelerometer and gyroscope data.

## Imports

In [1]:
import os
os.environ['OUTDATED_IGNORE'] = '1'

import random as py_random

import numpy as np
import pandas as pd
import pingouin as pg
import tensorflow as tf

from alive_progress import alive_bar

from functions.fs_utils import list_subjects_folders, to_json, load_json

from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

from tensorflow import keras
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv1D, Flatten
from tensorflow.keras.optimizers import Adam

from tflite_support import flatbuffers
from tflite_support import metadata as _metadata
from tflite_support import metadata_schema_py_generated as _metadata_fb

pg.options['round.column.p-val'] = 3
## Disable GPU usage: bad performance on M1 with low batch sizes
tf.config.set_visible_devices([], 'GPU')

## Random seed

In [2]:
RANDOM_SEED = 53

def set_seed():
    os.environ['PYTHONHASHSEED'] = str(RANDOM_SEED)
    py_random.seed(RANDOM_SEED)
    np.random.seed(RANDOM_SEED)
    tf.random.set_seed(RANDOM_SEED)

## Constants

In [3]:
DATA_DIR = '01_SENSOR-DATA'

WINDOWED_DATA_DIR = os.path.join(DATA_DIR, '03_WINDOWED')
TURNING_AND_SITTING_WINDOWED_DIR = os.path.join(WINDOWED_DATA_DIR, '01_TURNING-AND-SITTING')
TURN_TO_SIT_WINDOWED_DIR = os.path.join(WINDOWED_DATA_DIR, '02_TURN-TO-SIT')

MODEL_DIR = '02_ML'
SPLIT_REPORTS = os.path.join(MODEL_DIR, '01_splitting-approaches-reports.json')
SPLIT_COMPARISON = os.path.join(MODEL_DIR, '02_splitting-approaches-comparison.csv')
MODEL_FILE = os.path.join(MODEL_DIR, '03_{0}_2.tflite')
LABELS_FILE = os.path.join(MODEL_DIR, '03_labels_2.txt')

## Data Functions

In [4]:
def load_subjects_data(path, source):
    x = {}
    y = {}
    subjects = list_subjects_folders(path)
    
    for subject in subjects:
        subject_dir = os.path.join(path, subject)

        x[subject] = np.load(os.path.join(subject_dir, f'{subject}_{source}.npy'))
        y[subject] = np.load(os.path.join(subject_dir, f'{subject}_{source}_gt.npy'))
                
    return x, y

In [5]:
NUM_CLASSES = 5

def ground_truth_to_categorical(y, mapping): #One-hot encoding: Transform discrete values to an encoded array
    y_copy = y.copy()
    for subject, gt in y_copy.items():
        mapped_gt = list(map(lambda i : mapping[i], gt))
        y_copy[subject] = to_categorical(mapped_gt, NUM_CLASSES)
        
    return y_copy

In [6]:
def merge_subjects_datasets(x, y, subjects):
    x_dataset = np.empty((0,6,50), dtype='float64')
    y_dataset = np.empty((0, 5))
    for subject in subjects:
        x_dataset = np.append(x_dataset, x[subject], axis=0)
        y_dataset = np.append(y_dataset, y[subject], axis=0)
    
    return x_dataset, y_dataset

## Model architecture

After testing several architectures (e.g., MLP, CNN+Dense, CNN+LSTM, and ConvLSTM), the CNN+Dense shown the best results. Regarding the internal architecture of the model, simple models obtained very good results in comparision with more complex models. Therefore, the chosen architecture (where different ones were tested using training (80%, where 20% for validation) / testing (20%)) for the CNN+Dense is the following:

- 1D Convolutional Layer (Input layer, 6x50): 64 filters of size 5, with *relu* as activation function.
- Flattening layer
- Dense layer: 256 neurons, with *relu* as activation function
- Dense layer (Output layer, 1x5): 5 neurons, with *softmax* as activation layer.

> [!NOTE]
> The testing of the several architecures is not included in this notebook.

In [7]:
def create_model():
    model = Sequential([
        Conv1D(filters=64, kernel_size=10, padding='same', activation='relu', input_shape=(6, 50)),
        Flatten(),
        Dense(256, activation='relu'),
        Dense(NUM_CLASSES, activation='softmax')
    ])
    
    model.compile(loss='categorical_crossentropy', optimizer=Adam(), metrics=['accuracy'])
    return model


BATCH_SIZE = 20
EPOCHS = 50

def create_trainer(batch_size=BATCH_SIZE, epochs=EPOCHS):
    def train_model(x, y, validation_data=None, verbose=0):
        model = create_model()
        model.fit(x, y, batch_size=batch_size, epochs=epochs, validation_data=validation_data, verbose=verbose)
        return model
    
    return train_model

In [8]:
cnn_model = create_model()
cnn_model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv1d (Conv1D)             (None, 6, 64)             32064     
                                                                 
 flatten (Flatten)           (None, 384)               0         
                                                                 
 dense (Dense)               (None, 256)               98560     
                                                                 
 dense_1 (Dense)             (None, 5)                 1285      
                                                                 
Total params: 131,909
Trainable params: 131,909
Non-trainable params: 0
_________________________________________________________________


## Splitting approaches: Turning and sitting vs. turning to sit

In the literature, there are two different approaches regarding how the subphases of the TUG test are splitted. The main conflict resides in the last part of the test when the subject has to turn and sit down in the chair. Some works split these two phases, having a _turning_ and a _sitting_down_ phase. However, others decide to keep these phases together, having a _turn_to_sit_ phase. Next, we are going to analyse which approach is better.

1. We have 4 datasets, 2 with data collected from a smartwatch and the other 2 with data collected from the smartphone. These 2 smartwatch's dataset contain the same data, but they differ in how the data is labelled (same for the smartphone's datasets).
2. With each dataset, we are going to train 100 models, splitting the data by subjects in 80/20.
3. Finally, we are going to compare the accuracy and the f1-score of the phases of interest (_turning_, _sitting_down_ and _turn_to_sit_) of the models.

### Functions

In [9]:
def generate_report(y_test, y_pred, activity_names):
    y_test = np.argmax(y_test, axis=1)
    y_pred = np.argmax(y_pred, axis=1)
    return classification_report(y_test, y_pred, target_names=activity_names, output_dict=True)


def training_report_from_datasets(datasets, models_per_dataset=100):
    set_seed()
    trainer = create_trainer()
    reports = {}

    for dataset_id, (x, y) in datasets.items():
        reports[dataset_id] = []
        activity_names = TURNING_AND_SITTING_MAPPING.keys() if 'turning_and_sitting' in dataset_id else TURN_TO_SIT_MAPPING.keys()
        with alive_bar(models_per_dataset, title=f'Training models for dataset {dataset_id}', force_tty=True) as progress:
            for i in range(models_per_dataset):
                train_subjects, test_subjects = train_test_split(list(x.keys()), test_size=0.2)

                x_train, y_train = merge_subjects_datasets(x, y, train_subjects)
                x_test, y_test = merge_subjects_datasets(x, y, test_subjects)

                model = trainer(x_train, y_train, verbose=0)
                y_pred = model.predict(x_test)

                reports[dataset_id].append(generate_report(y_test, y_pred, activity_names))
                progress()

    return reports


def get_accuracy_and_f1_scores(reports):
    results = {}
    for dataset_key, dataset_reports in reports.items():
        dataset_accuracies = []
        dataset_turning = []
        dataset_sitting = []
        dataset_turn_to_sit = []

        if 'turning_and_sitting' in dataset_key:
            for dataset_report in dataset_reports:
                dataset_accuracies.append(dataset_report['accuracy'])
                dataset_turning.append(dataset_report['TURNING']['f1-score'])
                dataset_sitting.append(dataset_report['SITTING_DOWN']['f1-score'])

            results[dataset_key] = {
                'accuracy': dataset_accuracies,
                'f1-turning': dataset_turning,
                'f1-sitting-down': dataset_sitting,
            }
        else:
            for dataset_report in dataset_reports:
                dataset_accuracies.append(dataset_report['accuracy'])
                dataset_turning.append(dataset_report['TURNING']['f1-score'])
                dataset_turn_to_sit.append(dataset_report['TURN_TO_SIT']['f1-score'])

            results[dataset_key] = {
                'accuracy': dataset_accuracies,
                'f1-turning': dataset_turning,
                'f1-turn_to_sit': dataset_turn_to_sit
            }

    return results


# Computed MWU power using G*Power 3.1:
# - Test family > t tests
# - Statistical test > Means: Wilcoxon-Mann_Whitney test (two groups)
# - Type of power analysis > Post hoc: Compute achieved power -- given alpha, sample size, and effect size.
COMPUTED_POWERS = {
    'sp_accuracy': 1
}

def compare_results(datasource_reports, metrics):
    def test_builder(parametric, source, metric, alternative='two-sided'):
        def mwu(g1, g2):
            res = pg.mwu(g1, g2, alternative=alternative).loc['MWU']
            power = COMPUTED_POWERS[f'{source}_{metric}']
            return [np.round(np.median(g1), 3), np.round(np.median(g2), 3), f'U={res["U-val"]}, p-val={res["p-val"]}, power={power}']

        def ttest(g1, g2):
            res = pg.ttest(g1, g2, alternative=alternative).loc['T-test']
            return [np.round(np.mean(g1), 3), np.round(np.mean(g2), 3), f't({res["dof"]})={res["T"]}, p-val={res["p-val"]}, power={res["power"]}']

        return ttest if parametric else mwu
    
    results = []
    for source, datasource_report in datasource_reports.items():
        for metric in metrics:
            partial_result = [source, metric]
            if metric not in datasource_report['ts']:
                results.append(partial_result + ['-', np.round(np.mean(datasource_report['tts'][metric]), 3), '-'])
            elif metric not in datasource_report['tts']:
                results.append(partial_result + [np.round(np.mean(datasource_report['ts'][metric]), 3), '-', '-'])
            else:
                ts_dataset = datasource_report['ts'][metric]
                tts_dataset = datasource_report['tts'][metric]

                tester = test_builder(bool(pg.normality(ts_dataset)['normal'].values[0]) and bool(pg.normality(tts_dataset)["normal"].values[0]), source, metric)
                results.append(partial_result + tester(ts_dataset, tts_dataset))
    return pd.DataFrame(results, columns=['source', 'metric', 'turning_sitting', 'turn_to_sit', 'two-tailed test'])

### Loading the data

In [10]:
x_sw_turning_and_sitting, y_sw_turning_and_sitting = load_subjects_data(TURNING_AND_SITTING_WINDOWED_DIR, source='sw')
x_sp_turning_and_sitting, y_sp_turning_and_sitting = load_subjects_data(TURNING_AND_SITTING_WINDOWED_DIR, source='sp')

x_sw_turn_to_sit, y_sw_turn_to_sit = load_subjects_data(TURN_TO_SIT_WINDOWED_DIR, source='sw')
x_sp_turn_to_sit, y_sp_turn_to_sit = load_subjects_data(TURN_TO_SIT_WINDOWED_DIR, source='sp')

One-hot encoding:

- Turning and sitting dataset:
  - 0 (SEATED) &rarr; [1,0,0,0,0]
  - 1 (STANDING_UP) &rarr; [0,1,0,0,0]
  - 2 (WALKING) &rarr; [0,0,1,0,0]
  - 3 (TURNING) &rarr; [0,0,0,1,0]
  - 4 (SITTING_DOWN) &rarr; [0,0,0,0,1]
- Turn to sit dataset:
  - 0 (SEATED) &rarr; [1,0,0,0,0]
  - 1 (STANDING_UP) &rarr; [0,1,0,0,0]
  - 2 (WALKING) &rarr; [0,0,1,0,0]
  - 3 (TURNING) &rarr; [0,0,0,1,0]
  - 4 (TURN_TO_SIT) &rarr; [0,0,0,0,1]

In [11]:
TURNING_AND_SITTING_MAPPING = {"SEATED": 0, "STANDING_UP": 1, "WALKING": 2, "TURNING": 3, "SITTING_DOWN": 4}
TURN_TO_SIT_MAPPING = {"SEATED": 0, "STANDING_UP": 1, "WALKING": 2, "TURNING": 3, "TURN_TO_SIT": 4}

y_sw_turning_and_sitting = ground_truth_to_categorical(y_sw_turning_and_sitting, TURNING_AND_SITTING_MAPPING)
y_sp_turning_and_sitting = ground_truth_to_categorical(y_sp_turning_and_sitting, TURNING_AND_SITTING_MAPPING)
y_sw_turn_to_sit = ground_truth_to_categorical(y_sw_turn_to_sit, TURN_TO_SIT_MAPPING)
y_sp_turn_to_sit = ground_truth_to_categorical(y_sp_turn_to_sit, TURN_TO_SIT_MAPPING)

### Training the models

The `training_report_from_datasets` function trains the 100 models per each dataset (400), which is a **long process**. Instead, you can load the results of the training from a file using one of the following cells.

In [None]:
datasets = {
    'sw_turning_and_sitting': [x_sw_turning_and_sitting, y_sw_turning_and_sitting],
    'sp_turning_and_sitting': [x_sp_turning_and_sitting, y_sp_turning_and_sitting],
    'sw_turn_to_sit': [x_sw_turn_to_sit, y_sw_turn_to_sit],
    'sp_turn_to_sit': [x_sp_turn_to_sit, y_sp_turn_to_sit],
}

reports = training_report_from_datasets(datasets)

Results of the training are stored in `02_MODEL/01_splitting_approaches-reports.json`

In [None]:
to_json(reports, SPLIT_REPORTS)

Run this cell to load the results of a previous stored training results

In [10]:
reports = load_json(SPLIT_REPORTS)

### Results

In [11]:
results = get_accuracy_and_f1_scores(reports)

In [12]:
comparison = compare_results({
    'sw': {
        'ts': results['sw_turning_and_sitting'],
        'tts': results['sw_turn_to_sit']
    },
    'sp': {
        'ts': results['sp_turning_and_sitting'],
        'tts': results['sp_turn_to_sit']
    },
    
}, ['accuracy', 'f1-turning', 'f1-sitting-down', 'f1-turn_to_sit'])
comparison

Unnamed: 0,source,metric,turning_sitting,turn_to_sit,two-tailed test
0,sw,accuracy,0.848,0.809,"t(198)=13.459315696940894, p-val=0.0, power=1.0"
1,sw,f1-turning,0.795,0.565,"t(198)=44.34041443328367, p-val=0.0, power=1.0"
2,sw,f1-sitting-down,0.804,-,-
3,sw,f1-turn_to_sit,-,0.735,-
4,sp,accuracy,0.857,0.789,"U=9036.0, p-val=0.0, power=1"
5,sp,f1-turning,0.846,0.529,"t(198)=57.06501597390656, p-val=0.0, power=1.0"
6,sp,f1-sitting-down,0.753,-,-
7,sp,f1-turn_to_sit,-,0.655,-


From these results we can see:

- The accuracy of the _turning_sitting_ approach is significantly better than the _turn_to_sit_ approach **for both** data sources.
- The f1-score of the _turning_ activity in the _turning_sitting_ approach is significantly better than the _turn_to_sit_ approach **for both** data sources.
- The f1-score of the _sitting down_ activity is superior to the _turn_to_sit_ activity.

The above table corresponds with the **Table 2** of the paper.

In [13]:
comparison.to_csv(SPLIT_COMPARISON, index=False)

## Training the model

In [15]:
x_sw, y_sw = load_subjects_data(TURNING_AND_SITTING_WINDOWED_DIR, source='sw')
x_sp, y_sp = load_subjects_data(TURNING_AND_SITTING_WINDOWED_DIR, source='sp')

TURNING_AND_SITTING_MAPPING = {"SEATED": 0, "STANDING_UP": 1, "WALKING": 2, "TURNING": 3, "SITTING_DOWN": 4}
y_sw = ground_truth_to_categorical(y_sw, TURNING_AND_SITTING_MAPPING)
y_sp = ground_truth_to_categorical(y_sp, TURNING_AND_SITTING_MAPPING)

In [16]:
x_sw_train, y_sw_train = merge_subjects_datasets(x_sw, y_sw, list(x_sw.keys()))
x_sp_train, y_sp_train = merge_subjects_datasets(x_sp, y_sp, list(x_sp.keys()))

training_sets = {
    "sw_data_model": [x_sw_train, y_sw_train],
    "sp_data_model": [x_sp_train, y_sp_train]
}

In [22]:
models = {}
for model_id, (x_set, y_set) in training_sets.items():
    trainer = create_trainer(epochs=100)
    set_seed()
    print(f'Training model {model_id}')
    models[model_id] = trainer(x_set, y_set, verbose=1)

Training model sw_data_model
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch

### Store the model

In [23]:
def save(model, path, export_format='.tflite'):
    if not path.endswith(export_format):
            path += export_format
    
    if export_format == '.tflite':
        converter = tf.lite.TFLiteConverter.from_keras_model(model)
        tflite_model = converter.convert()
        with open(path, 'wb') as file:
            file.write(tflite_model)
    elif export_format == ".h5":
        model.save(path)

In [24]:
for model_id, model in models.items():
    save(model, MODEL_FILE.format(model_id))

2023-04-14 17:53:27.001448: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.


INFO:tensorflow:Assets written to: /var/folders/61/xgwbr5s950x987vmbr_898w40000gn/T/tmpy87eki8r/assets


2023-04-14 17:53:28.814698: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:357] Ignored output_format.
2023-04-14 17:53:28.814716: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:360] Ignored drop_control_dependency.
2023-04-14 17:53:28.815506: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /var/folders/61/xgwbr5s950x987vmbr_898w40000gn/T/tmpy87eki8r
2023-04-14 17:53:28.816163: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }
2023-04-14 17:53:28.816168: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /var/folders/61/xgwbr5s950x987vmbr_898w40000gn/T/tmpy87eki8r
2023-04-14 17:53:28.818610: I tensorflow/cc/saved_model/loader.cc:228] Restoring SavedModel bundle.
2023-04-14 17:53:28.841903: I tensorflow/cc/saved_model/loader.cc:212] Running initialization op on SavedModel bundle at path: /var/folders/61/xgwbr5s950x987vmbr_898w40000gn/T/tmpy87eki8r
2023-04-

INFO:tensorflow:Assets written to: /var/folders/61/xgwbr5s950x987vmbr_898w40000gn/T/tmptjuwl7hk/assets


INFO:tensorflow:Assets written to: /var/folders/61/xgwbr5s950x987vmbr_898w40000gn/T/tmptjuwl7hk/assets
2023-04-14 17:53:30.899387: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:357] Ignored output_format.
2023-04-14 17:53:30.899407: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:360] Ignored drop_control_dependency.
2023-04-14 17:53:30.899480: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /var/folders/61/xgwbr5s950x987vmbr_898w40000gn/T/tmptjuwl7hk
2023-04-14 17:53:30.900115: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }
2023-04-14 17:53:30.900120: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /var/folders/61/xgwbr5s950x987vmbr_898w40000gn/T/tmptjuwl7hk
2023-04-14 17:53:30.902063: I tensorflow/cc/saved_model/loader.cc:228] Restoring SavedModel bundle.
2023-04-14 17:53:30.920882: I tensorflow/cc/saved_model/loader.cc:212] Running initialization

### Add metadata to the model

The metadata contains useful information about the model, such as the name, version, author, etc... But the most important metadata piece is the associated file, which contains the labels of the outputs of the model.

#### Labels file creation

Create the file containing the labels for the model

In [25]:
with open(LABELS_FILE, 'w') as labels_file:
    for key in TURNING_AND_SITTING_MAPPING.keys():
        labels_file.write(f'{key}\n')

#### Model metadata

- Name
- Version
- Author

In [26]:
def build_model_meta(name, version, author):
    model_meta = _metadata_fb.ModelMetadataT()
    model_meta.name = name
    model_meta.version = version
    model_meta.author = author
    return model_meta

#### Input metadata

Metadata regarding the inputs of the model

In [27]:
def build_input_meta(name, desc):
    input_meta = _metadata_fb.TensorMetadataT()
    input_meta.name = name
    input_meta.description = (desc)
    return input_meta

#### Output metadata

Metadata regarding the output of the model

In [28]:
def build_output_meta(name, desc, associated_file):
    output_meta = _metadata_fb.TensorMetadataT()
    output_meta.name = name
    output_meta.description = desc
    output_meta.content = _metadata_fb.ContentT()
    output_meta.content.content_properties = _metadata_fb.FeaturePropertiesT()
    output_meta.content.contentPropertiesType = (
        _metadata_fb.ContentProperties.FeatureProperties)
    output_stats = _metadata_fb.StatsT()
    output_stats.max = [1.0]
    output_stats.min = [0.0]
    output_meta.stats = output_stats
    
    output_meta.associatedFiles = [associated_file]
    return output_meta


def build_associated_file(file_path, desc):
    associated_file = _metadata_fb.AssociatedFileT()
    associated_file.name = os.path.basename(file_path)
    associated_file.description = desc
    associated_file.type = _metadata_fb.AssociatedFileType.TENSOR_AXIS_LABELS
    return associated_file

#### Create metadata

In [31]:
metadata = {
    "sp_data_model": {
        "model-meta": {
            "name": "CNN Smartphone Actions Classifier",
            "version": "2.0.0",
            "author": "Miguel Matey Sanz"
        },
        "input-meta": {
            "name": "Raw data",
            "desc": "Accelerometer and gyroscope raw data sampled at 100Hz from a smartphone."
        },
        "associated-file-meta": {
            "path": LABELS_FILE,
            "desc": "Actions that the model can recognize."
        },
        "output-meta": {
            "name": "Probability",
            "desc": "Probability of each one of the actions."
        }
    },
    "sw_data_model": {
        "model-meta": {
            "name": "CNN Smartwatch Actions Classifier",
            "version": "2.0.0",
            "author": "Miguel Matey Sanz"
        },
        "input-meta": {
            "name": "Raw data",
            "desc": "Accelerometer and gyroscope raw data sampled at 100Hz from a smartwatch."
        },
        "associated-file-meta": {
            "path": LABELS_FILE,
            "desc": "Actions that the model can recognize."
        },
        "output-meta": {
            "name": "Probability",
            "desc": "Probability of each one of the actions."
        }
    }
}

def add_metadata_to_model(model_file_path, metadata):
    model_meta = build_model_meta(
        metadata['model-meta']["name"],
        metadata['model-meta']["version"],
        metadata['model-meta']["author"],
    )
    
    input_meta = build_input_meta(
        metadata['input-meta']["name"],
        metadata['input-meta']["desc"]
    )
    
    associated_file = build_associated_file(
        metadata['associated-file-meta']["path"],
        metadata['associated-file-meta']["desc"]
    )
    
    output_meta = build_output_meta(
        metadata['output-meta']["name"],
        metadata['output-meta']["desc"],
        associated_file
    )
    
    subgraph = _metadata_fb.SubGraphMetadataT()
    subgraph.inputTensorMetadata = [input_meta]
    subgraph.outputTensorMetadata = [output_meta]
    model_meta.subgraphMetadata = [subgraph]
    
    b = flatbuffers.Builder(0)
    b.Finish(
        model_meta.Pack(b),
        _metadata.MetadataPopulator.METADATA_FILE_IDENTIFIER)
    metadata_buf = b.Output()
    
    populator = _metadata.MetadataPopulator.with_model_file(model_file_path)
    populator.load_metadata_buffer(metadata_buf)
    populator.load_associated_files([LABELS_FILE])
    populator.populate()
    
    updated_model_buff = populator.get_model_buffer()
    with open(model_file_path, "wb") as f:
          f.write(updated_model_buff)

In [32]:
for model_id in models.keys():
    add_metadata_to_model(MODEL_FILE.format(model_id), metadata[model_id])