# Modeling

## Reference

The code in this notebook is adapted and modified from the following Youtube tutorial: https://www.youtube.com/watch?v=doDUihpj6ro 

## Usage

In this notebook, an LSTM based neural network model can be trained with data generated by the notebook "1_Example-Data-Generation.ipynb"

The path to the input data `DATA_PATH` is set to 'MP_Data_test'. Change it, if you want to use different / your own data generated with "1_Example-Data-Generation.ipynb". 

The weights of the trained model are saved with the name `model_name = 'first_model_whoop_whoop.h5'`. Change it, if you want ;)

## 1. Install and Import Dependencies

### Install Dependencies

In [1]:
%pip install tensorflow-macos opencv-python mediapipe-silicon sklearn matplotlib
#!pip install tensorflow==2.4.1 tensorflow-gpu==2.4.1 opencv-python mediapipe sklearn matplotlib # original code line from tutorial (he used a Windows system)

Note: you may need to restart the kernel to use updated packages.


### Import Dependencies

In [18]:
# general 
import numpy as np
import pandas as pd
import os # easier file path handling

# for device camera feed
import cv2 # opencv
from matplotlib import pyplot as plt # imshow for easy visualization
import time # to insert breaks / "sleep" in between frames
import mediapipe as mp # for accessing and reading from device camera

# for data pre-processing
from sklearn.model_selection import train_test_split
from tensorflow.keras.utils import to_categorical

# for model evaluation
from sklearn.metrics import multilabel_confusion_matrix, ConfusionMatrixDisplay, accuracy_score

## 2. Preprocess Data and Create Labels and Features

### Some global setup (edit to your needs)

In [3]:
# path for input data
DATA_PATH = os.path.join('MP_Data_test')
#DATA_PATH = os.path.join('MP_Data')

# actions to detect (get later from our .json file)
actions = np.array(['hello', 'thanks', 'iloveyou'])

# 30 videos per sequence / word / sign (get it later from input file)
no_sequences = 30

# each video with 30 frames (get it later from input file)
sequence_length = 30

# create label map (dict, later our .json file)
label_map = {label:num for num, label in enumerate(actions)}

### Here, we could feed in our Kaggle data

### Loading Data

In [4]:
sequences, labels = [], [] # sequences will be x data, labels will be y data
# loop over all actions (words)
for action in actions: 
    # loop over all sequences (videos)
    for sequence in range(no_sequences): 
        window = [] # represents all frames of particular sequence (video)
        # loop through each frame
        for frame_num in range(sequence_length): 
            # load up current frame (frame_num)
            res = np.load(os.path.join(DATA_PATH, action, str(sequence), "{}.npy".format(frame_num)))
            window.append(res) # append to one video
        sequences.append(window) # append all videos to sequence (for a word)
        labels.append(label_map[action])

### Define X and y

In [5]:
X = np.array(sequences) # shape: (90, 30, 1662)
y = to_categorical(labels).astype(int) # one-hot encoded labels (words), shape: (90, 3)

### Splitting Train and Test Data

In [6]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.05)

## Build and Train LSTM Neural Network

### Setup

In [7]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from tensorflow.keras.callbacks import TensorBoard

In [8]:
# for tensorflow callbacks
log_dir = os.path.join('Logs')
tb_callback = TensorBoard(log_dir=log_dir)

### Model building

In [9]:
# setup sequential model
model = Sequential()
model.add(LSTM(64, return_sequences=True, activation='relu', input_shape=(30, 1662))) # input_shape=(#sequences, #keypoints)
model.add(LSTM(128, return_sequences=True, activation='relu'))
model.add(LSTM(64, return_sequences=False, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(actions.shape[0], activation='softmax')) # actions.shape[0]: from user-defined actions (see above)

# compile the model
model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['categorical_accuracy'])
# categorical_crossentropy must be used for multiclass classification model! 

# show summary of model
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 30, 64)            442112    
                                                                 
 lstm_1 (LSTM)               (None, 30, 128)           98816     
                                                                 
 lstm_2 (LSTM)               (None, 64)                49408     
                                                                 
 dense (Dense)               (None, 64)                4160      
                                                                 
 dense_1 (Dense)             (None, 32)                2080      
                                                                 
 dense_2 (Dense)             (None, 3)                 99        
                                                                 
Total params: 596,675
Trainable params: 596,675
Non-trai

### Model fitting

In [10]:
model.fit(X_train, y_train, epochs=2000, callbacks=[tb_callback])
# advantage of using mediapipe holistic model is you don't need additional data generator to build up a pipeline of data. Training data fits all into memory.

Epoch 1/2000


2023-04-06 19:26:40.110980: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz


Epoch 2/2000
Epoch 3/2000
Epoch 4/2000
Epoch 5/2000
Epoch 6/2000
Epoch 7/2000
Epoch 8/2000
Epoch 9/2000
Epoch 10/2000
Epoch 11/2000
Epoch 12/2000
Epoch 13/2000
Epoch 14/2000
Epoch 15/2000
Epoch 16/2000
Epoch 17/2000
Epoch 18/2000
Epoch 19/2000
Epoch 20/2000
Epoch 21/2000
Epoch 22/2000
Epoch 23/2000
Epoch 24/2000
Epoch 25/2000
Epoch 26/2000
Epoch 27/2000
Epoch 28/2000
Epoch 29/2000
Epoch 30/2000
Epoch 31/2000
Epoch 32/2000
Epoch 33/2000
Epoch 34/2000
Epoch 35/2000
Epoch 36/2000
Epoch 37/2000
Epoch 38/2000
Epoch 39/2000
Epoch 40/2000
Epoch 41/2000
Epoch 42/2000
Epoch 43/2000
Epoch 44/2000
Epoch 45/2000
Epoch 46/2000
Epoch 47/2000
Epoch 48/2000
Epoch 49/2000
Epoch 50/2000
Epoch 51/2000
Epoch 52/2000
Epoch 53/2000
Epoch 54/2000
Epoch 55/2000
Epoch 56/2000
Epoch 57/2000
Epoch 58/2000
Epoch 59/2000
Epoch 60/2000
Epoch 61/2000
Epoch 62/2000
Epoch 63/2000
Epoch 64/2000
Epoch 65/2000
Epoch 66/2000
Epoch 67/2000
Epoch 68/2000
Epoch 69/2000
Epoch 70/2000
Epoch 71/2000
Epoch 72/2000
Epoch 73/2000


<keras.callbacks.History at 0x29d53cc70>

### Some Tensorboard stuff I didn't figure out (not important)

In [11]:
# %load_ext tensorboard

In [12]:
# to run it in the notebook: 
# %tensorboard --logdir=./Logs/train --port=8008

# to run it in terminal: python3 -m tensorboard --logdir=./Logs/train --port=8008
# then copy+paste this into your internet browser: localhost:8008

## Prediction

In [13]:
y_pred = model.predict(X_test)



### Looking at a single example prediction

In [14]:
actions[np.argmax(y_pred[0])] # prediction

'hello'

In [15]:
actions[np.argmax(y_test[0])] # actual

'hello'

## Evaluation using Confusion Matrix and Accuracy

### back-converting one-hot codes to labels

In [16]:
# convert one-hot encoded categories back to labels, 
# e.g. 0, 1 and 2 instead of [1,0,0], [0,1,0], [0,0,1]
y_test_original = y_test.copy()
y_pred_original = y_pred.copy()
y_test = np.argmax(y_test, axis=1).tolist()
y_pred = np.argmax(y_pred, axis=1).tolist()

### Multilabel Confusion Matrix

In [23]:
# multilabel confusion matrix 
multilabel_confusion_matrix(y_test, y_pred)

array([[[2, 0],
        [1, 2]],

       [[4, 0],
        [0, 1]],

       [[3, 1],
        [0, 1]]])

### Accuracy Score

In [24]:
# accuracy score
accuracy_score(y_test, y_pred)

0.8

## Save Model Weights

In [25]:
# save model
model_name = 'first_model_whoop_whoop.h5'
model.save(model_name)