# HDA - Project 3
## TASK B1: Activity detection
This task consists of a binary classification, where a gesture denotes activity and thus the model detects wheter there is a gesture label or not (labeled in column 6).

This first cell contains the parameters that can be tuned for code execution:
- subject: select the subject on which to test the model, between [1,4];
- folder: directory name where '.mat' files are stored;
- label_col: index of feature column to be selected to perform activity detection, between [0,6];
- window_size: parameter that sets the length of temporal windows on which to perform the convolution;
- stride: step length to chose the next window.

In [1]:
subject = 1
folder = "./data/full/"
label = 0     # default for task B1
window_size = 15
stride = 5
# make_binary = True

In [2]:
import utils
import deeplearning
import numpy as np
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.metrics import classification_report
import matplotlib.pyplot as plt
from keras.optimizers import Adam
from keras.utils import to_categorical
from keras.callbacks import ModelCheckpoint
from keras.models import load_model

Using TensorFlow backend.


Creation of training set and test set

In [3]:
[x_train, y_train, x_test, y_test, n_classes] = utils.preprocessing(subject,
                                                                    folder,
                                                                    label,
                                                                    window_size,
                                                                    stride,
                                                                    printInfo = True,
                                                                    make_binary = True)

Training samples:  157125 
Test samples:       57536 
Features:             110

TRAINING SET:
Dataset of Images have shape:  (31422, 15, 110) 
Dataset of Labels have shape:    (31422, 2) 
Fraction of labels:   [0.11036853 0.88963147]

TEST SET:
Dataset of Images have shape:  (11504, 15, 110) 
Dataset of Labels have shape:    (11504, 2) 
Fraction of labels:   [0.1772427 0.8227573]


Preparation of data in a input-suitable form

In [4]:
n_features = 110 #number of features taken into consideration for the solution of the problem
n_classes = 2

detection_model = deeplearning.MotionDetection((window_size,n_features,1), n_classes)
detection_model.summary() # model visualization

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
batch_normalization_1 (Batch (None, 15, 110, 1)        4         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 5, 108, 50)        1700      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 2, 108, 50)        0         
_________________________________________________________________
reshape_1 (Reshape)          (None, 2, 5400)           0         
_________________________________________________________________
lstm_1 (LSTM)                (None, 2, 20)             433680    
_________________________________________________________________
lstm_2 (LSTM)                (None, 20)                3280      
_________________________________________________________________
dense_1 (Dense)              (None, 512)               10752     
__________

In [5]:
detection_model.compile(optimizer = Adam(lr=0.01), 
                        loss = "categorical_crossentropy", 
                        metrics = ["accuracy"])

input_train = x_train.reshape(x_train.shape[0], window_size, n_features, 1)
input_test = x_test.reshape(x_test.shape[0], window_size, n_features, 1)

checkpointer = ModelCheckpoint(filepath='./data/weights_d.hdf5', verbose=1, save_best_only=True)

detection_model.fit(x = input_train, 
                    y = y_train, 
                    epochs = 20, 
                    batch_size = 128,
                    verbose = 1,
                    validation_data=(input_test, y_test),
                    callbacks=[checkpointer])

Train on 31422 samples, validate on 11504 samples
Epoch 1/20

Epoch 00001: val_loss improved from inf to 0.31890, saving model to ./data/weights_d.hdf5
Epoch 2/20

Epoch 00002: val_loss improved from 0.31890 to 0.19087, saving model to ./data/weights_d.hdf5
Epoch 3/20

Epoch 00003: val_loss did not improve
Epoch 4/20

Epoch 00004: val_loss did not improve
Epoch 5/20

Epoch 00005: val_loss did not improve
Epoch 6/20

Epoch 00006: val_loss did not improve
Epoch 7/20

Epoch 00007: val_loss did not improve
Epoch 8/20

Epoch 00008: val_loss did not improve
Epoch 9/20

Epoch 00009: val_loss improved from 0.19087 to 0.17830, saving model to ./data/weights_d.hdf5
Epoch 10/20

Epoch 00010: val_loss did not improve
Epoch 11/20

Epoch 00011: val_loss did not improve
Epoch 12/20

Epoch 00012: val_loss did not improve
Epoch 13/20

Epoch 00013: val_loss did not improve
Epoch 14/20

Epoch 00014: val_loss did not improve
Epoch 15/20

Epoch 00015: val_loss did not improve
Epoch 16/20

Epoch 00016: val_

<keras.callbacks.History at 0x18424624390>

In [6]:
y_pred = detection_model.predict(input_test)
y_pred = np.argmax(y_pred, 1)

print(classification_report(y_test, to_categorical(y_pred)))

             precision    recall  f1-score   support

          0       0.96      0.61      0.75      2039
          1       0.92      0.99      0.96      9465

avg / total       0.93      0.93      0.92     11504



In [7]:
detection_model_best = load_model('./data/weights_d.hdf5')

y_pred = detection_model_best.predict(input_test)
y_pred = np.argmax(y_pred, 1)

print(classification_report(y_test, to_categorical(y_pred)))

             precision    recall  f1-score   support

          0       0.86      0.78      0.82      2039
          1       0.95      0.97      0.96      9465

avg / total       0.94      0.94      0.94     11504



## Task B2: gesture recognition
This task consists of a 17-class classification, where gestures are labeled in column 6.

To tune the following parameters, refer to the first cell of task B1:
- subject: select the subject on which to test the model, between [1,4];
- folder: directory name where '.mat' files are stored;
- label_col: index of feature column to be selected to perform activity detection, between [0,6];
- window_size: parameter that sets the length of temporal windows on which to perform the convolution;
- stride: step length to chose the next window.

Here we just need to preserve the different labels, thus we set 'make_binary' to False. We have then 18 different labels, keeping into account the null class, with label 0.

In [8]:
[x_train, y_train, x_test, y_test, n_classes] = utils.preprocessing(subject,
                                                                    folder,
                                                                    label,
                                                                    window_size,
                                                                    stride,
                                                                    printInfo = True,
                                                                    make_binary = False,
                                                                    null_class = False)

Training samples:  157125 
Test samples:       57536 
Features:             110

TRAINING SET:
Dataset of Images have shape:  (27948, 15, 110) 
Dataset of Labels have shape:    (27948, 4) 
Fraction of labels:   [0.47098182 0.30918134 0.19253614 0.0273007 ]

TEST SET:
Dataset of Images have shape:  (9465, 15, 110) 
Dataset of Labels have shape:    (9465, 4) 
Fraction of labels:   [0.41817221 0.24638141 0.28874802 0.04669836]


In [9]:
n_classes = 4 # OVERWRITE TO BE FIXED

In [10]:
classification_model = deeplearning.MotionClassification((window_size,n_features,1), n_classes)
classification_model.summary() # model visualization

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
batch_normalization_2 (Batch (None, 15, 110, 1)        4         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 5, 110, 50)        600       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 2, 110, 50)        0         
_________________________________________________________________
reshape_2 (Reshape)          (None, 2, 5500)           0         
_________________________________________________________________
lstm_3 (LSTM)                (None, 2, 300)            6961200   
_________________________________________________________________
lstm_4 (LSTM)                (None, 300)               721200    
_________________________________________________________________
dropout_1 (Dropout)          (None, 300)               0         
__________

In [11]:
classification_model.compile(optimizer = Adam(lr=0.01), 
                             loss = "categorical_crossentropy", 
                             metrics = ["accuracy"])

input_train = x_train.reshape(x_train.shape[0], window_size, n_features, 1)
input_test = x_test.reshape(x_test.shape[0], window_size, n_features, 1)

checkpointer = ModelCheckpoint(filepath='./data/weights_c.hdf5', verbose=1, save_best_only=True)

classification_model.fit(x = input_train, 
                         y = y_train, 
                         epochs = 20, 
                         batch_size = 128,
                         verbose = 1,
                         validation_data=(input_test, y_test),
                         callbacks=[checkpointer])

Train on 27948 samples, validate on 9465 samples
Epoch 1/20

Epoch 00001: val_loss improved from inf to 0.27595, saving model to ./data/weights_c.hdf5
Epoch 2/20

Epoch 00002: val_loss did not improve
Epoch 3/20

Epoch 00003: val_loss improved from 0.27595 to 0.25982, saving model to ./data/weights_c.hdf5
Epoch 4/20

Epoch 00004: val_loss did not improve
Epoch 5/20

Epoch 00005: val_loss did not improve
Epoch 6/20

Epoch 00006: val_loss did not improve
Epoch 7/20

Epoch 00007: val_loss improved from 0.25982 to 0.21853, saving model to ./data/weights_c.hdf5
Epoch 8/20

Epoch 00008: val_loss did not improve
Epoch 9/20

Epoch 00009: val_loss did not improve
Epoch 10/20

Epoch 00010: val_loss did not improve
Epoch 11/20

Epoch 00011: val_loss improved from 0.21853 to 0.18484, saving model to ./data/weights_c.hdf5
Epoch 12/20

Epoch 00012: val_loss did not improve
Epoch 13/20

Epoch 00013: val_loss did not improve
Epoch 14/20

Epoch 00014: val_loss did not improve
Epoch 15/20

Epoch 00015: 

<keras.callbacks.History at 0x185507bcdd8>

In [12]:
y_pred = classification_model.predict(input_test)
y_pred = np.argmax(y_pred, 1)

print(classification_report(y_test, to_categorical(y_pred)))

             precision    recall  f1-score   support

          0       0.91      0.91      0.91      3958
          1       0.88      0.87      0.87      2332
          2       0.98      1.00      0.99      2733
          3       1.00      0.85      0.92       442

avg / total       0.93      0.93      0.93      9465



In [13]:
detection_model_best = load_model('./data/weights_c.hdf5')

y_pred = detection_model_best.predict(input_test)
y_pred = np.argmax(y_pred, 1)

print(classification_report(y_test, to_categorical(y_pred)))

             precision    recall  f1-score   support

          0       0.95      0.88      0.91      3958
          1       0.82      0.91      0.87      2332
          2       0.99      1.00      0.99      2733
          3       0.99      1.00      0.99       442

avg / total       0.93      0.93      0.93      9465



## Classification with null class
(detection and classification are performed together)

In [14]:
[x_train, y_train, x_test, y_test, n_classes] = utils.preprocessing(subject,
                                                                    folder,
                                                                    label,
                                                                    window_size,
                                                                    stride,
                                                                    printInfo = True,
                                                                    make_binary = False,)

Training samples:  157125 
Test samples:       57536 
Features:             110

TRAINING SET:
Dataset of Images have shape:  (31422, 15, 110) 
Dataset of Labels have shape:    (31422, 5) 
Fraction of labels:   [0.11055948 0.41891032 0.27499841 0.17124944 0.02428235]

TEST SET:
Dataset of Images have shape:  (11504, 15, 110) 
Dataset of Labels have shape:    (11504, 5) 
Fraction of labels:   [0.1772427  0.34405424 0.2027121  0.23756954 0.03842142]


In [15]:
classification_model = deeplearning.MotionClassification((window_size,n_features,1), n_classes)
classification_model.summary() # model visualization

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
batch_normalization_3 (Batch (None, 15, 110, 1)        4         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 5, 110, 50)        600       
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 2, 110, 50)        0         
_________________________________________________________________
reshape_3 (Reshape)          (None, 2, 5500)           0         
_________________________________________________________________
lstm_5 (LSTM)                (None, 2, 300)            6961200   
_________________________________________________________________
lstm_6 (LSTM)                (None, 300)               721200    
_________________________________________________________________
dropout_3 (Dropout)          (None, 300)               0         
__________

In [16]:
classification_model.compile(optimizer = Adam(lr=0.01), 
                             loss = "categorical_crossentropy", 
                             metrics = ["accuracy"])

input_train = x_train.reshape(x_train.shape[0], window_size, n_features, 1)
input_test = x_test.reshape(x_test.shape[0], window_size, n_features, 1)

checkpointer = ModelCheckpoint(filepath='./data/weights_dc.hdf5', verbose=1, save_best_only=True)

classification_model.fit(x = input_train, 
                         y = y_train, 
                         epochs = 20, 
                         batch_size = 300,
                         verbose = 1,
                         validation_data=(input_test, y_test),
                         callbacks=[checkpointer])

Train on 31422 samples, validate on 11504 samples
Epoch 1/20

Epoch 00001: val_loss improved from inf to 0.32708, saving model to ./data/weights_dc.hdf5
Epoch 2/20

Epoch 00002: val_loss did not improve
Epoch 3/20

Epoch 00003: val_loss did not improve
Epoch 4/20

Epoch 00004: val_loss did not improve
Epoch 5/20

Epoch 00005: val_loss did not improve
Epoch 6/20

Epoch 00006: val_loss improved from 0.32708 to 0.29561, saving model to ./data/weights_dc.hdf5
Epoch 7/20

Epoch 00007: val_loss did not improve
Epoch 8/20

Epoch 00008: val_loss did not improve
Epoch 9/20

Epoch 00009: val_loss did not improve
Epoch 10/20

Epoch 00010: val_loss did not improve
Epoch 11/20

Epoch 00011: val_loss did not improve
Epoch 12/20

Epoch 00012: val_loss did not improve
Epoch 13/20

Epoch 00013: val_loss did not improve
Epoch 14/20

Epoch 00014: val_loss did not improve
Epoch 15/20

Epoch 00015: val_loss did not improve
Epoch 16/20

Epoch 00016: val_loss did not improve
Epoch 17/20

Epoch 00017: val_los

<keras.callbacks.History at 0x18449938b70>

In [17]:
y_pred = classification_model.predict(input_test)
y_pred = np.argmax(y_pred, 1)

print(classification_report(y_test, to_categorical(y_pred)))

             precision    recall  f1-score   support

          0       0.88      0.80      0.84      2039
          1       0.91      0.90      0.91      3958
          2       0.78      0.86      0.82      2332
          3       0.98      0.99      0.99      2733
          4       0.90      0.85      0.87       442

avg / total       0.90      0.89      0.89     11504



In [35]:
classification_model_best = load_model('./data/weights_dc.hdf5')

y_pred = classification_model_best.predict(input_test)
y_pred = np.argmax(y_pred, 1)

print(classification_report(y_test, to_categorical(y_pred)))

             precision    recall  f1-score   support

          0       0.87      0.85      0.86      2039
          1       0.92      0.90      0.91      3958
          2       0.79      0.84      0.81      2332
          3       0.98      1.00      0.99      2733
          4       0.97      0.82      0.89       442

avg / total       0.90      0.90      0.90     11504

