### Import Data

To use the GPU, go to Edit > Notebook Settings > Select GPU as the Hardware Accelerator.<br>

Run the following cell, click "Choose File" and select the file titled "R_variables_all_std.csv"<br>

This file contains ~12000 strides from 22 participants walking as three different speeds (slow, comfortable, and fast). There are 600 features per stride from 6 angular velocity signals from 2 IMUs on the right shank. Each signal from the IMU was time normalized to 100 timepoints of the gait cycle (right heel strike to right heel strike). There is also a file titled ("L_variables_all_std.csv") in the github repository.<br>

The original data came from the open source dataset from Miraldo et al. (https://doi.org/10.6084/m9.figshare.7778255.v3). The dataset contains signals from IMUs and time synchronized indices of heel strike, toe off, and other gait events.  The other jupyter notebooks in this repository outlines the steps for processing data to get it into this format. 

In [3]:
#import data
from google.colab import files
uploaded = files.upload()

Saving R_variables_all_std.csv to R_variables_all_std.csv


### Deep Learning

Run the following section to train and test model using leave one subject out cross validation. 

In [4]:
#import libraries
import os
import numpy as np 
import pandas as pd 
from scipy.signal import resample
import matplotlib.pyplot as plt
import seaborn as sns
import seaborn as sns
import scipy.stats as stats
import statistics as st

#import the PCA function
from sklearn.decomposition import PCA

#import machine learning libraries
#pre-processing
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler

#ML algorithms
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC

#import LOOCV
from sklearn.model_selection import LeaveOneOut

#import evaluation metrics
from sklearn.metrics import accuracy_score
from sklearn.metrics import mean_squared_error

#deep learning
# gather software versions
import tensorflow as tf
import keras

#scikit learn
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score, confusion_matrix
from sklearn.utils import shuffle
from sklearn.utils import class_weight

#deep learning
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution1D, Conv1D, MaxPooling1D, GlobalAveragePooling1D
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.optimizers import Adam

#plotting
import matplotlib.pyplot as plt
%matplotlib inline
%pylab inline
%config InlineBackend.figure_format = 'retina'

#ignoring warnings
import warnings                       
warnings.filterwarnings("ignore")

#-----------------------------------------------------------------------------
#deep learning architecture
model = Sequential()
#since we only have a 1D vector, only 1D convolutional layers are needed
model.add(Conv1D(filters=16, kernel_size=2, input_shape=(101, 6)))
model.add(MaxPooling1D(pool_size=2))

model.add(Conv1D(filters=32, kernel_size=2, activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Dropout(0.3))

model.add(Conv1D(filters=64, kernel_size=2, activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Dropout(0.3))
model.add(GlobalAveragePooling1D())

model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))

#model.add(Dense(len(encoder.classes_), activation='softmax'))
model.add(Dense(3, activation='softmax'))
model.summary()

#compile model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer=Adam(lr = 0.001))
#-----------------------------------------------------------------------------

#import data
data = pd.read_csv(os.getcwd()+'/R_variables_all_std.csv', sep=',', index_col=0)

#get a list of all subjects
all_subjects = data['subject_ID'].unique()

#create an empty list to store accuracy in
max_accuracy_list = []
average_accuracy_list = []
average_accuracy_list_third = []

#loop over all conditions
for idx, subject_ID in enumerate(all_subjects):
    
    #assign testing and training data
    x_train = data.loc[data['subject_ID'] != subject_ID].drop(['subject_ID', 'speed', 'trial'], axis=1)
    x_test = data.loc[data['subject_ID'] == subject_ID].drop(['subject_ID', 'trial', 'speed'], axis=1)
    y_train = data.loc[data['subject_ID'] != subject_ID]['speed']
    y_test = data.loc[data['subject_ID'] == subject_ID]['speed']
    
    #convert training and testing data to arrays and reshape into (num_of_examples, num_of_features, num_of_signals)
    X_train = np.asarray(x_train)
    X_train = X_train.reshape(X_train.shape[0], 101, 6)
    X_test = np.asarray(x_test)
    X_test = X_test.reshape(X_test.shape[0], 101, 6)
    
    #encode training and testing labels (switch from letters to one hot encoding)
    encoder = LabelEncoder()
    encoder.fit(y_train.values)
    y_train = encoder.transform(y_train.values)
    y_test = encoder.transform(y_test.values)
    #Convert y_train and y_test to categorical variables 
    y_train = to_categorical(y_train)
    y_test = to_categorical(y_test)
    
    #fit model
    history = model.fit(X_train, y_train, batch_size=64, epochs=100, validation_data=(X_test, y_test), shuffle=True)
    
    #append max accuracy over all epochs
    max_accuracy_list.append(max(history.history['val_accuracy']))
    #append mean accuracy over all epochs
    average_accuracy_list.append(mean(history.history['val_accuracy']))
    #append mean accuracy over final third of epochs
    average_accuracy_list_third.append(mean(history.history['val_accuracy'][66::]))

#get average of accuracies over all cross folds
mean_max_accuracy = mean(max_accuracy_list)
mean_average_accuracy = mean(average_accuracy_list)
mean_average_accuracy_third = mean(average_accuracy_list_third)

print('Average Max Accuracy: ', mean_max_accuracy)
print('Mean Average Accuracy: ', mean_average_accuracy)
print('Mean Average Accuracy Over Final Third of Epochs: ', mean_average_accuracy_third)



Populating the interactive namespace from numpy and matplotlib
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv1d_6 (Conv1D)           (None, 100, 16)           208       
                                                                 
 max_pooling1d_6 (MaxPooling  (None, 50, 16)           0         
 1D)                                                             
                                                                 
 conv1d_7 (Conv1D)           (None, 49, 32)            1056      
                                                                 
 max_pooling1d_7 (MaxPooling  (None, 24, 32)           0         
 1D)                                                             
                                                                 
 dropout_4 (Dropout)         (None, 24, 32)            0         
                                                         

### Results

Over all crossfolds, the average accuracy for classifying gait speed was about 97%.