# LSTMs for Human Activity Recognition Time Series Classification

This notebook follows the steps as shown in the following website:<br>
https://machinelearningmastery.com/how-to-develop-lstm-models-for-human-activity-recognition-time-series-classification/


A group volunteers are asked to perform six (6) common human activities:
1. Walking
2. Walking Upstairs
3. Walking Downstairs
4. Sitting
5. Standing
6. Laying

A smartphone is attached to their waist and they are asked to perform the activities while wearing the smartphone. The smartphone has an embedded accelerometer and gyroscope. The accelerometer captures 3-axial linear acceleration and the gyroscope captures 3-axial angular velocity at a constant rate of 50Hz. With the provided data, I will train a LSTM model to predict the activity of the volunteer.

In [25]:
from pandas import read_csv
from numpy import dstack, mean, std
from tensorflow import keras
from tensorflow.keras import losses
from tensorflow.keras.layers import Dense, Dropout, LSTM, TimeDistributed, Conv1D, MaxPooling1D, Flatten, ConvLSTM2D
from tensorflow.keras.utils import to_categorical

# Basic LSTM Model

The section is divided into the following sections:
1. Load the Data
2. Fit and Evaluate a Model
3. Summarize Results
4. Final Examination

## Load the Data

In [26]:
# load a single file as a numpy array
def load_file(filepath):
  dataframe = read_csv(filepath, header=None, delim_whitespace=True)
  return dataframe.values

In [27]:
# load a list of files into a 3D array of [samples, timesteps, features]
def load_group(filenames, prefix=''):
  loaded = list()
  for name in filenames:
    data = load_file(prefix + name)
    loaded.append(data)
  # stack group so that features are the 3rd dimension
  loaded = dstack(loaded)
  return loaded

In [28]:
# load a dataset group, such as train or test
def load_dataset_group(group, prefix=''):
	filepath = prefix + group + '/Inertial Signals/'
	# load all 9 files as a single array
	filenames = list()
	# total acceleration
	filenames += ['total_acc_x_'+group+'.txt',
               'total_acc_y_'+group+'.txt', 'total_acc_z_'+group+'.txt']
	# body acceleration
	filenames += ['body_acc_x_'+group+'.txt',
               'body_acc_y_'+group+'.txt', 'body_acc_z_'+group+'.txt']
	# body gyroscope
	filenames += ['body_gyro_x_'+group+'.txt',
               'body_gyro_y_'+group+'.txt', 'body_gyro_z_'+group+'.txt']
	# load input data
	X = load_group(filenames, filepath)
	# load class output
	y = load_file(prefix + group + '/y_'+group+'.txt')
	return X, y

In [29]:
# load the dataset, returns train and test X and y elements
def load_dataset(prefix=''):
	# load all train
	trainX, trainy = load_dataset_group('train', prefix + 'HARDataset/')
	print(trainX.shape, trainy.shape)
	# load all test
	testX, testy = load_dataset_group('test', prefix + 'HARDataset/')
	print(testX.shape, testy.shape)
	# zero-offset class values
	trainy = trainy - 1
	testy = testy - 1
	# one hot encode y
	trainy = to_categorical(trainy)
	testy = to_categorical(testy)
	print(trainX.shape, trainy.shape, testX.shape, testy.shape)
	return trainX, trainy, testX, testy

## Fit and Evaluate a Model

In [30]:
# fit and evaluate a model
def evaluate_model(trainX, trainy, testX, testy):
  verbose, epochs, batch_size = 0, 15, 64
  n_timesteps, n_features, n_outputs = trainX.shape[1], trainX.shape[2], trainy.shape[1]
  model = keras.Sequential([
    LSTM(100, input_shape=(n_timesteps,n_features)),
    Dropout(0.5),
    Dense(100, activation='relu'),
    Dense(n_outputs, activation='softmax')
  ])
  model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
  # fit network
  model.fit(trainX, trainy, epochs=epochs, batch_size=batch_size, verbose=verbose)
  # evaluate model
  _, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0)
  return accuracy


## Summarize Results

In [31]:
# summarize scores
def summarize_results(scores):
  print(scores)
  m, s = mean(scores), std(scores)
  print('Accuracy: %.3f%% (+/-%.3f)' % (m, s))

In [32]:
# run an experiment
def run_experiment(repeats=10):
  # load data
  trainX, trainy, testX, testy = load_dataset()
  # repeat experiment
  scores = list()
  for r in range(repeats):
    score = evaluate_model(trainX, trainy, testX, testy)
    score = score * 100.0
    print('>#%d: %.3f' % (r+1, score))
    scores.append(score)
  # summarize results
  summarize_results(scores)

## Final Examination

In [33]:
run_experiment()

(7352, 128, 9) (7352, 1)
(2947, 128, 9) (2947, 1)
(7352, 128, 9) (7352, 6) (2947, 128, 9) (2947, 6)
>#1: 91.517
>#2: 88.056
>#3: 89.345
>#4: 89.820
>#5: 90.431
>#6: 86.291
>#7: 90.499
>#8: 90.329
>#9: 90.533
>#10: 90.770
[91.51679873466492, 88.05565237998962, 89.34509754180908, 89.8201584815979, 90.43094515800476, 86.29114627838135, 90.49881100654602, 90.32914638519287, 90.53274393081665, 90.77027440071106]
Accuracy: 89.759% (+/-1.454)


# CNN-LSTM Network Model

In [34]:
# fit and evaluate a model
def evaluate_model(trainX, trainy, testX, testy):
  # define model
  verbose, epochs, batch_size = 0, 25, 64
  n_timesteps, n_features, n_outputs = trainX.shape[1], trainX.shape[2], trainy.shape[1]
  # reshape data into time steps of sub-sequences
  n_steps, n_length = 4, 32
  trainX = trainX.reshape((trainX.shape[0], n_steps, n_length, n_features))
  testX = testX.reshape((testX.shape[0], n_steps, n_length, n_features))
  # define model
  model = keras.Sequential([
    TimeDistributed(Conv1D(filters=64, kernel_size=3,
              activation='relu'), input_shape=(None, n_length, n_features)),
    TimeDistributed(Conv1D(filters=64, kernel_size=3, activation='relu')),
    TimeDistributed(Dropout(0.5)),
    TimeDistributed(MaxPooling1D(pool_size=2)),
    TimeDistributed(Flatten()),
    LSTM(100),
    Dropout(0.5),
    Dense(100, activation='relu'),
    Dense(n_outputs, activation='softmax')
  ])
  model.compile(loss='categorical_crossentropy',
                optimizer='adam', metrics=['accuracy'])
  # fit network
  model.fit(trainX, trainy, epochs=epochs,
            batch_size=batch_size, verbose=verbose)
  # evaluate model
  _, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0)
  return accuracy


In [35]:
run_experiment()

(7352, 128, 9) (7352, 1)
(2947, 128, 9) (2947, 1)
(7352, 128, 9) (7352, 6) (2947, 128, 9) (2947, 6)
>#1: 90.227
>#2: 90.092
>#3: 90.159
>#4: 90.906
>#5: 89.786
>#6: 87.411
>#7: 90.193
>#8: 90.567
>#9: 88.700
>#10: 86.698
[90.22734761238098, 90.09161591529846, 90.15948176383972, 90.90600609779358, 89.78622555732727, 87.41092681884766, 90.19341468811035, 90.56667685508728, 88.70037198066711, 86.69833540916443]
Accuracy: 89.474% (+/-1.336)


# ConvLSTM Network Model

A further extension of the CNN LSTM idea is to perform the convolutions of the CNN (e.g. how the CNN reads the input sequence data) as part of the LSTM.

In [36]:
# fit and evaluate a model
def evaluate_model(trainX, trainy, testX, testy):
  # define model
  verbose, epochs, batch_size = 0, 25, 64
  n_timesteps, n_features, n_outputs = trainX.shape[1], trainX.shape[2], trainy.shape[1]
  # reshape into subsequences (samples, time steps, rows, cols, channels)
  n_steps, n_length = 4, 32
  trainX = trainX.reshape((trainX.shape[0], n_steps, 1, n_length, n_features))
  testX = testX.reshape((testX.shape[0], n_steps, 1, n_length, n_features))
  # define model
  model = keras.Sequential([
    ConvLSTM2D(filters=64, kernel_size=(1, 3),
              activation='relu', input_shape=(n_steps, 1, n_length, n_features)),
    Dropout(0.5),
    Flatten(),
    Dense(100, activation='relu'),
    Dense(n_outputs, activation='softmax')
  ])
  model.compile(loss='categorical_crossentropy',
                optimizer='adam', metrics=['accuracy'])
  # fit network
  model.fit(trainX, trainy, epochs=epochs,
            batch_size=batch_size, verbose=verbose)
  # evaluate model
  _, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0)
  return accuracy

In [37]:
run_experiment()

(7352, 128, 9) (7352, 1)
(2947, 128, 9) (2947, 1)
(7352, 128, 9) (7352, 6) (2947, 128, 9) (2947, 6)
>#1: 91.890
>#2: 90.635
>#3: 91.042
>#4: 90.193
>#5: 90.567
>#6: 89.990
>#7: 88.768
>#8: 89.379
>#9: 89.108
>#10: 90.601
[91.89005494117737, 90.63454270362854, 91.0417377948761, 90.19341468811035, 90.56667685508728, 89.98982310295105, 88.76823782920837, 89.37903046607971, 89.10756707191467, 90.60060977935791]
Accuracy: 90.217% (+/-0.895)


In [38]:
# License:
# ========
# Use of this dataset in publications must be acknowledged by referencing the following publication [1] 

# [1] Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz.
# A Public Domain Dataset for Human Activity Recognition Using Smartphones.
# 21th European Symposium on Artificial Neural Networks, Computational Intelligence
# and Machine Learning, ESANN 2013. Bruges, Belgium 24-26 April 2013. 

# This dataset is distributed AS-IS and no responsibility implied or
# explicit can be addressed to the authors or their institutions for
# its use or misuse. Any commercial use is prohibited.

# Other Related Publications:
# ===========================
# [2] Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra, Jorge L.
# Reyes-Ortiz.  Energy Efficient Smartphone-Based Activity Recognition
# using Fixed-Point Arithmetic. Journal of Universal Computer Science.
# Special Issue in Ambient Assisted Living: Home Care.   Volume 19, Issue 9. May 2013

# [3] Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and
# Jorge L. Reyes-Ortiz. Human Activity Recognition on Smartphones
# using a Multiclass Hardware-Friendly Support Vector Machine. 4th
# International Workshop of Ambient Assited Living, IWAAL 2012,
# Vitoria-Gasteiz, Spain, December 3-5, 2012. Proceedings. Lecture
# Notes in Computer Science 2012, pp 216-223. 

# [4] Jorge Luis Reyes-Ortiz, Alessandro Ghio, Xavier Parra-Llanas,
# Davide Anguita, Joan Cabestany, Andreu Català. Human Activity and
# Motion Disorder Recognition: Towards Smarter Interactive Cognitive
# Environments. 21th European Symposium on Artificial Neural Networks,
# Computational Intelligence and Machine Learning, ESANN 2013. Bruges,
# Belgium 24-26 April 2013.  

# ==================================================================================================
# Jorge L. Reyes-Ortiz, Alessandro Ghio, Luca Oneto, Davide Anguita and Xavier Parra. November 2013.