## Long Short-Term Memory 

In this assignment, we will learn about LSTM models. We will create an LSTM model for time series prediction.

In [None]:
import numpy as np
import os
import pandas as pd

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout, Flatten
from tensorflow.keras.utils import to_categorical

In [None]:
from sklearn.metrics import confusion_matrix

Below is a function for loading time series data collected by sensors. There are 9 different files, We have data about body acceleration, body gyro, and total acceleration for the x, y, and z axis

In [None]:
def load_func(path, file_ind=False):
    data_list = []
    if file_ind:
        filenames = [path]
    else:
        files = os.listdir(path)
        filenames = [path + '/' + f for f in files]
    for f in filenames:
        dataframe = pd.read_csv(f, header=None, delim_whitespace=True)
        data_list.append(dataframe.values)
    if len(data_list) > 1:
        return np.dstack(data_list)
    else:
        return data_list[0]

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
PATH = './drive/MyDrive/content/UCI HAR Dataset/'
os.listdir(PATH)

['.DS_Store',
 'train',
 'test',
 'activity_labels.txt',
 'features.txt',
 'features_info.txt',
 'README.txt']

In [None]:
X_train = load_func(PATH + "/train/Inertial Signals")
X_test = load_func(PATH + "/test/Inertial Signals")
y_train_cat = load_func(PATH + '/train/y_train.txt', True)
y_test_cat = load_func(PATH + '/test/y_test.txt', True)

Print the dimensions of both the predictor variables and the target.

In [None]:
# Answer below:
# Answer below:
print( X_train.shape, y_train_cat.shape,
      X_test.shape, y_test_cat.shape)


(7352, 128, 9) (7352, 1) (2947, 128, 9) (2947, 1)


The target variable is categorical. One hot encode the target variable.

In [None]:
# Answer below:
y_train = to_categorical(y_train_cat - 1, 6)
y_test = to_categorical(y_test_cat - 1, 6)

Create a model containing an LSTM layer with unit size 100, and input shape that is the tuple containing the number of columns in X and the number of files in X.

The next layer is a dropout layer. Choose 0.5 for the proportion. Then add a dense layer of unit size 100 and finally an output dense layer. 

In [None]:
# Answer below:
n_timesteps, n_features, n_outputs = X_train.shape[1], X_train.shape[2], y_train.shape[1]
n_timesteps, n_features, n_outputs

model = Sequential()
model.add(LSTM(100, input_shape=(n_timesteps, n_features)))
model.add(Dropout(0.5))
model.add(Dense(100, activation='relu'))
model.add(Dense(n_outputs, activation='softmax'))


Print the model summary to ensure you have the correct number of parameters.

In [None]:
# Answer below:
model.summary()


Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm (LSTM)                  (None, 100)               44000     
_________________________________________________________________
dropout (Dropout)            (None, 100)               0         
_________________________________________________________________
dense (Dense)                (None, 100)               10100     
_________________________________________________________________
dense_1 (Dense)              (None, 6)                 606       
Total params: 54,706
Trainable params: 54,706
Non-trainable params: 0
_________________________________________________________________


Compile and fit the model. Select the appropriate activation, loss, and optimizer.

Run the model for 10 epochs with a batch size of 80.

In [None]:
# Answer below:
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

model.fit(X_train, y_train, validation_data = (X_test, y_test), batch_size=80, epochs=10)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7fc0448e8668>

Print the confusion matrix for the test data.

In [None]:
# Answer below:
y_pred = np.argmax(model.predict(X_test), axis=-1)
# Answer below:
confusion_matrix(y_test_cat, y_pred)


array([[  0,   0,   0,   0,   0,   0,   0],
       [463,  19,  14,   0,   0,   0,   0],
       [ 20, 443,   7,   0,   1,   0,   0],
       [  0,  15, 405,   0,   0,   0,   0],
       [  0,   1,   0, 354, 136,   0,   0],
       [  2,   1,   0,  46, 483,   0,   0],
       [  0,  27,   0,   0,   0, 510,   0]])