<a href="https://colab.research.google.com/github/elysethulin/PRACTICE/blob/master/Long_Short_Term_Memory.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Long Short Term Memory

In this notebook, we use a Long Short Term Memory Neural Network (LSTM) model to classify time series data. This model is built using the `tensorflow` and `keras` libraries, similar to the networks we constructed yesterday.

The notebook is split into the following sections:
1. Investigate the data
2. Construct an LSTM model
3. Evaluate the model

Author: Joshua Pickard (jpic@umich.edu)

In [None]:
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import Dropout
from keras.layers import LSTM  
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.utils import plot_model
import matplotlib.pyplot as plt
from sklearn import model_selection
import glob
import random

## Human Activity Recognition Using Smartphones Data Set

We will be using machine learning to preform Human Activity Recognition (HAR) for the following 3 activities:

1. Running
2. Stationary
3. Walking

To do this, we will analyze 3 signals collected from people as they preformed these activities. Participants wore a Samsung Galaxy S II. Using its embedded accelerometer and gyroscope, 3-axial linear acceleration and 3-axial angular velocity were recorded at a constant rate of 50Hz.

The acceleration measurements taken for each dimension (x, y, z) will be the input data our models consider. Each sample of data contains all 3 acceleration measurements for 1000 points in time.

For more information on the dataset, please see: https://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones


In the below cell, a `.zip` file is downloaded from the given url, and each file is unpacked. The files come in the form of `csv` files, where each column is an acceleration measurement in 1 dimension, each row is a point in time, and each file is one sample.

In [None]:
!wget -nc https://github.com/STMicroelectronics/stm32ai/raw/master/AI_resources/HAR/dataset.zip
!unzip -n dataset.zip

## Loading the Data

The data has been downloaded to the `dataset/` directory and can be viewed with the file icon on the left hand side of the page. The following cell of code loads the data to `X` and `y` variables representing a sample of recordings and an action label.

In [None]:
# Load data into memory
labels = ['stationary', 'walking', 'running']
x_recordings = []
y_recordings = []
recordings_filenames = []
for i, label in enumerate(labels):
    filenames = glob.glob('dataset/' + label + '/*.csv')
    for filename in filenames:
        data = np.loadtxt(filename, delimiter=',')
        x_recordings.append(data)
        y_recordings.append(i)
        recordings_filenames.append(filename)

x_recordings = np.array(x_recordings).reshape(len(x_recordings), -1, 3)
y_recordings = np.array(y_recordings)

print(x_recordings.shape)
print(y_recordings.shape)

From the output above, we see the shape of `x_recordings`, the data we downloaded, is `(92, 1000, 3)`. There are 3 measurements taken at any time, 1000 time points per sample, and 92 samples in all.

## Visualize the Data

The code below generates a few plots to visualize what different sample look like when plotted over time. It appears as if each activity has its own set of trends in terms of how it appears. If we were doing feature selction or signals processing, we may be itnerested in smoothing or otherwise modifying this dataset to remove noise in the data.

In [None]:
# Plot some captures
unique_rands = random.sample(range(len(x_recordings)), 10)
plt.figure(figsize=(18, 10))
for i, n in enumerate(unique_rands):
    plt.subplot(5, 2, i + 1)
    plt.margins(x=0, y=-0.25)
    plt.plot(x_recordings[n])
    plt.ylim(-4000, 4000)  # 4000 mg acc. range
    plt.title(recordings_filenames[n].split('/')[-1])
plt.tight_layout()
plt.show()

## Class Balance
As we have seen, it is important to understand the class balance of a data set. In the case of a HAR classification problem, we are interested in knowing how many examples of each type of activity we have in the data set. In the following cell, we explore this.

In [None]:
# convert the numpy array into a dataframe
df = pd.DataFrame(y_recordings)
counts = df.groupby(0).size()
counts = counts.values

# summarize
print('Training Data Set:')
for i in range(len(counts)):
    percent = counts[i] / len(df) * 100
    print('Class=%d, total=%d, percentage=%.3f' % (i+1, counts[i], percent))

Now that we have loaded and visualized a few realizations of our data, we partition the data into training and validation datasets.

In [None]:
X_train, X_test, y_train, y_test = model_selection.train_test_split(x_recordings, y_recordings)

# Verify that the data split correctly
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)

# Build and Test your Model

In this next section, we build and evaluate a LSTM model on this time series data.

### Reformat the Labels

In the next cell, we reformat the labels to be compatable with the `Sequential` model from `tensorflow`. This reformats the labels into One Hot Encodings, which is a different format than we used when generating the plots above.

The labels for walking, sitting, and running are numeric values, but there is no ordinal relationship between these activities. As a result, we transform the labels to a representation that does not assign an order to each activity.

In [None]:
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

In [None]:
print(y_train.shape)
print(y_test.shape)

## Constructing the LSTM Model

This syntax is nearly identical to the synatx used to construct CNN or Feedforward networks. The key difference is the first layer is `LSTM` to make use of this time series.

In [None]:
model = Sequential()

# LSTM layer
model.add(LSTM(100, input_shape=(1000, 3)))
model.add(Dropout(0.5))

# TODO: add more hidden layers
# fill in the activations as sigmoid, softmax, relu, or tanh
model.add(Dense(10, activation='sigmoid'))

# Do not change this line
model.add(Dense(3, activation='sigmoid'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['AUC'])

In [None]:
model.summary()

In [None]:
plot_model(model, show_shapes=True, show_layer_activations=True)

In [None]:
# fit network
epochs = 10
history = model.fit(X_train, y_train, epochs=epochs)

In [None]:
plt.plot(history.history['auc'])
plt.ylabel('AUC')
plt.xlabel('Epoch')
plt.title('Accuracy During Training')

Finally, we can evaluate this trained model on the test data set.

In [None]:
_, AUC = model.evaluate(X_test, y_test)
print('AUC', AUC) 