# Practical 8: CNNs and RNNs

The aim of this lab is the learn to implement Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNN) in Keras.

We will do this be implementing building CNN and RNN models for the task of *Human Activity Recognition* (HAR). This is the problem of predicting what a person is doing based on a trace of their movement using accelerometers sensors.

The data, and a description of the task we are doing can be found __[here](https://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones)__.

If you have not already done so, please download a copy of the daa form the __[UCI website](https://archive.ics.uci.edu/ml/machine-learning-databases/00240/)__ before the lab (the *UCI HAR Dataset.zip* folder). After downloading, unzip it and place it in the same folder you are running this program from.

<hr style="border:1px solid black"> </hr>

### Step 1 Set-Up
Import all toolboxes need to support the tutorial

In [1]:
import numpy as np
import pandas as pd

from numpy import mean
from numpy import std
from numpy import dstack

from keras.models import Sequential
from keras.layers import InputLayer
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import BatchNormalization
from keras.layers import LSTM
from keras.layers import Dropout
from keras.layers import GRU
from tensorflow.keras.utils import to_categorical

from keras.layers import Conv1D
from keras.layers import MaxPooling1D
from keras.layers import preprocessing

import seaborn as sns
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix

2022-05-18 15:01:33.212031: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/changhyun/catkin_ws/devel/lib:/opt/ros/noetic/lib
2022-05-18 15:01:33.212057: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


**Observing Model training** the following function will be used to obsevere the behaviour of the models after they have been trained. It plots the loss and accuracy of model after every training epoch

In [2]:
def plot_learningCurve(history, epoch):
  # Plot training & validation accuracy values
  epoch_range = range(1, epoch+1)
  plt.plot(epoch_range, history.history['accuracy'])
  plt.plot(epoch_range, history.history['val_accuracy'])
  plt.title('Model accuracy')
  plt.ylabel('Accuracy')
  plt.xlabel('Epoch')
  plt.legend(['Train', 'Val'], loc='upper left')
  plt.show()

  # Plot training & validation loss values
  plt.plot(epoch_range, history.history['loss'])
  plt.plot(epoch_range, history.history['val_loss'])
  plt.title('Model loss')
  plt.ylabel('Loss')
  plt.xlabel('Epoch')
  plt.legend(['Train', 'Val'], loc='upper left')
  plt.show()

<hr style="border:1px solid black"> </hr>

### Step 2 Load the data, partition and set up pre-processing

The data was collected from 30 subjects aged between 19 and 48 years old performing one of six standard activities while wearing a waist-mounted smartphone that recorded the movement data. The activities performed were:
- Walking
- Walking Upstairs
- Walking Downstairs
- Sitting
- Standing
- Laying

There are three main signal types in the raw data: total acceleration, body acceleration, and body gyroscope. Each has 3 axises (x,y,z) of data. This means that there are a total of nine variables for each time step.

Further, each series of data has been partitioned into overlapping windows of 2.56 seconds of data, or 128 time steps. These windows of data correspond to the windows of engineered features (rows) in the previous section.

This means that one row of data has (128 * 9), or 1,152 elements

The following block of code are the functions needed to exract this data

In [None]:
# load a single file as a numpy array
def load_file(filepath):
    dataframe = pd.read_csv(filepath, header=None, delim_whitespace=True)
    return dataframe.values
 
# load a list of files and return as a 3d numpy array
def load_group(filenames, prefix='UCIHARDataset/'):
    loaded = list()
    for name in filenames:
        data = load_file(prefix + name)
        loaded.append(data)
    # stack group so that features are the 3rd dimension
    loaded = dstack(loaded)
    return loaded

# load a dataset group, such as train or test
def load_dataset_group(group, prefix='UCIHARDataset/'):
    filepath = prefix + group + '/Inertial Signals/'
    # load all 9 files as a single array
    filenames = list()
    # total acceleration
    filenames += ['total_acc_x_'+group+'.txt', 'total_acc_y_'+group+'.txt', 'total_acc_z_'+group+'.txt']
    # body acceleration
    filenames += ['body_acc_x_'+group+'.txt', 'body_acc_y_'+group+'.txt', 'body_acc_z_'+group+'.txt']
    # body gyroscope
    filenames += ['body_gyro_x_'+group+'.txt', 'body_gyro_y_'+group+'.txt', 'body_gyro_z_'+group+'.txt']
    # load input data
    X = load_group(filenames, filepath)
    # load class output
    y = load_file(prefix + group + '/y_'+group+'.txt')
    return X, y

 
# load the dataset, returns train and test X and y elements
def load_dataset(prefix='UCIHARDataset/'):
    # load all train
    trainX, trainy = load_dataset_group('train')
    # load all test
    testX, testy = load_dataset_group('test')
    # zero-offset class values
    trainy = trainy - 1
    testy = testy - 1
    # one hot encode y
    trainy = to_categorical(trainy)
    testy = to_categorical(testy)
    return trainX, trainy, testX, testy

Call the `load_dataset` function, then split the training data into a training and a validation set

In [None]:
# Extract defined training and test splits
X_train, y_train, X_test, y_test = load_dataset()

# Parameters needed for setting the input and output shapes in our models
n_timesteps, n_features, n_outputs = X_train.shape[1], X_train.shape[2], y_train.shape[1]

The following block of code utiliser the Keras preprocessing layers API to build Keras-native input pre-processing step. We implemetn the normiliser in the a `create_model()` function below

In [None]:
normalizer = preprocessing.Normalization()
normalizer.adapt(X_train)

<hr style="border:1px solid black"> </hr>

###  Step 4 Building a CNN

As in the previous lab, we build our CNN via a keras __[sequential](https://keras.io/guides/sequential_model/)__ model. 

**Questions:** 
- Describe the architecture of the model coded below.
- What is the purpose of the `BatchNormalization()` layer?

In [None]:
# define the keras model
def create_model():
    model = Sequential()
    model.add(InputLayer(input_shape=(n_timesteps,n_features)))
    model.add(normalizer)
    model.add(Conv1D(filters=32, kernel_size=3))
    model.add(BatchNormalization())
    model.add(MaxPooling1D(pool_size=2, strides=2))
    model.add(Flatten())
    model.add(Dense(n_outputs, activation='softmax'))
    print(model.summary())
    return model

Compile the model then build it to start training

**Question** What is the purpose of the `shuffle=True` and `validation_split = 0.2` commands in the `model.fit` function? 

In [None]:
model = create_model()

model.compile(optimizer='adam', loss = 'categorical_crossentropy', metrics=['accuracy'])

EPOCHS = 64
BATCH_SIZE = 32

history = model.fit(
    X_train, y_train, 
    batch_size=BATCH_SIZE, 
    epochs=EPOCHS,
    validation_split = 0.2,
    shuffle=True,
    verbose=2
)

Plot the accuracies and model losses obtained over training

In [None]:
plot_learningCurve(history, EPOCHS)

Observe the models performance on the test set? Does this seem reasonable for this experiement? 

In [None]:
# Evaluate the model on the test data using `evaluate`
print("Evaluate on test data")
results = model.evaluate(X_test, y_test)
print("test loss, test acc:", results)

y_pred = model.predict(X_test)
y_pred = np.argmax(y_pred, axis=1)
y_test_vec = np.argmax(y_test, axis=1)

cf_matrix = confusion_matrix(y_test_vec, y_pred)
sns.heatmap(cf_matrix, annot=True)

### Exercise: Alter the model
- First, change the number of filters in the first layer to be size 64, and the kernal size to be 5. Observe how this alters the performance of the system.
- Then, update the model again. Add in a new convolution, batch normalisation and max pooling layers. These should be added between the current max pooling and flattening layer. The size of the convolutional layer should be 32, the kernel size should be 3. The max pooling size should be 2 and stride 1. Observe how this alters the performance of the system.
- Finally, Add ReLU activations to both convolutional layers. Observe how this alters the performance of the system.

<hr style="border:1px solid black"> </hr>

###  Step 5 Building a LSTM-RNN

Again, we build our LSTM via a keras __[sequential](https://keras.io/guides/sequential_model/)__ model. 

**Question:** Describe the architecture of the model coded below

In [None]:
# define the keras model
def create_model():
    model = Sequential()
    model.add(InputLayer(input_shape=(n_timesteps,n_features)))
    model.add(normalizer)
    model.add(LSTM(32))
    model.add(BatchNormalization())
    model.add(Dense(n_outputs, activation='softmax'))
    print(model.summary())
    return model

Compile the model then build it to start training

**Question** Why has `shuffle` been set to `False` in the below code

In [None]:
model = create_model()

model.compile(optimizer='adam', loss = 'categorical_crossentropy', metrics=['accuracy'])

EPOCHS = 64
BATCH_SIZE = 32

history = model.fit(
    X_train, y_train, 
    batch_size=BATCH_SIZE, 
    epochs=EPOCHS,
    validation_split = 0.2,
    shuffle=False,
    verbose=2
)

Plot the accuracies and model losses obtained over training

In [None]:
plot_learningCurve(history, EPOCHS)

In [None]:
# Evaluate the model on the test data using `evaluate`
print("Evaluate on test data")
results = model.evaluate(X_test, y_test)
print("test loss, test acc:", results)

y_pred = model.predict(X_test)
y_pred = np.argmax(y_pred, axis=1)
y_test_vec = np.argmax(y_test, axis=1)

cf_matrix = confusion_matrix(y_test_vec, y_pred)
sns.heatmap(cf_matrix, annot=True)

### Exercise: Alter the model
- First, change the number of nodes in the first layer to be size 64. Then add in a new LSTM layer between the batch normalisation and dense layers. This layer should have 32 nodes. Perform Batch Normalisation after this layer as well. **Note**, in the first LSTM layer, you will need to set `return_sequences=True` so that the so that the second LSTM layer has a three-dimensional sequence input
- Observe how this alters the performance of the system.
- Then add this layer `model.add(Dropout(0.5))` in directly after both batch normalisation layers. What is the purpose of doing this? 
- Alter the code to run Gated Reccurent Units (GRUs) instead of LSTMs. Observe how this alters the performance of the system.

<hr style="border:1px solid black"> </hr>

### Exercise: Early Stopping

Implement an __[early stopping](https://keras.io/api/callbacks/early_stopping/)__ strategy for the validation loss on both networks. 

**Note** the default `patience` value is 0. This means training is terminated as soon as the performance measure gets worse from one epoch to the next. This may not be the ideal value. 

**Note** don't run the `plot_learningCurve(history, EPOCHS)` function with early stopping, it will error.