# Supervised RNN pipeline with Tensorflow

This notebook trains a supervised RNN model via Tensorflow, considering the case of using a window of data to predict the activity within the window.

Specifically,
* We use the raw sensor data, not the feature data
* Use `tensorflow.keras` to set up and train the model
* A single-hidden-LSTM-layer forecasting model is trained (multi-classification)

We finally test the model on the test data set, as defined by the authors.

Note: some improvements could be made with respect to the training process.

In [1]:
%cd ..

/project


In [2]:
from tensorflow import keras
from src.data import *

## Prepare the data

Load the raw data and then prepare the train / test objects.

Note that
1. We need to construct the windows of the 9 sensor data series into shapes of $(128, 9)$
1. We need to format the integer activity labels as one-hot-encoded vectors like in the DNN pipeline

The data is loaded here as a normally-structured time series data set, like in the data description. We do this because we want to scale the dataset before putting into `keras`.

In [3]:
raw_train_df = load_raw_data() \
    .sort_values(['subject_id', 'time_exp']) \
    .reset_index(drop=True)
raw_train_df.shape

(941056, 14)

Now create the training array without subject, activity labels, etc. Also, create the one-hot-encoded label array.

In [4]:
X_train = raw_train_df.drop(['subject_id', 'time_exp', 'time_window_s', 'time_counter', 'activity_id'], axis=1)
labels_train = raw_train_df.loc[:, ['subject_id', 'time_window_s', 'activity_id']].drop_duplicates().activity_id
y_train = keras.utils.to_categorical(labels_train - 1)
X_train.shape, y_train.shape

((941056, 9), (7352, 6))

Repeat the above process for the test data.

In [5]:
raw_test_df = load_raw_data('test') \
    .sort_values(['subject_id', 'time_exp', 'time_counter']) \
    .reset_index(drop=True)
raw_test_df.shape

(377216, 14)

In [6]:
X_test = raw_test_df.drop(['subject_id', 'time_exp', 'time_window_s', 'time_counter', 'activity_id'], axis=1)
labels_test = raw_test_df.loc[:, ['subject_id', 'time_window_s', 'activity_id']].drop_duplicates().activity_id
y_test = keras.utils.to_categorical(labels_test - 1)
X_test.shape, y_test.shape

((377216, 9), (2947, 6))

We now scale the data like normal. This won't be part of the pipeline in order to keep the `keras` model simpler.

In [7]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In [8]:
train_window_qt, test_window_qt = y_train.shape[0], y_test.shape[0]

In [9]:
X_train_input = X_train.reshape((train_window_qt, 128, 9))
X_test_input = X_test.reshape((test_window_qt, 128, 9))
X_train_input.shape, X_test_input.shape

((7352, 128, 9), (2947, 128, 9))

## Model training

### Prepare the model

We create a single hidden LSTM layer with 3 nodes. Using the `Sequential` API, we create the model object and `.add` layers to it. The last layer must be congruent with the model task: a 6-category prediction.

In [10]:
model = keras.models.Sequential()
model.add(keras.layers.LSTM(3, input_shape=(128, 9)))
model.add(keras.layers.Dense(6, activation='sigmoid'))

We need to add a loss function, optimizer, and metrics for training.
* The `categorical_crossentropy` is a good choice for multi-category classification
* The `rmsprop` optimizer performs well here
* `accuracy` is a common metric to choose; it must get the label exactly right

In [11]:
model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer='rmsprop',
              metrics=['accuracy'])

We can summarize the model for our understanding. 

In [12]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm (LSTM)                  (None, 3)                 156       
_________________________________________________________________
dense (Dense)                (None, 6)                 24        
Total params: 180
Trainable params: 180
Non-trainable params: 0
_________________________________________________________________


### Train

In this example, we fit the model with a single validation data set. This is actually common in deep learning problems, since the datasets are so large that cross-validation is often not necessary.

Some notes:
* We choose the batch size as 32 since it's on par with the way the data is recorded
* Using 3 epochs is common, but could be increased

In [13]:
model.fit(X_train_input, y_train,
          batch_size=32,
          epochs=3,
          validation_data=(X_test_input, y_test))

Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x7f02e4611040>

The `keras` fitting process already prints out the accuracy, but for illustration we can compute it like we did in the other notebooks, only we need to translate the one-hot-encoded vector back to the integer label.

In [14]:
y_train_hat = model.predict(X_train_input)
y_train_hat = np.array([y.argmax() + 1 for y in y_train_hat])

In [15]:
from sklearn.metrics import accuracy_score, classification_report

In [16]:
accuracy_score(y_train_hat, labels_train)

0.5447497279651795

### Evaluate on the test data

We evaluate the model on the test data that was defined by the authors.

In [17]:
y_test_hat = model.predict(X_test_input)
y_test_hat = np.array([y.argmax() + 1 for y in y_test_hat])

In [18]:
accuracy_score(y_test_hat, labels_test)

0.5083135391923991

In [19]:
print(classification_report(y_test_hat, labels_test))

              precision    recall  f1-score   support

           1       0.08      0.40      0.13        97
           2       0.05      0.32      0.08        73
           3       0.37      0.35      0.36       449
           4       0.76      0.49      0.59       763
           5       0.76      0.47      0.58       856
           6       0.94      0.72      0.81       709

    accuracy                           0.51      2947
   macro avg       0.49      0.46      0.43      2947
weighted avg       0.70      0.51      0.58      2947



When we cross-tabulate the actual labels with the classified ones, we don't see great agreement, but many improvements could be made. 

Here we will want to validate if the errors made are acceptable. For example, errors for walking downstairs are either walking or walking upstairs. It may be important to continue tuning parameters such that this activity is never (or less commonly) misclassified as walking upstairs.

In [20]:
pd.crosstab(labels_test,
            y_test_hat,
            rownames=['True'],
            colnames=['Classified'])

Classified,1,2,3,4,5,6
True,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,39,23,135,133,128,38
2,33,23,144,79,135,57
3,21,25,155,61,66,92
4,1,2,5,372,98,13
5,3,0,7,118,402,2
6,0,0,3,0,27,507
