# Using MI-RNN on the HAR datset

This is an example of how the existing MI-RNN code base can be used to train a model with shorter input sequence length on the HAR dataset. We are actively working on releasing a better implementation of both `MI-RNN` and `EMI-RNN`. This notebook only illustrates some of the features/methods we have. For instance, usage of features like embeddings, dropout layers, various RNN cells etc are not illustrated here.

Please note that, in the preprint of our work, we use the terms *bag* and *instance* to refer to the LSTM input sequence of original length and the shorter ones we want to learn to predict on, respectively. In the code though, *bag* is replaced with *instance* and *instance* is replaced with *sub-instance*. To avoid ambiguity, I'll use the terms *bag* and *sub-instance* only.

The network used here is a simple LSTM + Linear classifier network. 

The UCI [Human Activity Recognition](https://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones) dataset.

In [1]:
import numpy as np
import os
import tensorflow as tf
import time
import sys
sys.path.insert(0, '../')
# TODO: Explain these methods
from edgeml.graph.emi_rnn import analysisModelMultiClass
from edgeml.graph.emi_rnn import NetworkV2
from edgeml.graph.emi_rnn import updateYPolicy4
from edgeml.graph.emi_rnn import getUpdateIndexList

  from ._conv import register_converters as _register_converters


# Loading Data
Please download the UCI datset from the above link and use your favorite data loading methods to set up (`x_train`, `y_train`) and (`x_val`, `y_val`) numpy arrays.

### Data Preparation

[Typical RNN models](https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/recurrent_network.ipynb) by convention, uses a 3 dimensional tensor for input with shape `[number of examples, number of time steps, number of features]`. To incorporate the notion of *bags* and *sub-instances*, we extend this by adding an additional fourth dimension, thus making our input data shape - `[number of bags, number of sub-instances, number of time steps, number of features]`. Additionally, the typical shape of the one-hot encoded label tensor - `[number of examples, number of outputs]` is extended to incorporate sub-instance level labels, thus making it `[number of bags, number of sub-instances, number of output classes]`.

Specifically for HAR dataset, the data creation algorithm looks something like this.

```
def createData(X, Y, subinstanceWidth, subinstanceStride):
    '''
    Here X and Y are typical time series inputs and their labels. This methods
    chops the sequences into temporarily ordered set of sub-instances. All 
    sub-instances are given the same label as the bag.
    '''
    assert len(X) == len(Y)
    assert len(X.shape) == 3
    assert len(Y.shape) == 2
    
    X_out = []
    Y_out = []
    
    for i in range(len(X)):
        bag = X[i]
        bagLabel = Y[i]
        
        instances = breakBagIntoInstances(bag, subinstanceWidth, subinstaceStride)
        instanceLabels = [Y[i]] * len(instances)
        X_out.append(instances)
        Y_out.append(instanceLabels)
```

In [8]:
x_train, y_train = np.load('./HAR/48_16/x_train.npy'), np.load('./HAR/48_16/y_train.npy')
x_test, y_test = np.load('./HAR/48_16/x_test.npy'), np.load('./HAR/48_16/y_test.npy')
x_val, y_val = np.load('./HAR/48_16/x_val.npy'), np.load('./HAR/48_16/y_val.npy')

# BAG_TEST, BAG_TRAIN, BAG_VAL are used as part of some of the analysis methods
# These are BAG level labels.
BAG_TEST = np.argmax(y_test[:, 0, :], axis=1)
BAG_TRAIN = np.argmax(y_train[:, 0, :], axis=1)
BAG_VAL = np.argmax(y_val[:, 0, :], axis=1)

print("x_train shape is:", x_train.shape)
print("y_train shape is:", y_train.shape)
print("x_test shape is:", x_val.shape)
print("y_test shape is:", y_val.shape)

x_train shape is: (6220, 6, 48, 9)
y_train shape is: (6220, 6, 6)
x_test shape is: (1132, 6, 48, 9)
y_test shape is: (1132, 6, 6)


In [9]:
# These are the parameters that are required to create the training graph
SUBINSTANCE_WIDTH = 48
SUBINSTANCE_STRIDE = 16
NUM_SUBINSTANCE = x_val.shape[1]
NUM_TIME_STEPS = x_val.shape[2]
NUM_FEATS = x_val.shape[3]
NUM_HIDDEN = 16
# TODO: Explain this. In the mean time, set it to NUM_HIDDEN
NUM_FC = NUM_HIDDEN
NUM_OUTPUT = 6
NUM_ITER = 3
NUM_ROUNDS = 5
MODELDIR = '/tmp/model_dump/'
MIN_SUBSEQUENCE_LEN = 3

# Training parameters. To make sure datset API is used efficiently,
# these parameters need to be known apriori.
trainingParams = {
    'batch_size': 256,
    'max_epochs': 50,
    'learning_rate_start': 0.001,
    'keep_prob':0.85
}

print('Num subinstance', NUM_SUBINSTANCE)
print('Num time steps', NUM_TIME_STEPS)
print('Num feats', NUM_FEATS)

Num subinstance 6
Num time steps 48
Num feats 9


# Training

Both *MI-RNN* and *EMI-RNN* training happens in multiple *rounds*. Each round consists two phases, the training phase where we learn the best possible model for the current information of instance labels, followed by the label update phase where we use the best model we have to update the label information of the instances.

In [None]:
currentRound = 0
reuse = False
currY = np.array(y_train)
while currentRound < NUM_ROUNDS:
    print("%s" %( '-' * 10))
    print("Round %d" % (currentRound))
    print("%s" %( '-' * 10))
    # Some random numbers as global steps so that models are not overwritten
    globalStepBase = 20000 + currentRound * 100
    accList = []
    modelList = []
    # Start training. We save a model after each interation and 
    # reload the one with the best validation set performance later
    for i in range(NUM_ITER):
        print("Iteration %d " % (i))
        if reuse is False:
            print("Generating graph %d" % i)
            tf.reset_default_graph()
            network = NetworkV2(NUM_SUBINSTANCE, NUM_FEATS,
                                NUM_TIME_STEPS, NUM_HIDDEN,
                                NUM_FC, NUM_OUTPUT)
            network.createGraph(stepSize=trainingParams['learning_rate_start'])
        
        network.trainModel(x_train, y_train, x_val, y_val, trainingParams, reuse=reuse)
        network.checkpointModel(MODELDIR, max_to_keep=1000,
                                global_step = globalStepBase + i)
        rawOut, softmaxOut, labelOut = network.inference(x_val, 50000)
        trueLabels = np.argmax(y_val, axis=2)
        f = open(os.devnull, 'w')
        df = analysisModelMultiClass(labelOut, trueLabels, BAG_VAL,
                                     NUM_SUBINSTANCE, numClass=NUM_OUTPUT,
                                     redirFile=f)
        f.close()
        acc = np.max(df.acc.values)
        # ssl: subsequence length
        print("Val Accuracy: %f @ssl %d" % (acc, np.argmax(df.acc.values) + 1))
        accList.append(acc)
        modelList.append((MODELDIR, globalStepBase + i))
        reuse=True

    # Load the next best model
    idx = np.argmax(accList)
    modelname, global_step = modelList[idx]
    tf.reset_default_graph()
    network = NetworkV2(NUM_SUBINSTANCE, NUM_FEATS, NUM_TIME_STEPS,
                        NUM_HIDDEN, NUM_FC, NUM_OUTPUT, useCudnn=False)
    graph = network.importModelTF(modelname, global_step)
    rawOut, softmaxOut, labelOut = network.inference(x_val, 50000)
    trueLabels = np.argmax(y_val, axis=2)
    f = open(os.devnull, 'w')
    df = analysisModelMultiClass(labelOut, trueLabels, BAG_VAL,
                                NUM_SUBINSTANCE, numClass=NUM_OUTPUT, redirFile=f)
    f.close()
    print("\nVal Accuracy: %f @ssl %d" % (np.max(df.acc.values), np.argmax(df.acc.values) + 1))
  
    # Update label information
    _, softmaxOut, _ = network.inference(x_train, 50000)
    newY = updateYPolicy4(currY, softmaxOut, BAG_TRAIN,
                          numClasses=NUM_OUTPUT, k=MIN_SUBSEQUENCE_LEN)
    updateIndexBags, updateIndexTotal = getUpdateIndexList(currY, newY,
                                        NUM_SUBINSTANCE, NUM_OUTPUT)
    count = len(updateIndexBags)
    print("Number of bag updates: %d (%f)" % (count, count / len(newY)))
    print("Number of toal updates: %d (%f)" % (updateIndexTotal, updateIndexTotal / (len(newY) * NUM_SUBINSTANCE)))
    currY = newY
    currentRound += 1

----------
Round 0
----------
Iteration 0 
Generating graph 0
Using softmax loss
GPU Fraction: 1.0
Executing 50 epochs
Epoch  48 Batch     0 ( 1200) Loss 0.11787 Accuracy 0.94792 
Model saved to /tmp/model_dump/, global_step 20000
Val Accuracy: 0.931095 @ssl 2
Iteration 1 
Reusing previous session
Reusing previous init
Executing 50 epochs
Epoch  16 Batch    20 (  420) Loss 0.08526 Accuracy 0.95573 

In [5]:
network.checkpointModel('/tmp/model00_', 1000)

Model saved to /tmp/model00_, global_step 1000


In [6]:
tf.reset_default_graph()
network = NetworkV2(NUM_SUBINSTANCE, NUM_FEATS, NUM_TIME_STEPS, NUM_HIDDEN, NUM_FC, NUM_OUTPUT, useCudnn=False)
_ = network.importModelTF('/tmp/model00_', 1000)

INFO:tensorflow:Restoring parameters from /tmp/model00_-1000
Restoring /tmp/model00_-1000


## Test Stats

In [7]:
_, softmaxOut, predictions = network.inference(x_test, 1000)
trueLabels = np.argmax(y_test, axis=2)
bagTest = np.argmax(y_test, axis=2)[:, 0]
df = analysisModelMultiClass(predictions, trueLabels,
                        bagTest, NUM_SUBINSTANCE,
                        numClass=NUM_OUTPUT)

   len       acc  macro-fsc  macro-pre  macro-rec  micro-fsc  micro-pre  \
0    1  0.876824   0.878007   0.879787   0.879677   0.876824   0.876824   
1    2  0.888022   0.889345   0.888977   0.890494   0.888022   0.888022   
2    3  0.880556   0.881668   0.883384   0.882141   0.880556   0.880556   
3    4  0.862233   0.863936   0.873560   0.862808   0.862233   0.862233   
4    5  0.842891   0.846170   0.868189   0.842166   0.842891   0.842891   
5    6  0.825585   0.830608   0.863335   0.824227   0.825585   0.825585   

   micro-rec  
0   0.876824  
1   0.888022  
2   0.880556  
3   0.862233  
4   0.842891  
5   0.825585  
Max accuracy 0.888022 at subsequencelength 2
Max micro-f 0.888022 at subsequencelength 2
Micro-precision 0.888022 at subsequencelength 2
Micro-recall 0.888022 at subsequencelength 2
Max macro-f 0.889345 at subsequencelength 2
macro-precision 0.888977 at subsequencelength 2
macro-recall 0.890494 at subsequencelength 2
Fraction false alarm 0.038642 (115/2976) 
