# Using EMI-RNN on the HAR Dataset

This is a very simple example of how the existing EMI-RNN implementation can be used on the HAR dataset. We illustrate how to train a model that predicts on 48 step sequence in place of the 128 length baselines while attempting to predict early. For more advanced use cases which involves more sophisticated computation graphs or loss functions, please refer to the doc strings provided with the released code.

In the preprint of our work, we use the terms *bag* and *instance* to refer to the LSTM input sequence of original length and the shorter ones we want to learn to predict on, respectively. In the code though, *bag* is replaced with *instance* and *instance* is replaced with *sub-instance*. We will use the term *instance* and *sub-instance* interchangeably.

The network used here is a simple LSTM + Linear classifier network. 

The UCI [Human Activity Recognition](https://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones) dataset.

In [1]:
import os
import sys
import tensorflow as tf
import numpy as np
# Making sure edgeml is part of python path
sys.path.insert(0, '../../')
os.environ['CUDA_VISIBLE_DEVICES'] ='0'

# MI-RNN and EMI-RNN imports
from edgeml.graph.rnn import EMI_DataPipeline
from edgeml.graph.rnn import EMI_BasicLSTM
from edgeml.trainer.emirnnTrainer import EMI_Trainer, EMI_Driver
import edgeml.utils

Let us set up some network parameters for the computation graph.

In [2]:
# Network parameters for our LSTM + FC Layer
NUM_HIDDEN = 32
NUM_TIMESTEPS = 48
NUM_FEATS = 9
FORGET_BIAS = 1.0
NUM_OUTPUT = 6
USE_DROPOUT = True
KEEP_PROB = 0.75

# For dataset API
PREFETCH_NUM = 5
BATCH_SIZE = 32

# Number of epochs in *one iteration*
NUM_EPOCHS = 3
# Number of iterations in *one round*. After each iteration,
# the model is dumped to disk. At the end of the current
# round, the best model among all the dumped models in the
# current round is picked up..
NUM_ITER = 2
# A round consists of multiple training iterations and a belief
# update step using the best model from all of these iterations
NUM_ROUNDS = 5

# A staging direcory to store models
MODEL_PREFIX = '/tmp/model-lstm'

# Loading Data

Please make sure the data is preprocessed to a format that is compatible with EMI-RNN. `tf/examples/EMI-RNN/fetch_har.py` can be used to download and setup the HAR dataset.


In [3]:
# Loading the data
x_train, y_train = np.load('./HAR/48_16/x_train.npy'), np.load('./HAR/48_16/y_train.npy')
x_test, y_test = np.load('./HAR/48_16/x_test.npy'), np.load('./HAR/48_16/y_test.npy')
x_val, y_val = np.load('./HAR/48_16/x_val.npy'), np.load('./HAR/48_16/y_val.npy')

# BAG_TEST, BAG_TRAIN, BAG_VAL represent bag_level labels. These are used for the label update
# step of EMI/MI RNN
BAG_TEST = np.argmax(y_test[:, 0, :], axis=1)
BAG_TRAIN = np.argmax(y_train[:, 0, :], axis=1)
BAG_VAL = np.argmax(y_val[:, 0, :], axis=1)
NUM_SUBINSTANCE = x_train.shape[1]
print("x_train shape is:", x_train.shape)
print("y_train shape is:", y_train.shape)
print("x_test shape is:", x_val.shape)
print("y_test shape is:", y_val.shape)

x_train shape is: (6220, 6, 48, 9)
y_train shape is: (6220, 6, 6)
x_test shape is: (1132, 6, 48, 9)
y_test shape is: (1132, 6, 6)


# Computation Graph

![Parst Computation graph illustration](img/3PartsGraph.png)

The *EMI-RNN* computation graph is constructed out of the following three mutually disjoint parts:

1. `EMI_DataPipeline`: An efficient data input pipeline that using the Tensorflow Dataset API. This module ingests data compatible with EMI-RNN and provides two iterators for a batch of input data, $x$ and label $y$. 
2. `EMI_RNN`: The 'abstract' `EMI-RNN` class defines the methods and attributes required for the forward computation graph. An implementation based on LSTM - `EMI_LSTM` is used in this document, though the user is free to implement his own computation graphs compatible with `EMI-RNN`. This module expects two Dataset API iterators for $x$-batch and $y$-batch as inputs and constructs the forward computation graph based on them. Every implementation of this class defines an `output` operation - the output of the forward computation graph.
3. `EMI_Trainer`: An instance of `EMI_Trainer` class which defines the loss functions and the training routine. This expects an `output` operator from an `EMI-RNN` implementation and attaches loss functions and training routines to it.

To build the computation graph, we create an instance of all the above and then connect them together.

Note that, the `EMI_BasicLSTM` class is an implementation that uses an LSTM cell and pushes the LSTM output at each step to a secondary classifier for classification. This secondary classifier is not implemented as part of `EMI_BasicLSTM` and is left to the user to define by overriding the `createExtendedGraph` method, and the `restoreExtendedgraph` method.

For the purpose of this example, we will be using a simple linear layer as a secondary classifier.

In [4]:
# Define the linear secondary classifier
def createExtendedGraph(self, baseOutput, *args, **kwargs):
    W1 = tf.Variable(np.random.normal(size=[NUM_HIDDEN, NUM_OUTPUT]).astype('float32'), name='W1')
    B1 = tf.Variable(np.random.normal(size=[NUM_OUTPUT]).astype('float32'), name='B1')
    y_cap = tf.add(tf.tensordot(baseOutput, W1, axes=1), B1, name='y_cap_tata')
    self.output = y_cap
    self.graphCreated = True

def restoreExtendedGraph(self, graph, *args, **kwargs):
    y_cap = graph.get_tensor_by_name('y_cap_tata:0')
    self.output = y_cap
    self.graphCreated = True
    
def feedDictFunc(self, keep_prob=None, inference=False, **kwargs):
    if inference is False:
        feedDict = {self._emiGraph.keep_prob: keep_prob}
    else:
        feedDict = {self._emiGraph.keep_prob: 1.0}
    return feedDict
    
EMI_BasicLSTM._createExtendedGraph = createExtendedGraph
EMI_BasicLSTM._restoreExtendedGraph = restoreExtendedGraph

if USE_DROPOUT is True:
    EMI_Driver.feedDictFunc = feedDictFunc

In [5]:
inputPipeline = EMI_DataPipeline(NUM_SUBINSTANCE, NUM_TIMESTEPS, NUM_FEATS, NUM_OUTPUT)
emiLSTM = EMI_BasicLSTM(NUM_SUBINSTANCE, NUM_HIDDEN, NUM_TIMESTEPS, NUM_FEATS,
                        forgetBias=FORGET_BIAS, useDropout=USE_DROPOUT)
emiTrainer = EMI_Trainer(NUM_TIMESTEPS, NUM_OUTPUT, lossType='xentropy')

Now that we have all the elementary parts of the computation graph setup, we connect them together to form the forward graph.

In [6]:
tf.reset_default_graph()
g1 = tf.Graph()    
with g1.as_default():
    # Obtain the iterators to each batch of the data
    x_batch, y_batch = inputPipeline()
    # Create the forward computation graph based on the iterators
    y_cap = emiLSTM(x_batch)
    # Create loss graphs and training routines
    emiTrainer(y_cap, y_batch)

# EMI Driver

The `EMI_Driver` implements the `EMI_RNN` algorithm. For more information on how the driver works, please refer to `tf/docs/EMI-RNN.md`. Note that, during the training period, the accuracy printed is instance level accuracy with the current label information as target. Bag level accuracy, with which we are actually concerned, is calculated after the training ends. 

In [7]:
with g1.as_default():
    emiDriver = EMI_Driver(inputPipeline, emiLSTM, emiTrainer)

emiDriver.initializeSession(g1)
y_updated, modelStats = emiDriver.run(numClasses=NUM_OUTPUT, x_train=x_train,
                                      y_train=y_train, bag_train=BAG_TRAIN,
                                      x_val=x_val, y_val=y_val, bag_val=BAG_VAL,
                                      numIter=NUM_ITER, keep_prob=KEEP_PROB,
                                      numRounds=NUM_ROUNDS, batchSize=BATCH_SIZE,
                                      numEpochs=NUM_EPOCHS, modelPrefix=MODEL_PREFIX,
                                      fracEMI=0.5, updatePolicy='top-k', k=1)

Update policy: top-k
Training with MI-RNN loss for 3 rounds
Round: 0
Epoch   2 Batch   180 (  570) Loss 0.00533 Acc 0.89062 | Val acc 0.93707 | Model saved to /tmp/model-lstm, global_step 1000
Epoch   2 Batch   180 (  570) Loss 0.00352 Acc 0.89583 | Val acc 0.93301 | Model saved to /tmp/model-lstm, global_step 1001
INFO:tensorflow:Restoring parameters from /tmp/model-lstm-1000
Round: 1
Epoch   2 Batch   180 (  570) Loss 0.00357 Acc 0.94271 | Val acc 0.94141 | Model saved to /tmp/model-lstm, global_step 1002
Epoch   2 Batch   180 (  570) Loss 0.00302 Acc 0.93750 | Val acc 0.94271 | Model saved to /tmp/model-lstm, global_step 1003
INFO:tensorflow:Restoring parameters from /tmp/model-lstm-1003
Round: 2
Epoch   2 Batch   180 (  570) Loss 0.00361 Acc 0.89583 | Val acc 0.95153 | Model saved to /tmp/model-lstm, global_step 1004
Epoch   2 Batch   180 (  570) Loss 0.00306 Acc 0.92188 | Val acc 0.95341 | Model saved to /tmp/model-lstm, global_step 1005
INFO:tensorflow:Restoring parameters from /

# Evaluating the  trained model

![MIML Formulation illustration](img/MIML_illustration.png)

## Accuracy

Since the trained model predicts on a smaller 48-step input while our test data has labels for 128 step inputs (i.e. bag level labels), evaluating the accuracy of the trained model is not straight forward. We perform the evaluation as follows:

1. Divide the test data also into sub-instances; similar to what was done for the train data.
2. Obtain sub-instance level predictions for each bag in the test data.
3. Obtain bag level predictions from sub-instance level predictions. For this, we use our estimate of the length of the signature to estimate the expected number of sub-instances that would be non negative - $k$ illustrated in the figure. If a bag has $k$ consecutive sub-instances with the same label, that becomes the label of the bag. All other bags are labeled negative.
4. Compare the predicted bag level labels with the known bag level labels in test data.

## Early Savings

Early prediction is accomplished by defining an early prediction policy method. This method receives the prediction at each step of the learned LSTM for a sub-instance as input and is expected to return a predicted class and the 0-indexed step at which it made this prediction. This is illustrated below in code. 

In [8]:
# Early Prediction Policy: We make an early prediction based on the predicted classes
#     probability. If the predicted class probability > minProb at some step, we make
#     a prediction at that step.
def earlyPolicy_minProb(instanceOut, minProb, **kwargs):
    assert instanceOut.ndim == 2
    classes = np.argmax(instanceOut, axis=1)
    prob = np.max(instanceOut, axis=1)
    index = np.where(prob >= minProb)[0]
    if len(index) == 0:
        assert (len(instanceOut) - 1) == (len(classes) - 1)
        return classes[-1], len(instanceOut) - 1
    index = index[0]
    return classes[index], index

def getEarlySaving(predictionStep, numTimeSteps, returnTotal=False):
    predictionStep = predictionStep + 1
    predictionStep = np.reshape(predictionStep, -1)
    totalSteps = np.sum(predictionStep)
    maxSteps = len(predictionStep) * numTimeSteps
    savings = 1.0 - (totalSteps / maxSteps)
    if returnTotal:
        return savings, totalSteps
    return savings

In [9]:
k = 2
predictions, predictionStep = emiDriver.getInstancePredictions(x_test, y_test, earlyPolicy_minProb,
                                                               minProb=0.99, keep_prob=1.0)
bagPredictions = emiDriver.getBagPredictions(predictions, minSubsequenceLen=k, numClass=NUM_OUTPUT)
print('Accuracy at k = %d: %f' % (k,  np.mean((bagPredictions == BAG_TEST).astype(int))))
print('Additional savings: %f' % getEarlySaving(predictionStep, NUM_TIMESTEPS))

Accuracy at k = 2: 0.930777
Additional savings: 0.601676


In [10]:
# A slightly more detailed analysis method is provided. 
df = emiDriver.analyseModel(predictions, BAG_TEST, NUM_SUBINSTANCE, NUM_OUTPUT)

   len       acc  macro-fsc  macro-pre  macro-rec  micro-fsc  micro-pre  \
0    1  0.920937   0.920994   0.922158   0.923090   0.920937   0.920937   
1    2  0.930777   0.931287   0.931330   0.932738   0.930777   0.930777   
2    3  0.932474   0.933065   0.933142   0.933938   0.932474   0.932474   
3    4  0.919919   0.920495   0.923989   0.920161   0.919919   0.919919   
4    5  0.907363   0.908220   0.916799   0.906566   0.907363   0.907363   
5    6  0.894130   0.895735   0.911114   0.892380   0.894130   0.894130   

   micro-rec  
0   0.920937  
1   0.930777  
2   0.932474  
3   0.919919  
4   0.907363  
5   0.894130  
Max accuracy 0.932474 at subsequencelength 3
Max micro-f 0.932474 at subsequencelength 3
Micro-precision 0.932474 at subsequencelength 3
Micro-recall 0.932474 at subsequencelength 3
Max macro-f 0.933065 at subsequencelength 3
macro-precision 0.933142 at subsequencelength 3
macro-recall 0.933938 at subsequencelength 3


## Picking the best model

The `EMI_Driver.run()` method, upon finishing, returns a list containing information about the best models after each EMI-RNN round. This can be used to identify the best model (based on validation accuracy) at the end of each round - illustrated below.

In [11]:
devnull = open(os.devnull, 'r')
for val in modelStats:
    round_, acc, modelPrefix, globalStep = val
    emiDriver.loadSavedGraphToNewSession(modelPrefix, globalStep, redirFile=devnull)
    predictions, predictionStep = emiDriver.getInstancePredictions(x_test, y_test, earlyPolicy_minProb,
                                                               minProb=0.99, keep_prob=1.0)

    bagPredictions = emiDriver.getBagPredictions(predictions, minSubsequenceLen=k, numClass=NUM_OUTPUT)
    print("Round: %2d, Validation accuracy: %.4f" % (round_, acc), end='')
    print(', Test Accuracy (k = %d): %f, ' % (k,  np.mean((bagPredictions == BAG_TEST).astype(int))), end='')
    print('Additional savings: %f' % getEarlySaving(predictionStep, NUM_TIMESTEPS)) 

INFO:tensorflow:Restoring parameters from /tmp/model-lstm-1000
Round:  0, Validation accuracy: 0.9371, Test Accuracy (k = 2): 0.891415, Additional savings: 0.337203
INFO:tensorflow:Restoring parameters from /tmp/model-lstm-1003
Round:  1, Validation accuracy: 0.9427, Test Accuracy (k = 2): 0.903291, Additional savings: 0.435999
INFO:tensorflow:Restoring parameters from /tmp/model-lstm-1005
Round:  2, Validation accuracy: 0.9534, Test Accuracy (k = 2): 0.900916, Additional savings: 0.457807
INFO:tensorflow:Restoring parameters from /tmp/model-lstm-1007
Round:  3, Validation accuracy: 0.9604, Test Accuracy (k = 2): 0.933492, Additional savings: 0.532466
INFO:tensorflow:Restoring parameters from /tmp/model-lstm-1009
Round:  4, Validation accuracy: 0.9599, Test Accuracy (k = 2): 0.930777, Additional savings: 0.601676
