# Sequence Classification using Recurrent Neural Networks(RNN)
In this homework, you will learn how to train a recurrent neural network for human action classification. RNN is designed handle sequential data. The network can incorporate both past history and current input. [This](http://colah.github.io/posts/2015-08-Understanding-LSTMs/) is a very good tutorial. You should read it before you start.

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


## Setup
Please make sure you have h5py and torchnet installed
> pip install h5py

> pip install git+https://github.com/pytorch/tnt.git@master


In [8]:
!pip install h5py
!pip install git+https://github.com/pytorch/tnt.git@master

Collecting git+https://github.com/pytorch/tnt.git@master
  Cloning https://github.com/pytorch/tnt.git (to revision master) to /tmp/pip-req-build-pvy42imn
  Running command git clone -q https://github.com/pytorch/tnt.git /tmp/pip-req-build-pvy42imn
Building wheels for collected packages: torchnet
  Building wheel for torchnet (setup.py) ... [?25l[?25hdone
  Created wheel for torchnet: filename=torchnet-0.0.5.1-cp36-none-any.whl size=30917 sha256=8d73f02fc1bda61b74064b5c3399654172d75484b4137f5f3ecf276e32739a2c
  Stored in directory: /tmp/pip-ephem-wheel-cache-v_s83rts/wheels/17/05/ec/d05d051a225871af52bf504f5e8daf57704811b3c1850d0012
Successfully built torchnet


In [83]:
import os
import numpy as np
import h5py

import torch
import torch.nn as nn
from torch.autograd import Variable
import torch.utils.data as DD
import torchnet as tnt

use_cuda = torch.cuda.is_available()
print('use cuda: %s'%(use_cuda))
FloatTensor = torch.cuda.FloatTensor if use_cuda else torch.FloatTensor
LongTensor = torch.cuda.LongTensor if use_cuda else torch.LongTensor
ByteTensor = torch.cuda.ByteTensor if use_cuda else torch.ByteTensor


use cuda: False


## Dataset
The data we are using is skeleton data, which indicates the 3D locations of body joints. In total, there are 25 body joints. It is collected by Kinect v2. To make it easier, each sequence have same number of frames. You need to classify 10 different actions. There are 2000 training sequences, 400 validation sequences, and 500 test sequences. Each sequence has 15 frames, each frame is a 75-dimension vector (3*25). 

For your convenience, we provide the dataloader for you.


In [0]:
class Dataset(DD.Dataset):
    # subset can be: 'train', 'val', 'test'
    def __init__(self, data_path, subset='train'):
        super(Dataset, self).__init__()
        self.data_path = os.path.join(data_path, '%s_data.h5'%subset)
        self.subset = subset

        with h5py.File(self.data_path) as f:
            self.data = np.array(f['data'])

        if subset != 'test':
            self.label_path = os.path.join(data_path, '%s_label.h5'%subset)
            with h5py.File(self.label_path) as f:
                self.label = np.array(f['label'])

        self.num_sequences = self.data.shape[0]
        self.seq_len = self.data.shape[1]
        self.n_dim = self.data.shape[2]

    def __getitem__(self, index):
        seq = self.data[index]
        if self.subset != 'test':
            label = int(self.label[index])
            sample = {'seq': seq, 'label': label}
        else:
            sample = {'seq': seq}
        return sample

    def __len__(self):
        return self.num_sequences

trSet = Dataset('/content/drive/My Drive/Colab Notebooks/Question3/data', subset='train')
valSet = Dataset('/content/drive/My Drive/Colab Notebooks/Question3/data', subset='val')
tstSet = Dataset('/content/drive/My Drive/Colab Notebooks/Question3/data', subset='test')

batch_size = 50
trLD = DD.DataLoader(trSet, batch_size=batch_size,
       sampler=DD.sampler.RandomSampler(trSet),
       num_workers=2, pin_memory=False)
valLD = DD.DataLoader(valSet, batch_size=batch_size,
       sampler=DD.sampler.SequentialSampler(valSet),
       num_workers=1, pin_memory=False)
tstLD = DD.DataLoader(tstSet, batch_size=batch_size,
       sampler=DD.sampler.SequentialSampler(tstSet),
       num_workers=1, pin_memory=False)

input_dim = trSet.n_dim
num_class = 10

## Model
Pytorch has implemented different types of recurrent layers for you. For this homework, you can use any type of RNNs as you want:
> torch.nn.RNN()

> torch.nn.LSTM()

> torch.nn.GRU()

You can check details for different types of recurrent layers here: [RNN](http://pytorch.org/docs/master/nn.html#torch.nn.RNN), [LSTM]( http://pytorch.org/docs/master/nn.html#torch.nn.LSTM), [GRU](http://pytorch.org/docs/master/nn.html#torch.nn.GRU)


### Implement a specific model
In this section, you need to implement a model for sequence classification. The model has following layers:
* A linear layer that can map features of 75-dimension to 100-dimension.
* 1 Layer LSTM layer with hidden size of 100
* A linear layer that goes from 100 to num_class (10). 

An LSTM layer takes an input of size of (batch_size, seq_len, fea_dim) and outputs a variable of shape (batch_size, seq_len, hidden_size). In this homework, the classification score for a sequence is the classification score for the last step of rnn_outputs.



In [0]:
# sequence classification model
class sqClassify(nn.Module):
    def __init__(self):
        super(sqClassify, self).__init__()
        
        ############## 1st To Do (10 points) ##############
        ###################################################
        self.project_layer = nn.Linear(75, 100)
        self.recurrent_layer = nn.LSTM(100, 100, 1)
        self.classify_layer = nn.Linear(100, 10)
        ###################################################
    
    # the size of input is [batch_size, seq_len(15), input_dim(75)]
    # the size of logits is [batch_size, num_class]
    def forward(self, input, h_t_1=None, c_t_1=None):
        # the size of rnn_outputs is [batch_size, seq_len, rnn_size]
        rnnOpt, (hn, cn) = self.recurrent_layer(self.project_layer(input))
        # classify the last step of rnn_outpus
        # the size of logits is [batch_size, num_class]
        lgts = self.classify_layer(rnnOpt[:,-1])
        return lgts

model = sqClassify()

## Train the model
After you have the dataloader and model, you can start training the model. Define a SGD optimizer with learning rate of 1e-3, and a cross-entropy loss function:

In [0]:
################ 2nd To Do  (5 points)##################
from torch import optim
dtype = torch.FloatTensor
optimizer = optim.SGD(model.parameters(), lr = 1e-3)
criterion = nn.CrossEntropyLoss().type(dtype)

In [154]:
# one epoch train or validation model
def run_epoch(data_loader, model, criterion, epoch, is_training, optimizer=None):
    if is_training:
        model.train()
        lgp = 'train'
    else:
        model.eval()
        lgp = 'val'

    confusion_matrix = tnt.meter.ConfusionMeter(num_class)
    acc = tnt.meter.ClassErrorMeter(accuracy=True)
    mtrL = tnt.meter.AverageValueMeter()

    for batch_idx, sample in enumerate(data_loader):
        sequence = sample['seq']
        label = sample['label']
        ipSeqVar = Variable(sequence).type(FloatTensor)
        ipLabVar = Variable(label).type(LongTensor)

        # compute output
        # output_logits: [batch_size, num_class]
        oplgts = model(ipSeqVar)
        loss = criterion(oplgts, ipLabVar)

        if is_training:
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

        mtrL.add(loss.data)
        acc.add(oplgts.data, ipLabVar.data)
        confusion_matrix.add(oplgts.data, ipLabVar.data)


    print('%s epoch: %d  , loss: %.4f,  acc: %.2f'%(lgp, epoch+1, mtrL.value()[0], acc.value()[0]))
    return acc.value()[0]

epochs = 10
for i in range(epochs):
    run_epoch(trLD, model, criterion, e, True, optimizer)
    run_epoch(valLD, model, criterion, e, False, None)


train Epoch: 10  , Loss: 2.3030,  Accuracy: 10.90
val Epoch: 10  , Loss: 2.3037,  Accuracy: 10.00
train Epoch: 10  , Loss: 2.3030,  Accuracy: 11.10
val Epoch: 10  , Loss: 2.3037,  Accuracy: 10.00
train Epoch: 10  , Loss: 2.3030,  Accuracy: 11.55
val Epoch: 10  , Loss: 2.3037,  Accuracy: 10.00
train Epoch: 10  , Loss: 2.3035,  Accuracy: 11.30
val Epoch: 10  , Loss: 2.3037,  Accuracy: 10.00
train Epoch: 10  , Loss: 2.3028,  Accuracy: 11.30
val Epoch: 10  , Loss: 2.3037,  Accuracy: 10.00
train Epoch: 10  , Loss: 2.3027,  Accuracy: 11.55
val Epoch: 10  , Loss: 2.3037,  Accuracy: 10.00
train Epoch: 10  , Loss: 2.3032,  Accuracy: 11.35
val Epoch: 10  , Loss: 2.3037,  Accuracy: 10.00
train Epoch: 10  , Loss: 2.3032,  Accuracy: 11.45
val Epoch: 10  , Loss: 2.3037,  Accuracy: 10.00
train Epoch: 10  , Loss: 2.3030,  Accuracy: 11.45
val Epoch: 10  , Loss: 2.3037,  Accuracy: 10.00
train Epoch: 10  , Loss: 2.3033,  Accuracy: 11.20
val Epoch: 10  , Loss: 2.3037,  Accuracy: 10.00


## Submit your results 

### Train a better model for action recognition!
Now it's your job to experiment with architectures, hyperparameters, loss functions, and optimizers to train a model that achieves better accuracy on the action recognition validation set. 


### Testing the model and reporting the results
Test the model on the testing set and save the results as a .csv file. 
submit the results.csv file generated by predict_on_test(). Also mention the best performance on the Validation set, and submit the corresponding results csv file which results in the best performance. 
################ 3rd To Do  (15 points) ###############


In [0]:
class Flatten(nn.Module):
    def forward(self, x):
        N, C, H, W = x.size()
        return x.view(N, -1)
    
# sequence classification model
class sqClassifyfinal(nn.Module):
    def __init__(self):
        super(sqClassifyfinal, self).__init__()
        
        self.cnn_layer = nn.Sequential( 
            
              nn.Conv2d(1, 128, kernel_size=(3, 3), stride=1),
              nn.LeakyReLU(0.02),
              nn.BatchNorm2d(128),
              nn.MaxPool2d(kernel_size=2, stride=2, padding=0),

              nn.Conv2d(128, 256, kernel_size=(2,2), stride=1),
              nn.LeakyReLU(0.02),
              nn.BatchNorm2d(256),
              nn.MaxPool2d(kernel_size=2, stride=2, padding=0),

              Flatten(), 
              nn.ReLU(inplace=True),
              nn.Linear(8704, 100)
        )
        
        #self.project_layer = nn.Linear(75, 100)
        self.recurrent_layer = nn.LSTM(100, 100, 1)
        self.classify_layer = nn.Linear(100, 10)
        ###################################################
    
    # the size of input is [batch_size, seq_len(15), input_dim(75)]
    # the size of logits is [batch_size, num_class]
    def forward(self, input, h_t_1=None, c_t_1=None):
        # the size of rnn_outputs is [batch_size, seq_len, rnn_size]
        
        rnnOps = self.cnn_layer(input.view(50, 1, 15, 75))
        rnnOps, (hn, cn) = self.recurrent_layer(rnnOps.view(50, 1, 100))
        
        # classify the last step of rnn_outpus
        # the size of logits is [batch_size, num_class]
        lgts = self.classify_layer(rnnOps[:,-1])
        return lgts

finalmodel = sqClassifyfinal()

In [156]:
optimizer = optim.Adam(finalmodel.parameters(), lr = 1e-4)
criterion = nn.CrossEntropyLoss()

epochs = 50
valLst = []
trainLst = []
for i in range(epochs):
    acc = run_epoch(trLD, finalmodel, criterion, i, True, optimizer)
    trainLst.append(acc)
    acc = run_epoch(valLD, finalmodel, criterion, i, False, None)
    valLst.append(acc)

train Epoch: 1  , Loss: 2.2355,  Accuracy: 19.45
val Epoch: 1  , Loss: 2.2008,  Accuracy: 19.25
train Epoch: 2  , Loss: 1.9105,  Accuracy: 46.65
val Epoch: 2  , Loss: 1.7071,  Accuracy: 60.00
train Epoch: 3  , Loss: 1.5340,  Accuracy: 63.95
val Epoch: 3  , Loss: 1.3768,  Accuracy: 69.00
train Epoch: 4  , Loss: 1.2657,  Accuracy: 69.90
val Epoch: 4  , Loss: 1.1533,  Accuracy: 75.50
train Epoch: 5  , Loss: 1.0929,  Accuracy: 73.95
val Epoch: 5  , Loss: 1.0404,  Accuracy: 76.50
train Epoch: 6  , Loss: 0.9758,  Accuracy: 77.70
val Epoch: 6  , Loss: 0.9561,  Accuracy: 77.50
train Epoch: 7  , Loss: 0.8705,  Accuracy: 79.80
val Epoch: 7  , Loss: 0.8551,  Accuracy: 79.75
train Epoch: 8  , Loss: 0.8034,  Accuracy: 81.05
val Epoch: 8  , Loss: 0.8637,  Accuracy: 77.75
train Epoch: 9  , Loss: 0.7369,  Accuracy: 83.35
val Epoch: 9  , Loss: 0.7521,  Accuracy: 81.00
train Epoch: 10  , Loss: 0.6820,  Accuracy: 84.40
val Epoch: 10  , Loss: 0.7278,  Accuracy: 80.25
train Epoch: 11  , Loss: 0.6323,  Accu

In [160]:
# Use your best model to generate results on test set and validation set.

# generate csv file for test set
def predict_on_test(model, data_loader):
    model.eval() # Put the model in test mode (the opposite of model.train(), essentially)
    results=open('results.csv','w')
    count=0
    results.write('Id'+','+'Class'+'\n')
    for batch_idx, sample in enumerate(data_loader):
        seq = sample['seq']
        ipSeqVar = Variable(seq).type(FloatTensor)
        scores = model(ipSeqVar)
        _, preds = scores.data.max(1)
        for i in range(len(preds)):
            results.write(str(count)+','+str(preds[i])+'\n')
            count+=1
    results.close()
    return count

count=predict_on_test(model, tstLD)
print(count)

500


## Report the performance
################ 4th To Do  (5 points)##################

In this cell, you should write an explanation of what you did (network architecture, optimiziter, learning rate, epoches) and any visualizations or graphs that you make in the process of training and evaluating your network.



Network Architecture:

I used Convolutional Long Short Memory, Fully Connected Deep Neural Network. I have tried various kernel and filter sizes in convolutions to improve performance.
The implemented model has 2 layers of Conv2d, LeakyReLU, BatchNorm2d, Maxpool2d. The final layer is a fully connected LSTM layer.

Optimizer:

I tested the model with SGD and Adam optimizer. SGD optimizer gave 76% accuracy on validation set but did not converge in 50 epochs. I then used Adam Optimizer, this made the model stable, gave 86.5% accuracy on validation set and converged in 50 epochs.

Learning Rate:

The learning rate of 1e-4 is used in this model. 




In [0]:
from google.colab import files
files.download('results.csv')