# Sequence Classification using Recurrent Neural Networks(RNN)
In this homework, you will learn how to train a recurrent neural network for human action classification. RNN is designed handle sequential data. The network can incorporate both past history and current input. [This](http://colah.github.io/posts/2015-08-Understanding-LSTMs/) is a very good tutorial. You should read it before you start.

## Setup
**Please make sure you have h5py and torchnet installed**
> pip install h5py

> pip install git+https://github.com/pytorch/tnt.git@master

## Known Windows Issues:
### In case you're getting an error [Read more](https://discuss.pytorch.org/t/brokenpipeerror-errno-32-broken-pipe-when-i-run-cifar10-tutorial-py/6224): 
```python
BrokenPipeError: [Errno 32] Broken pipe
```

>In the dataloader block change Line 39, 42, and 45 num_workers=0 

### In case of error (This should be a CUDA error [Read more](https://discuss.pytorch.org/t/asserterror-in-lstm-layer-on-gpu/8698)):

```python
--> 186             assert param_from.type() == param_to.type()
AssertionError: 
```

**Replace following lines:**
```python
def run_epoch(data_loader, model, criterion, epoch, is_training, optimizer=None):
    ...
    input_sequence_var = Variable(sequence).type(FloatTensor)
    input_label_var = Variable(label).type(LongTensor)
    ...
```
```python
def predict_on_test(model, data_loader):
    ...
        input_sequence_var = Variable(sequence).type(FloatTensor)
    ...
```
** With: **
```python
def run_epoch(data_loader, model, criterion, epoch, is_training, optimizer=None):
    ...
    input_sequence_var = Variable(sequence)
    input_label_var = Variable(label)
    ...
```
```python
def predict_on_test(model, data_loader):
    ...
        input_sequence_var = Variable(sequence)
    ...
```

In [0]:
!pip install git+https://github.com/pytorch/tnt.git@master

In [17]:
import os
import numpy as np
import h5py

import torch
import torch.nn as nn
from torch.autograd import Variable
import torch.utils.data as DD
import torchnet as tnt


use_cuda = torch.cuda.is_available()
print('use cuda: %s'%(use_cuda))
FloatTensor = torch.cuda.FloatTensor if use_cuda else torch.FloatTensor
LongTensor = torch.cuda.LongTensor if use_cuda else torch.LongTensor
ByteTensor = torch.cuda.ByteTensor if use_cuda else torch.ByteTensor



use cuda: True


In [0]:
!pip install -U -q PyDrive

In [0]:
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

In [6]:
fileId = drive.CreateFile({'id': '14Jx-VPNPx1_o8J3qBRSKWDSNN2QAmBlB'}) #DRIVE_FILE_ID is file id example: 1iytA1n2z4go3uVCwE_vIKouTKyIDjEq
print fileId['title'] 
fileId.GetContentFile('hw5.zip')

hw5.zip


In [7]:
!unzip hw5.zip -d ./

Archive:  hw5.zip
   creating: ./hw5/
  inflating: ./hw5/.DS_Store         
   creating: ./__MACOSX/
   creating: ./__MACOSX/hw5/
  inflating: ./__MACOSX/hw5/._.DS_Store  
  inflating: ./hw5/RNN_ActionClassify.ipynb  
  inflating: ./__MACOSX/hw5/._RNN_ActionClassify.ipynb  
   creating: ./hw5/data/
  inflating: ./hw5/data/val_label.h5  
   creating: ./__MACOSX/hw5/data/
  inflating: ./__MACOSX/hw5/data/._val_label.h5  
  inflating: ./hw5/data/.DS_Store    
  inflating: ./__MACOSX/hw5/data/._.DS_Store  
  inflating: ./hw5/data/train_label.h5  
  inflating: ./__MACOSX/hw5/data/._train_label.h5  
  inflating: ./hw5/data/val_data.h5  
  inflating: ./__MACOSX/hw5/data/._val_data.h5  
  inflating: ./hw5/data/test_data.h5  
  inflating: ./__MACOSX/hw5/data/._test_data.h5  
  inflating: ./hw5/data/train_data.h5  
  inflating: ./__MACOSX/hw5/data/._train_data.h5  
  inflating: ./__MACOSX/hw5/._data   
  inflating: ./__MACOSX/._hw5        


## Dataset
The data we are using is skeleton data, which indicates the 3D locations of body joints. In total, there are 25 body joints. It is collected by Kinect v2. To make it easier, each sequence have same number of frames. You need to classify 10 different actions. There are 4000 training sequences, 800 validation sequences, and 1000 test sequences. Each sequence has 15 frames, each frame is a 75-dimension vector (3*25).

For your convenience, we provide the dataloader for you.


In [0]:
class Dataset(DD.Dataset):
    # subset can be: 'train', 'val', 'test'
    def __init__(self, data_path, subset='train'):
        super(Dataset, self).__init__()
        self.data_path = os.path.join(data_path, '%s_data.h5'%subset)
        self.subset = subset

        with h5py.File(self.data_path) as f:
            self.data = np.array(f['data'])

        if subset != 'test':
            self.label_path = os.path.join(data_path, '%s_label.h5'%subset)
            with h5py.File(self.label_path) as f:
                self.label = np.array(f['label'])

        self.num_sequences = self.data.shape[0]
        self.seq_len = self.data.shape[1]
        self.n_dim = self.data.shape[2]

    def __getitem__(self, index):
        seq = self.data[index]
        if self.subset != 'test':
            label = int(self.label[index])
            sample = {'seq': seq, 'label': label}
        else:
            sample = {'seq': seq}
        return sample

    def __len__(self):
        return self.num_sequences

trSet = Dataset('./hw5/data', subset='train')
valSet = Dataset('./hw5/data', subset='val')
tstSet = Dataset('./hw5/data', subset='test')

batch_size = 50
trLD = DD.DataLoader(trSet, batch_size=batch_size,
       sampler=DD.sampler.RandomSampler(trSet),
       num_workers=2, pin_memory=False)
valLD = DD.DataLoader(valSet, batch_size=batch_size,
       sampler=DD.sampler.SequentialSampler(valSet),
       num_workers=1, pin_memory=False)
tstLD = DD.DataLoader(tstSet, batch_size=batch_size,
       sampler=DD.sampler.SequentialSampler(tstSet),
       num_workers=1, pin_memory=False)

input_dim = trSet.n_dim
num_class = 10

In [0]:
import collections
OrderedDict = collections.OrderedDict()

## Model
Pytorch has implemented different types of recurrent layers for you. For this homework, you can use any type of RNNs as you want:
> torch.nn.RNN()

> torch.nn.LSTM()

> torch.nn.GRU()

You can check details for different types of recurrent layers here: [RNN](http://pytorch.org/docs/master/nn.html#torch.nn.RNN), [LSTM]( http://pytorch.org/docs/master/nn.html#torch.nn.LSTM), [GRU](http://pytorch.org/docs/master/nn.html#torch.nn.GRU)


### Implement a specific model
In this section, you need to implement a model for sequence classification. The model has following layers:
* 1 Layer LSTM layer with hidden size of 100, and input size of 75
* A linear layer that goes from 100 to num_class (10). 

An LSTM layer takes an input of size of (batch_size, seq_len, fea_dim) and outputs a variable of shape (batch_size, seq_len, hidden_size). In this homework, the classification score for a sequence is the classification score for the last step of rnn_outputs.



In [0]:
# sequence classification model
class SequenceClassify(nn.Module):
    def __init__(self):
        super(SequenceClassify, self).__init__()
        
        ############## 1st To Do (20 points) ##############
        ###################################################
             
        self.recurrent_layer = nn.LSTM(75,120,2,batch_first=True,dropout=0.3)
        self.Relu=nn.ReLU()
        self.project_layer = nn.Linear(120,10)
        ###################################################
    
    # the size of input is [batch_size, seq_len(15), input_dim(75)]
    # the size of logits is [batch_size, num_class]
    def forward(self, input, h_t_1=None, c_t_1=None):
        # the size of rnn_outputs is [batch_size, seq_len, rnn_size]
        rnn_outputs, (hn, cn) = self.recurrent_layer(input)
        # classify the last step of rnn_outpus
        # the size of logits is [batch_size, num_class]
        rnn_outputs=self.Relu(rnn_outputs)
        logits = self.project_layer(rnn_outputs[:,-1])
        logits=self.Relu(rnn_outputs)
        logits = self.project_layer(rnn_outputs[:,-1])
        return logits

model = SequenceClassify()

## Train the model
After you have the dataloader and model, you can start training the model. Define a SGD optimizer with learning rate of 1e-2, and a cross-entropy loss function:

In [0]:
################ 2nd To Do  (5 points)##################
optimizer = torch.optim.SGD(model.parameters(),lr = 1e-1)
criterion = nn.CrossEntropyLoss()

In [62]:
# run the model for one epoch
# can be used for both training or validation model
def run_epoch(data_loader, model, criterion, epoch, is_training, optimizer=None):
    if is_training:
        model.train()
        logger_prefix = 'train'
    else:
        model.eval()
        logger_prefix = 'val'

    confusion_matrix = tnt.meter.ConfusionMeter(num_class)
    acc = tnt.meter.ClassErrorMeter(accuracy=True)
    meter_loss = tnt.meter.AverageValueMeter()

    for batch_idx, sample in enumerate(data_loader):
        sequence = sample['seq']
        label = sample['label']
        input_sequence_var = Variable(sequence)  
        input_label_var = Variable(label)
        

        # compute output
        # output_logits: [batch_size, num_class]
        output_logits = model(input_sequence_var)
        loss = criterion(output_logits, input_label_var)

        if is_training:
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

        meter_loss.add(loss.data[0])
        acc.add(output_logits.data, input_label_var.data)
        confusion_matrix.add(output_logits.data, input_label_var.data)


    print('%s Epoch: %d  , Loss: %.4f,  Accuracy: %.2f'%(logger_prefix, epoch, meter_loss.value()[0], acc.value()[0]))
    return meter_loss.value()[0]

num_epochs = 301
evaluate_every_epoch = 5
train_loss=[]
validation_loss=[]
for e in range(num_epochs):
    loss=run_epoch(trLD, model, criterion, e, True, optimizer)
    train_loss.append(loss)
    if e % evaluate_every_epoch == 0:
        val_loss=run_epoch(valLD, model, criterion, e, False, None)  
        validation_loss.append(val_loss)




train Epoch: 0  , Loss: 0.0521,  Accuracy: 98.35
val Epoch: 0  , Loss: 1.0808,  Accuracy: 78.75
train Epoch: 1  , Loss: 0.0836,  Accuracy: 97.10
train Epoch: 2  , Loss: 0.0548,  Accuracy: 98.17
train Epoch: 3  , Loss: 0.0665,  Accuracy: 97.82
train Epoch: 4  , Loss: 0.0370,  Accuracy: 98.88
train Epoch: 5  , Loss: 0.0523,  Accuracy: 98.58
val Epoch: 5  , Loss: 1.1062,  Accuracy: 79.88
train Epoch: 6  , Loss: 0.0627,  Accuracy: 98.20
train Epoch: 7  , Loss: 0.0485,  Accuracy: 98.72
train Epoch: 8  , Loss: 0.0451,  Accuracy: 98.58
train Epoch: 9  , Loss: 0.0241,  Accuracy: 99.58
train Epoch: 10  , Loss: 0.0199,  Accuracy: 99.72
val Epoch: 10  , Loss: 1.0870,  Accuracy: 80.88
train Epoch: 11  , Loss: 0.0177,  Accuracy: 99.85
train Epoch: 12  , Loss: 0.0691,  Accuracy: 97.75
train Epoch: 13  , Loss: 0.0401,  Accuracy: 98.80
train Epoch: 14  , Loss: 0.2154,  Accuracy: 93.97
train Epoch: 15  , Loss: 0.0542,  Accuracy: 98.50
val Epoch: 15  , Loss: 1.1089,  Accuracy: 79.12
train Epoch: 16  , L

train Epoch: 52  , Loss: 0.0516,  Accuracy: 98.40
train Epoch: 53  , Loss: 0.0858,  Accuracy: 97.28
train Epoch: 54  , Loss: 0.2269,  Accuracy: 92.58
train Epoch: 55  , Loss: 0.0903,  Accuracy: 97.17
val Epoch: 55  , Loss: 1.1485,  Accuracy: 78.38
train Epoch: 56  , Loss: 0.0517,  Accuracy: 98.50
train Epoch: 57  , Loss: 0.0420,  Accuracy: 99.10
train Epoch: 58  , Loss: 0.1815,  Accuracy: 94.80
train Epoch: 59  , Loss: 0.2017,  Accuracy: 94.27
train Epoch: 60  , Loss: 0.1034,  Accuracy: 96.85
val Epoch: 60  , Loss: 1.1344,  Accuracy: 76.88
train Epoch: 61  , Loss: 0.1340,  Accuracy: 95.30
train Epoch: 62  , Loss: 0.0420,  Accuracy: 98.85
train Epoch: 63  , Loss: 0.0930,  Accuracy: 97.20
train Epoch: 64  , Loss: 0.0347,  Accuracy: 99.02
train Epoch: 65  , Loss: 0.0510,  Accuracy: 98.55
val Epoch: 65  , Loss: 1.1007,  Accuracy: 78.00
train Epoch: 66  , Loss: 0.0209,  Accuracy: 99.62
train Epoch: 67  , Loss: 0.0270,  Accuracy: 99.28
train Epoch: 68  , Loss: 0.0255,  Accuracy: 99.33
train 

train Epoch: 104  , Loss: 0.0154,  Accuracy: 99.70
train Epoch: 105  , Loss: 0.0102,  Accuracy: 99.85
val Epoch: 105  , Loss: 1.1451,  Accuracy: 79.62
train Epoch: 106  , Loss: 0.0432,  Accuracy: 98.85
train Epoch: 107  , Loss: 0.0185,  Accuracy: 99.55
train Epoch: 108  , Loss: 0.0201,  Accuracy: 99.48
train Epoch: 109  , Loss: 0.0151,  Accuracy: 99.62
train Epoch: 110  , Loss: 0.0208,  Accuracy: 99.33
val Epoch: 110  , Loss: 1.3168,  Accuracy: 77.12
train Epoch: 111  , Loss: 0.0116,  Accuracy: 99.85
train Epoch: 112  , Loss: 0.0134,  Accuracy: 99.75
train Epoch: 113  , Loss: 0.0155,  Accuracy: 99.55
train Epoch: 114  , Loss: 0.0324,  Accuracy: 99.00
train Epoch: 115  , Loss: 0.0177,  Accuracy: 99.52
val Epoch: 115  , Loss: 1.1610,  Accuracy: 79.12
train Epoch: 116  , Loss: 0.0091,  Accuracy: 99.90
train Epoch: 117  , Loss: 0.0096,  Accuracy: 99.88
train Epoch: 118  , Loss: 0.0113,  Accuracy: 99.78
train Epoch: 119  , Loss: 0.0181,  Accuracy: 99.58
train Epoch: 120  , Loss: 0.0140,  Ac

val Epoch: 155  , Loss: 1.2623,  Accuracy: 78.75
train Epoch: 156  , Loss: 0.0165,  Accuracy: 99.45
train Epoch: 157  , Loss: 0.0176,  Accuracy: 99.58
train Epoch: 158  , Loss: 0.0102,  Accuracy: 99.83
train Epoch: 159  , Loss: 0.0139,  Accuracy: 99.67
train Epoch: 160  , Loss: 0.0418,  Accuracy: 98.90
val Epoch: 160  , Loss: 1.2122,  Accuracy: 79.12
train Epoch: 161  , Loss: 0.0113,  Accuracy: 99.75
train Epoch: 162  , Loss: 0.0109,  Accuracy: 99.80
train Epoch: 163  , Loss: 0.0575,  Accuracy: 98.58
train Epoch: 164  , Loss: 0.0693,  Accuracy: 97.38
train Epoch: 165  , Loss: 0.0146,  Accuracy: 99.60
val Epoch: 165  , Loss: 1.1878,  Accuracy: 79.12
train Epoch: 166  , Loss: 0.0246,  Accuracy: 99.25
train Epoch: 167  , Loss: 0.0362,  Accuracy: 98.98
train Epoch: 168  , Loss: 0.0160,  Accuracy: 99.65
train Epoch: 169  , Loss: 0.0365,  Accuracy: 98.95
train Epoch: 170  , Loss: 0.0168,  Accuracy: 99.62
val Epoch: 170  , Loss: 1.1993,  Accuracy: 78.62
train Epoch: 171  , Loss: 0.0358,  Accu

train Epoch: 207  , Loss: 0.0244,  Accuracy: 99.40
train Epoch: 208  , Loss: 0.0131,  Accuracy: 99.70
train Epoch: 209  , Loss: 0.0213,  Accuracy: 99.42
train Epoch: 210  , Loss: 0.0175,  Accuracy: 99.50
val Epoch: 210  , Loss: 1.3078,  Accuracy: 78.50
train Epoch: 211  , Loss: 0.0135,  Accuracy: 99.67
train Epoch: 212  , Loss: 0.0240,  Accuracy: 99.28
train Epoch: 213  , Loss: 0.0176,  Accuracy: 99.62
train Epoch: 214  , Loss: 0.0200,  Accuracy: 99.30
train Epoch: 215  , Loss: 0.0083,  Accuracy: 99.88
val Epoch: 215  , Loss: 1.2824,  Accuracy: 78.75
train Epoch: 216  , Loss: 0.0063,  Accuracy: 99.90
train Epoch: 217  , Loss: 0.0138,  Accuracy: 99.60
train Epoch: 218  , Loss: 0.0360,  Accuracy: 99.10
train Epoch: 219  , Loss: 0.0077,  Accuracy: 99.90
train Epoch: 220  , Loss: 0.0080,  Accuracy: 99.92
val Epoch: 220  , Loss: 1.2953,  Accuracy: 79.38
train Epoch: 221  , Loss: 0.0086,  Accuracy: 99.88
train Epoch: 222  , Loss: 0.0090,  Accuracy: 99.80
train Epoch: 223  , Loss: 0.0056,  Ac

train Epoch: 259  , Loss: 0.0134,  Accuracy: 99.67
train Epoch: 260  , Loss: 0.0259,  Accuracy: 99.35
val Epoch: 260  , Loss: 1.2178,  Accuracy: 80.75
train Epoch: 261  , Loss: 0.0118,  Accuracy: 99.75
train Epoch: 262  , Loss: 0.0130,  Accuracy: 99.72
train Epoch: 263  , Loss: 0.0105,  Accuracy: 99.78
train Epoch: 264  , Loss: 0.0131,  Accuracy: 99.80
train Epoch: 265  , Loss: 0.0064,  Accuracy: 99.92
val Epoch: 265  , Loss: 1.2354,  Accuracy: 79.50
train Epoch: 266  , Loss: 0.0059,  Accuracy: 99.90
train Epoch: 267  , Loss: 0.0167,  Accuracy: 99.58
train Epoch: 268  , Loss: 0.0088,  Accuracy: 99.90
train Epoch: 269  , Loss: 0.0084,  Accuracy: 99.80
train Epoch: 270  , Loss: 0.0063,  Accuracy: 99.90
val Epoch: 270  , Loss: 1.2645,  Accuracy: 79.25
train Epoch: 271  , Loss: 0.0072,  Accuracy: 99.88
train Epoch: 272  , Loss: 0.0046,  Accuracy: 99.95
train Epoch: 273  , Loss: 0.0054,  Accuracy: 99.98
train Epoch: 274  , Loss: 0.0125,  Accuracy: 99.55
train Epoch: 275  , Loss: 0.0137,  Ac

In [0]:
import matplotlib.pyplot as plt

train_temp = np.arange(1,302)
plt.xlim(0,1)
plt.xlabel('Number of iterations')
plt.ylabel('Train Loss')
plt.title('Training Loss')
plt.plot(train_temp,train_loss)
plt.show()

val_temp = np.arange(1,62)
plt.xlabel('Number of iterations')
plt.ylabel('Validation Loss')
plt.title('Validation Loss')
plt.plot(val_temp,validation_loss)
plt.show()

## Submit your results on Kaggle

### Train a better model for action recognition!
Now it's your job to experiment with architectures, hyperparameters, loss functions, and optimizers to train a model that achieves better accuracy on the action recognition validation set.


### Testing the model and submit on Kaggle
Testing the model on the testing set and save the results as a .csv file. 
Please submitted the results.csv file generated by predict_on_test() to Kaggle(https://www.kaggle.com/c/cse512springhw5) to see how well your network performs on the test set. 
################ 3rd To Do  (30 points, the highest 3 entries get extra 10 points) ###############


In [73]:
# Use your best model to generate results on test set.

# generate csv file for test set
def predict_on_test(model, data_loader):
    model.eval() # Put the model in test mode (the opposite of model.train(), essentially)
    results=open('results_tejas_new_6.csv','w')
    count=0
    results.write('Id'+','+'Class'+'\n')
    for batch_idx, sample in enumerate(data_loader):
        sequence = sample['seq']
        input_sequence_var = Variable(sequence) 
        scores = model(input_sequence_var)
        _, preds = scores.data.max(1)
        for i in range(len(preds)):
            x= str(preds[i])
            ans=int(x[7])
            results.write(str(count)+','+str(ans)+'\n')
            
            #print ans
            count+=1
    results.close()
    return count

count=predict_on_test(model, tstLD)
print(count)

1000


In [0]:
from google.colab import files

files.download('results_tejas_new_6.csv')

## Report the performance
################ 4th To Do  (15 points)##################

### Documentation of what you did
In this cell, you should write an explanation of what you did (network architecture, optimiziter, learning rate, epoches) and visualizations or graphs of loss/accuracy curve tin the process of training and evaluating.

I tried different architectures and played with different parameters and tuned it to get the best accuracy.
My best architecture is as:

1) Single LSTM layer with input size of 75, hidden size 120 and with dropout of 0.3
2) ReLU Activation Layer
3) Linear Layer
4) ReLU Activation Layer
5) Linear Layer

I tried various optimizers like Adam, SGD and found SGD to be the best. I experimented by increasing and decreasing the learning rate and found e-1 to be the best learning rate.

I experimented with 100, 200 ,300 , 400 and 500 epochs. 300 epochs gave me the best possible accuracy and after 300 the model starts overfitting.



### performance on Kaggle
You should also report your Kaggle Performance here: 

0.794