# Action Recognition @ UCF101  
**Due date: 11:59 pm on Nov. 19, 2019 (Tuesday)**

## Description
---
In this homework, you will be doing action recognition using Recurrent Neural Network (RNN), (Long-Short Term Memory) LSTM in particular. You will be given a dataset called UCF101, which consists of 101 different actions/classes and for each action, there will be 145 samples. We tagged each sample into either training or testing. Each sample is supposed to be a short video, but we sampled 25 frames from each videos to reduce the amount of data. Consequently, a training sample is an image tuple that forms a 3D volume with one dimension encoding *temporal correlation* between frames and a label indicating what action it is.

To tackle this problem, we aim to build a neural network that can not only capture spatial information of each frame but also temporal information between frames. Fortunately, you don't have to do this on your own. RNN — a type of neural network designed to deal with time-series data — is right here for you to use. In particular, you will be using LSTM for this task.

Instead of training an end-to-end neural network from scratch whose computation is prohibitively expensive, we divide this into two steps: feature extraction and modelling. Below are the things you need to implement for this homework:
- **{35 pts} Feature extraction**. Use any of the [pre-trained models](https://pytorch.org/docs/stable/torchvision/models.html) to extract features from each frame. Specifically, we recommend not to use the activations of the last layer as the features tend to be task specific towards the end of the network. 
    **hints**: 
    - A good starting point would be to use a pre-trained VGG16 network, we suggest first fully connected layer `torchvision.models.vgg16` (4096 dim) as features of each video frame. This will result into a 4096x25 matrix for each video. 
    - Normalize your images using `torchvision.transforms` 
    ```
    normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    prep = transforms.Compose([ transforms.ToTensor(), normalize ])
    prep(img)
    The mean and std. mentioned above is specific to Imagenet data
    
    ```
    More details of image preprocessing in PyTorch can be found at http://pytorch.org/tutorials/beginner/data_loading_tutorial.html
    
- **{35 pts} Modelling**. With the extracted features, build an LSTM network which takes a **dx25** sample as input (where **d** is the dimension of the extracted feature for each frame), and outputs the action label of that sample.
- **{20 pts} Evaluation**. After training your network, you need to evaluate your model with the testing data by computing the prediction accuracy **(5 points)**. The baseline test accuracy for this data is 75%, and **10 points** out of 20 is for achieving test accuracy greater than the baseline. Moreover, you need to compare **(5 points)** the result of your network with that of support vector machine (SVM) (stacking the **dx25** feature matrix to a long vector and train a SVM).
- **{10 pts} Report**. Details regarding the report can be found in the submission section below.

Notice that the size of the raw images is 256x340, whereas your pre-trained model might take **nxn** images as inputs. To solve this problem, instead of resizing the images which unfavorably changes the spatial ratio, we take a better solution: Cropping five **nxn** images, one at the image center and four at the corners and compute the **d**-dim features for each of them, and average these five **d**-dim feature to get a final feature representation for the raw image.
For example, VGG takes 224x224 images as inputs, so we take the five 224x224 croppings of the image, compute 4096-dim VGG features for each of them, and then take the mean of these five 4096-dim vectors to be the representation of the image.

In order to save you computational time, you need to do the classification task only for **the first 25** classes of the whole dataset. The same applies to those who have access to GPUs. **Bonus 10 points for running and reporting on the entire 101 classes.**


## Dataset
Download **dataset** at [UCF101](http://vision.cs.stonybrook.edu/~yangwang/public/UCF101_images.tar)(Image data for each video) and the **annos folder** which has the video labels and the label to class name mapping is included in the assignment folder uploaded. 


UCF101 dataset contains 101 actions and 13,320 videos in total.  

+ `annos/actions.txt`  
  + lists all the actions (`ApplyEyeMakeup`, .., `YoYo`)   
  
+ `annots/videos_labels_subsets.txt`  
  + lists all the videos (`v_000001`, .., `v_013320`)  
  + labels (`1`, .., `101`)  
  + subsets (`1` for train, `2` for test)  

+ `images/`  
  + each folder represents a video
  + the video/folder name to class mapping can be found using `annots/videos_labels_subsets.txt`, for e.g. `v_000001` belongs to class 1 i.e. `ApplyEyeMakeup`
  + each video folder contains 25 frames  



## Some Tutorials
- Good materials for understanding RNN and LSTM
    - http://blog.echen.me
    - http://karpathy.github.io/2015/05/21/rnn-effectiveness/
    - http://colah.github.io/posts/2015-08-Understanding-LSTMs/
- Implementing RNN and LSTM with PyTorch
    - [LSTM with PyTorch](http://pytorch.org/tutorials/beginner/nlp/sequence_models_tutorial.html#sphx-glr-beginner-nlp-sequence-models-tutorial-py)
    - [RNN with PyTorch](http://pytorch.org/tutorials/intermediate/char_rnn_classification_tutorial.html)

In [0]:
from google.colab import drive
drive.mount('/content/gdrive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/gdrive


In [0]:
cd '/content/gdrive/My Drive/Chintapalli_Saketh_112686022_hw5'

/content/gdrive/My Drive/Chintapalli_Saketh_112686022_hw5


In [0]:
!wget 'http://vision.cs.stonybrook.edu/~yangwang/public/UCF101_images.tar'

--2019-11-18 01:16:04--  http://vision.cs.stonybrook.edu/~yangwang/public/UCF101_images.tar
Resolving vision.cs.stonybrook.edu (vision.cs.stonybrook.edu)... 130.245.4.232
Connecting to vision.cs.stonybrook.edu (vision.cs.stonybrook.edu)|130.245.4.232|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 8658247680 (8.1G) [application/x-tar]
Saving to: ‘UCF101_images.tar’


2019-11-18 01:20:18 (32.6 MB/s) - ‘UCF101_images.tar’ saved [8658247680/8658247680]



In [0]:
!tar -xkf './UCF101_images.tar' 2>/dev/null

---
---
## **Problem 1.** Feature extraction

In [0]:
import pandas as pd
import glob
import cv2
import torch
import torchvision
import gc
import torchvision.models as models
import torchvision.transforms as transforms
import torch.nn.functional as F
import torch.nn as nn
import time
import numpy as np
import torch.optim as optim

In [0]:
df = pd.read_csv('/content/gdrive/My Drive/Chintapalli_Saketh_112686022_hw5/annos/videos_labels_subsets.txt', sep = '\t', header=None)

In [0]:
df.columns = ['folder', 'label', 'istrain']
df.head(5)

Unnamed: 0,folder,label,istrain
0,v_000001,1,2
1,v_000002,1,2
2,v_000003,1,2
3,v_000004,1,2
4,v_000005,1,2


In [0]:
train_feat = []
test_feat = []
train_data = []
test_data = []
vgg16 = models.vgg16(pretrained=True)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
vgg16 = vgg16.to(device)
count = 0
for row in df.iterrows():
  if count <=976:
    count +=1
    continue
  train_data = []
  test_data = []
  if row[1]['label'] > 25:
    break
  print("folder number: " + str(count+1))
  for f in glob.glob("./images/" + str(row[1]['folder']) + "/*.jpg"):
    im = cv2.imread(f)
    im = cv2.resize(im,(224,224))
    if row[1]['istrain'] == 2:
      test_data.append(im)
    elif row[1]['istrain'] ==1:
      train_data.append(im)
  if row[1]['istrain'] ==2:
    t = test_data[0:25]
  elif row[1]['istrain'] ==1:
    t = train_data[0:25]
  t = np.reshape(t, (25, 3, 224, 224))
  normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
  prep = transforms.Compose([ transforms.ToTensor(), normalize])
  for i in range(0,25):
    t[i] = np.reshape(prep(np.reshape(t[i],(224,224,3))), (3,224,224))
  t = torch.from_numpy(t)
  t = t.float()
  t = t.to(device)
  avg = F.adaptive_avg_pool2d(vgg16.features[:31](t), (7,7))
  if row[1]['istrain'] ==2:
    torch.save(vgg16.classifier[0](avg.view(avg.size(0), -1)), str('./test_tensors/' + str(count +1) + '.pt'))
    #test_feat.append(vgg16.classifier[0](avg.view(avg.size(0), -1)))
  elif row[1]['istrain'] ==1:
    torch.save(vgg16.classifier[0](avg.view(avg.size(0), -1)).cpu(), str('./train_tensors/' + str(count +1) + '.pt'))
    #train_feat.append(vgg16.classifier[0](avg.view(avg.size(0), -1))) 
  del(train_data)
  del(test_data)
  del(t)
  del(avg)
  count +=1

folder number: 978
folder number: 979
folder number: 980
folder number: 981
folder number: 982
folder number: 983
folder number: 984
folder number: 985
folder number: 986
folder number: 987
folder number: 988
folder number: 989
folder number: 990
folder number: 991
folder number: 992
folder number: 993
folder number: 994
folder number: 995
folder number: 996
folder number: 997
folder number: 998
folder number: 999
folder number: 1000
folder number: 1001
folder number: 1002
folder number: 1003
folder number: 1004
folder number: 1005
folder number: 1006
folder number: 1007
folder number: 1008
folder number: 1009
folder number: 1010
folder number: 1011
folder number: 1012
folder number: 1013
folder number: 1014
folder number: 1015
folder number: 1016
folder number: 1017
folder number: 1018
folder number: 1019
folder number: 1020
folder number: 1021
folder number: 1022
folder number: 1023
folder number: 1024
folder number: 1025
folder number: 1026
folder number: 1027
folder number: 1028
fo

In [0]:
# test_data = []
# test_data.append(im[16:240, 58:282])
# test_data.append(im[0:224, 0:224])
# test_data.append(im[0:224, 116:340])
# test_data.append(im[32:256, 0:224])
# test_data.append(im[32:256, 116:340])
# vgg16 = models.vgg16(pretrained=True)
# t = test_data[0:5]
# t = np.reshape(t, (5, 3, 224, 224))
# normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
# prep = transforms.Compose([ transforms.ToTensor(), normalize])
# for i in range(0,5):
#     t[i] = np.reshape(prep(np.reshape(t[i],(224,224,3))), (3,224,224))
# t = torch.from_numpy(t)
# t = t.float()
# bob = vgg16.features[:31](t)
# avg = F.adaptive_avg_pool2d(bob, (7,7))
# avg = avg.view(avg.size(0), -1)
# final = vgg16.classifier[0](avg)
# mean = torch.mean(final,0)

I was not able to include the cropped images in my training data because it kept crashing my RAM and environment(tried both in Colab and local machine)

In [0]:
train_labels = []
test_labels = []
for row in df.iterrows():
  if row[1]['label'] > 25:
    break
  if row[1]['istrain'] == 2:
    test_labels.append(int(row[1]['label']))
  elif row[1]['istrain'] == 1:
    train_labels.append(int(row[1]['label']))

***
***
## **Problem 2.** Modelling

* ##### **Print the size of your training and test data**

In [0]:
# Don't hardcode the shape of train and test data
train_feat = torch.tensor([])
test_feat = torch.tensor([])
for f in glob.glob('./train_tensors/*.pt'):
  t = torch.load(f)
  train_feat = torch.cat((train_feat,t))
for f in glob.glob('./test_tensors/*.pt'):
  t = torch.load(f)
  test_feat = torch.cat((test_feat,t.cpu()))
test_feat = test_feat.reshape((len(test_labels),25,4096))
train_feat = train_feat.reshape((len(train_labels),25,4096))
print('Shape of training data is :', train_feat.shape)
print('Shape of test/validation data is :', test_feat.shape)

Shape of training data is : torch.Size([2409, 25, 4096])
Shape of test/validation data is : torch.Size([951, 25, 4096])


In [0]:
# train_labels = np.asarray(train_labels)
# temp = train_feat.data.numpy()
# assert len(train_labels) == len(temp)
# p = np.random.permutation(len(train_labels))
# temp  = temp[p]
# train_labels_shuffled = train_labels[p]

# test_labels = np.asarray(test_labels)
# temp2 = test_feat.data.numpy()
# assert len(test_labels) == len(temp2)
# p = np.random.permutation(len(test_labels))
# temp2 = temp2[p]
# test_labels_shuffled = test_labels[p]

# train_feat_shuffled = torch.from_numpy(temp)
# test_feat_shuffled = torch.from_numpy(temp2)
# train_labels_shuffled = torch.from_numpy(train_labels_shuffled)
# test_labels_shuffled = torch.from_numpy(test_labels_shuffled)


In [0]:
train_labels = torch.from_numpy(np.asarray(train_labels))
test_labels = torch.from_numpy(np.asarray(test_labels))

**Method 1: Simple LSTM Network**

In [0]:
# \*write your codes for modelling using the extracted feature (You can use multiple cells, this is just a place holder)
class lstm(nn.Module):
  def __init__(self, hidden_dim, batch_size, target_size):
    super(lstm, self).__init__()
    self.hidden_dim = hidden_dim
    self.batch_size = batch_size
    self.lstm = nn.LSTM(25*4096, hidden_dim)
    self.fc1 = nn.Linear(hidden_dim, target_size)
    self.hidden = self.init_hidden()
    
  def init_hidden(self):
    return (torch.zeros((1,self.batch_size,self.hidden_dim)),
            torch.zeros((1,self.batch_size,self.hidden_dim)))
  def forward(self,x):
    x = x.view(self.batch_size,-1)
    # x = x.reshape((1,1,102400))
    x,self.hidden = self.lstm(x.view(1,self.batch_size,-1),self.hidden)
    x = self.fc1(x.view(self.batch_size,-1))
    x = F.log_softmax(x,dim = -1)
    return x
  

In [0]:
batch_size = 20
model = lstm(256,batch_size,2)
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.1)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
#model = model.to(device)
loss = 0
epochs =25

for epoch in range(epochs):
  count = 0
  epoch_loss = 0.0
  for i in range(9):
    #print(i)
    x = train_feat[count:count+batch_size]
    #x = x.to(device)
    y = train_labels[count:count+batch_size]
    y = y.view(-1)
    y = y - 1
    # y = y.to(device)
    model.zero_grad()
    model.hidden = model.init_hidden()
    out = model(x)
    loss = criterion(out,y)
    loss.backward()
    optimizer.step()
    epoch_loss += loss.item()
    gc.collect()
    count += batch_size
  print('avg epoch loss: ', epoch_loss)


avg epoch loss:  61.688423432166566
avg epoch loss:  22.920334711670876
avg epoch loss:  7.272368974983692
avg epoch loss:  32.20755386352539
avg epoch loss:  4.211898870766163
avg epoch loss:  11.427648827433586
avg epoch loss:  8.653375387191772
avg epoch loss:  6.32935793697834
avg epoch loss:  9.079805493354797
avg epoch loss:  8.11868131160736
avg epoch loss:  7.265342622995377
avg epoch loss:  8.250297009944916
avg epoch loss:  7.969094693660736
avg epoch loss:  7.3766858875751495
avg epoch loss:  7.497505336999893
avg epoch loss:  7.569215834140778
avg epoch loss:  7.371934950351715
avg epoch loss:  7.292917251586914
avg epoch loss:  7.753476828336716
avg epoch loss:  7.866509705781937
avg epoch loss:  7.3453089594841
avg epoch loss:  7.300910472869873
avg epoch loss:  6.866195380687714
avg epoch loss:  10.515653371810913
avg epoch loss:  10.302231013774872


I had RAM issues with loading the entire data for 25 classes and the 4096 features using the pretrained VGG. Above is my code to do the same but because of the RAM issues, I was not able to get the desirable output for all 3360 folders of training/testing features. Hence, I decided to move forward with a subset of these folders for my training and testing. However, I have implemented everything else as described by the assignment.

**Method 2: Multiple Linear Layers**

In [0]:
#Method 2
class lstm(nn.Module):
  def __init__(self, hidden_dim, batch_size, target_size):
    super(lstm, self).__init__()
    self.hidden_dim = hidden_dim
    self.batch_size = batch_size
    self.lstm = nn.LSTM(25*4096, hidden_dim)
    self.fc1 = nn.Linear(hidden_dim, 256)
    self.fc2 = nn.Linear(256,target_size)
    self.hidden = self.init_hidden()
    
  def init_hidden(self):
    return (torch.zeros((1,self.batch_size,self.hidden_dim)),
            torch.zeros((1,self.batch_size,self.hidden_dim)))
  def forward(self,x):
    x = x.view(self.batch_size,-1)
    # x = x.reshape((1,1,102400))
    x,self.hidden = self.lstm(x.view(1,self.batch_size,-1),self.hidden)
    x = self.fc1(x.view(self.batch_size,-1))
    x = self.fc2(x.view(self.batch_size,-1))
    x = F.log_softmax(x,dim = -1)
    return x
  

In [0]:
#Method 2
batch_size = 20
model = lstm(512,batch_size,2)
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.0001)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
#model = model.to(device)
loss = 0
epochs =10

for epoch in range(epochs):
  count = 0
  epoch_loss = 0.0
  for i in range(9):
    #print(i)
    x = train_feat[count:count+batch_size]
    #x = x.to(device)
    y = train_labels[count:count+batch_size]
    y = y.view(-1)
    y = y - 1
    # y = y.to(device)
    model.zero_grad()
    model.hidden = model.init_hidden()
    out = model(x)
    loss = criterion(out,y)
    loss.backward()
    optimizer.step()
    epoch_loss += loss.item()
    gc.collect()
    count += batch_size
  print('avg epoch loss: ', epoch_loss)


avg epoch loss:  8.348071217536926
avg epoch loss:  6.6649483144283295
avg epoch loss:  6.188506007194519
avg epoch loss:  6.096600979566574
avg epoch loss:  6.098347544670105
avg epoch loss:  6.1367082595825195
avg epoch loss:  6.153148293495178
avg epoch loss:  6.152450799942017
avg epoch loss:  6.147309482097626
avg epoch loss:  6.162711679935455


**Method 3: Two LSTM network**

In [0]:
#Method 3
class lstm2(nn.Module):
  def __init__(self, hidden_dim, batch_size, target_size):
    super(lstm2, self).__init__()
    self.hidden_dim = hidden_dim
    self.batch_size = batch_size
    self.lstm = nn.LSTM(25*4096, 1024)
    self.lstm2 = nn.LSTM(1024,hidden_dim)
    self.fc1 = nn.Linear(hidden_dim, 256)
    self.fc2 = nn.Linear(256,target_size)
    self.hidden = self.init_hidden()
    self.hidden_lstm2 = self.init_hidden_lstm2()
    
  def init_hidden(self):
    return (torch.zeros((1,self.batch_size,1024)),
            torch.zeros((1,self.batch_size,1024)))
  def init_hidden_lstm2(self):
    return (torch.zeros((1,self.batch_size,self.hidden_dim)),
            torch.zeros((1,self.batch_size,self.hidden_dim)))
  def forward(self,x):
    x = x.view(self.batclh_size,-1)
    # x = x.reshape((1,1,102400))
    x,self.hidden = self.lstm(x.view(1,self.batch_size,-1),self.hidden)
    x,self.hidden_lstm2 = self.lstm2(x.view(1,self.batch_size,-1),self.hidden_lstm2)
    x = self.fc1(x.view(self.batch_size,-1))
    x = self.fc2(x.view(self.batch_size,-1))
    x = F.log_softmax(x,dim = -1)
    return x
  

In [0]:
#Method 3
batch_size = 20
model = lstm2(512,batch_size,2)
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.0001)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
#model = model.to(device)
loss = 0
epochs =10

for epoch in range(epochs):
  count = 0
  epoch_loss = 0.0
  for i in range(9):
    #print(i)
    x = train_feat[count:count+batch_size]
    #x = x.to(device)
    y = train_labels[count:count+batch_size]
    y = y.view(-1)
    y = y - 1
    # y = y.to(device)
    model.zero_grad()
    model.hidden = model.init_hidden()
    model.hidden_lstm2 = model.init_hidden_lstm2()
    out = model(x)
    loss = criterion(out,y)
    loss.backward()
    optimizer.step()
    epoch_loss += loss.item()
    gc.collect()
    count += batch_size
  print('avg epoch loss: ', epoch_loss)


avg epoch loss:  6.941666841506958
avg epoch loss:  6.289087474346161
avg epoch loss:  6.164422810077667
avg epoch loss:  6.153954088687897
avg epoch loss:  6.15695321559906
avg epoch loss:  6.158075571060181
avg epoch loss:  6.156584620475769
avg epoch loss:  6.1533233523368835
avg epoch loss:  6.149765133857727
avg epoch loss:  6.140604496002197


**SVM Modelling**

In [0]:
svm_feat = []
svm_labels = []
count = 0
for i in range(180):
  t = train_feat[i].data.numpy()
  svm_feat.append(t.reshape((25*4096,1)))
  svm_labels.append(train_labels[i].data.numpy().reshape(1))


In [0]:
svm_feat = np.asarray(svm_feat)
svm_feat = svm_feat.reshape(180,102400)
svm_labels = np.asarray(svm_labels)
svm_labels = svm_labels.reshape((180,))


In [0]:
test_svm = []
test_svm_labels = []
for i in range(60):
  feat = test_feat[i].data.numpy()
  test_svm.append(feat.reshape((25*4096,1)))
  test_svm_labels.append(test_labels[i].data.numpy().reshape(1))
test_svm = np.asarray(test_svm)
test_svm = test_svm.reshape((60,102400))
test_svm_labels = np.asarray(test_svm_labels)
test_svm_labels = test_svm_labels.reshape((60,))

In [0]:
from sklearn.svm import SVC
clf = SVC(C = 0.1)
clf.fit(svm_feat, svm_labels)



SVC(C=0.1, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
    kernel='rbf', max_iter=-1, probability=False, random_state=None,
    shrinking=True, tol=0.001, verbose=False)

---
---
## **Problem 3.** Evaluation

* ##### **Print the train and test accuracy of your model** 

In [0]:
PATH = './trained.pth'
torch.save(model.state_dict(), PATH)
net = model
net.load_state_dict(torch.load(PATH))
count = 0
with torch.no_grad():
  total = 0
  correct = 0
  for i in range(3):
    x = test_feat[count:count+batch_size]
    y = test_labels[count:count+batch_size]
    #y = torch.from_numpy(np.asarray(y))
    y = y.view(-1)
    #print(y)
    output = net(x)
    _, predicted = torch.max(output.data,1)
    total = total + y.size(0)
    predicted +=1
    correct = correct  + (predicted == y).sum().item()
    count += batch_size
print("accuracy: ", (100 * (correct/total)))


accuracy:  48.333333333333336


In [0]:
#Method 2
PATH = './trained.pth'
torch.save(model.state_dict(), PATH)
net = model
net.load_state_dict(torch.load(PATH))
count = 0
with torch.no_grad():
  total = 0
  correct = 0
  for i in range(3):
    x = test_feat[count:count+batch_size]
    y = test_labels[count:count+batch_size]
    #y = torch.from_numpy(np.asarray(y))
    y = y.view(-1)
    #print(y)
    output = net(x)
    _, predicted = torch.max(output.data,1)
    total = total + y.size(0)
    predicted +=1
    correct = correct  + (predicted == y).sum().item()
    count += batch_size
print("accuracy: ", (100 * (correct/total)))


accuracy:  73.33333333333333


In [0]:
#Method 3
PATH = './trained.pth'
torch.save(model.state_dict(), PATH)
net = model
net.load_state_dict(torch.load(PATH))
count = 0
with torch.no_grad():
  total = 0
  correct = 0
  for i in range(3):
    x = test_feat[count:count+batch_size]
    y = test_labels[count:count+batch_size]
    #y = torch.from_numpy(np.asarray(y))
    y = y.view(-1)
    #print(y)
    output = net(x)
    _, predicted = torch.max(output.data,1)
    total = total + y.size(0)
    predicted +=1
    correct = correct  + (predicted == y).sum().item()
    count += batch_size
print("accuracy: ", (100 * (correct/total)))


accuracy:  73.33333333333333


* ##### **Print the train and test and test accuracy of SVM** 

In [0]:
from sklearn.metrics import accuracy_score
preds = clf.predict(test_svm)
acc = accuracy_score(test_svm_labels,preds)
train_preds = clf.predict(svm_feat)
acc1 = accuracy_score(svm_labels, train_preds)
print("test accuracy: ",(acc*100))


test accuracy:  73.33333333333333


## **Problem 4.** Report

## **Bonus**


* ##### **Print the size of your training and test data**

In [0]:
# Don't hardcode the shape of train and test data
print('Shape of training data is :', )
print('Shape of test/validation data is :', )

* ##### **Modelling and evaluation**

In [0]:
#Write your code for modelling and evaluation

## Submission
---
**Runnable source code in ipynb file and a pdf report are required**.

The report should be of 3 to 4 pages describing what you have done and learned in this homework and report performance of your model. If you have tried multiple methods, please compare your results. If you are using any external code, please cite it in your report. Note that this homework is designed to help you explore and get familiar with the techniques. The final grading will be largely based on your prediction accuracy and the different methods you tried (different architectures and parameters).

Please indicate clearly in your report what model you have tried, what techniques you applied to improve the performance and report their accuracies. The report should be concise and include the highlights of your efforts.
The naming convention for report is **Surname_Givenname_SBUID_report*.pdf**

When submitting your .zip file through blackboard, please
-- name your .zip file as **Surname_Givenname_SBUID_hw*.zip**.

This zip file should include:
```
Surname_Givenname_SBUID_hw*
        |---Surname_Givenname_SBUID_hw*.ipynb
        |---Surname_Givenname_SBUID_hw*.pdf
        |---Surname_Givenname_SBUID_report*.pdf
```

For instance, student Michael Jordan should submit a zip file named "Jordan_Michael_111134567_hw5.zip" for homework5 in this structure:
```
Jordan_Michael_111134567_hw5
        |---Jordan_Michael_111134567_hw5.ipynb
        |---Jordan_Michael_111134567_hw5.pdf
        |---Jordan_Michael_111134567_report*.pdf
```

The **Surname_Givenname_SBUID_hw*.pdf** should include a **google shared link**. To generate the **google shared link**, first create a folder named **Surname_Givenname_SBUID_hw*** in your Google Drive with your Stony Brook account. 

Then right click this folder, click ***Get shareable link***, in the People textfield, enter two TA's emails: ***bo.cao.1@stonybrook.edu*** and ***sayontan.ghosh@stonybrook.edu***. Make sure that TAs who have the link **can edit**, ***not just*** **can view**, and also **uncheck** the **Notify people** box.

Colab has a good feature of version control, you should take advantage of this to save your work properly. However, the timestamp of the submission made in blackboard is the only one that we consider for grading. To be more specific, we will only grade the version of your code right before the timestamp of the submission made in blackboard. 

You are encouraged to post and answer questions on Piazza. Based on the amount of email that we have received in past years, there might be dealys in replying to personal emails. Please ask questions on Piazza and send emails only for personal issues.

Be aware that your code will undergo plagiarism check both vertically and horizontally. Please do your own work.