**Note to grader:** Each question consists of parts, e.g. Q1(i), Q1(ii), etc. Each part must be first graded  on a 0-4 scale, following the standard NJIT convention (A:4, B+: 3.5, B:3, C+: 2.5, C: 2, D:1, F:0). However, any given item may be worth 4 or 8 points; if an item is worth 8 points, you need to accordingly scale the 0-4 grade.


The total score must be re-scaled to 100. That should apply to all future assignments so that Canvas assigns the same weight on all assignments.



# Assignment 2



### Preparation Steps




We will work with this [mystery dataset](https://drive.google.com/open?id=1WLnWBThCYZ25pReI5DCwk2bgDaCrJxI_&authuser=ikoutis%40njit.edu&usp=drive_fs) that you can download and place to your google drive. You can then put it somewhere on your google drive and bring it into your Colab by following the steps in the following cell.



In [None]:
from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)

Mounted at /content/gdrive


The file contains

* Two matrices $X$ and $X_1$ of numerical features. These datasets have the same dimensions (169343x80) but they are different.
* An array $y$ of labels, ranging from 0-39.
* The indices $otrain$ of a training set. These indices tell you what rows of the arrays $X,X_1,y$ correspond to the training points. You can use these to make two different training sets $(X[train], y[train])$ and $(X_1[train], y[train])$
* Similarly, it contains the indexes for a validation and a test set, $ovalid$ and $otest$ respectively.

The following cell shows how to access these arrays and assign them to local numpy objects.

In [4]:
import scipy

mat = scipy.io.loadmat('mysteryDataset.mat')

## <font color = 'blue'> Question 1. Import the dataset and conver to torch tensors </font>

Your task for this question is to adapt the above preparation steps, import all mentioned variables into numpy arrays, and then transform them to PyTorch tensors.


In [5]:
type(mat.get('X'))
X_feature = mat.get('X')
X1_feature = mat.get('X1')
y_labels = mat.get('y')

#### Already numpy arrays. 

In [3]:
type(y_labels)

numpy.ndarray

### Cast to Pytorch tensors. 

In [6]:
import torch
X_feature = torch.tensor(X_feature)
X1_feature = torch.tensor(X1_feature)
y_labels = torch.tensor(y_labels)

In [5]:
print(f"Tensor Shapes:\nX: {X_feature.shape}\nX1: {X1_feature.shape}\ny: {y_labels.shape}")

Tensor Shapes:
X: torch.Size([169343, 80])
X1: torch.Size([169343, 80])
y: torch.Size([169343, 1])


In [None]:
# for grader use only

# insert grade here  (out of 4)

# G[1] =
#
# please justify point subtractions when needed

## <font color = 'blue'> Question 2. Write a functioning classifier in PyTorch </font>

Write code that defines a classification model for the above dataset, and all other functions that are needed for its training. Apply your model on the two datsets $X,X_1$ and report the accuracy. The classifier should operate on the GPU.

**Hint:** Re-use code we discussed for the Softmax Regression module.

In [7]:
from torch import nn

class SoftMaxRegression(nn.Module): 
    def __init__(self): 
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear = nn.Linear(784,10) 

    def forward(self,x): 
        y = self.flatten(x)
        y = self.linear(y)
        return y        

In [8]:
device  = 'cuda' if torch.cuda.is_available() else 'cpu'

model = SoftMaxRegression().to(device)

loss_fn = nn.CrossEntropyLoss() 

optimizer = torch.optim.Adam(model.parameters())

In [9]:
def init_weights(m):
    if type(m) ==  nn.Linear: 
        nn.init.normal_(m.weight,std=0.01)

model.apply(init_weights)

SoftMaxRegression(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear): Linear(in_features=784, out_features=10, bias=True)
)

#### Split bot X and X1 Datasets into Training, Validation, and Test sets. 

In [10]:
from sklearn.model_selection import train_test_split

# X Split
train_data, temp_data, train_labels, temp_labels = train_test_split(
    X_feature, y_labels, test_size=0.4, random_state=42)  

validation_data, test_data, validation_labels, test_labels = train_test_split(
    temp_data, temp_labels, test_size=0.5, random_state=42)  

# X1 Split
train_data, temp_data, train_labels, temp_labels = train_test_split(
    X_feature, y_labels, test_size=0.4, random_state=42)  

validation_data, test_data, validation_labels, test_labels = train_test_split(
    temp_data, temp_labels, test_size=0.5, random_state=42) 



In [39]:
  for x_batch, y_batch in train_iter:
      
      print(x_batch.shape)
      print(y_batch)

      x_batch = x_batch.to(device)
      y_batch = y_batch.to(device)
      
      y_hat = model(x_batch)
            
      ll = loss_fn(y_hat, y_batch )
      print(ll)

      break 

NameError: name 'train_iter' is not defined

In [None]:
def make_train_step(model, loss_fn, optimizer):
    # Builds function that performs a step in the train loop
    def train_step(x, y):
        # Sets model to TRAIN mode
        model.train()
        # Makes predictions
        yhat = model(x)
        # Computes loss
        loss = loss_fn(yhat, y)
        # Computes gradients
        loss.backward()
        # Updates parameters and zeroes gradients
        optimizer.step()
        optimizer.zero_grad()
        # Returns the loss
        return loss.item()
    
    # Returns the function that will be called inside the train loop
    return train_step

# Creates the train_step function for our model, loss function and optimizer
train_step = make_train_step(model, loss_fn, optimizer)
losses = []
n_epochs = 10

In [None]:
model.apply(init_weights)   #always good to initialize in the beginning
n_epochs = 10
losses = []
test_losses = []
train_step = make_train_step(model, loss_fn, optimizer)

for epoch in range(n_epochs):
    for x_batch, y_batch in train_iter:
        x_batch = x_batch.to(device)
        y_batch = y_batch.to(device)

        loss = train_step(x_batch, y_batch)
        losses.append(loss)

    # torch no_grad makes sure that the nested-below computations happen without gradients, 
    # since these are not needed for evaluation
    with torch.no_grad():
        for x_test, y_test in test_iter:
            x_test = x_test.to(device)
            y_test = y_test.to(device)
            
            model.eval()
    
            yhat = model(x_test)
            test_loss = loss_fn(yhat, y_test)
            test_losses.append(test_loss.item())

#print(model.state_dict())

In [None]:
def predict_ch3(net, test_iter, n=6): 
    """Predict labels (defined in Chapter 3)."""
    for X, y in test_iter:
        X = X.to(device)
        y = y.to(device)
        break
    trues = d2l.get_fashion_mnist_labels(y)
    preds = d2l.get_fashion_mnist_labels(model(X).argmax(axis=1))
    titles = [true + '\n' + pred for true, pred in zip(trues, preds)]
    Xcpu = X.cpu()
    d2l.show_images(Xcpu[0:n].reshape((n, 28, 28)), 1, n, titles=titles[0:n])


predict_ch3(model, test_iter)

In [None]:
def accuracy(net, test_iter):  
    
    n_samples = 0; 
    n_correct = 0;
    model.eval()
    for X, y in test_iter:
        X = X.to(device)
        y = y.to(device)
        
        trues = y

        preds = model(X).argmax(axis=1)
        

        n_samples = n_samples + y.shape[0]
        n_correct = n_correct + (trues==preds).sum()
        break
    
    return n_correct/n_samples

accuracy(model,test_iter)

In [None]:
# for grader use only

# insert grade here  (out of 8)

# G[2] =
#
# please justify point subtractions when needed

## <font color = 'blue'> Question 3. Maximize the accuracy on the two datasets </font>

Augment your classifier from Question-2 with any number and type of layers you want, with the goal to maximize the **validation** accuracy you achieve on the two datasets. Feel free to use any stopping criterion you want for the training process. The networks for $X$ and $X_1$ do not have be of the same architecture.

Show your code, and add a text cell summarizing your idea and findings. Finally apply your models to the **test** set, and report the accuracy. Feel free to discuss your validation accuracy on Canvas. Also please avoid looking at the test set, until the very end.

**Rubric**: All complete answers get 8 points, and the **top 5** test accuracies reported get an extra 10\% in the final quiz.

In [None]:
## your answer goes here

In [None]:
# for grader use only

# insert grade here  (out of 8)

# G[3] =
#
# please justify point subtractions when needed

In [None]:
# total score
max_score = 20
$inal_score = sum(G)*(100/max_score)