# **Problem 2: Convolutional Neural Networks (CNN)**

#### You will develop a neural network with convolution and pooling layers to perform image classification, and test it out on the [STL-10](https://cs.stanford.edu/~acoates/stl10/) dataset. Please, run the following blocks for running your code.


## Import necessary libraries

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
import numpy as np

## Download the dataset

Without data augmentation

In [None]:
torch.manual_seed(0)
#############################################################################
# PLACE YOUR CODE HERE (OPTION: Data augmentation)                          #
#############################################################################

train_dataset = datasets.STL10('data', split='train', download=True, transform=transforms.ToTensor())
test_dataset = datasets.STL10('data', split='test', download=True, transform=transforms.ToTensor()) # we use the test set as a validation set







# END OF YOUR CODE                                                          #
#############################################################################

torch.manual_seed(1)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=128, shuffle=True, drop_last=True)
val_loader  = torch.utils.data.DataLoader(test_dataset, batch_size=128, shuffle=True)

## 2.1. A CNN with MaxPooling layers



### Design the model 
You will implement a CNN model. In PyTorch, there are built-in functions that carry out the convolution steps for you. The following shows the key functions required for the design.


*   nn.**[Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html)**(in_channels, out_channels, kernel_size, stride=1, padding=0, bias=False): Convolution layer.
*   nn.**[MaxPool2d](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html)**(kernel_size, stride=None, padding=0): Max pooling layer. 
* nn.**[Dropout](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html)**(p=0.5, inplace=False): randomly zeroes some of the elements of the input tensor during training.
*   F.**[relu](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html)**(Z1): computes the elementwise ReLU of Z1 (which can be any shape).  
*   x.**[view](https://pytorch.org/docs/stable/generated/torch.Tensor.view.html)**(new_shape): returns a new tensor with the same data but different size. It is the equivalent of numpy function reshape (Gives a new shape to an array without changing its data). 
*   nn.**[Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html)**(in_features, out_features): applies a linear transformation to the incoming data. It is also called a fully connected layer. 

In [None]:
# Problem 2: Implementing your own CNN
# a. Convolution and MaxPooling layers
class CNN_Max(nn.Module):
  """
  A convolutional neural network (CNN). In this CNN object, we will use following
  dimensions:

  input_size: the dimension d of the input data.                        
  hidden_size: the number of neurons h in the hidden layer.             
  output_size: the number of classes c, which is 10 in our task          
  """
  def __init__(self):
    """
    An initialization function. This object of network is a simple feed-forward 
    network. It takes an input to pass to muitiple layers. Then, provide the 
    output. The layers are initialized after their creation. 

    In this problem, we will use following set of parameters building a CNN/

    conv: convolutional kernel size, which is 3 by 3 with bias                         
    pool: pooling kernel-size, which is 2 by 2    
    dropout: random zeroing layer with probability 0.4                            
    fc: fully-connected layer which uses affine operation y=Wx+b              

    Parameters
    ----------
    N/A
    """
    super(CNN_Max, self).__init__()

    #############################################################################
    # PLACE YOUR CODE HERE                                                      #
    ############################################################################# 
    
    #self.conv1 =  
    #self.conv2 = 

    #self.pool = 
    #self.fc   = 

    #torch.nn.init.xavier_uniform_(self.fc.weight)
    # END OF YOUR CODE                                                          #
    #############################################################################
   

  def forward(self, x):
    """
    A forward pass function. Using the pre-defined network modules, we can here 
    build a model designing its structure. 

    Parameters
    ---------- 
    x: matrix  
      an input data of shape (3, d, d), where d is the dimension of the input 
      image. 
  
    Returns
    ---------- 
    out:     
      an output data given x.

    """
    #############################################################################
    # PLACE YOUR CODE HERE                                                      #
    ############################################################################# 



      
    out = 
    # END OF YOUR CODE                                                          #
    #############################################################################

    return out

# create a CNN object
net = CNN_Max()
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print('Device:', device)
net.to(device)

num_params = sum(p.numel() for p in net.parameters() if p.requires_grad)
print("Number of trainable parameters:", num_params)

from torchsummary import summary
summary(net,(3,96,96))

### Train the designed model:

In [None]:
import torch.optim as optim

"""

 You have to define the loss, for that please use cross-entropy loss      
 Also, you must implement optimizer called Adam.                           

"""

#############################################################################
# PLACE YOUR CODE HERE                                                      #
############################################################################# 
criterion = 
optimizer = 
#############################################################################
#                              END OF YOUR CODE                             #
#############################################################################

scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', 
                                                 factor=0.05, 
                                                 patience=10, 
                                                 min_lr=0.0,
                                                 verbose=True, 
                                                 )

loss_hist, acc_hist = [], []
loss_hist_val, acc_hist_val = [], []

for epoch in range(30):
  running_loss = 0.0
  correct = 0
  count = 0
  for batch, labels in train_loader:
    batch, labels = batch.to(device), labels.to(device)
    count += len(batch)
    """
    First, set the gradients to zero. Then obtain predictions from your CNN   
    model. After that, pass into loss to calculate the difference between the 
    prediction and labels. Next, you have to compute the gradients with       
    respect to the tensor.  

    """
    #############################################################################
    # PLACE YOUR CODE HERE                                                      #
    ############################################################################# 
    
    outputs = 
    loss = 
    
    #############################################################################
    #                              END OF YOUR CODE                             #
    #############################################################################
    optimizer.step()

    # compute training statistics
    _, predicted = torch.max(outputs, 1)
    correct += (predicted == labels).sum().item()
    running_loss += loss.item()
  
  avg_loss = running_loss / count
  avg_acc = correct / count
  loss_hist.append(avg_loss)
  acc_hist.append(avg_acc)

  # validation statistics
  net.eval()
  with torch.no_grad():
    loss_val = 0.0
    correct_val = 0
    count = 0      
    for batch, labels in val_loader:
      batch, labels = batch.to(device), labels.to(device)
      count += len(batch)        
      outputs = net(batch)
      loss = criterion(outputs, labels)
      _, predicted = torch.max(outputs, 1)
      correct_val += (predicted == labels).sum().item()
      loss_val += loss.item()
    avg_loss_val = loss_val / count
    avg_acc_val = correct_val / count
    loss_hist_val.append(avg_loss_val)
    acc_hist_val.append(avg_acc_val)
  net.train()

  scheduler.step(avg_loss_val)
  print('[epoch %d] loss: %.5f accuracy: %.4f val loss: %.5f val accuracy: %.4f' % (epoch + 1, avg_loss, avg_acc, avg_loss_val, avg_acc_val))

### Visualize the classification accuracies

In [None]:
import matplotlib.pyplot as plt

"""

  You have to plot two graphs, one for loss of training and validation data 
  and second one for accuarcy of training and validation data.              
  Set x-axis to number of epochs and y-axis to loss or accuracy. Set legend 
  equal to training and validation set.                                      

"""
#############################################################################
# PLACE YOUR CODE HERE                                                      #
############################################################################# 

legend = ['Train', 'Validation']





plt.show()
#############################################################################
#                              END OF YOUR CODE                             #
#############################################################################