<center><h1 style="font-size:40px;">Image Classification using CNNs</h1></center>

Welcome to the *fourth* lab for Deep Learning!

In this lab an CNN network to classify RGB images. Image classification refers to classify classes from images. This labs the *dataset* consist of multiple images where each image have a target label for classification.

Good luck!


In [None]:
import torch
from torch import nn
import os
import imageio
import torchvision
import math
import matplotlib.pyplot as plt
from torch.utils.data import DataLoader
from collections import OrderedDict

First lets define the ```path``` to the dataset. Make sure you explore the directories of the dataset and get familiar with it!

In [2]:
training_img_dir = "../data/FlyingObjectDataset_10K/training"
validation_img_dir = "../data/FlyingObjectDataset_10K/validation"
testing_img_dir = "../data/FlyingObjectDataset_10K/testing"

We are also going to start using ```tensorboard```:

In [213]:
from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter('runs/classification_1') #make sure to adapt this to your needs

Please make sure to read the [doc](https://pytorch.org/tutorials/intermediate/tensorboard_tutorial.html) to understand how to correctly plot your ```losses``` and ```metrics``` to tensorboard

Ok now that we have the path to the tree different splits, lets start by defining our ```Dataset``` class!
The main two methods we need to define when subclassing ```Dataset``` is ```__getitem__``` and ```__len__```:

In [179]:
class FlyingObjects(torch.utils.data.Dataset):
    """Dataset to Flying Object Dataset for the classification task.
       The label information is encoded on the filename, __extract_label will extract the label following the chosen granularity
    
    """
    def __init__(self, root,fine_grained=False,transform=None):
        super(FlyingObjects,self).__init__()
        self.root = root
        self.transform = transform
        self.fine_grained = fine_grained

        self.images = [os.path.join(dp,f) for dp, dn, fn in os.walk(os.path.expanduser(self.root+'/image')) for f in fn if f.endswith(".png")]
        self.images.sort()
        
        self.classes = [
            'square_red',
            'square_green',
            'square_blue',
            'square_yellow',
            'triangle_red',
            'triangle_green',
            'triangle_blue',
            'triangle_yellow',
            'circular_red',
            'circular_green',
            'circular_blue',
            'circular_yellow'
        ] if self.fine_grained else [
            'square',
            'triangle',
            'circular',
            'background']
        self.labels = [self.__extract_label(f) for f in self.images]
    

    def get_classes(self):
        return self.classes
    
    
    def __extract_label(self, image_file):
        """Extract label from image_file name"""
        path, img_name = os.path.split(image_file)
        names = img_name.split(".")[0].split("_")

        currLabel = names[1] + "_" + names[2] if self.fine_grained else names[1]

        if currLabel in self.classes:
            label = self.classes.index(currLabel)
        else:
            raise ValueError("ERROR: Label " + str(currLabel) + " is not defined!")
        return label
    
    def __getitem__(self, index):
        # get data
        x = imageio.imread(self.images[index])
        if self.transform:
            x = self.transform(x)
        else:
            x = torch.from_numpy(x)
        x = x.float()
   
        # get label
        y = self.labels[index]
        #y = np.eye(len(self.get_classes()))[y]
        #y = torch.tensor(y)
        return x, y

    def __len__(self):
        return len(self.images)

We can define our transformations. Note that not all transformations are considered ```Data Augmentation```.
The following transformations are used to convert our data to ```Tensor``` and resize our images to the desired size!

In [180]:
train_transform = torchvision.transforms.Compose([
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Resize((64, 64)), 
])
test_transform = torchvision.transforms.Compose([
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Resize((64, 64))
])

# Question 1

Define the three dataloaders for the three splits: ```train```, ```validation``` and ```test```

Do not forget to visualize the data!

In [181]:
def image_with_labels(data, labels, title:str=None, nimages:int=10, nrows:int=2, fig_dimension=1,title_size=10, label_prefix="Label: "):
    """Creates a grid of images with/without labels.

    :param data: tensor": B,W,H,C
    :param labels: tensor":  (Default value = None)
    :param title: str:  (Default value = None)
    :param nimages: int:  (Default value = 10)
    :param nrows: int:  (Default value = 2)
    :param fig_dimension: Default value = 1)
    :param data:"tensor": 
    :param labels:"tensor":  (Default value = None)
    :param title:str:  (Default value = None)
    :param nimages:int:  (Default value = 10)
    :param nrows:int:  (Default value = 2)

    """
    image_ratio = data[0].shape[0] /data[0].shape[1]
    if len(data)< nimages:
        nimages = len(data)
 
    columns = math.ceil(nimages/nrows)
    
    if nrows*columns > nimages:
        nrows = math.ceil(nimages/columns)
    
    fig = plt.figure(figsize=(fig_dimension*columns,1.4*fig_dimension*nrows*image_ratio))
    for i in range(0, nimages):
        ax = fig.add_subplot(nrows, columns, i+1)
        ax.imshow(data[i]/255)
        ax.set_xlabel(f"{label_prefix}{labels[i]}") if labels is not None else None
        ax.set_xticks([])
        ax.set_yticks([])
        ax.grid(False)

    if labels is None:
        fig.suptitle(title,x=0.5, y=.95, size=title_size) 
        
        fig.subplots_adjust(
            left=0,
            right=0.9,
            top=0.9,
            bottom=0,
            wspace=0.1,
            hspace=-0.45
        )
    else:
        fig.suptitle(title) #,x=0.45, y=.95
        
        fig.subplots_adjust(
            #left=0,
            #right=1,
            top=0.9,#+((nrows-1)*0.045),
            #bottom=0,
            wspace=0,
            #hspace=0
        )
        
    #plt.tight_layout(h_pad=0,w_pad=0)
    fig.tight_layout(pad=0, h_pad=0,w_pad=0)
    plt.show()

Let's start with a very simple network

In [182]:
class SimpleModel(torch.nn.Module):
    def __init__(self,num_channels, num_classes, input_shape=(64,64)):
        super(SimpleModel,self).__init__()
        self.conv_layer1 = self._conv_layer_set(num_channels, 32)
        self.conv_layer2 = self._conv_layer_set(32, 64)
        self.fc1 = nn.Linear(64*input_shape[1]//4*input_shape[1]//4, 64) # Calculated with the size. why //4
        self.fc2 = nn.Linear(64, num_classes)
        self.drop = nn.Dropout(0.5)
        
    def _conv_layer_set(self, in_c, out_c):
        conv_layer = nn.Sequential(OrderedDict([
            ('conv',nn.Conv2d(in_c, out_c, kernel_size=3, padding=1)),
            ('leakyrelu',nn.LeakyReLU()),
            ('maxpool',nn.MaxPool2d(2)),
        ]))
        return conv_layer
    

    def forward(self, x):
        out = self.conv_layer1(x)
        out = self.conv_layer2(out)
       
        out = out.view(out.size(0), -1)

        out = self.fc1(out)
        out = self.drop(out)
        out = self.fc2(out)
        return out

# Question 2

Get inspired on the code you did on the previous lab and create your ```train function```. It might be useful to think about having a ```predict``` function too. Check the code of the previous lab if you need ideas!

Do not forget, to train you need an ```optimizer```, ```loss function``` and an instance of your ```model```! If you need more inspiration check [here](https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html)!

# Question 3

Now that you have your train function. Train the network until it overfits. Which ```hyperparameters``` allowed you to overfit?

To help you visualize the results we provide a ```confusion matrix function```. 

In [199]:
from sklearn.metrics import confusion_matrix,ConfusionMatrixDisplay
import seaborn as sn
import pandas as pd
import numpy as np

In [207]:
def matrix(loader,net,classes):
    y_pred = []
    y_true = []
    # iterate over test data
    for inputs, labels in loader:
            output = net(inputs) # Feed Network

            output = (torch.max(torch.exp(output), 1)[1]).data.cpu().numpy()
            y_pred.extend(output) # Save Prediction

            labels = labels.data.cpu().numpy()
            y_true.extend(labels) # Save Truth
            break

    # Build confusion matrix
    cf_matrix = confusion_matrix(y_true, y_pred)
    disp = ConfusionMatrixDisplay(confusion_matrix=cf_matrix,display_labels=classes)
    disp.plot()

# Question 4
Go through the [doc](https://pytorch.org/vision/stable/transforms.html) about data augmentation transformations and use some in your pipeline. Did the ones you try improve your model? Why?

# Question 5

Redo the previous questions with an image size of ```128x128```. Make sure to note what changed and why. Compare both versions on ```Tensorboard``` plots.

# Question 6

Once you have a good model for ```128x128``` without ```fine grain``` redo the experiments with ```fine grain```. How did the performance change? And why?