## Task 1


The notebook is divided into 3 parts:
- In the first part we import the images and create the dataframe
- In the second part we define our embedding network and train it on the given images
- In the last part we use the embedding network to create a classifier and use it to classify the given images

## Part 1: Data import and dataframe creation

Importing important libraries for the section

In [1]:
import numpy as np
import pandas as pd
import os
from PIL import Image
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder

Rather than importing all the images together, we shall create a dataframe with Image path and respective label "string type", and then encode the labels "numeral type". This makes the execution faster by only importing selected images at a time.

In [4]:
imagePath = []
labels = []
for folder in os.listdir('./train'):
    for images in os.listdir(os.path.join('./train/',folder)):
        image = os.path.join('/train/',folder,images)
        imagePath.append(image)
        labels.append(folder)
data = {'Images':imagePath, 'Labels':labels}
data = pd.DataFrame(data)
data.head()

Unnamed: 0,Images,Labels
0,/train/Sample029/img029-019.png,Sample029
1,/train/Sample029/img029-028.png,Sample029
2,/train/Sample029/img029-051.png,Sample029
3,/train/Sample029/img029-023.png,Sample029
4,/train/Sample029/img029-040.png,Sample029


Now that  we have the Image path and their respective labels, we shall encode these labels to their numerical counterparts for our deep learning model.

In [6]:
labelEncoder = LabelEncoder()
data['Encoded Labels'] = labelEncoder.fit_transform(data['Labels'])
data.head()

Unnamed: 0,Images,Labels,Encoded Labels
0,/train/Sample029/img029-019.png,Sample029,28
1,/train/Sample029/img029-028.png,Sample029,28
2,/train/Sample029/img029-051.png,Sample029,28
3,/train/Sample029/img029-023.png,Sample029,28
4,/train/Sample029/img029-040.png,Sample029,28


We now create the training and validation splits of the images.

In [9]:
batchSize = 128
validationSplit = 0.1
shuffleDataset = True
randomSeed = 17

In [10]:
datasetSize = len(data)
indices = list(range(datasetSize))
split = int(np.floor(validationSplit*datasetSize))
if shuffleDataset:
    np.random.seed(randomSeed)
    np.random.shuffle(indices)
trainIndices, validationIndices = indices[split:], indices[:split]

In [None]:
trainSampler = SubsetRandomSampler(trainIndices)
validationSampler = SubsetRandomSampler(validationIndices)

In [None]:
class CustomDataset(Dataset):
    def __init__(self, imageData, imagePath, transform=None):
        self.imagePath = imagePath
        self.imageData = imageData
        self.transform = transform
    def __len__(self):
        return len(self.imageData)
    def __getitem__(self, index):
        imageName = os.path.join(self.imagePath, self.imageData.loc[index, 'labels'],self.imageData.loc[index,'Images'])
        image = Image.open(imageName)
        image = image.resize((28,28))
        label = torch.tensor(self.imageData.loc[index, 'Encoded Labels'])
        if self.transform is not None:
            image = self.transform(image)
        return image,label

We now create  the custom dataset using all the images

In [None]:
dataset = CustomDataset(data,'./train/',transform)

We create train and validation loaders that will load batches of images for training and validation using the indices we created above

In [None]:
trainLoader = torch.utils.data.DataLoader(dataset, batch_size = batch_size, sampler = trainSampler)
validationLoader = torch.utils.data.DataLoader(dataset, batch_size = batch_size, sampler = validationSampler)

To visualize the images, we  first need to unormalize  them, this is performed in the function below

In [None]:
def  displayImage(image):
    image = image/2 + 0.5
    image = image.numpy()
    image = image.transpose(image, (1,2,0))
    return image

We now visualize some of our training images

In [None]:
dataIterator = iter(trainLoader)
images, labels = dataIterator.next()
figure, axis = plt.subplots(3,5, figsize=(16,16))
for i, ax in enumerate(axis.flat):
    with torch.no_grad():
        image, label = images[i], labels[i]
        ax.imshow(displayImage(image))
        ax.set(title=f"{label.item()}")

## Part 2: Embedding Network and its training

Importing important libraries for the section

In [None]:
#the pytorch metric learning library comes with inbuilt methods for triplet mining and computing triplet losses between anchor, positive class and negative class
from pytorch_metric_learning import losses, miners, distances, reducers, testers

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

We now define our embedding network

In [None]:
class EmbeddingNetwork(nn.Module):
    def __init__(self):
        super(EmbeddingNetwork, self).__init__()
        self.conv1 = nn.Conv2d(3,96, kernel_size = (3,3), stride = 1, padding = 1)
        self.conv2 = nn.Conv2d(96, 96, kernel_size = (4,4), stride = 2, padding = 1 )
        self.conv3 = nn.Conv2d(96, 256, kernel_size = (1,1), stride = 1)
        self.maxpooling1 = nn.MaxPool2d(kernel_size = (3,3), stride = 2, padding = 1)
        self.fullyConnected1 = nn.Linear(256*8*8,256)
        self.fullyConnected2 = nn.Linear(256,128)
        self.bactchNorm1 = nn.BatchNorm2d(96)
        self.batchNorm2 = nn.BatchNorm2d(256)
        self.dropout = nn.Dropout(0.2)
    def forward(self,x):
        x = self.bactchNorm1(F.relu(self.conv1(x)))
        x = self.batchNorm1(F.relu(self.conv2(x)))
        x = self.batchNorm2(F.relu(self.conv3(x)))
        x = x.view(-1, 256*8*8)
        x = self.dropout(self.fullyConnected1(x))
        x = self.dropout(self.fullyConnected2(x))
        x = F.normalize(x, p=2, dim=-1)
        return x

We shall now create the embedding model and print its layers

In [None]:
embeddingNetwork = EmbeddingNetwork()
print(embeddingNetwork)

The train function below  takes this Embedding model along with the training loaded, the triplet miner and the triplet loss function to train the embedding network for a single epoch

In [None]:
def train(model, lossFunction, miningFunction, device, trainLoader, optimizer, epoch):
    print("Training started for Epoch: "epoch)
    model.train()
    for batchIndex, (data, labels) in enumerate(trainLoader):
        data, labels = data.to(device), labels.to(device)
        optimizer.zero_grad()
        embeddings = model(data)
        indicesTuple = miningFunction(embeddings, labels)
        loss = lossFunction(embeddings, labels, indicesTuple)
        loss.backward()
        optimizer.step()
        if batchIndex%5==0:
            print("Training Strats for Epoch {} Iteration {}: Loss= {}, Number of mined triplets {}".format(epoch, batchIndex, loss, miningFunction.num_triplets))

We now define the loss function, triplet miner, optimizer, and other important hyperparamters we will be using to trai the model

In [None]:
#distance this tells the model how to calculate the distance between the  generated embeddings
distance = distances.CosineSimilarity()
reducer = reducers.ThresholdReducer(low=0)
lossFunction = losses.TripletMarginLoss(margin=0.2, distance=distance, type_of_triplets = "semihard")
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print("Training the model on: ",device)

The function below uses the above function to train the embedding model for 10 epochs, we won't be saving the model at this point as we are not  yet classifying the images.

In [None]:
for epoch in range(1, 11):
    train(embeddingNetwork, lossFunction, miningFunction, device, trainLoader, optimizer, epoch)