# Introduction

In this notebook, we shall be using Siamese Network in order to build a model to perform the task of face verification for a given character.

The paper referred to for performing this experiment is [linked here](https://proceedings.neurips.cc/paper/1993/file/288cc0ff022877bd3df94bc9360b9c5d-Paper.pdf).

In [1]:
from siameseDataset import *
from loss_func import *
from siameseModel import *
import torch
from torch import nn as nn
import pandas as pd
torch.autograd.set_detect_anomaly(True)
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
data_path = "/home/vinayak/cleaned_anime_faces"
model_save_path = "/home/vinayak/anime_face_recognition/enet_model.pth"
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

In [3]:
partition = {}
split_info = pd.read_csv(f"/home/vinayak/anime_face_recognition/data.csv")
partition["train"] = list(split_info[split_info.label == "train"].images)
random.shuffle(partition["train"])
partition["validation"] = list(split_info[split_info.label == "valid"].images)

In [4]:
# https://omoindrot.github.io/triplet-loss#strategies-in-online-mining

In [5]:
# Create a training dataset and use it to create a training_generator
training_set = siameseDataset(partition['train'])
training_generator = torch.utils.data.DataLoader(training_set, batch_size = 1)
# training_set.show_sample()

In [6]:
# Create a training dataset and use it to create a validation_generator
validation_set = siameseDataset(partition['validation'], dtype = "validation")
validation_generator = torch.utils.data.DataLoader(validation_set, batch_size = 1)
# validation_set.show_sample()

In [7]:
# Create the model and move it to appropriate device (i.e. cuda if gpu is available)
model = enet_model().to(DEVICE)

In [8]:
# Define the loss function to be used for training
loss_func = batchHardTripletLoss().to(DEVICE)

In [9]:
# Define a learning rate and create an optimizer for training the model 
# (Adam with default momentum should be good)
learning_rate = 1e-3
optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate)

# Define a learning rate scheduler so that you reduce the learning rate
# As you progress across multiple epochs
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, patience = 3, factor = 0.2, threshold = 1e-6)

In [10]:
# Training Loop
train_losses = []
valid_losses = []

n_epochs = 25
n_train_batches = len(training_generator)
n_valid_batches = len(validation_generator)
PRINT_PROGRESS = 30

round_off = lambda x: round(x, 5)

# Loop over number of epochs
for epch in range(n_epochs):
    
    # Initialize the loss values to zero at the beginning of the epoch
    train_loss = 0.
    valid_loss = 0.

    # Train for an epoch
    for idx, (images, labels) in enumerate(training_generator, start = 1):
        images, labels = images[0].to(DEVICE), labels.to(DEVICE)
        feature_vectors = model(images)
        loss = loss_func(feature_vectors, labels)
        loss.backward()
        optimizer.step()
        
        batch_loss = round_off(loss.item())
        train_loss += batch_loss
        
        if (idx % PRINT_PROGRESS == 0) or (idx == 1) or (idx == n_train_batches):
            print(f"Epoch: {(epch + 1):<4}| Batch Number: {idx:<4}| Current Batch Loss: {batch_loss:<7}| Average Train Loss: {round_off(train_loss / idx):<7}")
    
    # Validate after the trained epoch
    for images, labels in validation_generator:
        images, labels = images[0].to(DEVICE), labels.to(DEVICE)
        with torch.no_grad():
            feature_vectors = model(images)
            loss = loss_func(feature_vectors, labels)
            valid_loss += round_off(loss.item())
    
    # Average the train and valid losses across all batches and save it to our array
    train_loss = round_off(train_loss / n_train_batches)
    valid_loss = round_off(valid_loss / n_valid_batches)
    
    print(f"_____________________________ End of Epoch {epch + 1} _____________________________")
    print(f"Epoch: {(epch + 1):<4}| Train Loss: {train_loss:<7}| Valid Loss: {valid_loss:<7}")
    print()
    
    train_losses.append(train_loss)
    valid_losses.append(valid_loss)
    
    # Check the valid loss and reduce learning rate as per the need
    scheduler.step(valid_loss)

Epoch: 1   | Batch Number: 1   | Current Batch Loss: 2.20982| Average Train Loss: 2.20982
Epoch: 1   | Batch Number: 30  | Current Batch Loss: 1.68284| Average Train Loss: 2.04862
Epoch: 1   | Batch Number: 55  | Current Batch Loss: 1.27986| Average Train Loss: 1.79386
_____________________________ End of Epoch 1 _____________________________
Epoch: 1   | Train Loss: 1.79386| Valid Loss: 1.31989

Epoch: 2   | Batch Number: 1   | Current Batch Loss: 1.34464| Average Train Loss: 1.34464
Epoch: 2   | Batch Number: 30  | Current Batch Loss: 1.12947| Average Train Loss: 1.21264
Epoch: 2   | Batch Number: 55  | Current Batch Loss: 1.06667| Average Train Loss: 1.15808
_____________________________ End of Epoch 2 _____________________________
Epoch: 2   | Train Loss: 1.15808| Valid Loss: 1.07646

Epoch: 3   | Batch Number: 1   | Current Batch Loss: 1.0751 | Average Train Loss: 1.0751 
Epoch: 3   | Batch Number: 30  | Current Batch Loss: 1.05491| Average Train Loss: 1.05316
Epoch: 3   | Batch N

_____________________________ End of Epoch 21 _____________________________
Epoch: 21  | Train Loss: 1.00262| Valid Loss: 1.00286

Epoch: 22  | Batch Number: 1   | Current Batch Loss: 1.00251| Average Train Loss: 1.00251
Epoch: 22  | Batch Number: 30  | Current Batch Loss: 1.00259| Average Train Loss: 1.00267
Epoch: 22  | Batch Number: 55  | Current Batch Loss: 1.00225| Average Train Loss: 1.00258
_____________________________ End of Epoch 22 _____________________________
Epoch: 22  | Train Loss: 1.00258| Valid Loss: 1.00235

Epoch: 23  | Batch Number: 1   | Current Batch Loss: 1.00272| Average Train Loss: 1.00272
Epoch: 23  | Batch Number: 30  | Current Batch Loss: 1.00325| Average Train Loss: 1.00227
Epoch: 23  | Batch Number: 55  | Current Batch Loss: 1.00154| Average Train Loss: 1.00232
_____________________________ End of Epoch 23 _____________________________
Epoch: 23  | Train Loss: 1.00232| Valid Loss: 1.0026 

Epoch: 24  | Batch Number: 1   | Current Batch Loss: 1.00254| Avera

In [11]:
# Save the losses to a loss_history.csv file on the disk
history = pd.DataFrame({"train_loss": train_losses, "valid_loss":valid_losses})
history.to_csv("loss_history.csv", index = False)

In [12]:
# Save the trained model to our disk
torch.save(model.state_dict(), model_save_path)