<a href="https://colab.research.google.com/github/FilipeChagasDev/Facial-Recognition-ResNet50-SNN/blob/main/training.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Siamese Neural Network (SNN) training with ResNet50 encoder

This notebook trains the SNN with ResNet50 encoder for facial recognition applications.

## What is a Siamese Neural Network

Siamese neural network (SNN) is an architecture proposed by Bromley et al in 1994. The initial objective of this architecture was to create signature recognizers, but the possibilities of this architecture are wide. In general, it is an option to solve pattern recognition problems where there is no closed and pre-defined set of classes.

The structure of an SNN can be defined as follows.

$$y = d(f(a), f(b))$$

Where:
* $a$ and $b$ are input tensors. RGB images are typically third-order tensors.
* $f$ in the encoder function. The purpose of this function is to transform the input tensors into feature vectors. This function is non-linear.
* $d$ is the distance function. The purpose of this function is to calculate the distance between the feature vectors. Herein, euclidean distance is used as $d$. 
* $y$ is the SNN output.

The SNN is trained with pairs of tensors $(a,b)$ as input, that are labeled as **1** (genuine) or **0** (impostor). The pair $(a,b)$ is genuine only if $a$ and $b$ belong to the same class. In the training process, $d(f(a), f(b))$ is conditioned to give low distances to genuine pairs and high distances to impostor pairs.

## What is a perfect SNN

A trained SNN can be considered perfect if satisfy the following condition.

* $d(f(a_1), f(b_1)) < d(f(a_2), f(d_2))$ for any genuine pair $(a_1, b_1)$ and any impostor pair $(a_2, b_2)$.

Hardly a training process results in a perfect SNN. In real situations, we just want $d(f(a_1), f(b_1)) < d(f(a_2), f(d_2))$ for most genuine pairs $(a_1, b_1)$ and for most imposter pairs $(a_2, b_2)$.

## SNN for facial recognition

In this work, we are going to train the SNN with pairs of cropped face photos from the CelebA dataset. Pairs of photos of the same person are genuine, and pairs of photos of different people are imposters.

## Code

To run this notebook on Google Colab, you will need to upload files:

* helper.py
* pairing.py
* partitioning.py
* snn.py 

First, the necessary modules will be included.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.applications.resnet50 import ResNet50
import helper
from snn import SNNGenerator, SNN

Now, CelebA will be downloaded.

In [None]:
helper.download_celeba()

Now, **training** and **validation** partitions will be created. The **build_celeba_partitions** function generates the **celeba_partitions/partitions.json** file that separates the **validation** people from the **training** people. Validation and training are make with photos of different persons.

In [None]:
from partitioning import build_celeba_partitions
build_celeba_partitions()

Now the next code cell generates the paired image metadata. It might take a while.

The *build_celeba_pairs* function generates four CSV files: *eval_genuine_pairs.csv*, *eval_impostor_pairs.csv*, *training_genuine_pairs.csv* and *training_impostor_pairs.csv*. These files have the following format:

|file_a|person_a|file_b|person_b|
|:----:|:------:|:----:|:------:|
| ...  | ...    | ...  | ...    |

* The **file_a** column has photo filenames $a$.
* The **person_a** column has person identifiers $a$. Each person is identified by an integer greater than 0.
* The **file_b** column has photo filenames $b$.
* The **person_b** column has person identifiers $b$.

In files *eval_impostor_pairs.csv* and *training_impostor_pairs.csv*, all the rows have different values for *person_a* and *person_b*. In files *eval_genuine_pairs.csv* and *training_impostor_pairs.csv*, all the rows have equal values for *person_a* and *person_b*.

In [None]:
from pairing import build_celeba_pairs
build_celeba_pairs()

The following function loads a batch of images and formats them to the encoder.

* Width: 80px
* Height: 80px
* Format: RGB (3 channels)
* Data type: Unitary floats (between 0 and 1)

In [None]:
def load_images(paths):
    images = [helper.get_image(path, 80, 80, 'RGB').astype(np.float)/255 for path in paths]
    return np.array(images)

The next cell loads the CSVs generated by *build_celeba_pairs*.

In [None]:
training_genuine_pairs = pd.read_csv(os.path.join('celeba_pairs', 'training_genuine_pairs.csv'))
training_impostor_pairs = pd.read_csv(os.path.join('celeba_pairs', 'training_impostor_pairs.csv'))
eval_genuine_pairs = pd.read_csv(os.path.join('celeba_pairs', 'eval_genuine_pairs.csv'))
eval_impostor_pairs = pd.read_csv(os.path.join('celeba_pairs', 'eval_impostor_pairs.csv'))

CelebA is a big dataset. To reduce the amount of time needed to train the SNN, it is necessary to randomly sample data from the dataset.

In [None]:
divider = 10
training_genuine_pairs = training_genuine_pairs.sample(n=training_genuine_pairs.shape[0]//divider, random_state=1)
training_impostor_pairs = training_impostor_pairs.sample(n=training_impostor_pairs.shape[0]//divider, random_state=1)
eval_genuine_pairs = eval_genuine_pairs.sample(n=eval_genuine_pairs.shape[0]//divider, random_state=1)
eval_impostor_pairs = eval_impostor_pairs.sample(n=eval_impostor_pairs.shape[0]//divider, random_state=1)



The next cell implements the data generator needed to train the SNN. The generator is a class that loads and formats the images of each training batch.

In [None]:
import h5py as h5
from tqdm import tqdm

class CelebAGenerator(SNNGenerator):
    def __init__(self, name: str, genuine_pairs_df: pd.DataFrame, impostor_pairs_df: pd.DataFrame, batch_size: int):
        self.__genuine_pairs_df__ = genuine_pairs_df
        self.__impostor_pairs_df__ = impostor_pairs_df
        self.__batch_size__ = batch_size
        self.__name__ = name
        self.__n_batches__ = (self.__genuine_pairs_df__.shape[0]+self.__impostor_pairs_df__.shape[0])//self.__batch_size__
        
        if not os.path.exists(name):
            os.mkdir(name)
            print('Generating batches for', name)
            self.__batches__ = [self.get_batch(i) for i in tqdm(range(self.__n_batches__))]
        else:
            print('Loading batches for', name)
            self.__batches__ = [h5.File(os.path.join(self.__name__, f'b{i}.h5'), 'r') for i in tqdm(range(self.__n_batches__))]

    def __len__(self):
        return self.__n_batches__

    def get_batch(self, index):
        def get_genuines():
            my_rows = self.__genuine_pairs_df__.sample(n=self.__batch_size__//2, replace=False)
            images_a = load_images([os.path.join('celeba', 'img_align_celeba', fn) for fn in list(my_rows['file_a'])])
            images_b = load_images([os.path.join('celeba', 'img_align_celeba', fn) for fn in list(my_rows['file_b'])])
            images_y = np.ones(shape=(self.__batch_size__//2,1))
            return images_a, images_b, images_y

        def get_impostors():
            my_rows = self.__impostor_pairs_df__.sample(n=self.__batch_size__//2, replace=False)
            images_a = load_images([os.path.join('celeba', 'img_align_celeba', fn) for fn in list(my_rows['file_a'])])
            images_b = load_images([os.path.join('celeba', 'img_align_celeba', fn) for fn in list(my_rows['file_b'])])
            images_y = np.zeros(shape=(self.__batch_size__//2,1))
            return images_a, images_b, images_y

        genuines_a, genuines_b, genuines_y = get_genuines()
        impostors_a, impostors_b, impostors_y = get_impostors()
        a = np.append(genuines_a, impostors_a, axis=0)
        b = np.append(genuines_b, impostors_b, axis=0)
        y = np.append(genuines_y, impostors_y, axis=0)
        
        batch_h5 = h5.File(os.path.join(self.__name__, f'b{index}.h5'), 'a')
        batch_h5.create_dataset('a', data=a)
        batch_h5.create_dataset('b', data=b)
        batch_h5.create_dataset('y', data=y)
        
        return ([batch_h5['a'], batch_h5['b']], batch_h5['y'])

    def __getitem__(self, index: int):
        return self.__batches__[index]   
      

We use ResNet50 as an encoder, just adding an extra linear layer. Feature vectors have 100 entries (features).
The following cell defines the function that creates the encoder.

In [None]:
def build_resnet50_encoder(n_features=100):
    base_model = ResNet50(weights=None, include_top=False, input_shape=(80,80,3))
    x = base_model.output
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Dropout(0.7)(x)
    x = layers.Dense(n_features, activation='linear')(x)
    model = keras.Model(inputs = base_model.input, outputs = x)
    return model

The next cell creates the SNN.

In [None]:
my_snn = SNN((80,80,3),build_resnet50_encoder())

The next cell creates the generators and trains the SNN.

The loss function used is the **Contrastive Loss**, proposed by Lian et al in 2018.

$$L(y,d)= yd^2 + (1-y)\max\{m-d,0\}^2$$

Where:
* $y$ is the label (**1** for genuine pairs and **0** for impostor pairs).
* $d$ is the euclidean distance between $a$ and $b$.

In [None]:
training_generator = CelebAGenerator('training_batches', training_genuine_pairs, training_impostor_pairs, 100)
eval_generator = CelebAGenerator('eval_batches', eval_genuine_pairs, eval_impostor_pairs, 100)
my_snn.fit(training_generator, eval_generator, epochs=50)

The next cell plots the evolution of the training loss and the validation loss over the course of training.

In [None]:
plt.plot([i+1 for i in range(len(my_snn.training_loss_history))], my_snn.training_loss_history, label='Training')
plt.plot([i+1 for i in range(len(my_snn.validation_loss_history))], my_snn.validation_loss_history, label='Validation')
plt.yscale('log')
plt.xlabel('Epochs')
plt.ylabel('Loss')  
plt.legend()
plt.grid()
plt.show()

The next cell saves encoder's weights to a file. We will use this encoder later as part of a prediction framework.

In [None]:
my_snn.save_encoder('resnet50_encoder_weights.h5')

**If you're on Google Colab, don't forget to download the file *resnet50_encoder_weights.h5* as it will be needed to run the other notebooks.**