# **Introduction**
---

**About the Dataset**
---
---

This dataset contains over **100,000 facial images** of **1,063 different celebrities** and public figures, including actors, musicians, politicians, athletes, and social media influencers. The images are in **JPEG format** and vary in quality, size, and
 setting. The dataset also includes metadata such as the individual's name, gender, and the number of images available for each person. It is designed for research in
 facial recognition, computer vision, and machine learning algorithms for tasks like face recognition and facial expression analysis.

**Problem Statement**
---

---

The facial recognition problem in the context of this dataset is to develop a machine learning model that can accurately identify individuals from their facial images. Given a new image of a person's face, the goal of the model would be to correctly identify the person in the image by matching it with the images of known individuals in the dataset.

The problem of facial recognition has many potential applications, including security systems, access control, personal identification, and more. However, it is important to ensure that these systems are designed and used in an ethical manner, with appropriate safeguards to protect privacy and prevent misuse.

In this notebook, we will explore the PINS Face Recognition Dataset and propose a solution using the FaceNet model for facial recognition. We will discuss the preprocessing of the data, training the FaceNet model, fine-tuning the model, face recognition, and evaluation of the model's performance. We will also highlight the benefits of u
sing a pre-trained FaceNet model for facial recognition tasks.


# **SetUp**

---

Before proceeding further, let's ensure that we have imported all the necessary modules required for our notebook. This will help us work more efficiently.

In [None]:
# Common
import os
import cv2 as cv
import numpy as np
from IPython.display import clear_output as cls

# Data 
from tqdm import tqdm
from glob import glob

# Data Visuaalization
import plotly.express as px
import matplotlib.pyplot as plt

# Model
from tensorflow.keras.models import load_model

In [None]:
# Setting a random
np.random.seed(42)

# Define the image dimensions
IMG_W, IMG_H, IMG_C = (160, 160, 3)

# **Data Loading**
---

Loading the data into memory is a crucial step in face recognition. We typically work with two types of datasets:
1. the collection of images for testing our model's inference capabilities, and the database, which is a collection of images used to compute the embeddings of faces for face recognition.

2. The database has a significant impact on the performance of our model in two ways. First, the quality of the images in the database can affect our model's ability to recognize faces. Second, the total number of images per individual in the database also affects model performance.


In this dataset, we have a large number of images, and the images vary significantly in terms of lighting, pose, and expression. To account for this diversity, we need a database that is large enough to include embeddings of various diverse images. Therefore, we will use 10 images per individual to ensure that our database is comprehensive and diverse enough to improve model performance.


Let's start by collecting all the images present in the data.

In [None]:
# Specify the root directory path
root_path = '/kaggle/input/pins-face-recognition/105_classes_pins_dataset/'

# Collect all the person names
dir_names = os.listdir(root_path)
person_names = [name.split("_")[-1].title() for name in dir_names]
n_individuals = len(person_names)

print(f"Total number of individuals: {n_individuals}\n")
print(f"Name of the individuals : \n\t{person_names}")

This list of names will be used as **labels** to **train our model** to recognize the specific individuals.

In [None]:
# Number of images available per person
n_images_per_person = [len(os.listdir(root_path + name)) for name in dir_names]
n_images = sum(n_images_per_person)

# Show
print(f"Total Number of Images : {n_images}.")

In [None]:
# Plot the Distribution of number of images per person.
fig = px.bar(x=person_names, y=n_images_per_person, color=person_names)
fig.update_layout({'title':{'text':"Distribution of number of images per person"}})
fig.show()

In [None]:
# Select all the file paths
filepaths = [path  for name in dir_names for path in glob(root_path + name + '/*')]
np.random.shuffle(filepaths)
print(f"Total number of images to be loaded : {len(filepaths)}")

# Create space for the images
all_images = np.empty(shape=(len(filepaths), IMG_W, IMG_H, IMG_C), dtype = np.float32)
all_labels = np.empty(shape=(len(filepaths), 1), dtype = np.int32)

# For each path, load the image and apply some preprocessing.
for index, path in tqdm(enumerate(filepaths), desc="Loading Data"):
    
    # Extract label
    label = [name[5:] for name in dir_names if name in path][0]
    label = person_names.index(label.title())
    
    # Load the Image
    image = plt.imread(path)
    
    # Resize the image
    image = cv.resize(image, dsize = (IMG_W, IMG_H))
    
    # Convert image stype
    image = image.astype(np.float32)/255.0
    
    # Store the image and the label
    all_images[index] = image
    all_labels[index] = label

# **Data Visualization**
---

Now that we have our data set loaded, we can move on to visualizing it. Visualization is an important step in data analysis, as it helps us gain insight into the data and identify any patterns or anomalies. In the context of face recognition, visualization allows us to assess the quality of the images and how well they represent each individual.

In [None]:
def show_data(
    images: np.ndarray, 
    labels: np.ndarray,
    GRID: tuple=(15,6),
    FIGSIZE: tuple=(25,50), 
    recog_fn = None,
    database = None
) -> None:
    
    """
    Function to plot a grid of images with their corresponding labels.

    Args:
        images (numpy.ndarray): Array of images to plot.
        labels (numpy.ndarray): Array of corresponding labels for each image.
        GRID (tuple, optional): Tuple with the number of rows and columns of the plot grid. Defaults to (15,6).
        FIGSIZE (tuple, optional): Tuple with the size of the plot figure. Defaults to (30,50).
        recog_fn (function, optional): Function to perform face recognition. Defaults to None.
        database (dictionary, optional): Dictionary with the encoding of the images for face recognition. Defaults to None.

Returns:
        None
    """
    
    # Plotting Configuration
    plt.figure(figsize=FIGSIZE)
    n_rows, n_cols = GRID
    n_images = n_rows * n_cols
    
    # loop over the images and labels
    for index in range(n_images):
        
        # Select image in the corresponding label randomly
        image_index = np.random.randint(len(images))
        image, label = images[image_index], person_names[int(labels[image_index])]
        
        # Create a Subplot
        plt.subplot(n_rows, n_cols, index+1)
        
        # Plot Image
        plt.imshow(image)
        plt.axis('off')
        
        if recog_fn is None:
            # Plot title
            plt.title(label)
        else:
            recognized = recog_fn(image, database)
            plt.title(f"True:{label}\nPred:{recognized}")
    
    # Show final Plot
    plt.tight_layout()
    plt.show()

In [None]:
show_data(images = all_images, labels = all_labels)

This **data set** is a **treasure trove of images**, containing a **vast variety of photographs**. The **diversity** of the **images** is **remarkable**, with a **range of lighting conditions, poses, and unique facial features** that **distinguish each individual**. It's fascinating to see how these **well-known figures** are captured in **different moments and situations**. As a **fan of many of these individuals**, it's a **delight to explore** the **collection** and see some of **their candid and posed shots**.

# **Face Database**
---

In order to create an effective face recognition model, it is essential to have a diverse and comprehensive database of images. However, manually selecting images for such a large dataset can be a daunting task. To streamline the process, we have opted to randomly choose 10 images per person from our extensive collection. This approach not only saves time and effort, but also ensures that we have a representative sample of each individual's facial features, expressions, and poses. Once the images are selected, we will use the average encoding produced by them for comparison, which will help us improve the accuracy of our face recognition model. Please note that for computational efficiency, we will only save the encodings of these images, not the images themselves.

In [None]:
def load_image(image_path: str, IMG_W: int = IMG_W, IMG_H: int = IMG_H) -> np.ndarray:
    """Load and preprocess image.
    
    Args:
        image_path (str): Path to image file.
        IMG_W (int, optional): Width of image. Defaults to 160.
        IMG_H (int, optional): Height of image. Defaults to 160.
    
    Returns:
        np.ndarray: Preprocessed image.
    """
    
    # Load the image
    image = plt.imread(image_path)
    
    # Resize the image
    image = cv.resize(image, dsize=(IMG_W, IMG_H))
    
    # Convert image type and normalize pixel values
    image = image.astype(np.float32) / 255.0
    
    return image

def image_to_embedding(image: np.ndarray, model) -> np.ndarray:
    """Generate face embedding for image.
    
    Args:
        image (np.ndarray): Image to generate encoding for.
        model : Pretrained face recognition model.
    
    Returns:
        np.ndarray: Face embedding for image.
    """
    
    # Obtain image encoding
    embedding = model.predict(image[np.newaxis,...])
    
    # Normalize bedding using L2 norm.
    embedding /= np.linalg.norm(embedding, ord=2)
    
    # Return embedding
    return embedding
    
def generate_avg_embedding(image_paths: list, model) -> np.ndarray:
    """Generate average face embedding for list of images.
    
    Args:
        image_paths (list): List of paths to image files.
        model : Pretrained face recognition model.
    
    Returns:
        np.ndarray: Average face embedding for images.
    """
    
    # Collect embeddings
    embeddings = np.empty(shape=(len(image_paths), 128))
    
    # Loop over images
    for index, image_path in enumerate(image_paths):
        
        # Load the image
        image = load_image(image_path)
        
        # Generate the embedding
        embedding = image_to_embedding(image, model)
        
        # Store the embedding
        embeddings[index] = embedding
        
    # Compute average embedding
    avg_embedding = np.mean(embeddings, axis=0)
    
    # Clear Output
    cls()
    
    # Return average embedding
    return avg_embedding

Loading the **FaceNet** model:

In [None]:
# Load model
model = load_model('/kaggle/input/facenet-keras/facenet_keras.h5')

In [None]:
# Select all the file paths : 50 images per person.
filepaths = [np.random.choice(glob(root_path + name + '/*'), size=10) for name in dir_names]

# Create data base
database = {name:generate_avg_embedding(paths, model=model) for paths, name in tqdm(zip(filepaths, person_names), desc="Generating Embeddings")}

# **Face Recognition**
---

Facial recognition technology is an area of rapid development that employs machine learning algorithms to authenticate and validate individuals based on their facial characteristics. This technology is widely used in various applications, including unlocking smartphones and surveillance systems.

There are two primary types of facial recognition problems: face verification and face recognition. Face verification involves confirming a person's identity, such as when passing through customs or using facial recognition to unlock a phone. On the other hand, face recognition is a multi-class problem that aims to identify an individual from a large pool of people.

The success of facial recognition technology is attributed to the use of deep neural networks, such as the Siamese Network. This type of neural network architecture is commonly used for tasks involving similarity or distance measurement, as it consists of two or more identical subnetworks that share the same weights and architecture.

In the context of facial recognition, a Siamese Network takes two facial images as input and learns to output a similarity score indicating how similar the two images are in terms of facial features. This network has also been applied to other tasks, such as text similarity and signature verification, making it a valuable tool for various similarity and distance measurement tasks.

The use of a Siamese network for facial recognition involves generating 128-dimensional embeddings for all images in a database. When a new image is inputted, an embedding is produced and compared with the rest of the embeddings in the database to perform facial recognition with a high degree of accuracy. This comparison is made possible by the rich information contained in the embeddings about facial features and their relationships.

In [None]:
def compare_embeddings(embedding_1: np.ndarray, embedding_2: np.ndarray, threshold: float = 0.8) -> int:
    """
    Compares two embeddings and returns 1 if the distance between them is less than the threshold, else 0.

    Args:
    - embedding_1: A 128-dimensional embedding vector.
    - embedding_2: A 128-dimensional embedding vector.
    - threshold: A float value representing the maximum allowed distance between embeddings for them to be considered a match.

    Returns:
    - 1 if the distance between the embeddings is less than the threshold, else 0.
    """

    # Calculate the distance between the embeddings
    embedding_distance = embedding_1 - embedding_2

    # Calculate the L2 norm of the distance vector
    embedding_distance_norm = np.linalg.norm(embedding_distance)

    # Return 1 if the distance is less than the threshold, else 0
    return embedding_distance_norm if embedding_distance_norm < threshold else 0

In [None]:
def recognize_face(image: np.ndarray, database: dict, threshold: float = 1.0, model = model) -> str:
    """
    Given an image, recognize the person in the image using a pre-trained model and a database of known faces.
    
    Args:
        image (np.ndarray): The input image as a numpy array.
        database (dict): A dictionary containing the embeddings of known faces.
        threshold (float): The distance threshold below which two embeddings are considered a match.
        model (keras.Model): A pre-trained Keras model for extracting image embeddings.
        
    Returns:
        str: The name of the recognized person, or "No Match Found" if no match is found.
    """
    
    # Generate embedding for the new image
    image_emb = image_to_embedding(image, model)
    
    # Clear output
    cls()
    
    # Store distances
    distances = []
    names = []
    
    # Loop over database
    for name, embed in database.items():
        
        # Compare the embeddings
        dist = compare_embeddings(embed, image_emb, threshold=threshold)
        
        if dist > 0:
            # Append the score
            distances.append(dist)
            names.append(name)
    
    # Select the min distance
    if distances:
        min_dist = min(distances)
    
        return names[distances.index(min_dist)].title().strip()
    
    return "No Match Found"

Let's have a quick look at the functioning of the function.

In [None]:
# Randomly select an index
index = np.random.randint(len(all_images))

# Obtain an image and its corresponding label
image_ = all_images[index]
label_ = person_names[int(all_labels[index])]

# Recognize the face in the image
title = recognize_face(image_, database)

# Plot the image along with its true and predicted labels
plt.imshow(image_)
plt.title(f"True:{label_}\nPred:{title}")
plt.axis('off')
plt.show()

In [None]:
show_data(all_images, all_labels, recog_fn = recognize_face, database = database)

# **Analysis**
---
This model is exceptional in its capacity to precisely identify almost all individuals in the pictures. While there are a few mistakes, a broader analysis of the images reveals that the model performs exceptionally well overall. This level of performance is likely due to the mean embeddings that were computed for each person. These encodings encompass the average of any variations within an image, enabling the model to be resilient and accurately recognize individuals.

The model's effectiveness is mainly attributed to its emphasis on computing the similarities between embeddings and comparing the distances between them, which significantly enhances its speed and accuracy. Overall, this model is a highly efficient tool for facial recognition and showcases the power of machine learning in identifying individuals with remarkable precision.

# **Accuracy** 
Let's find the accuracy of the model. The **accuracy of the model** is also calculated on a **subset of the dataset**, which is selected randomly. However, when these **subsets are averaged out**, the accuracy remains approximately the same. Therefore, below we are only presenting the accuracy of the chosen subset.

In [None]:
# Count the number of images
n_images = 50

# Initialize the number of correct predictions
n_correct = 0

# Randomly Select images
indicies = np.random.permutation(n_images)
temp_images = all_images[indicies]
temp_labels = all_labels[indicies]

# Iterate over each image and its corresponding label
for (image, label) in zip(temp_images, temp_labels):
    
    # Extract the true label of the person in the image
    true_label = person_names[int(label)]

    # Use the recognize_face function to predict the label of the person in the image
    pred_label = recognize_face(image, database)

    # If the true label and the predicted label match, increment the number of correct predictions
    if true_label == pred_label:
        n_correct += 1

# Calculate the accuracy of the model
acc = (n_correct / n_images) * 100.0

# Print the accuracy of the model
print(f"Model Accuracy: {acc}%!!!")

# **Working With Large Database**
---

As we discussed earlier, the **size of the database** plays a **crucial role in** **facial recognition accuracy**. To further **analyze the performance** of our model, we can **increase the size of the database** and **observe its impact on the accuracy**. A **larger database** can help the **model learn** and **recognize faces** more **effectively,** leading to improved accuracy** in facial recognition tasks**.

In [None]:
# Select all the file paths : 50 images per person.
filepaths = [np.random.choice(glob(root_path + name + '/*'), size=50) for name in dir_names]

# Create data base
large_database = {name:generate_avg_embedding(paths, model=model) for paths, name in tqdm(zip(filepaths, person_names), desc="Generating Embeddings")}

In [None]:
show_data(all_images, all_labels, recog_fn = recognize_face, database = large_database)

Having a **sufficient database** is **crucial for accurate face recognition**. In order to improve our model's performance, we **increased the size of our database by computing 50 embeddings per person and averaging them to obtain a single embedding per person**. By doing so, we were able to **significantly improve the accuracy of our model**.

In [None]:
# Count the number of images
n_images = 100

# Initialize the number of correct predictions
n_correct = 0

# Randomly Select images
indicies = np.random.permutation(n_images)
temp_images = all_images[indicies]
temp_labels = all_labels[indicies]

# Iterate over each image and its corresponding label
for (image, label) in zip(temp_images, temp_labels):
    
    # Extract the true label of the person in the image
    true_label = person_names[int(label)]

    # Use the recognize_face function to predict the label of the person in the image
    pred_label = recognize_face(image, large_database)

    # If the true label and the predicted label match, increment the number of correct predictions
    if true_label == pred_label:
        n_correct += 1

# Calculate the accuracy of the model
acc = (n_correct / n_images) * 100.0

# Print the accuracy of the model
print(f"Model Accuracy: {acc}%!!!")

To further improve the **model's performance**, we increased the **size of the database to 50 embeddings per person and then averaged them to get one embedding per person**. This led to a **noticeable increase in accuracy,** with the previous **accuracy of 90% being improved to 92% by using the larger database**. This demonstrates the importance of having a **large and diverse database** for **face recognition models to achieve higher accuracy**.

In [None]:
# Select all the file paths : 50 images per person.
filepaths = [np.random.choice(glob(root_path + name + '/*'), size=25) for name in dir_names]

# Create data base
med_database = {name:generate_avg_embedding(paths, model=model) for paths, name in tqdm(zip(filepaths, person_names), desc="Generating Embeddings")}

In [None]:
# show_data(all_images, all_labels, recog_fn = recognize_face, database = med_database)

In [None]:
# Count the number of images
n_images = 100

# Initialize the number of correct predictions
n_correct = 0

# Randomly Select images
indicies = np.random.permutation(n_images)
temp_images = all_images[indicies]
temp_labels = all_labels[indicies]

# Iterate over each image and its corresponding label
for (image, label) in zip(temp_images, temp_labels):
    
    # Extract the true label of the person in the image
    true_label = person_names[int(label)]

    # Use the recognize_face function to predict the label of the person in the image
    pred_label = recognize_face(image, med_database)

    # If the true label and the predicted label match, increment the number of correct predictions
    if true_label == pred_label:
        n_correct += 1

# Calculate the accuracy of the model
acc = (n_correct / n_images) * 100.0

# Print the accuracy of the model
print(f"Model Accuracy: {acc}%!!!")

While the companion shows the benefits of using a **larger dataset**, I still prefer to work with the **medium-sized dataset** for this project. In my opinion, **large datasets** can introduce **some noise** into the **embeddings**, which can **negatively impact** the **accuracy of the model**. Therefore, it is **crucial to ensure** that the **embeddings** are as **accurate as possible. For this reason**, I found that having an **average of 20 embeddings per person is sufficient for achieving high accuracy.**

---
Note : To obtain a **significant improvement** in the performance of the **face recognition model**, it is **crucial to retrain** it using the **new data set.** In our case, we only used a **pre-trained network** and achieved an **impressive 92% accuracy** on the **face recognition method**. However, if we want to take our **model's accuracy to the next level, we must train it on more extensive and diverse data sets**.

**Retraining the model** on **new data** will help it **learn and recognize different facial features** that were **not present in the original data set**. Moreover, it will help to **eliminate any bias** that **may have existed** in the **previous data set. This bias may** have resulted from the **lack of diversity in the original data set**, leading to **inaccurate or incomplete facial recognition**.