# Let's recognise faces

## Naive Face Verification



### Packages needed



You have started by creating a virtual environment as per the readme.MD file. It eeds to be activated for this notebook!

Can install needed packages in the next cell.

In [None]:
%pip install -r requirements.txt

Some Jupyter housekeeping

In [None]:
%matplotlib inline
%load_ext autoreload
%autoreload 2

## Let's get an inception ResNet model trained with VGGface



Inception Model

https://arxiv.org/pdf/1409.4842v1

Neural network architecture codenamed Inception, which was responsible for setting the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14)

![Why](meme.png)


Two different kinds of inception blocks

![module](module.png)

Output size stays the same regardless of kernel size

![Inception3D](inception3d.jpg)

Which combine into GoogleNet

![Summary](summary.png)

![Layers](layers.png)

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
from facenet_pytorch import InceptionResnetV1, MTCNN, extract_face
from torchview import draw_graph
from sklearn.manifold import TSNE
from PIL import Image
import matplotlib.pyplot as plt
from matplotlib.offsetbox import OffsetImage, AnnotationBbox

import os
import math
import numpy as np

from IPython.display import display

Getting a resnet

In [None]:
resnet = InceptionResnetV1(pretrained="vggface2", classify=True).cpu().eval()

Visualise...

In [None]:
model_graph = draw_graph(
    resnet, input_size=(1, 3, 256, 256), expand_nested=True
)
model_graph.visual_graph

The way PyTorch describes it

In [None]:
resnet

We use a guillotine to create `resnet_vector` through the `tweaked_resnet` class:

In [None]:
class tweaked_resnet(nn.Module):
    """
    A decapitated resnet version

    The logits layer is removed from the resnet model
    """

    def __init__(self):
        super(tweaked_resnet, self).__init__()
        self.resnet = InceptionResnetV1(
            pretrained="vggface2", classify=True
        ).cpu()
        self.resnet.logits = nn.Identity()

    def forward(self, x):
        return self.resnet(x)

In [None]:
resnet_vector = tweaked_resnet()
resnet_vector.eval()

Let's see how it looks like:

In [None]:
model_graph = draw_graph(
    resnet_vector,
    input_size=(1, 3, 256, 256),
    expand_nested=True,
)
model_graph.visual_graph

## Encoding images

A helper to transform, Multitask Cascaded Convolutional Networks for Face Detection and Alignment:

MTCNN uses a cascade of three networks to detect faces and facial landmarks:

* *PNet (Proposal Network)*: Scans the image and proposes candidate face regions.
* *RNet (Refine Network)*: Refines the face proposals from PNet.
* *ONet (Output Network)*: Detects facial landmarks (eyes, nose, mouth) and provides a final refinement of the bounding boxes.

In [None]:
mtcnn = MTCNN(select_largest=False)
mtcnn

We are using it as a helper but we are using the decapitated inception ResNet (`resnet_vector`) for the encoding:

In [None]:
def img_to_encoding(image_path, model, transform):
    """
    Convert any image into a 128-dimensional vector using the given model.
    """
    img = Image.open(image_path)
    img_t = transform(img)
    batch_t = torch.unsqueeze(img_t, 0)
    with torch.no_grad():
        output = model(batch_t)  # here we get the 512-dimensional vector
    return (output, image_path)

We have a set of images in a folder:

In [None]:
FOLDER = "images"

We will start encoding a picture of Jason Chan:

<img src="images/chan1.png" style="width:250px;height:250px;">

In [None]:
CHAN1 = FOLDER + "/chan1.png"

jason_chan = img_to_encoding(CHAN1, resnet_vector, mtcnn)

Let's visualise what happened:

In [None]:
to_pil = torchvision.transforms.ToPILImage()
invert = torchvision.transforms.functional.invert

mtcnn2 = MTCNN(select_largest=False, post_process=False)
tensor = mtcnn2(Image.open(CHAN1))
img = invert(to_pil(tensor))
display(img)

The image above is after the transform (MTCNN), and below we get the vector:

In [None]:
print(jason_chan[0].shape)
print(jason_chan[0])

Now let's encode a group of people!

In [None]:
database = {}
database["Sammy Sum"] = img_to_encoding(
    FOLDER + "/sum1.png", resnet_vector, mtcnn
)
database["Jason Chan"] = img_to_encoding(
    FOLDER + "/chan1.png", resnet_vector, mtcnn
)
database["Alex Fong"] = img_to_encoding(
    FOLDER + "/fong1.png", resnet_vector, mtcnn
)
database["Dada Chan"] = img_to_encoding(
    FOLDER + "/dada1.png", resnet_vector, mtcnn
)
database["Niki Chow"] = img_to_encoding(
    FOLDER + "/niki1.png", resnet_vector, mtcnn
)
database["Shiga Lin"] = img_to_encoding(
    FOLDER + "/shiga1.png", resnet_vector, mtcnn
)
database["Gillian Chung"] = img_to_encoding(
    FOLDER + "/gillian1.png", resnet_vector, mtcnn
)
database["Charlene Choi"] = img_to_encoding(
    image_path=FOLDER + "/choi1.png", model=resnet_vector, transform=mtcnn
)

A small helper here to visualise images in a folder:

In [None]:
def show_images_in_folder(
    folder, transformed=True, sigourney=True, last_number=""
):
    """
    Show all images in a folder.
    Can filter Sigourney Weaver images.
    Returns a list of vectors and image filenames.
    """
    image_files = [
        f
        for f in sorted(os.listdir(folder))
        if f.endswith((last_number + ".png"))
    ]
    if not sigourney:
        image_files = [f for f in image_files if not f.startswith("sig")]

    images_per_row = 6
    _, axes = plt.subplots(
        math.ceil(len(image_files) / images_per_row),
        images_per_row,
        figsize=(7, 4),
    )
    axes = axes.flatten()

    files_list, vectors_list = [], []
    for idx, image_file in enumerate(image_files):
        img_path = os.path.join(folder, image_file)
        if transformed:
            img = invert(to_pil(mtcnn2(Image.open(img_path))))
        else:
            img = Image.open(img_path)

        axes[idx].imshow(img)
        axes[idx].set_title(image_file)
        axes[idx].axis("off")
        vec, _ = img_to_encoding(img_path, resnet_vector, mtcnn)
        files_list.append(image_file)
        vectors_list.append(vec[0])

    for i in range(len(image_files), len(axes)):
        axes[i].axis("off")
    plt.tight_layout()
    plt.show()
    return np.array(vectors_list), files_list

The selection contained:

In [None]:
_, _ = show_images_in_folder(
    FOLDER, transformed=False, sigourney=False, last_number="1"
)

To quickly see what we have in the database now:

In [None]:
_, _ = show_images_in_folder(FOLDER, sigourney=False, last_number="1")

It seems we are ready to start recognising and verifying faces!

## The HK boys: identity verification

We are comparing the vector database entry with a specific vector

In [None]:
def verify(image_path, identity, database, model, transform):
    """
    Function that verifies if the person on the "image_path" image is "identity".
    """
    encoding = img_to_encoding(image_path, model, transform)[0]

    # This is where we get the cosine similarity between the two vectors
    # Cosine similarity is working with angles, not magnitudes, and we get a value between -1 and 1
    # We want to get a dissimilarity value, so we take 1 - cosine similarity
    dist = 1 - F.cosine_similarity(database[identity][0], encoding)

    if dist < 0.5:
        print("It's " + str(identity) + ", welcome in!")
        door_open = True
    else:
        print("It's not " + str(identity) + ", please go away")
        door_open = False

    return dist, door_open

In [None]:
def side_by_side(database_img, input_img):
    """
    Shows image and recognised image
    """
    to_pil = torchvision.transforms.ToPILImage()
    invert = torchvision.transforms.functional.invert
    mtcnn2 = MTCNN(select_largest=False, post_process=False)

    img1 = invert(to_pil(mtcnn2(Image.open(database_img))))
    img2 = invert(to_pil(mtcnn2(Image.open(input_img))))

    fig, axes = plt.subplots(1, 2, figsize=(4, 2))
    axes = axes.flatten()

    axes[0].axis("off")
    axes[0].imshow(img1)
    axes[0].set_title("Database")
    axes[1].axis("off")
    axes[1].imshow(img2)
    axes[1].set_title("Input")

    plt.tight_layout()
    plt.show()

Jason Chan is now trying to enter the office, this is how he looks like today:

<img src=FOLDER + "/chan2.png" style="width:250px;height:250px;">

In [None]:
CHAN2 = FOLDER + "/chan2.png"

distance, door_open_flag = verify(
    CHAN2, "Jason Chan", database, resnet_vector, mtcnn
)
print("(", distance, ",", door_open_flag, ")")

What did the system compare?

In [None]:
side_by_side(database["Jason Chan"][1], CHAN2)

Alex Fong is now trying to enter the office, but with Jason Chan's card... he's coming from the swimming pool today:

<img src="images/fong2.png" style="width:250px;height:250px;">

In [None]:
FONG2 = FOLDER + "/fong2.png"

distance, door_open_flag = verify(
    FONG2, "Jason Chan", database, resnet_vector, mtcnn
)
print("(", distance, ",", door_open_flag, ")")

What did the system compare?

In [None]:
side_by_side(database["Jason Chan"][1], FONG2)

But he decides to put his real identity, still out of the pool!

In [None]:
distance, door_open_flag = verify(
    FONG2, "Alex Fong", database, resnet_vector, mtcnn
)
print("(", distance, ",", door_open_flag, ")")

What did the system compare?

In [None]:
side_by_side(database["Alex Fong"][1], FONG2)

Then Sammy Sum shows up, happily:

<img src="images/sum2.png" style="width:250px;height:250px;">

In [None]:
SUM2 = FOLDER + "/sum2.png"

distance, door_open_flag = verify(
    SUM2, "Sammy Sum", database, resnet_vector, mtcnn
)
print("(", distance, ",", door_open_flag, ")")

What did the system compare?

In [None]:
side_by_side(database["Sammy Sum"][1], SUM2)


## The HK girls: recognising a face

Now we are getting the closest entry in the vector database


In [None]:
def who_is_it(image_path, database, model, transform):
    """
    Implements face recognition
    """
    encoding = img_to_encoding(image_path, model, transform)[0]
    min_dist = 100
    for name, (db_enc, _) in database.items():
        dist = 1 - F.cosine_similarity(db_enc, encoding)
        if dist < min_dist:
            min_dist = dist
            identity = name

    if min_dist > 0.7:
        print("Not in the database.")
    else:
        print("it's " + str(identity) + ", the distance is " + str(min_dist))
    return min_dist, identity

Dada Chan is now trying to enter the office:

<img src="images/dada2.png" style="width:250px;height:250px;">

In [None]:
DADA2 = FOLDER + "/dada2.png"

_, who = who_is_it(DADA2, database, resnet_vector, mtcnn)

In [None]:
side_by_side(database[who][1], DADA2)

Gillian Chung is now trying to enter the office:

<img src="images/gillian2.png" style="width:250px;height:250px;">

In [None]:
GILLIAN2 = FOLDER + "/gillian2.png"

_, who = who_is_it(GILLIAN2, database, resnet_vector, mtcnn)

In [None]:
side_by_side(database[who][1], GILLIAN2)

And Shiga Lin

<img src="images/shiga2.png" style="width:250px;height:250px;">

In [None]:
SHIGA2 = FOLDER + "/shiga2.png"

_, who = who_is_it(SHIGA2, database, resnet_vector, mtcnn)

In [None]:
side_by_side(database[who][1], SHIGA2)

# Clustering the images

Naughty Sigourney Weaver has added herself to the HK team... maybe not?

The full folder:

In [None]:
vectors, files = show_images_in_folder(FOLDER, transformed=False)

The full database:

In [None]:
vectors, files = show_images_in_folder(FOLDER)

## Principal Component Analysis

PCA finds the directions of maximum variance in the data and projects the data onto those directions. 

This way we can see which vectors are close or far

Note that this is distance, which is NOT cosine similarity, but there is some correlation

In [None]:
def plot_2d(vectors_2d, files_list):
    """
    2D scatter plot of the images based on their position in a compact space
    """
    _, ax = plt.subplots(figsize=(6, 6))
    ax.scatter(vectors_2d[:, 0], vectors_2d[:, 1])

    for vec, image_file in zip(vectors_2d, files_list):
        img = invert(
            to_pil(mtcnn2(Image.open(os.path.join(FOLDER, image_file))))
        )
        ab = AnnotationBbox(
            offsetbox=OffsetImage(img, zoom=0.2),
            xy=vec,
            frameon=False,
            box_alignment=(0, 0),
        )
        ax.add_artist(ab)
    plt.show()

In [None]:
tensor = torch.tensor(vectors)
_, _, V = torch.pca_lowrank(tensor, q=2)
tensor_pca = tensor @ V[:, :2]

vectors.shape, tensor_pca.numpy().shape

In [None]:
plot_2d(tensor_pca, files)

## t-SNE Dimensionality Reduction

t-distributed Stochastic Neighbor Embedding

* Converts Euclidean distances between points into probabilities using a Gaussian distribution in high-dimensional space.
* In low-dimensional space, it uses a Student's t-distribution to compute similarities, which helps mitigate the "crowding problem" by allowing more flexibility.

In [None]:
vectors_embedded = TSNE(
    n_components=2, learning_rate="auto", init="random", perplexity=3
).fit_transform(vectors)

vectors_embedded.shape

In [None]:
plot_2d(vectors_embedded, files)

And that's all folks!