# Face detection and recognition inference pipeline

The following example illustrates how to use the `facenet_pytorch` python package to perform face detection and recogition on an image dataset using an Inception Resnet V1 pretrained on the VGGFace2 dataset.

The following Pytorch methods are included:
* Datasets
* Dataloaders
* GPU/CPU processing

In [1]:
from facenet_pytorch import MTCNN, InceptionResnetV1
import torch
from torch.utils.data import DataLoader
from torchvision import datasets
import numpy as np
import pandas as pd
import os

workers = 0 if os.name == 'nt' else 4

  from .autonotebook import tqdm as notebook_tqdm


#### Determine if an nvidia GPU is available

In [2]:
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print('Running on device: {}'.format(device))

Running on device: cuda:0


#### Define MTCNN module

Default params shown for illustration, but not needed. Note that, since MTCNN is a collection of neural nets and other code, the device must be passed in the following way to enable copying of objects when needed internally.

See `help(MTCNN)` for more details.

In [3]:
mtcnn = MTCNN(
    image_size=160, margin=0, min_face_size=20,
    thresholds=[0.6, 0.7, 0.7], factor=0.709, post_process=True,
    device=device
)

#### Define Inception Resnet V1 module

Set classify=True for pretrained classifier. For this example, we will use the model to output embeddings/CNN features. Note that for inference, it is important to set the model to `eval` mode.

See `help(InceptionResnetV1)` for more details.

In [4]:
resnet = InceptionResnetV1(pretrained='vggface2').eval().to(device)

#### Define a dataset and data loader

We add the `idx_to_class` attribute to the dataset to enable easy recoding of label indices to identity names later one.

In [22]:
def collate_fn(x):
    return x[0]

dataset = datasets.ImageFolder('../images')
dataset.idx_to_class = {i:c for c, i in dataset.class_to_idx.items()}
loader = DataLoader(dataset, collate_fn=collate_fn, num_workers=workers)

(<PIL.Image.Image image mode=RGB size=2133x3200>, 0)

In [23]:
dataset.idx_to_class

{0: 'KanyeWest',
 1: 'MargotRobbie',
 2: 'PrinceHarry',
 3: 'TomHolland',
 4: 'angelina_jolie',
 5: 'bradley_cooper',
 6: 'kate_siegel',
 7: 'paul_rudd',
 8: 'shea_whigham'}

#### Perfom MTCNN facial detection

Iterate through the DataLoader object and detect faces and associated detection probabilities for each. The `MTCNN` forward method returns images cropped to the detected face, if a face was detected. By default only a single detected face is returned - to have `MTCNN` return all detected faces, set `keep_all=True` when creating the MTCNN object above.

To obtain bounding boxes rather than cropped face images, you can instead call the lower-level `mtcnn.detect()` function. See `help(mtcnn.detect)` for details.

In [24]:
aligned = []
names = []
# only get one image per person
people = set()
for x, y in loader:
    person = dataset.idx_to_class[y]
    if person in people:
        continue
    people.add(person)
    x_aligned, prob = mtcnn(x, return_prob=True)
    if x_aligned is not None:
        print('Face detected with probability: {:8f}'.format(prob))
        aligned.append(x_aligned)
        names.append(dataset.idx_to_class[y])

Face detected with probability: 1.000000
Face detected with probability: 0.999670
Face detected with probability: 0.999314
Face detected with probability: 0.999936
Face detected with probability: 0.999983
Face detected with probability: 0.999934
Face detected with probability: 0.999733
Face detected with probability: 0.999876
Face detected with probability: 0.999992


In [2]:
# loop over all files in the images folder
import PIL
from PIL import Image
image_folder_path = '../images'
corrupted_files = []
# loop over all files in the images folder including subdirectories
for root, dirs, files in os.walk(image_folder_path):
    for file in files:
      try:
        image = Image.open(os.path.join(root, file))
      except PIL.UnidentifiedImageError as e:
        corrupted_files.append(os.path.join(root, file))
print(corrupted_files)

[]


#### Calculate image embeddings

MTCNN will return images of faces all the same size, enabling easy batch processing with the Resnet recognition module. Here, since we only have a few images, we build a single batch and perform inference on it. 

For real datasets, code should be modified to control batch sizes being passed to the Resnet, particularly if being processed on a GPU. For repeated testing, it is best to separate face detection (using MTCNN) from embedding or classification (using InceptionResnetV1), as calculation of cropped faces or bounding boxes can then be performed a single time and detected faces saved for future use.

In [25]:
aligned = torch.stack(aligned).to(device)
embeddings = resnet(aligned).detach().cpu()

#### Print distance matrix for classes

In [26]:
names

['KanyeWest',
 'MargotRobbie',
 'PrinceHarry',
 'TomHolland',
 'angelina_jolie',
 'bradley_cooper',
 'kate_siegel',
 'paul_rudd',
 'shea_whigham']

In [27]:
dists = [[(e1 - e2).norm().item() for e2 in embeddings] for e1 in embeddings]
print(pd.DataFrame(dists, columns=names, index=names))

                KanyeWest  MargotRobbie  PrinceHarry  TomHolland  \
KanyeWest        0.000000      1.464733     1.477928    1.648029   
MargotRobbie     1.464733      0.000000     1.477447    1.494487   
PrinceHarry      1.477928      1.477447     0.000000    1.467718   
TomHolland       1.648029      1.494487     1.467718    0.000000   
angelina_jolie   1.526968      1.013688     1.326358    1.454472   
bradley_cooper   1.584339      1.365534     1.240440    1.428000   
kate_siegel      1.511385      1.086190     1.342529    1.460030   
paul_rudd        1.482957      1.547630     1.387526    1.448142   
shea_whigham     1.374122      1.324691     1.362363    1.501615   

                angelina_jolie  bradley_cooper  kate_siegel  paul_rudd  \
KanyeWest             1.526968        1.584339     1.511385   1.482957   
MargotRobbie          1.013688        1.365534     1.086190   1.547630   
PrinceHarry           1.326358        1.240440     1.342529   1.387526   
TomHolland            1