Face Verification "Is this the claimed person?" given two images and you have to determine if they are of the same person. 
- The simplest way to do this is to compare the two images pixel-by-pixel. If the distance between the raw images is below a chosen threshold, it may be the same person! 
- Of course, this algorithm performs poorly, since the pixel values change dramatically due to variations in lighting, orientation of the person's face, minor changes in head position, and so on.

#### rather than using the raw image, you can learn an encoding,  𝑓(𝑖𝑚𝑔) 
#### By using an encoding for each image, an element-wise comparison produces a more accurate judgement

- Eg A mobile phone that unlocks using your face is also using face verification. This is a 1:1 matching problem.

Face Recognition "Who is this person?" - Eg person details like name. This is a 1:K matching problem. 

## FaceNet

FaceNet learns a neural network that encodes a face image into a vector of 128 numbers. By comparing two such vectors, you can then determine if two pictures are of the same person.

### Tech Used

- one-shot learning to solve a face recognition problem
- triplet loss function to learn a network's parameters in the context of face recognition
- face recognition as a binary classification problem
- Map face images into 128-dimensional encodings using a pretrained model

- Images will be of shape  (𝑚,𝑛𝐻,𝑛𝑊,𝑛𝐶) .

In [1]:
# plain stack of layers where each layer has exactly one input tensor and one output tensor, allows to create models layer-by-layer in a step-by-step fashion
from tensorflow.keras.models import Sequential 

# Conv2D - used for convolution, layer creates a convolution kernel that is wind with layers input which helps produce a tensor of outputs.
# ZeroPadding2D - add rows and columns of zeros at the top, bottom, left and right side of an image tensor.
# Activation - Activation function decides, whether a neuron should be activated or not by calculating weighted sum and further adding bias with it, purpose of the activation function is to introduce non-linearity into the output of a neuron
# Concatenate - data from the input tensors is joined along the axis dimension
from tensorflow.keras.layers import Conv2D, ZeroPadding2D, Activation, Input, concatenate

# Model - groups layers into an object with training and inference features.
from tensorflow.keras.models import Model

# BatchNormalization - use to normalize the inputs of each layer
from tensorflow.keras.layers import BatchNormalization

# MaxPooling2D - Downsamples the input along its spatial dimensions (height and width) by taking the maximum value over an input window (of size defined by pool_size ) for each channel of the input.
# AveragePooling2D - Downsamples the input along its spatial dimensions (height and width) by taking the Average value over an input window (of size defined by pool_size ) for each channel of the input.
from tensorflow.keras.layers import MaxPooling2D, AveragePooling2D

from tensorflow.keras.layers import Concatenate

# Lambda - Used to constructing Sequential and Functional API models.
# Flatten - Used to flatten the data into vector format
# Dense - Used to make a Fully connected layer
from tensorflow.keras.layers import Lambda, Flatten, Dense

# glorot_uniform - from a uniform distribution within certain limits
from tensorflow.keras.initializers import glorot_uniform

# model_from_json - used to Parses a JSON model configuration string and returns a model instance
from tensorflow.keras.models import model_from_json

# Layer - object that takes as input one or more tensors and that outputs one or more tensors
from tensorflow.keras.layers import Layer
from tensorflow.keras import backend as K

# Sets the value of the image data format convention
K.set_image_data_format('channels_last')
import os # for accessing the file from the machine
import numpy as np # for manipulating the array
from numpy import genfromtxt # Used to load data from a text file, with missing values handled as specified
import pandas as pd # Used for data analysis
import tensorflow as tf # tensorflow for handling tensor values
import PIL # Python Imaging Library - Used for opening, manipulating, and saving many different image file formats

%matplotlib inline
%load_ext autoreload
%autoreload 2|

ModuleNotFoundError: No module named 'pandas'

## Encoding Face Images into a 128-Dimensional Vector
### Using a ConvNet to Compute Encodings

- FaceNet model takes a lot of data and a long time to train so we take the weigth which is already trained by others

- faceNet network uses 160x160 dimensional RGB images as its input. Specifically, a face image (or batch of  𝑚  face images) as a tensor of shape  (𝑚,𝑛𝐻,𝑛𝑊,𝑛𝐶)=(𝑚,160,160,3)

- The input images are originally of shape 96x96, thus, you need to scale them to 160x160. This is done in the img_to_encoding() function.

- The output is a matrix of shape  (𝑚,128)  that encodes each input face image into a 128-dimensional vector

In [None]:
json_file = open('keras-facenet-h5/model.json', 'r') # Opening the model which is in json format
loaded_model_json = json_file.read() # reading the json file
json_file.close() # closing the json file
model = model_from_json(loaded_model_json) # Loading the model from the readed data from json
model.load_weights('keras-facenet-h5/model.h5') # loading the weigth of the model

# printing the model input shape and output shape
print(model.inputs)
print(model.outputs)



The triplet loss function formalizes this, and tries to "push" the encodings of two images of the same person (Anchor and Positive) closer together, while "pulling" the encodings of two images of different persons (Anchor, Negative) further apart.

For an image  𝑥 , its encoding is denoted as  𝑓(𝑥) , where  𝑓  is the function computed by the neural network.

Training will use triplets of images  (𝐴,𝑃,𝑁) :

- A is an "Anchor" image--a picture of a person.
- P is a "Positive" image--a picture of the same person as the Anchor image.
- N is a "Negative" image--a picture of a different person than the Anchor image

You would thus like to minimize the following "triplet cost":

$$\mathcal{J} = \sum^{m}_{i=1} \large[ \small \underbrace{\mid \mid f(A^{(i)}) - f(P^{(i)}) \mid \mid_2^2}_\text{(1)} - \underbrace{\mid \mid f(A^{(i)}) - f(N^{(i)}) \mid \mid_2^2}_\text{(2)} + \alpha \large ] \small_+ \tag{3}$$
Here, the notation "$[z]_+$" is used to denote $max(z,0)$.

Here, the notation " [𝑧]+ " is used to denote  𝑚𝑎𝑥(𝑧,0) .

Notes:

The term (1) is the squared distance between the anchor "A" and the positive "P" for a given triplet; you want this to be small.
The term (2) is the squared distance between the anchor "A" and the negative "N" for a given triplet, you want this to be relatively large. It has a minus sign preceding it because minimizing the negative of the term is the same as maximizing that term.
𝛼  is called the margin. It's a hyperparameter that you pick manually. You'll use  𝛼=0.2 .

Since using a pretrained model, don't need to implement the triplet loss function

In [None]:
# GRADED FUNCTION: triplet_loss

def triplet_loss(y_true, y_pred, alpha = 0.2):
    """
    Implementation of the triplet loss as defined by formula (3)
    
    Arguments:
    y_true -- true labels, required when you define a loss in Keras, you don't need it in this function.
    y_pred -- python list containing three objects:
            anchor -- the encodings for the anchor images, of shape (None, 128)
            positive -- the encodings for the positive images, of shape (None, 128)
            negative -- the encodings for the negative images, of shape (None, 128)
    
    Returns:
    loss -- real number, value of the loss
    """
    
    anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]
    
    
    # Step 1: Compute the (encoding) distance between the anchor and the positive
    pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)), axis=-1)
    
    # Step 2: Compute the (encoding) distance between the anchor and the negative
    neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)), axis=-1)
    
    # Step 3: subtract the two previous distances and add alpha.
    basic_loss = tf.add(tf.subtract(pos_dist, neg_dist), alpha)
    
    # Step 4: Take the maximum of basic_loss and 0.0. Sum over the training examples.
    loss = tf.reduce_sum(tf.maximum(basic_loss, 0))
    
    return loss



In [None]:
tf.random.set_seed(1)
y_true = (None, None, None) # It is not used
y_pred = (tf.keras.backend.random_normal([3, 128], mean=6, stddev=0.1, seed = 1),
          tf.keras.backend.random_normal([3, 128], mean=1, stddev=1, seed = 1),
          tf.keras.backend.random_normal([3, 128], mean=3, stddev=4, seed = 1))
loss = triplet_loss(y_true, y_pred)



In [None]:
FRmodel = model # Loading the pre-trained model

### Face Verification

building a database(represented as a Python dictionary) containing one encoding vector for each person. To generate the encoding, we use img_to_encoding(image_path, model), which runs the forward propagation of the model on the specified image.

This database maps each person's name to a 128-dimensional encoding of their face.

In [None]:
def img_to_encoding(image_path, model):
    
    # loading the image 
    img = tf.keras.preprocessing.image.load_img(image_path, target_size=(160, 160))
    # converting the image to pixcel values and around it off
    img = np.around(np.array(img) / 255.0, decimals=12)
    # expandind the img value for making a fit as the input of the model 
    x_train = np.expand_dims(img, axis=0)
    # predicting the face embedding
    embedding = model.predict_on_batch(x_train)
    
    return embedding / np.linalg.norm(embedding, ord=2)

In [None]:
database = {}
database["danielle"] = img_to_encoding("images/danielle.png", FRmodel)
database["younes"] = img_to_encoding("images/younes.jpg", FRmodel)
database["tian"] = img_to_encoding("images/tian.jpg", FRmodel)
database["andrew"] = img_to_encoding("images/andrew.jpg", FRmodel)
database["kian"] = img_to_encoding("images/kian.jpg", FRmodel)
database["dan"] = img_to_encoding("images/dan.jpg", FRmodel)
database["sebastiano"] = img_to_encoding("images/sebastiano.jpg", FRmodel)
database["bertrand"] = img_to_encoding("images/bertrand.jpg", FRmodel)
database["kevin"] = img_to_encoding("images/kevin.jpg", FRmodel)
database["felix"] = img_to_encoding("images/felix.jpg", FRmodel)
database["benoit"] = img_to_encoding("images/benoit.jpg", FRmodel)
database["arnaud"] = img_to_encoding("images/arnaud.jpg", FRmodel)

In [None]:
danielle = tf.keras.preprocessing.image.load_img("images/danielle.png", target_size=(160, 160))
kian = tf.keras.preprocessing.image.load_img("images/kian.jpg", target_size=(160, 160))

Verify() function, which checks picture (image_path) is actually the person called "identity". You will have to go through the following steps:

- Compute the encoding of the image from image_path.
- Compute the distance between this encoding and the encoding of the identity image stored in the database.
- if the distance is less than 0.7, else do not open it.

As presented above, you should use the L2 distance np.linalg.norm.

Note: In this implementation, compare the L2 distance, not the square of the L2 distance, to the threshold 0.7.

In [None]:
# GRADED FUNCTION: verify

def verify(image_path, identity, database, model):
    """
    Function that verifies if the person on the "image_path" image is "identity".
    
    Arguments:
        image_path -- path to an image
        identity -- string, name of the person you'd like to verify the identity. Has to be an employee who works in the office.
        database -- python dictionary mapping names of allowed people's names (strings) to their encodings (vectors).
        model -- your Inception model instance in Keras
    
    Returns:
        dist -- distance between the image_path and the image of "identity" in the database.
        door_open -- True, if the door should open. False otherwise.
    """
    # Step 1: Compute the encoding for the image. Use img_to_encoding() see example above. (≈ 1 line)
    encoding = img_to_encoding(image_path,model)
    # Step 2: Compute distance with identity's image (≈ 1 line)
    dist = np.linalg.norm(encoding-database[identity])
    # Step 3: Open the door if dist < 0.7, else don't open (≈ 3 lines)
    if dist < 0.7:
        print("It's " + str(identity) + ", welcome in!")
        door_open = True
    else:
        print("It's not " + str(identity) + ", please go away")
        door_open = False
        
        
    return dist, door_open

## Face Reg

- Compute the target encoding of the image from image_path
- Find the encoding from the database that has smallest distance with the target encoding.
- Initialize the min_dist variable to a large enough number (100). This helps you keep track of the closest encoding to the input's encoding.
- Loop over the database dictionary's names and encodings. To loop use for (name, db_enc) in database.items().
- Compute the L2 distance between the target "encoding" and the current "encoding" from the database. If this distance is less than the min_dist, then set min_dist to dist, and identity to name.

In [None]:
# GRADED FUNCTION: who_is_it

def who_is_it(image_path, database, model):
    """
    Implements face recognition for the office by finding who is the person on the image_path image.
    
    Arguments:
        image_path -- path to an image
        database -- database containing image encodings along with the name of the person on the image
        model -- your Inception model instance in Keras
    
    Returns:
        min_dist -- the minimum distance between image_path encoding and the encodings from the database
        identity -- string, the name prediction for the person on image_path
    """
    ## Step 1: Compute the target "encoding" for the image. Use img_to_encoding() see example above. ## (≈ 1 line)
    encoding = img_to_encoding(image_path,model)
    
    ## Step 2: Find the closest encoding ##
    
    # Initialize "min_dist" to a large value, say 100 (≈1 line)
    min_dist = 100
        
    # Loop over the database dictionary's names and encodings.
    for (name, db_enc) in database.items():
        
        # Compute L2 distance between the target "encoding" and the current "emb" from the database. (≈ 1 line)
        dist = np.linalg.norm(encoding-db_enc)

        # If this distance is less than the min_dist, then set min_dist to dist, and identity to name. (≈ 3 lines)
        if dist < min_dist:
            min_dist = dist
            identity = name
    
    if min_dist > 0.7:
        print("Not in the database.")
    else:
        print ("it's " + str(identity) + ", the distance is " + str(min_dist))
        
    return min_dist, identity

In [None]:
# Test 1 with Younes pictures 
who_is_it("images/camera_0.jpg", database, FRmodel)


Ways to improve your facial recognition model:

Although you won't implement these here, here are some ways to further improve the algorithm:

Put more images of each person (under different lighting conditions, taken on different days, etc.) into the database. Then, given a new image, compare the new face to multiple pictures of the person. This would increase accuracy.

Crop the images to contain just the face, and less of the "border" region around the face. This preprocessing removes some of the irrelevant pixels around the face, and also makes the algorithm more robust.

What you should remember:

- Face verification solves an easier 1:1 matching problem; face recognition addresses a harder 1:K matching problem.

- Triplet loss is an effective loss function for training a neural network to learn an encoding of a face image.-

- The same encoding can be used for verification and recognition. Measuring distances between two images' encodings allows you to determine whether they are pictures of the same person.

<a name='6'></a>
## 6 - References
1. Florian Schroff, Dmitry Kalenichenko, James Philbin (2015). [FaceNet: A Unified Embedding for Face Recognition and Clustering](https://arxiv.org/pdf/1503.03832.pdf)

2. Yaniv Taigman, Ming Yang, Marc'Aurelio Ranzato, Lior Wolf (2014). [DeepFace: Closing the gap to human-level performance in face verification](https://research.fb.com/wp-content/uploads/2016/11/deepface-closing-the-gap-to-human-level-performance-in-face-verification.pdf)

3. This implementation also took a lot of inspiration from the official FaceNet github repository: https://github.com/davidsandberg/facenet

4. Further inspiration was found here: https://machinelearningmastery.com/how-to-develop-a-face-recognition-system-using-facenet-in-keras-and-an-svm-classifier/

5. And here: https://github.com/nyoki-mtl/keras-facenet/blob/master/notebook/tf_to_keras.ipynb