# Face Recognition 

Here you will build a face recognition system. Many of the ideas presented here are from [FaceNet](https://arxiv.org/pdf/1503.03832.pdf). In lecture, we also talked about [DeepFace](https://research.fb.com/wp-content/uploads/2016/11/deepface-closing-the-gap-to-human-level-performance-in-face-verification.pdf). 

Face recognition problems commonly fall into two categories: 

- **Face Verification** -  
- **Face Recognition** - 

FaceNet learns a neural network that encodes a face image into a vector of 128 numbers. By comparing two such vectors, you can then determine if two pictures are of the same person.
    

In [2]:
from keras.models import Sequential
from keras.layers import Conv2D, ZeroPadding2D, Activation, Input, concatenate
from keras.models import Model
from keras.layers.normalization import BatchNormalization
from keras.layers.pooling import MaxPooling2D, AveragePooling2D
from keras.layers.merge import Concatenate
from keras.layers.core import Lambda, Flatten, Dense
from keras.initializers import glorot_uniform
from keras.engine.topology import Layer
from keras import backend as K
K.set_image_data_format('channels_first')
import cv2
import os
import numpy as np
from numpy import genfromtxt
import pandas as pd
import tensorflow as tf
from fr_utils import *
from inception_blocks_v2 import *

%matplotlib inline
%load_ext autoreload
%autoreload 2

np.set_printoptions(threshold=np.nan)

## 1 - Encoding face images into a 128-dimensional vector 

- This network uses 96x96 dimensional RGB images as its input. Specifically, inputs a face image (or batch of $m$ face images) as a tensor of shape $(m, n_C, n_H, n_W) = (m, 3, 96, 96)$ 
- It outputs a matrix of shape $(m, 128)$ that encodes each input face image into a 128-dimensional vector

In [3]:
FRmodel = faceRecoModel(input_shape=(3, 96, 96))

Instructions for updating:
dim is deprecated, use axis instead


In [4]:
print("Total Params:", FRmodel.count_params())

Total Params: 3743280




### 1.2 - The Triplet Loss


These triplets are picked from our training dataset. We will write $(A^{(i)}, P^{(i)}, N^{(i)})$ to denote the $i$-th training example. 

You'd like to make sure that an image $A^{(i)}$ of an individual is closer to the Positive $P^{(i)}$ than to the Negative image $N^{(i)}$) by at least a margin $\alpha$:

$$\mid \mid f(A^{(i)}) - f(P^{(i)}) \mid \mid_2^2 + \alpha < \mid \mid f(A^{(i)}) - f(N^{(i)}) \mid \mid_2^2$$

You would thus like to minimize the following "triplet cost":

$$\mathcal{J} = \sum^{N}_{i=1} \large[ \small \underbrace{\mid \mid f(A^{(i)}) - f(P^{(i)}) \mid \mid_2^2}_\text{(1)} - \underbrace{\mid \mid f(A^{(i)}) - f(N^{(i)}) \mid \mid_2^2}_\text{(2)} + \alpha \large ] \small_+ \tag{3}$$

Here, we are using the notation "$[z]_+$" to denote $max(z,0)$.  

Notes:
- The term (1) is the squared distance between the anchor "A" and the positive "P" for a given triplet; you want this to be small. 
- The term (2) is the squared distance between the anchor "A" and the negative "N" for a given triplet, you want this to be relatively large, so it thus makes sense to have a minus sign preceding it. 
- $\alpha$ is called the margin. It is a hyperparameter that you should pick manually. We will use $\alpha = 0.9$. 




In [1]:
# GRADED FUNCTION: triplet_loss

def triplet_loss(y_true, y_pred, alpha = 0.9):    
    
    anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]
    
    pos_dist = tf.square(anchor-positive)

    neg_dist = tf.square(anchor-negative)
    
    basic_loss = tf.reduce_sum(pos_dist-neg_dist)+alpha
    
    loss = tf.reduce_sum(tf.maximum(basic_loss,0.))

    
    return loss

## 2 - Loading the trained model

FaceNet is trained by minimizing the triplet loss. 

In [6]:
FRmodel.compile(optimizer = 'adam', loss = triplet_loss, metrics = ['accuracy'])
load_weights_from_FaceNet(FRmodel)

## 3 - Applying the model

### 3.1 - Face Verification

Let's build a database containing one encoding vector for each authentic person.

In [9]:
import glob
from os.path import basename

database = {}

for files in glob.glob("./database/*.jpg"):        
    name = (os.path.splitext(basename(files))[0])        
    database[name]= img_to_encoding(files, FRmodel)
    

'\ndatabase["Raj"] = img_to_encoding("images/Raj.png", FRmodel)\ndatabase["Ram"] = img_to_encoding("images/Ram.png", FRmodel)\ndatabase["Mandip"] = img_to_encoding("images/Mandip.png", FRmodel)\ndatabase["Jigar"] = img_to_encoding("images/Jigar.png", FRmodel)\ndatabase["Jitesh"] = img_to_encoding("images/Jitesh.png", FRmodel)\ndatabase["Narendra"] = img_to_encoding("images/Narendra.png", FRmodel)\ndatabase["Sandeep"] = img_to_encoding("images/Sandeep.png", FRmodel)\ndatabase["Yogesh"] = img_to_encoding("images/Yogesh.png", FRmodel)\ndatabase["Pratik"] = img_to_encoding("images/Pratik.png", FRmodel)\n'

In [10]:
# GRADED FUNCTION: verify

def verify(image_path, identity, database, model):
    
    
    encoding = img_to_encoding(image_path, FRmodel)
        
    dist = np.linalg.norm(encoding-database[identity])
       
    if dist<0.5:
        print("It's " + str(identity) + ", welcome home!")
        door_open = 1
    else:
        print("It's not " + str(identity) + ", please go away")
        door_open = 0  
        
    return dist, door_open

In [12]:
verify("Camera1.jpg", "Mandip", database, FRmodel)

It's Mandip, welcome home!


(0.27289486, 1)

In [13]:
verify("Camera2.jpg", "Sandeep", database, FRmodel)

It's not Sandeep, please go away


(0.7324429, 0)

### 3.2 - Face Recognition

Your face verification system is mostly working well. But since Kian got his ID card stolen, when he came back to the house that evening he couldn't get in! 

To reduce such shenanigans, you'd like to change your face verification system to a face recognition system. This way, no one has to carry an ID card anymore. An authorized person can just walk up to the house, and the front door will unlock for them! 

You'll implement a face recognition system that takes as input an image, and figures out if it is one of the authorized persons (and if so, who). Unlike the previous face verification system, we will no longer get a person's name as another input. 

**Exercise**: Implement `who_is_it()`. You will have to go through the following steps:
1. Compute the target encoding of the image from image_path
2. Find the encoding from the database that has smallest distance with the target encoding. 
    - Initialize the `min_dist` variable to a large enough number (100). It will help you keep track of what is the closest encoding to the input's encoding.
    - Loop over the database dictionary's names and encodings. To loop use `for (name, db_enc) in database.items()`.
        - Compute L2 distance between the target "encoding" and the current "encoding" from the database.
        - If this distance is less than the min_dist, then set min_dist to dist, and identity to name.

In [18]:
# GRADED FUNCTION: who_is_it

def who_is_it(image_path, database, model):

    
    encoding = img_to_encoding(image_path, model)

    min_dist = 100
      
    for (name, db_enc) in database.items():
                
        dist = np.linalg.norm(encoding-database[name])
       
        if dist<min_dist:
            min_dist = dist
            identity = name
    
    if min_dist > 0.6:
        print("Not in the database.")
    else:
        print ("it's " + str(identity) + ", the distance is " + str(min_dist))
        
    return min_dist, identity

In [21]:
who_is_it("jd2.jpg", database, FRmodel)

it's Jigar, the distance is 0.42339987


(0.42339987, 'Jigar')

<font color='blue'>
**What you should remember**:
- Face verification solves an easier 1:1 matching problem; face recognition addresses a harder 1:K matching problem. 
- The triplet loss is an effective loss function for training a neural network to learn an encoding of a face image.
- The same encoding can be used for verification and recognition. Measuring distances between two images' encodings allows you to determine whether they are pictures of the same person. 

### References:

- Florian Schroff, Dmitry Kalenichenko, James Philbin (2015). [FaceNet: A Unified Embedding for Face Recognition and Clustering](https://arxiv.org/pdf/1503.03832.pdf)
- Yaniv Taigman, Ming Yang, Marc'Aurelio Ranzato, Lior Wolf (2014). [DeepFace: Closing the gap to human-level performance in face verification](https://research.fb.com/wp-content/uploads/2016/11/deepface-closing-the-gap-to-human-level-performance-in-face-verification.pdf) 
- The pretrained model we use is inspired by Victor Sy Wang's implementation and was loaded using his code: https://github.com/iwantooxxoox/Keras-OpenFace.
- Our implementation also took a lot of inspiration from the official FaceNet github repository: https://github.com/davidsandberg/facenet 
