## Real Time Face Recognition using ConvNet Project Report

##### Architecture Used:

* CNN
* inception model
* FaceNet

Below cell imports all the libraries required for the running the program

In [1]:
import PIL
import os,os.path
import mtcnn
import tensorflow as tf
import tensorflow.keras as ks
from keras.models import Sequential
from keras.layers import Conv2D, ZeroPadding2D, Activation, Input, concatenate
from keras.models import Model
from keras.layers.normalization import BatchNormalization
from keras.layers.pooling import MaxPooling2D, AveragePooling2D
from keras.layers.merge import Concatenate
from keras.layers.core import Lambda, Flatten, Dense
from keras.initializers import glorot_uniform
from keras.engine.topology import Layer
from keras import backend as K
K.set_image_data_format('channels_first')
import cv2
import os
import re
import numpy as np
from numpy import genfromtxt
import pandas as pd
import tensorflow as tf
from fr_utils import *
from inception_blocks_v2 import *
import matplotlib.pyplot as plt
%matplotlib inline


Using TensorFlow backend.


In [2]:
def _get_available_gpus():
    """Get a list of available gpu devices (formatted as strings).

    # Returns
        A list of available GPU devices.
    """
    #global _LOCAL_DEVICES
    if tfback._LOCAL_DEVICES is None:
        devices = tf.config.list_logical_devices()
        tfback._LOCAL_DEVICES = [x.name for x in devices]
    return [x for x in tfback._LOCAL_DEVICES if 'device:gpu' in x.lower()]

In [3]:
import keras.backend.tensorflow_backend as tfback
tfback._get_available_gpus = _get_available_gpus

1. By understanding the architecture of the FaceNet and using Openface github rep i tried to develop the architecture similar in    the FaceNet using inception model.

2. Since training a deep CNN model requires large dataset and high computational power for training i have used pretrained model    weights of the FaceNet from Openface and loaded into my model.

3. Below cell creates a model and initializes the model with pretrained weights


In [4]:
FRmodel = faceRecoModel(input_shape=(3, 96, 96))
load_weights_from_FaceNet(FRmodel)

By running the below cell we see our model input shape and output shape
* input= (m,3,96,96)
* output= (m,128)

In [12]:
print(FRmodel.inputs)
print(FRmodel.outputs)

[<tf.Tensor 'input_1:0' shape=(None, 3, 96, 96) dtype=float32>]
[<tf.Tensor 'lambda_1/l2_normalize:0' shape=(None, 128) dtype=float32>]


##### In the preprocessing step i have tried different ways for input to the pretrained model:

1. Used just the faces of different people but with little higher extra pixels and i was able to get the results but the         threshold ie. minimum distance for recognition has to be set to be around 2 which is little high.
 
2. Used more closer faces with not much extra pixels and this time i was able to get results with minimum distance for recognition has to be set around 1 with it ok.

3. Explored different face detection and finally MTCNN did much better than all the things. So used MTCNN for face detection in pre-processing stage in pipeline

###### Running the  below 2 cells will do:

1. opens the webcam of the laptop.
2. takes around 10 images of the person in front of webcam and detect the faces from images using MTCNN and stores in the present working directory images folder. This serves as a database file which we will use later for 128 embedding vectors.

In [39]:
def get_faces_to_database_for_embedding():
    vc = cv2.VideoCapture(0)
    if vc.isOpened(): # try to get the first frame
        rval, frame = vc.read()
    else:
        rval = False
    count=0
    while rval:
        cv2.imshow("Face Recognition", frame)
        rval, frame = vc.read()
        image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        pix=np.asarray(image)
        K.set_image_data_format('channels_last')
        detect= mtcnn.MTCNN()
        result_faces=detect.detect_faces(pix)
        for fa in result_faces:
            (x,y,width,height)=fa['box']
            x1=abs(x)
            y1=abs(y)
            x2=x1+width
            y2=y1+height
            face=pix[y1:y2,x1:x2]
            image = PIL.Image.fromarray(face)
            image = image.resize((96,96),PIL.Image.ANTIALIAS)
            image.save('images/janardhan'+str(count+1)+'.jpg')
            count+=1
        key = cv2.waitKey(20)
        if key == 27: # exit on ESC
            break
        elif count>=10:
            break
    cv2.destroyWindow("Face Recognition")
    vc.release()

In [40]:
get_faces_to_database_for_embedding()

##### Running the below 2 cells will:

1. It creates a dictionary of 128 embedding vectors for each image file in the database using our model which has pretrained initial weights.

2. This embedding vectors are latter used to find the similarity

In [5]:
def create_image_embedding():
    dict={}
    for f in os.listdir('images'):
        impath='images\\'+f
        dict[f.split('.')[0]]=img_to_encoding(impath,FRmodel)
    return dict
        

In [6]:
DB_faces=create_image_embedding()

###### Below function is use to the similarity between images:

1. This function takes the image path which has to be recognized and conver that into 128 embedding vector.
2. Using for loop it compares with each of the 128 embedding vector of databse image and see if there is any similarity.
3. If the minimum distance is greater than 0.7  we consider them as not similar and hence not identified.
4. If the minimum distance is less than 0.7 then it return the distance and name of the person

In [7]:
def who_is_it(image_path, database, model):
   
    
    encoding = img_to_encoding(image_path,model)
    

    min_dist = 100
    
    for (name, db_enc) in database.items():
        
        dist = np.linalg.norm(db_enc-encoding)
        if dist<min_dist:
            min_dist = dist
            identity = name

    
    if min_dist > 0.7:
        #print("Not in the database.")
        identity=None
    #else:
        #print ("it's " + str(identity) + ", the distance is " + str(min_dist))
        
    return min_dist, identity

###### Running the below 2 cells will do:

1. Open the webcam for input image for Face recgonition
2. It perform MTCNN algorithm on image and detect the face in image
3. From this we have the (x1,y1),(x2,y2) of the face.
4. Now we perform who_is_it to see if the identified face has any similar face in database.
5. if it identifies any image we get the name else None.
6. We draw the box around the face using the coordinates we obtained and write the name of similar identity or None is written.

In [8]:
def recognize_input_face(database):
    vc = cv2.VideoCapture(0)
    if vc.isOpened(): # try to get the first frame
        rval, frame = vc.read()
    else:
        rval = False
    while rval:
        cv2.imshow("Face Recognition", frame)
        rval, frame = vc.read()
        image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        pix=np.asarray(image)
        K.set_image_data_format('channels_last')
        detect= mtcnn.MTCNN()
        result_faces=detect.detect_faces(pix)
        K.set_image_data_format('channels_first')
        for fa in result_faces:
            (x,y,width,height)=fa['box']
            x1=abs(x)
            y1=abs(y)
            x2=x1+width
            y2=y1+height
            face=pix[y1:y2,x1:x2]
            image = PIL.Image.fromarray(face)
            image = image.resize((96,96),PIL.Image.ANTIALIAS)
            image.save('temp.jpg') 
            _,name=who_is_it('temp.jpg',database,FRmodel)
            image =cv2.rectangle(frame,(x1, y1),(x2, y2),(255,255,255),2)
            if name is not None:
                image=cv2.putText(image,re.split(r'(\d+)',name)[0],(x1+5,y1-5),cv2.FONT_HERSHEY_SIMPLEX,1,(255,255,255),2)
            else:
                image=cv2.putText(image,str(name),(x1+5,y1-5),cv2.FONT_HERSHEY_SIMPLEX,1,(255,255,255),2)
        cv2.imshow("Face Recognition", image)
        key = cv2.waitKey(100)
        if key == 27: # exit on ESC
            break
    cv2.destroyWindow("Face Recognition")
    vc.release()

In [9]:
recognize_input_face(DB_faces)

### References:

1. FaceNet: A Unified Embedding for Face Recognition and Clustering (https://arxiv.org/pdf/1503.03832.pdf)

2. DeepFace:Closing the Gap to Human-Level Performance in FaceVerification(https://www.cs.toronto.edu/~ranzato/publications/taigman_cvpr14.pdf)

3. Pre-trained OpenFace mode from Keras-OpenFace(Pre trained weights can be found here) (https://github.com/iwantooxxoox/Keras-OpenFace) 

4. https://github.com/davidsandberg/facenet
