`On for a powerful face recognition and verification prefer to your course on Deep Learning`

`For OpenCV implementation of powerful face recognition, prefer to https://www.pyimagesearch.com/2018/06/18/face-recognition-with-opencv-python-and-deep-learning/`

# Introduction

We are working on **Face Recognition** with OpenCV (Open Source Computer Vision).

To create a complete project on Face Recognition, we must work on 3 very distinct phases:

  - Face Detection and Data Gathering
  - Train the Recognizer
  - Face Recognition

The below block diagram resumes those phases:

![picture](https://miro.medium.com/max/1020/0*oJIRaoERCUHoyylG.)

## Testing camera

This will output the results both in RGB and GRAY formats

`Note` --> Run it on locally machine with WebCam

In [2]:
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
cap.set(3,640) # set Width
cap.set(4,480) # set Height
while(True):
    ret, frame = cap.read()
    #frame = cv2.flip(frame, -1) # Flip camera vertically
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    
    cv2.imshow('frame', frame)
    cv2.imshow('gray', gray)
    
    k = cv2.waitKey(30) & 0xff
    if k == 27: # press 'ESC' to quit
        break
cap.release()
cv2.destroyAllWindows()

## Face Detection

![picture](https://s3-us-west-2.amazonaws.com/static.pyimagesearch.com/face-recognition-opencv/face_recognition_opencv_animation_01.gif)

How does it work ?

**The secret is a technique called deep metric learning.**

If you have any prior experience with deep learning you know that we typically train a network to:

  - Accept a single input image
  - And output a classification/label for that image

However, deep metric learning is different.

Instead, of trying to output a single label (or even the coordinates/bounding box of objects in an image), we are instead outputting a real-valued feature vector.

For the dlib facial recognition network, the output feature vector is 128-d (i.e., a list of 128 real-valued numbers) that is used to quantify the face. Training the network is done using triplets:

![picture](https://www.pyimagesearch.com/wp-content/uploads/2018/06/face_recognition_opencv_triplet.jpg)

Here we provide three images to the network:

  - Two of these images are example faces of the same person.
  - The third image is a random face from our dataset and is not the same person as the other two images.

As an example, let’s again consider Figure above where we provided three images: one of Chad Smith and two of Will Ferrell.

Our network quantifies the faces, constructing the 128-d embedding (quantification) for each.

From there, the general idea is that we’ll tweak the weights of our neural network so that the 128-d measurements of the two Will Ferrel will be closer to each other and farther from the measurements for Chad Smith.

Our network architecture for face recognition is based on ResNet-34 from the [Deep Residual Learning for Image Recognition paper by He et al.](https://arxiv.org/abs/1512.03385), but with fewer layers and the number of filters reduced by half.

The network itself was trained by [Davis King](https://pyimagesearch.com/2017/03/13/an-interview-with-davis-king-creator-of-the-dlib-toolkit/) on a dataset of ~3 million images. On the [Labeled Faces in the Wild (LFW) dataset](http://vis-www.cs.umass.edu/lfw/) the network compares to other state-of-the-art methods, reaching `99.38% accuracy`.

Both Davis King (the creator of dlib) and Adam Geitgey (the author of the face_recognition module we’ll be using shortly) have written detailed articles on how deep learning-based facial recognition works:

  - [High Quality Face Recognition with Deep Metric Learning ](http://blog.dlib.net/2017/02/high-quality-face-recognition-with-deep.html) (Davis)

  - [Modern Face Recognition with Deep Learning](https://medium.com/@ageitgey/machine-learning-is-fun-part-4-modern-face-recognition-with-deep-learning-c3cffc121d78) (Adam)

## Let Create the Scene

## Some little talk

The most basic task on Face Recognition is of course, “Face Detecting”. Before anything, you must “capture” a face (Phase 1) in order to recognize it, when compared with a new face captured on future (Phase 3).

The most common way to detect a face (or any objects), is using the [“Haar Cascade classifier”](https://docs.opencv.org/3.3.0/d7/d8b/tutorial_py_face_detection.html)

Object Detection using Haar feature-based cascade classifiers is an effective object detection method proposed by Paul Viola and Michael Jones in their paper, “Rapid Object Detection using a Boosted Cascade of Simple Features” in 2001. It is a machine learning based approach where a cascade function is trained from a lot of positive and negative images. It is then used to detect objects in other images.

Here we will work with face detection. Initially, the algorithm needs a lot of positive images (images of faces) and negative images (images without faces) to train the classifier. Then we need to extract features from it. The good news is that OpenCV comes with a trainer as well as a detector. If you want to train your own classifier for any object like car, planes etc. you can use OpenCV to create one. Its full details are given here: [Cascade Classifier Training.](https://docs.opencv.org/3.3.0/dc/d88/tutorial_traincascade.html)


If you do not want to create your own classifier, OpenCV already contains many pre-trained classifiers for face, eyes, smile, etc. Those XML files can be download from [haarcascades](https://github.com/Itseez/opencv/tree/master/data/haarcascades) directory.

Enough theory, let’s create a face detector with OpenCV!

In [4]:
# installing some libraries
!pip install face_recognition
!pip install imutils # A series of convenience functions to make basic image processing functions such as 
                     # translation, rotation, resizing, skeletonization, 
                     # and displaying Matplotlib images easier with OpenCV




In [3]:
import numpy as np
import cv2

faceCascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml') # cascade classifier front face only
cap = cv2.VideoCapture(0)
cap.set(3, 640) # set Width
cap.set(4, 480) # set Height

while True:
	ret, img = cap.read()
	#img = cv2.flip(img, -1) # Flip camera vertically
	gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
	faces = faceCascade.detectMultiScale(
		gray, # input gray scale
		scaleFactor=1.2, # Specifying how much the image size is reduced at each image scale.
                     # Used to create the scale pyramid.
		minNeighbors=5,  # Specifying how many neighbors each candidate rectangle should have, to retain it.
                     # A higher number gives lower positives
		minSize=(20, 20)
		)
 
 # The function below will detect faces on the image. 
 # Next, we must “mark” the faces in the image, using, for example, a blue rectangle. 
 # This is done with this portion of the code:

	for (x, y, w, h) in faces:
		cv2.rectangle(img, (x,y), (x+w, y+h), (255, 0, 0), 2)
		roi_gray = gray[y:y+h, x:x+w]
		roi_color = img[y:y+h, x:x+w]
 # If faces are found, it returns the positions of detected faces as a rectangle with the left up corner (x,y) 
 # and having “w” as its Width and “h” as its Height ==> (x,y,w,h).

	cv2.imshow('video', img)

	k = cv2.waitKey(30)
	if k == 27: # press 'ESC' to quit
		break

cap.release()
cv2.destroyAllWindows()

# Now putting labels with the faces classified

## Data Gathering

Let’s start the first phase of our project. What we will do here, is starting from last step (Face Detecting), we will simply create a dataset, where we will store for each id, a group of photos in gray with the portion that was used for face detecting.
![picture](https://miro.medium.com/max/960/0*Nuf1sgV1y5DaH6wF.)

In [4]:
# create directory to store our facial samples
!mkdir dataset

In [2]:
# This code will create dataset for training
import cv2
import os
import cv2
import os
cam = cv2.VideoCapture(0)
cam.set(3, 640) # set video width
cam.set(4, 480) # set video height
face_detector = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
# For each person, enter one numeric face id
face_id = input('\nenter user id and press ==> ')
print('\n [INFO] Initializing face capture. Look the camera and wait ... ')
# Initialize individual sampling face count
count = 0
while(True):
    ret, img = cam.read()
    #img = cv2.flip(img, -1) # flip video image vertically
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_detector.detectMultiScale(gray, 1.3, 5)
    for (x,y,w,h) in faces:
        cv2.rectangle(img, (x,y), (x+w,y+h), (255,0,0), 2)     
        count += 1
        # Save the captured image into the datasets folder
        cv2.imwrite("dataset/User." + str(face_id) + '.' +  
                    str(count) + ".jpg", gray[y:y+h,x:x+w])
        cv2.imshow('image', img)
    k = cv2.waitKey(100) & 0xff # Press 'ESC' for exiting video
    if k == 27:
        break
    elif count >= 30: # Take 30 face sample and stop video
         break
# Do a bit of cleanup
print("\n [INFO] Exiting Program and cleanup stuff")
cam.release()
cv2.destroyAllWindows()


enter user id and press ==> 2

 [INFO] Initializing face capture. Look the camera and wait ... 

 [INFO] Exiting Program and cleanup stuff


## Training the Recognizer

Training the dataset created above 

![picture](https://miro.medium.com/max/1020/0*N4IcbE8v2nwgj6Xg.)

In [1]:
# creating subdirectory to store the trained data
!mkdir trainer

In [2]:
# This will train the dataset created above for training
import numpy as np
from PIL import Image
import os
import cv2
# Path for face image database
path = 'dataset'
recognizer = cv2.face.LBPHFaceRecognizer_create() # the LBPH (LOCAL BINARY PATTERNS HISTOGRAMS) Face Recognizer
detector = cv2.CascadeClassifier("haarcascade_frontalface_default.xml");
# function to get the images and label data
# returns 2 array: "ids" and "faces" to train the recognizer
def getImagesAndLabels(path):
    imagePaths = [os.path.join(path,f) for f in os.listdir(path)]     
    faceSamples=[]
    ids = []
    for imagePath in imagePaths:
        PIL_img = Image.open(imagePath).convert('L') # grayscale
        img_numpy = np.array(PIL_img,'uint8')
        id = int(os.path.split(imagePath)[-1].split(".")[1])
        faces = detector.detectMultiScale(img_numpy)
        for (x,y,w,h) in faces:
            faceSamples.append(img_numpy[y:y+h,x:x+w])
            ids.append(id)
    return faceSamples,ids
print ("\n [INFO] Training faces. It will take a few seconds. Wait ...")
faces,ids = getImagesAndLabels(path)
recognizer.train(faces, np.array(ids))
# Save the model into trainer/trainer.yml
recognizer.write('trainer/trainer.yml') 
# Print the numer of faces trained and end program
print("\n [INFO] {0} faces trained. Exiting Program".format(len(np.unique(ids))))


 [INFO] Training faces. It will take a few seconds. Wait ...

 [INFO] 1 faces trained. Exiting Program


## Recognizer

Now, we reached the final phase of our project. Here, we will capture a fresh face on our camera and if this person had his face captured and trained before, our recognizer will make a “prediction” returning its id and an index, shown how confident the recognizer is with this match.

![picture](https://miro.medium.com/max/947/0*kkZMQyWtR5NOFr3q.)

In [1]:
import cv2
import numpy as np
import os 
recognizer = cv2.face.LBPHFaceRecognizer_create()
recognizer.read('trainer/trainer.yml')
cascadePath = "haarcascade_frontalface_default.xml"
faceCascade = cv2.CascadeClassifier(cascadePath);
font = cv2.FONT_HERSHEY_SIMPLEX
#iniciate id counter
id = 0
# names related to ids: example ==> Marcelo: id=1,  etc
names = ['None', 'Kunal Verma', 'Paula', 'Ilza', 'Z', 'W'] 
# Initialize and start realtime video capture
cam = cv2.VideoCapture(0)
cam.set(3, 640) # set video widht
cam.set(4, 480) # set video height
# Define min window size to be recognized as a face
minW = 0.1*cam.get(3)
minH = 0.1*cam.get(4)
while True:
    ret, img =cam.read()
    #img = cv2.flip(img, -1) # Flip vertically
    gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
    
    faces = faceCascade.detectMultiScale( 
        gray,
        scaleFactor = 1.2,
        minNeighbors = 5,
        minSize = (int(minW), int(minH)),
       )
    for(x,y,w,h) in faces:
        cv2.rectangle(img, (x,y), (x+w,y+h), (0,255,0), 2)
        id, confidence = recognizer.predict(gray[y:y+h,x:x+w])
        
        # If confidence is less then 100 ==> "0" : perfect match 
        if (confidence < 100):
            id = names[id]
            confidence = "  {0}%".format(round(100 - confidence))
        else:
            id = "unknown"
            confidence = "  {0}%".format(round(100 - confidence))
        
        cv2.putText(
                    img, 
                    str(id), 
                    (x+5,y-5), 
                    font, 
                    1, 
                    (255,255,255), 
                    2
                   )
        cv2.putText(
                    img, 
                    str(confidence), 
                    (x+5,y+h-5), 
                    font, 
                    1, 
                    (255,255,0), 
                    1
                   )  
    
    cv2.imshow('camera',img) 
    k = cv2.waitKey(10) & 0xff # Press 'ESC' for exiting video
    if k == 27:
        break
# Do a bit of cleanup
print("\n [INFO] Exiting Program and cleanup stuff")
cam.release()
cv2.destroyAllWindows()


 [INFO] Exiting Program and cleanup stuff
