# Facial Landmark Detection

Notebook by [Prashant Brahmbhatt](https://www.github.com/hashbanger)

___

The facial landmarks are used to detect the features and different regions of a face namely:  
- Eyes 
- Eyebrows
- Nose
- Mouth
- Jawline  

Detecting facial landmarks is a 'subset' of the **shape prediction problem**. A shape predictor localizes the key points of interests along with the shape. The motive is to detect import facial features using the shape prediction methods.  
It involves two process:  
- Loacalizing the face in the image  
- Detecting the features in ROI (Region Of Interest)  


 


#### Localizing the Face

We can use the traditional Haar-Cascades to localize the face in the image. We can use a pretrained model for such purposes. The method isn't the aim but somehow we have to get a bounding box for the face.

#### Detecting facial features

There are several facial features detectors but most of them try to localize the following features:  
- Left eye
- Right eye
- Left eyebrow
- Right eyebrow
- Nose
- Jaw  
  
The *dlib* library has a facial features detector included is based on the research paper found [here](http://www.nada.kth.se/~sullivan/Papers/Kazemi_cvpr14.pdf).  

The method involves:  
- A training set of labeled facial landmarks on an image. These images are manually labeled, specifying specific (x, y)-coordinates of regions surrounding each facial structure.
- Priors, of more specifically, the probability on distance between pairs of input pixels.  

Given this training data, an ensemble of regression trees are trained to estimate the facial landmark positions directly from the pixel intensities themselves, there's no requirement of feature extraction.  
The end result is a facial landmark detector that can be used to detect facial landmarks in real-time with high quality predictions.   

### The dlib's facial detection

There is a pretrained facial landmarks detector inside the dlib library which estimates the location of 68 coordinates that map to the facial structure.  

![img1](img01.jpg)

These annotations are part of the 68 point iBUG 300-W dataset which the dlib facial landmark predictor was trained on.
Other than this there are several other models that exist as the one trained on the known HELEN dataset.    

Dlib framework can used to train for own custom shape detetcion purposes as well.

## Detecting facial landmarks using Dlib ad OpenCV

We are going to add some few convenient functions to our im utils library, inside face_utils.py  

In [2]:
from scipy.spatial import distance as dist
from imutils.video import VideoStream
from imutils import face_utils
from threading import Thread
import numpy as np
import playsound
import argparse
import imutils
import time
import dlib
import cv2

First function is a *rect_to_bb* for "rectangle to bounding box". We normally think of bounding box as in the format (x, y, w, h) for convenience.

In [5]:
def rect_to_bb(rect):
    "takes a bounding box produced by the dlib detector"
    x = rect.left()
    y = rect.top()
    w = rect.right() - x
    h = rect.bottom() -y
    
    #returning the tuple of (x,y,w,h)
    return (x, y, w, h)

The second function is the *shape_to_np* function which we will use to convert the 68 (x,y) coordinates returned by the dlib detector into a numpy array so our work would get easier.


In [6]:
def shape_to_np(shape, dtype = 'int'):
    # initializing the list of (x, y) coordinates
    coords = np.zeros((68,2), dtype=dtype)
    
    #looping over the 68 remarks and convert them to a tuple
    for i in range(0, 68):
        coords[i] = (shape.part(i).x, shape.part(i).y)
        
    #returning the list of x,y coordinates
    return coords

**References:**  
[www.medium.com ]()  
[www.pyimagesearch.com]()    
[www.stackoverflow.com]()