# <font style = "color:rgb(50,120,229)">How to Train a Custom Facial Landmark Detector</font>

# <font style = "color:rgb(50,120,229)">Introduction</font>

To train a facial landmark detector, we need three pieces of information

1. A few thousand images of containing faces.

2. Bounding boxes corresponding to faces in those images.

3. Accurately placed landmarks for each face in the image. 

Most facial landmark datasets are annotated for a relatively smaller number of points. 

Fortunately, Imperial College of London’s research group, iBUG, annotated many famous datasets (300-W, XM2VTS, FRGC, LFPW, HELEN, AFW and IBUG) for the same set of 68 facial landmarks.

Dlib’s facial landmark model is trained on a subset of this data which consists AFW, HELEN, iBUG and LFPW datasets.

# <font style = "color:rgb(50,120,229)">Train 70-points Facial Landmarks</font>

The 68 points do not include the centers of the iris. So, our data collection team spent hours to annotate these two additional eye points. We are making this 70-points dataset available to everybody in this course.

<center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w3-m6-68points.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w3-m6-68points.jpg" width=500/></a></center>

## <font style = "color:rgb(50,120,229)">Data</font>
You can download the data from the download link given at the top of the module. You can also download it from Dropbox **[link](https://www.dropbox.com/s/q5w1wolzhtnuy5q/facial_landmark_data.zip?dl=1)**. You can get a quick overview on how data is organized in facial_landmark_data directory by looking at the following snapshot.

### <font style = "color:rgb(8,133,37)"><a href="https://www.dropbox.com/s/q5w1wolzhtnuy5q/facial_landmark_data.zip?dl=1">Dropbox Link for Data</a></font>

### <font style = "color:rgb(8,133,37)">Dataset Structure</font>

`facial_landmark_data` directory has:

3 folders

+ datasets - contains images and annotation files for face rectangles and facial landmarks (for both 70-points and 33-points)
+ 33_points - contains XML files for training & testing and 33-points shape predictor model
+ 70_points - contains XML files for training & testing and 70-points shape predictor model

and

6 files

+ image_names.txt - image paths to images in datasets directory, relative to this folder
+ 33_point_landmarks.jpg - 33 facial landmarks marked on face
+ 70_point_landmarks.jpg - 70 facial landmarks marked on face
+ shape_predictor_70_face_landmarks.dat - Dlib shape predictor model file for 70 facial landmarks
+ training_with_face_landmarks.xml - input file to training code, created from annotation files for 70-points to train model
+ testing_with_face_landmarks.xml - input file to training code, created from annotation files for 70-points to test model

```
facial_landmark_data
├── image_names.txt
│
├── 33_point_landmarks.jpg
├── 70_point_landmarks.jpg
│
├── 33_points
│   ├── shape_predictor_33_face_landmarks.dat
│   ├── testing_with_face_landmarks.xml
│   └── training_with_face_landmarks.xml
│
├── 70_points
│   ├── shape_predictor_70_face_landmarks.dat
│   ├── testing_with_face_landmarks.xml
│   └── training_with_face_landmarks.xml
│
```

`datasets` directory has 4 sub directories:

+ afw
+ helen
+ ibug
+ lfpw

Within each of these 4 sub directories, we have:

+ an image file e.g. afw/111076519_1.jpg
+ annotation file containing rectangle corresponding to face in image e.g. afw/111076519_1_rect.txt
+ 2 annotation files (33-points & 70-points) containing facial landmarks corresponding to same face of image which is specified in _rect.txt e.g. afw/111076519_1_bv33.txt and afw/111076519_1_bv70.txt
+ mirrored image of original image e.g. afw/111076519_1_mirror.jpg
+ annotation file containing rectangle corresponding to face in mirrored image e.g. afw/111076519_1_mirror_rect.txt
+ 2 annotations files for facial landmarks corresponding to same face of mirrored image which is specified in _mirror_rect.txt e.g. 111076519_2_mirror_bv33.txt & 111076519_2_mirror_bv70.txt

An important point to note here is that each of _rect.txt and _bv33.txt or _bv70.txt contains annotation for only 1 face. If an image has more than 1 faces, we create separate annotation and image files and suffix it with a number. e.g. 111076519 has 2 faces, so we have 111076519_1.jpg and 111076519_2.jpg and other files (annotation files, mirrored image & annotation files corresponding to mirrored image). If you look at 111076519_1.jpg and 111076519_2.jpg, both images are same.

```
│
├── datasets
│   ├── afw
│   │   ├── 111076519_1.jpg
│   │   ├── 111076519_1_rect.txt
│   │   ├── 111076519_1_bv33.txt
│   │   ├── 111076519_1_bv70.txt
│   │   ├── 111076519_1_mirror.jpg
│   │   ├── 111076519_1_mirror_rect.txt
│   │   ├── 111076519_1_mirror_bv33.txt
│   │   ├── 111076519_1_mirror_bv70.txt
│   │   ├── 111076519_2.jpg
│   │   ├── 111076519_2_rect.txt
│   │   ├── 111076519_2_bv33.txt
│   │   ├── 111076519_2_bv70.txt
│   │   ├── 111076519_2_mirror.jpg
│   │   ├── 111076519_2_mirror_rect.txt
│   │   ├── 111076519_2_mirror_bv33.txt
│   │   ├── 111076519_2_mirror_bv70.txt
│   ├── helen
│   │   ├── testset
│   │   │   ├── 2978322154_1.jpg
│   │   │   ├── 2978322154_1_rect.txt
│   │   │   ├── 2978322154_1_bv33.txt
│   │   │   ├── 2978322154_1_bv70.txt
│   │   │   ├── 2978322154_1_mirror.jpg
│   │   │   ├── 2978322154_1_mirror_rect.txt
│   │   │   ├── 2978322154_1_mirror_bv33.txt
│   │   │   ├── 2978322154_1_mirror_bv70.txt
│   │   └── trainset
│   │       ├── 100040721_1.jpg
│   │       ├── 100040721_1_rect.txt
│   │       ├── 100040721_1_bv33.txt
│   │       ├── 100040721_1_bv70.txt
│   │       ├── 100040721_1_mirror.jpg
│   │       ├── 100040721_1_mirror_rect.txt
│   │       ├── 100040721_1_mirror_bv33.txt
│   │       ├── 100040721_1_mirror_bv70.txt
│   ├── ibug
│   │   ├── image_014_01.jpg
│   │   ├── image_014_01_rect.txt
│   │   ├── image_014_01_bv33.txt
│   │   ├── image_014_01_bv70.txt
│   │   ├── image_014_01_mirror.jpg
│   │   ├── image_014_01_mirror_rect.txt
│   │   ├── image_014_01_mirror_bv33.txt
│   │   ├── image_014_01_mirror_bv70.txt
│   └── lfpw
│       ├── testset
│       │   ├── image_0001.png
│       │   ├── image_0001_rect.txt
│       │   ├── image_0001_bv33.txt
│       │   ├── image_0001_bv70.txt
│       │   ├── image_0001_mirror.jpg
│       │   ├── image_0001_mirror_rect.txt
│       │   ├── image_0001_mirror_bv33.txt
│       │   ├── image_0001_mirror_bv70.txt
│       └── trainset
│           ├── image_0001.png
│           ├── image_0001_rect.txt
│           ├── image_0001_bv33.txt
│           ├── image_0001_bv70.txt
│           ├── image_0001_mirror.jpg
│           ├── image_0001_mirror_rect.txt
│           ├── image_0001_mirror_bv33.txt
│           ├── image_0001_mirror_bv70.txt

11 directories, 30708 files
```

## <font style = "color:rgb(50,120,229)">Visualize Annotations</font>

There are a few occasions when you would want to visualize annotations as a sanity check before starting training. For example, if you add new data to this dataset, you should sure the ordering of points in the new dataset is the same as the old dataset. You should also visualize the annotations if you decide to build a smaller model with fewer points for a mobile application. 

To visualize annotations, we will run script **drawRectLandmarks.py**

This script randomly picks 50 files from datasets directory, draws rectangle around face using **file _rect.txt** and prints facial landmark number using **file _bv70.txt**.

### <font style = "color:rgb(8,133,37)">Python [Visual Annotations] [drawRectLandmarks.py]</font>

Run this script by providing 2 arguments,
path to facial_landmark_data directory
number of facial landmark points (70 or 33)

`python drawRectLandmarks.py path_to_facial_landmark_data 70`

In [None]:
import os
import sys
import cv2
import random

Functions to create directory, draw rectangle for face bounding box and circle + part number for facial landmarks.

In [None]:
# create a directory if it doesn't exist
def create_dir(folder):
  try:
    os.makedirs(folder)
  except:
    print('{} already exists.'.format(folder))

# draw rectangle on image
def drawRectangle(im, bbox):
  x1, y1, x2, y2 = bbox
  cv2.rectangle(im, (x1, y1), (x2, y2), (0, 255, 255), 
                  thickness=5, lineType=cv2.LINE_8)

# draw landmarks on image
def drawLandmarks(im, parts):
  for i, part in enumerate(parts):
    # print shape.num_parts()
    px, py = part
    # draw circle at each landmark
    cv2.circle(im, (px, py), 1, (0, 0, 255), thickness=2, 
                lineType=cv2.LINE_AA)
    # write landmark number at each landmark
    cv2.putText(im, str(i+1), (px, py), cv2.FONT_HERSHEY_SIMPLEX, 
                    1, (0, 200, 100), 4)

Since OpenCV font scale is pretty big, we will scale-up our image so that numbers for facial landmarks are not overlapped over each other. Since it will consume a lot of time to draw face rectangle and landmarks over full data, we will visualize a randomly selected sample of full training data.

Read path to facial_landmark_data directory and number of facial landmark points from arguments.

In [None]:
# define scale so that points can be printed well
scale = 4
# we will draw facial rectangles and landmarks on
# a randomly sampled small subset of all images
numSamples = 50

# facial landmark data directory
fldDir = sys.argv[1]
# number of facial landmarks; pass 70 or 33
numPoints = sys.argv[2]

Prepare directories to store output images

In [None]:
# Prepare output dirs
# as we know we have a mirrored image corresponding
# to each image. We will results for mirrored and original
# images in separate directory. Although this is not
# important for you because we already have annotation
# files for mirrored images.
#
# This step was crucial, when we got eye-centers annotated
# by data team. Data team only annotated eye-centers for
# original images. Then we generated annotation files
# for mirrored images from annotation files of original images
outputDir = os.path.join(fldDir, 'output')
outputMirrorDir = os.path.join(outputDir, 'mirror')
outputOriginalDir = os.path.join(outputDir, 'original')
create_dir(outputMirrorDir)
create_dir(outputOriginalDir)

Read image names from image_names.txt file and sample numSamples items from full image names list. We will visualize annotations on this sampled data.

In [None]:
# Path to image_names file
imageNamesFilepath = os.path.join(fldDir, 'image_names.txt')

# Check whether path to image_names file exists
# within facial_landmark_data directory
if os.path.exists(imageNamesFilepath):
  # If image_names.txt exists, read it
  with open(imageNamesFilepath) as d:
    imageNames = [x.strip() for x in d.readlines()]
else:
  print('Pass path to facial_landmark_data as argument to this script')



# Randomly shuffle list cntaining image names
random.shuffle(imageNames)
# select numSamples image names from list
imageNamesSampled = imageNames[:numSamples]

Iterate over all images. Read and upscale image.

In [None]:
# Iterate over image names
for k, imageName in enumerate(imageNamesSampled):
  print("Processing file: {}".format(imageName))

  # create image path
  imagePath = os.path.join(fldDir, imageName)
  # read image
  im = cv2.imread(imagePath, cv2.IMREAD_COLOR)
  # scale up image
  im = cv2.resize(im, (0, 0), fx=scale, fy=scale)

Read rectangle file corresponding to image

In [None]:
  # create path to face rectangle file
  rectPath = os.path.splitext(imagePath)[0] + '_rect.txt'

  # open rectangle file and read
  with open(rectPath) as f:
    line = f.readline()
    # read annotations
    left, top, width, height = [float(n) for n in line.strip().split()]
    # calculate coordinates of bottom right corner of rectangle
    right = left + width
    bottom = top + height
    # scale up face reactangle coordinates
    x1, y1, x2, y2 = int(scale*left), int(scale*top), 
                        int(scale*right), int(scale*bottom)
    # save coordinates to a list. this is also called bounding box
    # it is a term to denote coordinates of an object
    bbox = [x1, y1, x2, y2]

Read facial landmark points file corresponding to image

In [None]:
  # open facial landmarks file and read coordinates
  pointsPath = os.path.splitext(imagePath)[0] + '_bv' + 
                  str(numPoints) + '.txt'
  parts = []
  # open points file and read
  with open(pointsPath) as g:
    # read lines. each line has coordinates of a landmark point
    lines = [x.strip() for x in g.readlines()]
    # iterate over all lines
    for line in lines:
      # each line has two numbers (x, y of each landmark)
      left, right = [float(n) for n in line.split()]
      # scale up landmark coordinates
      px, py = int(scale*left), int(scale*right)
      parts.append([px, py])

Finally draw rectangle and facial landmarks points on image and write it to disk.

In [None]:
  # draw face rectangle and landmarks
  drawRectangle(im, bbox)
  drawLandmarks(im, parts)

  # basename is filename in a filepath
  imageBasename = os.path.basename(imagePath)
  # if basename has mirror, output image will be stored
  # in mirror output directory
  # else in original outout directory
  if 'mirror' in imageBasename:
    outputImagePath = os.path.join(outputMirrorDir, imageBasename)
  else:
    outputImagePath = os.path.join(outputOriginalDir, imageBasename)

  # save output image
  cv2.imwrite(outputImagePath, im)

## <font style = "color:rgb(50,120,229)">Train-Test Data Preparation</font>

Dlib’s shape predictor takes XML files in a specific format as input to train and test the landmark detector. To generate these train and test xml files from our data, we will run script **createTrainTestXml.py**

This script will generate two xml files named "<font style="color:rgb(8,133,37)">training_with_face_landmarks.xml</font>" and “<font style="color:rgb(8,133,37)">testing_with_face_landmarks.xml</font>” in facial_landmark_data folder.

### <font style = "color:rgb(8,133,37)">Python [Create train and test XML files] [createTrainTestXml.py]</font>

Run this script by providing 2 arguments,

+ path to facial_landmark_data directory
+ number of facial landmark points (70 or 33)

python createTrainTestXml.py path_to_facial_landmark_data 70

In [None]:
import sys
import os
import random
try:
  from lxml import etree as ET
except ImportError:
  print('install lxml using pip')
  print('pip install lxml')

We will create an XML document and store annotations for face rectangle and facial landmarks in this document. Structure of this XML is as follows:

```
<dataset>
  <name>Training faces</name>
  <images>
    <image file="datasets/helen/testset/3048914345_1.jpg">
      <box height="772" left="567" top="481" width="771">
        <part name="00" x="744" y="809"/>
        <part name="01" x="739" y="870"/>
        <part name="02" x="735" y="941"/>
        <part name="03" x="746" y="1018"/>
        <part name="04" x="763" y="1092"/>
        ...
        Similarly for all 70 landmark points
      </box>
    </image>
    <image file="datasets/afw/281972218_1.jpg">
      <box height="259" left="878" top="247" width="259">
    ...
    Similarly for rest of the images
  </images>
</dataset>
```
Create root node "dataset" and child nodes "name" & "images"

In [None]:
# create XML from annotations
def createXml(imageNames, xmlName, numPoints):
  # create a root node names dataset
  dataset = ET.Element('dataset')
  # create a child node "name" within root node "dataset"
  ET.SubElement(dataset, "name").text = "Training Faces"
  # create another child node "images" within root node "dataset"
  images = ET.SubElement(dataset, "images")

  # print information about xml filename and total files
  numFiles = len(imageNames)
  print('{0} : {1} files'.format(xmlName, numFiles))
  # iterate over all files
  for k, imageName in enumerate(imageNames):
    # print progress about files being read
    print('{}:{} - {}'.format(k+1, numFiles, imageName))

Read rectangle coordinates and create "image" and "box" nodes.

In [None]:
    # read rectangle file corresponding to image
    rect_name = os.path.splitext(imageName)[0] + '_rect.txt'
    with open(os.path.join(fldDatadir, rect_name), 'r') as file:
      rect = file.readline()
    rect = rect.split()
    left, top, width, height = rect[0:4]

    # create a child node "image" within node "images"
    # this node will have annotation data for an image
    image = ET.SubElement(images, "image", file=imageName)
    # create a child node "box" within node "image"
    # this node has values for bounding box or rectangle of face
    box = ET.SubElement(image, 'box', top=top, left=left, 
                        width=width, height=height)

Read facial landmark points, create "parts" node and store point landmarks

In [None]:
    # read points file corresponding to image
    points_name = os.path.splitext(imageName)[0] + '_bv' + 
                                    numPoints + '.txt'
    with open(os.path.join(fldDatadir, points_name), 'r') as file:
      for i, point in enumerate(file):
        x, y = point.split()
        # points annotation file has coordinates in float
        # but we want them to be in int format
        x = str(int(float(x)))
        y = str(int(float(y)))
        # name is the facial landmark or point number, starting from 0
        name = str(i).zfill(2)
        # create a child node "parts" within node "box"
        # this node has values for facial landmarks
        ET.SubElement(box, 'part', name=name, x=x, y=y)

Create XML tree and write it to disk

In [None]:
  # finally create an XML tree
  tree = ET.ElementTree(dataset)

  print('writing on disk: {}'.format(xmlName))
  # write XML file to disk. pretty_print=True indents the XML 
  # to enhance readability
  tree.write(xmlName, pretty_print=True, xml_declaration=True, 
              encoding="UTF-8")

Read image paths (which are relative to facial landmark data directory)

In [None]:
if __name__ == '__main__':

  # read value to facial_landmark_data directory
  # and number of facial landmarks
  fldDatadir = sys.argv[1]
  numPoints = sys.argv[2]

  # Read names of all images
  with open(os.path.join(fldDatadir, 'image_names.txt')) as d:
    imageNames = [x.strip() for x in d.readlines()]

This part is relevant if you don’t have sufficient RAM on your machine to train model on whole training data.

In [None]:
  ################# trick to use less data #################
  # If you are unable to train all images on your machine,
  # you can reduce training data by randomly sampling n 
  # images from the total list.
  # Keep decreasing the value of n from len(imageNames) to
  # a value which works on your machine.
  # Uncomment the next two lines to decrease training data
  # n = 1000
  # imageNames = random.sample(imageNames, n)
  ##########################################################

We will split train and test data in a 95:5 ratio. Generate and save training and test XML files. These XML files will be stored in facial_landmark_data folder


In [None]:
  totalNumFiles = len(imageNames)
  # We will split data into 95:5 for train and test
  numTestFiles = int(0.05 * totalNumFiles)

  # randomly sample 5% items from list of image names
  testFiles = random.sample(imageNames, numTestFiles)
  # assign rest of image names as train
  trainFiles = list(set(imageNames) - set(testFiles))

  # generate XML files for train and test data
  createXml(trainFiles, os.path.join(fldDatadir,
              'training_with_face_landmarks.xml'), numPoints)
  createXml(testFiles, os.path.join(fldDatadir, 
              'testing_with_face_landmarks.xml'), numPoints)

## <font style = "color:rgb(50,120,229)">Training</font>

We will train shape predictor on landmark data either using <font style="color:rgb(8,133,37)">C++ code **trainFLD.cpp**</font> or <font style="color:rgb(8,133,37)">Python code **trainFLD.py.**</font>

Both of these code look for <font style="color:rgb(8,133,37)">training_with_face_landmarks.xml</font> and <font style="color:rgb(8,133,37)">testing_with_face_landmarks.xml</font> files in facial_landmark_data directory. Once you run the training code, it will tell you the approximate time required to train the landmark detector. 

After the training is complete, the facial landmark model is saved to disk and the model is tested on the training and the test data to calculate training and test errors.

We discussed in the theory section that this landmark detection algorithm is based on V. Kazemi’s paper "One Millisecond Face Alignment with an Ensemble of Regression Trees". This algorithm has few parameters which affect model size and accuracy. 

Let’s go through those parameters. Values specified in brackets is the default value used in Dlib’s shape_predctor_trainer class. Some of the text below is copied from Dlib’s documentation.

>**cascade_depth** (10): &nbsp; Number of cascades created when you train a model. This parameter corresponds to the parameter T in the Kazemi paper.
>
> **num_trees_per_cascade_level** (500): &nbsp; Number of trees created for each cascade. This means that the total number of trees in the learned model is equal to cascade_depth x num_trees_per_cascade_level. This parameter corresponds to the parameter K in the Kazemi paper. 
>
> **tree_depth** (4): &nbsp; Depth of the trees used in the cascade. In particular, there are pow(2,get_tree_depth()) leaves in each tree.This parameter corresponds to the parameter F in the Kazemi paper. 
> 
>**nu** (0.1): &nbsp;nu is the regularization parameter. Larger values of this parameter will cause the algorithm to fit the training data better but may also cause overfitting. This parameter corresponds to the parameter  in the Kazemi paper. 
>
> **oversampling_amount** (20): &nbsp; We can reduce the capacity of the model by explicitly increasing the regularization (making nu smaller) and by using trees with smaller depths. We can effectively increase the amount of training data by adding in each training example multiple times but with a randomly selected deformation applied to it. That is what this parameter controls. i.e. if you supply n training samples to train then the algorithm runs internally with n x oversampling_amount training samples. The bigger this parameter the better (excepting that larger values make training take longer). In terms of the Kazemi paper, this parameter is the number of randomly selected initial starting points sampled for each training example. This parameter corresponds to the parameter R in the Kazemi paper. 
>
> **feature_pool_size** (400): &nbsp; At each level of the cascade we randomly sample feature_pool_size pixels from the image.  These pixels are used to generate features for the random trees.  So in general larger settings of this parameter give better accuracy but make the algorithm run slower. This parameter corresponds to the parameter P in the Kazemi paper. 
>
> **feature_pool_region_padding** (0): &nbsp;When we randomly sample the pixels for the feature pool we do so in a box fit around the provided training landmarks. By default, feature_pool_region_padding=0 and the box is the tightest box that contains the landmarks. However, we can expand or shrink the size of the pixel sampling region by setting a different value of feature_pool_region_padding. To explain this precisely, for a padding of 0 we say that the pixels are sampled from a box of size 1x1.  The padding value is added to each side of the box.  So a padding of 0.5 would cause the algorithm to sample pixels from a box that was 2x2, effectively multiplying the area pixels are sampled from by 4.  Similarly, setting the padding to -0.2 would cause it to sample from a box 0.6x0.6 in size.
> 
>
> **lambda_param** (0.1): &nbsp;To decide how to split nodes in the regression trees the algorithm looks at the intensity difference between pairs of pixels in the image. These pixel pairs are randomly sampled but with a preference for selecting pixels that are closer to each other. lambda controls this "nearness" preference.  In particular, smaller values of lambda_param will make the algorithm prefer pixels close together and larger values of lambda will make it care less about picking nearby pixel pairs. lambda_param should be in a number between 0 and 1. 
> 
>
> **num_test_splits** (20): &nbsp;When generating the random trees we randomly sample num_test_splits possible split features at each node and pick the one that gives the best split. Larger values of this parameter will usually give more accurate outputs but take longer to train.This parameter corresponds to the parameter S in the Kazemi paper. 
>  

### <font style = "color:rgb(8,133,37)">Python [Train Facial Landmark Detector] [trainFLD.py]</font>

We can start training using following command:

`python trainFLD.py path_to_facial_landmark_data 70`

In [None]:
import os
import sys
import dlib

Read facial_landmark_data and number of facial landmark points from command arguments

In [None]:
fldDatadir = sys.argv[1]
numPoints = sys.argv[2]
modelName = 'shape_predictor_' + numPoints + '_face_landmarks.dat'

In [None]:
# Set parameters of shape_predictor_trainer
options = dlib.shape_predictor_training_options()
options.cascade_depth = 10
options.num_trees_per_cascade_level = 500
options.tree_depth = 4
options.nu = 0.1
options.oversampling_amount = 20
options.feature_pool_size = 400
options.feature_pool_region_padding = 0
options.lambda_param = 0.1
options.num_test_splits = 20

# Tell the trainer to print status messages to the console so we can
# see training options and how long the training will take.
options.be_verbose = True

In [None]:
# Check if train and test XML files are present in facial_landmark_data folder
trainingXmlPath = os.path.join(fldDatadir, 
                                "training_with_face_landmarks.xml")
testingXmlPath = os.path.join(fldDatadir, 
                                "testing_with_face_landmarks.xml")
outputModelPath = os.path.join(fldDatadir, modelName)

# check whether path to XML files is correct
if os.path.exists(trainingXmlPath) and os.path.exists(testingXmlPath):
  # Train and test the model
  # dlib.train_shape_predictor() does the actual training. 
  # It will save the final predictor to predictor.dat.
  # The input is an XML file that lists the images in 
  # the training dataset and 
  # also contains the positions of the face parts.
  dlib.train_shape_predictor(trainingXmlPath, outputModelPath, options)

  # Now that we have a model we can test it.  
  # dlib.test_shape_predictor() measures the average distance 
  # between a face landmark output by the shape_predictor and 
  # ground truth data.

  print("\nTraining error: {}".format(
    dlib.test_shape_predictor(trainingXmlPath, outputModelPath)))

  # The real test is to see how well it does 
  # on data it wasn't trained on.
  print("Testing error: {}".format(
    dlib.test_shape_predictor(testingXmlPath, outputModelPath)))
# Print an error message if XML files are not 
# present in facial_landmark_data folder
else:
  print('training and test XML files not found.')
  print('Please check paths:')
  print('train: {}'.format(trainingXmlPath))
  print('test: {}'.format(testingXmlPath))  

# <font style = "color:rgb(50,120,229)">Train 33-points Facial Landmarks</font>

The 70 points facial landmark is huge in size and not appropriate for use in mobile applications. So we will create a smaller model. There is a tradeoff between model size and accuracy. It is important to pick right subset of points out of 70. After multiple experiments, we figured out a subset of 33 points which doesn’t compromise as much accuracy and is smaller in size.

<center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w3-m6-33points.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w3-m6-33points.jpg" width=500/></a></center>

## <font style = "color:rgb(50,120,229)">Data</font>

We will read _bv70.txt files as input, pick these 33 landmarks and save them as _bv33.txt for each image. Then use createTrainTestXml.py script to generate training_with_face_landmarks.xml and testing_with_face_landmarks.xml files.

### <font style = "color:rgb(8,133,37)">Python [Generate Landmark Annotation files for 33 points] [generate33Points.py]</font>

This script should be run as:

`python generate33Points.py path_to_facial_landmark_data`

facial_landmark_data folder should have files image_names.txt and a folder named "datasets"

In [None]:
import os
import sys

Read file image_names.txt. This file has path to all image files in "datasets" directory. We will use this to find/read/write annotation files corresponding to each image.

In [None]:
# facial landmark data directory
fldDir = sys.argv[1]

# Path to image_names file
imageNamesFilepath = os.path.join(fldDir, 'image_names.txt')

# Check whether path to image_names file exists within 
# facial_landmark_data directory
if os.path.exists(imageNamesFilepath):
  # If image_names.txt exists, read it
  with open(imageNamesFilepath) as d:
    imageNames = [x.strip() for x in d.readlines()]
else:
  print('Pass path to facial_landmark_data as argument to this script')

In [None]:
# Write down indices of all 33 points w.r.t. 70 points
# Out of 70 points, we have to pick 33 points.
# Here we are writing indices of those 33 points.
# IMPORTANT: Numbers shown on image are natural numbers. They
# start from 1 whereas indices in Python start from 0.
# So to match these indices with facial landmark numbers on
# sample image, you should add 1.
points33Indices = [
                   1, 3, 5, 8, 11, 13, 15,     # Jaw line
                   17, 19, 21,                 # Left eyebrow
                   22, 24, 26,                 # Right eyebrow
                   30, 31,                     # Nose bridge
                   33, 35,                     # Lower nose
                   36, 37, 38, 39, 40, 41,     # Left eye
                   42, 43, 44, 45, 46, 47,     # Right Eye
                   48, 51, 54, 57              # Outer lip
                  ]

Iterate over all images. For each image, create path to 70-points annotation file(for reading) and 33-points annotation file(for writing)

In [None]:
numImages = len(imageNames)

# Iterate over all image names
for n, imageName in enumerate(imageNames):
  # Just pretty printing the progress
  print('{}/{} - {}'.format(n+1, numImages, imageName))

  # path to image
  imagePath = os.path.join(fldDir, imageName)
  # We points annotation file has prefix _bv70.txt in end 
  # whereas image has .jpg. So we are creating the path to 70points
  # annotation file by replacing .jpg of image path with _bv70.txt.
  points70Path = os.path.splitext(imagePath)[0] + '_bv70.txt'
  # Similarly for 30points annotation files.
  points33Path = os.path.splitext(imagePath)[0] + '_bv33.txt'

Read 70-points annotation file, select 33 points using indices that we picked earlier and save these points to a 33-points annotation file.


In [None]:
  # Check if path to annotation file exists
  if os.path.exists(points70Path):
    # open file
    with open(points70Path, 'r') as f:
      # read all lines
      points70 = f.readlines()
      # select lines whose indices are in our points33Indices list
      points33 = [points70[i] for i in points33Indices]
      # open points33 file
      with open(points33Path, 'w') as g:
        # write 33 points to file
        g.writelines(points33)

  else:
    print('Unable to find path:{}'.format(points70Path))

## <font style = "color:rgb(50,120,229)">Visualize Annotations</font>

We have generated 33-points annotation files from 70-points files. It is really important to check whether these new annotation files are correct. We will visualize 33-points facial landmarks using script drawRectLandmarks.py. When we pass 33 as argument, script reads _bv33.txt files to load facial landmarks annotations.

### <font style = "color:rgb(8,133,37)">Python [Visual Annotations] [drawRectLandmarks.py]</font>

Run drawRectLandmarks.py for 33-points:

`python drawRectLandmarks.py path_to_facial_landmark_data 33`

## <font style = "color:rgb(50,120,229)">Train-Test Data Preparation</font>

To generate training and test XML files, we will run script createTrainTextXml. When we pass 33 as argument, script read _bv33.txt files to load facial landmarks annotations. This script will write <font style="color:rgb(8,133,37)">training_with_face_landmarks.xml</font> and <font style="color:rgb(8,133,37)">testing_with_face_landmarks.xml</font> to facial_landmark_data folder.

### <font style = "color:rgb(8,133,37)">Python [Create train and test XML files] [createTrainTestXml.py]</font>

Run createTrainTestXml.py for 33-points,

`python createTrainTestXml.py path_to_facial_landmark_data 33`

## <font style = "color:rgb(50,120,229)">Training</font>

Training code looks for files <font style="color:rgb(8,133,37)">training_with_face_landmarks.xml</font> and <font style="color:rgb(8,133,37)">testing_with_face_landmarks.xml</font> in facial_landmark_data folder. These files don’t have distinct names i.e. it is same for both 70-points and 33-points. By default facial_landmark_data folder has XML files for 70-points. Before you start training, make sure that you are using the correct files. Last argument (33 or 70) is used to name model file.

We can train for 33-points landmark model using Python code.

### <font style = "color:rgb(8,133,37)">Python [Train Facial Landmark Detector] [trainFLD.py]</font>

`python trainFLD.py path_to_facial_landmark_data 33`

# <font style = "color:rgb(50,120,229)">References and Further reading</font>

1. [http://www.csc.kth.se/~vahidk/face_ert.html](http://www.csc.kth.se/~vahidk/face_ert.html)

2. [https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/](https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/)

3. [http://dlib.net/ml.html#shape_predictor_trainer](http://dlib.net/ml.html#shape_predictor_trainer)