## End-to-End Procedure

### Procedure Outline
1. Normalize the dataset
    - Detect faces among all the images. Reject images that have more than one face.
    - Crop the face from the image.
    - Align it.
2. Generate Train-Test Splits
    - Create folds.
3. Evaluate 
    - Generate embeddings from the splits
    - Train classifier on the embeddings
    - Test classifier on the embeddings
4. Tune classifier
    - Tune the classifier 
5. Save the model

### Imports

In [1]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

In [3]:
import os
import cv2
import pprint
import logging
import tqdm

import face_trigger

In [4]:
from face_trigger.model.deep.FaceRecognizer import FaceRecognizer
from face_trigger.process.post_process import FaceDetector, LandmarkDetector, FaceAlign
from face_trigger.utils.common import RepeatedTimer, clamp_rectangle
from face_trigger.utils.data import normalize_dataset, Dataset

In [5]:
unnormalized_dataset_path = "/media/ankurrc/new_volume/softura/facerec/datasets/standard_att"
dataset_path = "/media/ankurrc/new_volume/softura/facerec/att_norm"
split_path = "/media/ankurrc/new_volume/softura/facerec/att_split_path"

In [6]:
logging.basicConfig(level=logging.DEBUG)

### Normalize dataset
 While normalizing the dataset we assume that the original dataset has the following structure:
 1. At the root level there are directories that represent each personality. The directories may or may not have a numeric name.
 2. Within each directory, the files should represent the images that contain the parent directory's(personality) faces. The file names may or may not be numeric.
 
 The final dimensions is assumed to be 256x256, since that is what the DNN ingests.
 Also, the detected faces align the eyes about 0.35th of the width from the ends. 

In [7]:
def normalize_the_dataset(unnormalized_dataset_path=None, dataset_path=None):
    return normalize_dataset(dataset_path=unnormalized_dataset_path, output_path=dataset_path)

In [9]:
rejected_dirs = normalize_the_dataset(unnormalized_dataset_path=unnormalized_dataset_path, dataset_path=dataset_path)

100%|██████████| 40/40 [00:02<00:00, 16.35it/s]
INFO:face_trigger.utils.data:Normalized dataset created at /media/ankurrc/new_volume/softura/facerec/att_norm


Rejected directories:
{'33': ['4.png'], '35': ['2.png'], '37': ['4.png', '5.png']}


### Generate Splits

In [10]:
def generate_splits(dataset_path=None, split_path=None):
    dataset = Dataset(dataset_path=dataset_path,
                      split_path=split_path)
    folds = 3
    training_samples = [2, 5, 8]
    
    dataset.split(num_train_list=training_samples, folds=folds)

In [11]:
generate_splits(dataset_path=dataset_path, split_path=split_path)

INFO:face_trigger.utils.data:Generating for 2 training samples per subject.
INFO:face_trigger.utils.data:Generating: Fold 1
INFO:face_trigger.utils.data:Creating directory: /media/ankurrc/new_volume/softura/facerec/att_split_path/2/1
INFO:face_trigger.utils.data:done.
INFO:face_trigger.utils.data:/media/ankurrc/new_volume/softura/facerec/att_split_path/2/1/train.csv
INFO:face_trigger.utils.data:Generating: Fold 2
INFO:face_trigger.utils.data:Creating directory: /media/ankurrc/new_volume/softura/facerec/att_split_path/2/2
INFO:face_trigger.utils.data:done.
INFO:face_trigger.utils.data:/media/ankurrc/new_volume/softura/facerec/att_split_path/2/2/train.csv
INFO:face_trigger.utils.data:Generating: Fold 3
INFO:face_trigger.utils.data:Creating directory: /media/ankurrc/new_volume/softura/facerec/att_split_path/2/3
INFO:face_trigger.utils.data:done.
INFO:face_trigger.utils.data:/media/ankurrc/new_volume/softura/facerec/att_split_path/2/3/train.csv
INFO:face_trigger.utils.data:We have 40 subje

### Train Classifier