All in one papaer implementation
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is 12 commits behind samimideksa:master.
Latest commit 1156aed Aug 8, 2018

README.md

All-In-One

All in one paper implementation

Getting started

All in one convolutional network for face analysis presents a multipurpose algorithm for simultaneous face detection, face alignment, pose estimation, gender recognition, smile detection, age estimation and face recognition using a single convolutional neural network(CNN).

Prerequisites

The project can be run by installing conda virtual environment with python=3.6 and installing dependencies using pip. Inside the projects directory run the following commands.

  • conda create -n <environment_name> python=3.6.
    After creating the environment use the following commands to install dependacies.
  • pip install keras
  • pip install tensorflow
  • pip install sklearn
  • pip install pandas
  • pip install opencv-python
  • pip install dlib

How to train the model

These datasets are used for training the network.

  • AFLW dataset provides a large-scale collection of annotated face images gathered from the web, exhibiting a large variety in appearance (e.g., pose, expression, ethnicity, age, gender) as well as general imaging and environmental conditions.
  • IMDB-WIKI dataset is the largest publicly available dataset of face images with gender and age labels for training.
  • CelebA dataset is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations.
  • Adience dataset attempts to capture all the variations in appearance, noise, pose, lighting and more, that can be expected of images taken without careful preparation or posing.
  • Extended Cohn-Kanade dataset (CK+), which is a public benchmark dataset for action unit and emotion recognition. The CK+ comprises a total of 593 sequences across 123 subjects. Sequences range from neutral to peak expression.
  • The Yale Face Database (size 6.4MB) contains 165 grayscale images in GIF format of 15 individuals. There are 11 images per subject, one per different facial expression or configuration: center-light, w/glasses, happy, left-light, w/no glasses, normal, right-light, sad, sleepy, surprised, and wink.

The network architecture can be found in this paper. The model is built with deep convolutional layers in keras and is found in nets/model.py.

Training

The model can be trained with age, gender,detection, visibility,pose, landmarks,identity, smile, and eye_glasses labels by using the following commands inside the project's directory.
The following code snippet is bash command to train the network in aflw dataset for face detection

python -m train --dataset aflw --images_path /path-to-dataset-images/ \
    --label detection --batch_size 100 --steps 500  --ol output-of-large-model --os output-of-small-model --epochs 10;

Options

  • --images_path - Path to dataset images
  • --dataset - Type of dataset to train the model. This could be imdb, wiki, celeba,yale,ck+,aflw. The layers that are going to be trained also depends on this choice.
  • --label - This option specifies which for which type of classification/prediction to train the model. The choices are age, gender,detection, visibility,pose, landmarks,identity, smile, and eye_glasses.
  • --epochs.
  • --batch_size.
  • --resume - To start training from previous checkpoint if available.
  • --steps - Steps per epoch.
  • --ol - Output filename to save large model(model with all layers)
  • --os - Output filename to save small model(model with layers trained with current training)
  • --load_model -
  • --freeze - If true freezes shared layers of the model

How to run demo

To do lists

Previous results recorded after training the model.

  • Gender estimation(~89% accuracy)
  • Face detection(~90% accuracy)
  • Smile detection(~91% accuracy)
  • Age prediction(~4% accuaracy)

Tasks remaining

  • Use CASIA and MORPH dataset for further training the model on age, detection and gender labels.
  • Implement pose estimation, Landmark detection and Face recognition.