Automatic Group Photography Enhancement: help you get a perfect group photo much more easier
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Automatic Group Photography Enhancement

Have you ever experienced this? When a group photograph has been taken, we will always disappointingly find that some people are looking away, some people are closing their eyes and some people are exactly wearing a sad expression in that photograph. Inspired by this paper, our project aims at synthesizing a perfect group photograph automatically from a given set of group photos.

The system pipeline is as follows:

We built our system on top of the Faster R-CNN. Here we used a TensorFlow implementation.

Note: You can see a better formatted report here


  1. Tensorflow (see: Tensorflow). Please select appropriate version (GPU/CPU only) according to your machine.

  2. Libraries you might not have: dlib

  3. Python packages you might not have: cython, python-opencv, easydict, ipython

Installation (for Faster R-CNN)

  1. Clone the repository
# Make sure to clone with --recursive
git clone --recursive
  1. Build the Cython modules
    cd $ROOT/lib


Pretrained ImageNet model npy, or this

Faster R-CNN model trained on VOC2007 ckpt, npy

Face Detection model ckpt, npy

Eye-closure and smile model ckpt

NOTE: You can use npy files as initialization, while use ckpt files to test and perform certain tasks. ckpt files can be transformed into npy, please check the code in $ROOT/lib/networks/

Facial landmark model of dlib dat


In order to run the automatic enhancement code, you need to:

  1. Finish Requires and Installation sections

  2. Create a directory named model under the repository root, and download Eye-closure and smile model in it.

  3. Download facial landmark model and put it under root directory.

Then you can run:

python tools/ --model model/VGGnet_fast_rcnn_full_eye_smile_1e-4_iter_70000.ckpt

And you will find the synthesized output under root directory.


In this project, we used WIDER to train face detector, used FDDB to train eye-closure and smile classifier (and fine-tune the face detector simultaneously).

In order to use the data iterator of VOC2007, we provide annotations of both dataset:

WIDER: [Google Drive]

FDDB(face detection only): [Google Drive]

FDDB(with eye-closure and smile labels): [Google Drive]

**NOTE: ** some images in FDDB contain too many faces to annotate eye-closure and smile labels, we just ignore them.

How to use the dataset

  1. Download the training, validation, test data and VOCdevkit
  1. Extract all of these tars into one directory named VOCdevkit2007
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar
  1. It should have this basic structure
  $VOCdevkit2007/                           # development kit
  $VOCdevkit2007/VOCcode/                   # VOC utility code
  $VOCdevkit2007/VOC2007                    # image sets, annotations, etc.
  # ... and several other directories ...
  1. Create symlinks for the PASCAL VOC dataset
  cd $FRCN_ROOT/data
  ln -s $VOCdevkit VOCdevkit2007
  1. Create a folders for WIDER and FDDB
  • Move the images/ folder of WIDER to VOCdevkit2007/VOC2007/JPEGImages/, and rename it as WIDER/

  • Move the downloaded annotations of WIDER to VOCdevkit2007/VOC2007/Annotations (the folder should be named as WIDER/)

  • Move the two folders (2002/ and 2003/) of FDDB to VOCdevkit2007/VOC2007/JPEGImages/

  • Move the downloaded annotations of FDDB to VOCdevkit2007/VOC2007/Annotations (You can't use old annotation and new annotation at the same time)

  • Don't forget the set training/val/test set in VOCdevkit2007/VOC2007/ImageSets/Main/. (We here provide examples for you, you can download along with the annotation files)

Training and Testing

  1. If you want to train and test the face detector, you can clone the repository from the TensorFlow version Faster R-CNN, and modify some funtions in $ROOT/lib/ to do this.

  2. If you want to train and test the eye-closure and smile utilities, you can run the following codes:

python tools/ --weights model/VGGnet_fast_rcnn_wider_iter_70000.npy --imdb voc_2007_trainval --iters 100000 --cfg experiments/cfgs/faster_rcnn_end2end.yml --network VGGnet_train


python tools/  --model model/VGGnet_fast_rcnn_full_eye_smile_1e-4_iter_70000.ckpt --net VGGnet_test

Experiment results

  1. Face detector

The AP on the WIDER training set is 0.328. The AP on the whole FDDB dataset is 0.902.

Some examples:

(Green box: ground truth, red box: prediction)

  1. Eye-closure and smile classification

Some examples:

  1. Other results will be updated later...