Automatic Group Photography Enhancement
Have you ever experienced this? When a group photograph has been taken, we will always disappointingly find that some people are looking away, some people are closing their eyes and some people are exactly wearing a sad expression in that photograph. Inspired by this paper, our project aims at synthesizing a perfect group photograph automatically from a given set of group photos.
The system pipeline is as follows:
Note: You can see a better formatted report here
Tensorflow (see: Tensorflow). Please select appropriate version (GPU/CPU only) according to your machine.
Libraries you might not have:
Python packages you might not have:
Installation (for Faster R-CNN)
- Clone the repository
# Make sure to clone with --recursive git clone --recursive email@example.com:Yuliang-Zou/Automatic_Group_Photography_Enhancement.git
- Build the Cython modules
cd $ROOT/lib make
Eye-closure and smile model ckpt
NOTE: You can use
npy files as initialization, while use
ckpt files to test and perform certain tasks.
ckpt files can be transformed into
npy, please check the code in
Facial landmark model of dlib dat
In order to run the automatic enhancement code, you need to:
Finish Requires and Installation sections
Create a directory named
modelunder the repository root, and download Eye-closure and smile model in it.
Download facial landmark model and put it under root directory.
Then you can run:
python tools/enhance.py --model model/VGGnet_fast_rcnn_full_eye_smile_1e-4_iter_70000.ckpt
And you will find the synthesized output under root directory.
In order to use the data iterator of VOC2007, we provide annotations of both dataset:
WIDER: [Google Drive]
FDDB(face detection only): [Google Drive]
FDDB(with eye-closure and smile labels): [Google Drive]
**NOTE: ** some images in FDDB contain too many faces to annotate eye-closure and smile labels, we just ignore them.
How to use the dataset
- Download the training, validation, test data and VOCdevkit
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
- Extract all of these tars into one directory named
tar xvf VOCtrainval_06-Nov-2007.tar tar xvf VOCtest_06-Nov-2007.tar tar xvf VOCdevkit_08-Jun-2007.tar
- It should have this basic structure
$VOCdevkit2007/ # development kit $VOCdevkit2007/VOCcode/ # VOC utility code $VOCdevkit2007/VOC2007 # image sets, annotations, etc. # ... and several other directories ...
- Create symlinks for the PASCAL VOC dataset
cd $FRCN_ROOT/data ln -s $VOCdevkit VOCdevkit2007
- Create a folders for WIDER and FDDB
images/folder of WIDER to
VOCdevkit2007/VOC2007/JPEGImages/, and rename it as
Move the downloaded annotations of WIDER to
VOCdevkit2007/VOC2007/Annotations(the folder should be named as
Move the two folders (
2003/) of FDDB to
Move the downloaded annotations of FDDB to
VOCdevkit2007/VOC2007/Annotations(You can't use old annotation and new annotation at the same time)
Don't forget the set training/val/test set in
VOCdevkit2007/VOC2007/ImageSets/Main/. (We here provide examples for you, you can download along with the annotation files)
Training and Testing
If you want to train and test the face detector, you can clone the repository from the TensorFlow version Faster R-CNN, and modify some funtions in
$ROOT/lib/to do this.
If you want to train and test the eye-closure and smile utilities, you can run the following codes:
cd $FRCN_ROOT python tools/train_net.py --weights model/VGGnet_fast_rcnn_wider_iter_70000.npy --imdb voc_2007_trainval --iters 100000 --cfg experiments/cfgs/faster_rcnn_end2end.yml --network VGGnet_train
cd $FRCN_ROOT python tools/test_yl.py --model model/VGGnet_fast_rcnn_full_eye_smile_1e-4_iter_70000.ckpt --net VGGnet_test
- Face detector
The AP on the WIDER training set is
0.328. The AP on the whole FDDB dataset is
(Green box: ground truth, red box: prediction)
- Eye-closure and smile classification
- Other results will be updated later...