Automatic Group Photography Enhancement

Have you ever experienced this? When a group photograph has been taken, we will always disappointingly find that some people are looking away, some people are closing their eyes and some people are exactly wearing a sad expression in that photograph. Inspired by this paper, our project aims at synthesizing a perfect group photograph automatically from a given set of group photos.

The system pipeline is as follows:

We built our system on top of the Faster R-CNN. Here we used a TensorFlow implementation.

Note: You can see a better formatted report here

Requirements

Tensorflow (see: Tensorflow). Please select appropriate version (GPU/CPU only) according to your machine.
Libraries you might not have: dlib
Python packages you might not have: cython, python-opencv, easydict, ipython

Installation (for Faster R-CNN)

Clone the repository

# Make sure to clone with --recursive
git clone --recursive git@github.com:Yuliang-Zou/Automatic_Group_Photography_Enhancement.git

Build the Cython modules
```
cd $ROOT/lib
make
```

Model

Pretrained ImageNet model npy, or this

Faster R-CNN model trained on VOC2007 ckpt, npy

Face Detection model ckpt, npy

Eye-closure and smile model ckpt

NOTE: You can use npy files as initialization, while use ckpt files to test and perform certain tasks. ckpt files can be transformed into npy, please check the code in $ROOT/lib/networks/newtork.py

Facial landmark model of dlib dat

Demo

In order to run the automatic enhancement code, you need to:

Finish Requires and Installation sections
Create a directory named model under the repository root, and download Eye-closure and smile model in it.
Download facial landmark model and put it under root directory.

Then you can run:

python tools/enhance.py --model model/VGGnet_fast_rcnn_full_eye_smile_1e-4_iter_70000.ckpt

And you will find the synthesized output under root directory.

Dataset

In this project, we used WIDER to train face detector, used FDDB to train eye-closure and smile classifier (and fine-tune the face detector simultaneously).

In order to use the data iterator of VOC2007, we provide annotations of both dataset:

WIDER: [Google Drive]

FDDB(face detection only): [Google Drive]

FDDB(with eye-closure and smile labels): [Google Drive]

**NOTE: ** some images in FDDB contain too many faces to annotate eye-closure and smile labels, we just ignore them.

How to use the dataset

Download the training, validation, test data and VOCdevkit

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar

Extract all of these tars into one directory named VOCdevkit2007

tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar

It should have this basic structure

  $VOCdevkit2007/                           # development kit
  $VOCdevkit2007/VOCcode/                   # VOC utility code
  $VOCdevkit2007/VOC2007                    # image sets, annotations, etc.
  # ... and several other directories ...

Create symlinks for the PASCAL VOC dataset

  cd $FRCN_ROOT/data
  ln -s $VOCdevkit VOCdevkit2007

Create a folders for WIDER and FDDB

Move the images/ folder of WIDER to VOCdevkit2007/VOC2007/JPEGImages/, and rename it as WIDER/
Move the downloaded annotations of WIDER to VOCdevkit2007/VOC2007/Annotations (the folder should be named as WIDER/)
Move the two folders (2002/ and 2003/) of FDDB to VOCdevkit2007/VOC2007/JPEGImages/
Move the downloaded annotations of FDDB to VOCdevkit2007/VOC2007/Annotations (You can't use old annotation and new annotation at the same time)
Don't forget the set training/val/test set in VOCdevkit2007/VOC2007/ImageSets/Main/. (We here provide examples for you, you can download along with the annotation files)

Training and Testing

If you want to train and test the face detector, you can clone the repository from the TensorFlow version Faster R-CNN, and modify some funtions in $ROOT/lib/ to do this.
If you want to train and test the eye-closure and smile utilities, you can run the following codes:

cd $FRCN_ROOT
python tools/train_net.py --weights model/VGGnet_fast_rcnn_wider_iter_70000.npy --imdb voc_2007_trainval --iters 100000 --cfg experiments/cfgs/faster_rcnn_end2end.yml --network VGGnet_train

or

cd $FRCN_ROOT
python tools/test_yl.py  --model model/VGGnet_fast_rcnn_full_eye_smile_1e-4_iter_70000.ckpt --net VGGnet_test

Experiment results

Face detector

The AP on the WIDER training set is 0.328. The AP on the whole FDDB dataset is 0.902.

Some examples:

(Green box: ground truth, red box: prediction)

Eye-closure and smile classification

Some examples:

Other results will be updated later...

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data/demo		data/demo
example		example
lib		lib
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Report.pdf		Report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data/demo

data/demo

example

example

lib

lib

tools

tools

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Report.pdf

Report.pdf

Repository files navigation

Automatic Group Photography Enhancement

Requirements

Installation (for Faster R-CNN)

Model

Demo

Dataset

How to use the dataset

Training and Testing

Experiment results

About

Releases

Packages

Languages

License

Yuliang-Zou/Automatic_Group_Photography_Enhancement

Folders and files

Latest commit

History

Repository files navigation

Automatic Group Photography Enhancement

Requirements

Installation (for Faster R-CNN)

Model

Demo

Dataset

How to use the dataset

Training and Testing

Experiment results

About

Topics

Resources

License

Stars

Watchers

Forks

Languages