Skip to content

Yuliang-Zou/Automatic_Group_Photography_Enhancement

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Automatic Group Photography Enhancement

Have you ever experienced this? When a group photograph has been taken, we will always disappointingly find that some people are looking away, some people are closing their eyes and some people are exactly wearing a sad expression in that photograph. Inspired by this paper, our project aims at synthesizing a perfect group photograph automatically from a given set of group photos.

The system pipeline is as follows:

We built our system on top of the Faster R-CNN. Here we used a TensorFlow implementation.

Note: You can see a better formatted report here

Requirements

  1. Tensorflow (see: Tensorflow). Please select appropriate version (GPU/CPU only) according to your machine.

  2. Libraries you might not have: dlib

  3. Python packages you might not have: cython, python-opencv, easydict, ipython

Installation (for Faster R-CNN)

  1. Clone the repository
# Make sure to clone with --recursive
git clone --recursive git@github.com:Yuliang-Zou/Automatic_Group_Photography_Enhancement.git
  1. Build the Cython modules
    cd $ROOT/lib
    make

Model

Pretrained ImageNet model npy, or this

Faster R-CNN model trained on VOC2007 ckpt, npy

Face Detection model ckpt, npy

Eye-closure and smile model ckpt

NOTE: You can use npy files as initialization, while use ckpt files to test and perform certain tasks. ckpt files can be transformed into npy, please check the code in $ROOT/lib/networks/newtork.py

Facial landmark model of dlib dat

Demo

In order to run the automatic enhancement code, you need to:

  1. Finish Requires and Installation sections

  2. Create a directory named model under the repository root, and download Eye-closure and smile model in it.

  3. Download facial landmark model and put it under root directory.

Then you can run:

python tools/enhance.py --model model/VGGnet_fast_rcnn_full_eye_smile_1e-4_iter_70000.ckpt

And you will find the synthesized output under root directory.

Dataset

In this project, we used WIDER to train face detector, used FDDB to train eye-closure and smile classifier (and fine-tune the face detector simultaneously).

In order to use the data iterator of VOC2007, we provide annotations of both dataset:

WIDER: [Google Drive]

FDDB(face detection only): [Google Drive]

FDDB(with eye-closure and smile labels): [Google Drive]

**NOTE: ** some images in FDDB contain too many faces to annotate eye-closure and smile labels, we just ignore them.

How to use the dataset

  1. Download the training, validation, test data and VOCdevkit
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
  1. Extract all of these tars into one directory named VOCdevkit2007
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar
  1. It should have this basic structure
  $VOCdevkit2007/                           # development kit
  $VOCdevkit2007/VOCcode/                   # VOC utility code
  $VOCdevkit2007/VOC2007                    # image sets, annotations, etc.
  # ... and several other directories ...
  1. Create symlinks for the PASCAL VOC dataset
  cd $FRCN_ROOT/data
  ln -s $VOCdevkit VOCdevkit2007
  1. Create a folders for WIDER and FDDB
  • Move the images/ folder of WIDER to VOCdevkit2007/VOC2007/JPEGImages/, and rename it as WIDER/

  • Move the downloaded annotations of WIDER to VOCdevkit2007/VOC2007/Annotations (the folder should be named as WIDER/)

  • Move the two folders (2002/ and 2003/) of FDDB to VOCdevkit2007/VOC2007/JPEGImages/

  • Move the downloaded annotations of FDDB to VOCdevkit2007/VOC2007/Annotations (You can't use old annotation and new annotation at the same time)

  • Don't forget the set training/val/test set in VOCdevkit2007/VOC2007/ImageSets/Main/. (We here provide examples for you, you can download along with the annotation files)

Training and Testing

  1. If you want to train and test the face detector, you can clone the repository from the TensorFlow version Faster R-CNN, and modify some funtions in $ROOT/lib/ to do this.

  2. If you want to train and test the eye-closure and smile utilities, you can run the following codes:

cd $FRCN_ROOT
python tools/train_net.py --weights model/VGGnet_fast_rcnn_wider_iter_70000.npy --imdb voc_2007_trainval --iters 100000 --cfg experiments/cfgs/faster_rcnn_end2end.yml --network VGGnet_train

or

cd $FRCN_ROOT
python tools/test_yl.py  --model model/VGGnet_fast_rcnn_full_eye_smile_1e-4_iter_70000.ckpt --net VGGnet_test

Experiment results

  1. Face detector

The AP on the WIDER training set is 0.328. The AP on the whole FDDB dataset is 0.902.

Some examples:

(Green box: ground truth, red box: prediction)

  1. Eye-closure and smile classification

Some examples:

  1. Other results will be updated later...

About

Automatic Group Photography Enhancement: help you get a perfect group photo much more easier

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published