Skip to content

Re-implementation of "Unpaired image-to-image translation using cycle-consistent adversarial networks" by Lee et al. 2018

Notifications You must be signed in to change notification settings

mcbuehler/FP-GAN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FP-GAN TensorFlow

This repository contains a re-implementation of the model proposed by Lee et al. (2018) in their paper "Unpaired image-to-image translation using cycle-consistent adversarial networks".

As an extension, we added a feature preserving cost function. We preserve eye gaze directions, as well as regional landmarks when training the image translation networks.

Please a have look at the report for a detailed description about the experiments and results.

Repository-URL: https://github.com/mcbuehler/FP-GAN

Overall Architecture

Overall Architecture of FP-GAN

There are four domains, two of them represent the original images from MPIIGaze (R) and UnityEyes (S). The other two domains represent images produced by the generators G and F. We calculate the standard GAN loss from the discriminators (DS and DR). For the eye gaze consistency (L_eg) and landmark losses (L_lm), we trained two additional Convolutional Neural Networks. The identity-transform loss (L_id) can directly be calculated. The cycle-consistency loss (L_cyc) is being omitted in order to reduce cluttering. For an illustration, please refer to Fig. 3 of the original CycleGAN paper [Zhu et al. 2017].

Qualitative Results

Qualitative comparison of the translations. The leftmost columns correspond to the original images. Columns a) to d) correspond to the following settings: a) λ_id = 2, λ_eg = λ_lm = 0 b) λ_eg = 30, λ_id = λ_lm = 0 c) λ_lm = 15, λ_id = λ_eg = 0 d) λ_id = 5, λ_eg = 15, λ_lm = 0 (failed). We observe that in column d), the weights of the feature losses were too high and therefore, the generator did not produce images that were optically different to the input images.

Real to Synthetic

Qualitative results when translating from the real to the synthetic domain

Synthetic to Real

Qualitative results when translating from the synthetic to the real domain

Quick Start

For more detailed step by step instructions see below.

  1. Install requirements
  2. Download the datasets: MPIIFaceGaze and UnityEyes
  3. Create a config file for the FP-GAN
  4. Download the pre-trained models for feature consistency
  5. Train the FP-GAN model
  6. Translate images
  7. (optional) Create a config file for a gaze estimator. Then, train a gaze estimation network and run inference.

Folder Structure

The cfg folder contains the config files (.ini) for the FP-GAN and the gaze estimator networks.

The folder documentation includes a detailed report and some illustrations.

The src folder contains the source code. These are the subfolders:

  • input: dataset classes and preprocessing scripts
  • models: eye gaze estimation models, GAN models (Generator, Discriminator, Feature Models) and scripts for model export / inference
  • run: scripts for running model training, inference, testing and visualisations
  • util: various helper scripts
  • visualisations: various visualisation scripts

Setup

  1. Install requirements pip install -r requirements.txt

  2. Download and unzip the datasets

    Download and unzip the ready-to-use datasets for MPIIFaceGaze and UnityEyes from here.

    As an alternative, you can pre-process the data yourself. Download it from the MPIIFaceGaze website and convert it to an h5 File with one group per person (e.g. 'p01'). For each per person add a sub group for "image" (the eye image), "gaze" (the gaze direction in 2D) and "head" (the head pose in 2D). You can find the pre-processing script that we used in our experiments on Bitbucket.

    If you want to generate your own UnityEyes dataset, download UnityEyes and follow their instructions. We recommend a size of at least 100,000 images.

  3. Update sample config file for the FP-GAN to your needs.

    Make sure that the dataset (S, R) and feature model paths (ege_path, lm_path) are correct.

  4. Download and unzip the pre-trained models for feature consistency.

    Optionally, re-train the models for eye gaze and/or landmarks consistency.

  5. Train an FP-GAN model

    Example command (run it from inside the src folder):

    python run/train_fpgan.py --config ../config/fpgan_example.ini --section DEFAULT
    
  6. Translate images

    • Update the config file

      Before running the image translation, you need to update the config file with the newly trained model.

      We recommend copying the DEFAULT section and giving it a new name, e.g. MYFPGAN. Then, set the checkpoint_folder variable to the newly trained model. For example: checkpoint_folder=../checkpoints/20190113-1455

    • Run the translations

      This will create subfolders in the FP-GAN checkpoint folder. Those subfolders will contain the refined images. Run this from inside the src folder.

      python run/run_fpgan_translations.py
          --config ../config/fpgan_example.ini
          --section MYFPGAN
          --direction both
      
  7. (optional) Train your own gaze estimator or use the pre-trained one from above in order to estimate eye gaze performance. For this you need to set the path_test and dataset_class_test in the config file and run the test script. Again, we recommend to copy the DEFAULT section for this.

    path_test =  ../checkpoints/20190113-1455/refined_MPII2Unity
    dataset_class_test = refined
    

    Then, run the script (inside the src folder): python run/run_test_gazenet.py --config ../config/gazenet_example.ini --section MYGAZENET

Feedback

I am happy to get your constructive feedback. Please don't hesitate to contact me if you have comments or questions. Thank you.

References

  • LEE , K., KIM , H., AND SUH , C. 2018. Simulated+unsupervised learning with adaptive data generation and bidirectional mappings. In International Conference on Learning Representations.
  • WOOD , E., B ALTRUŠAITIS , T., MORENCY , L.-P., ROBINSON , P., AND BULLING , A. 2016. Learning an appearance-based gaze estimator from one million synthesised images. In Pro- ceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications, 131–138. Website
  • ZHANG , X., SUGANO , Y., FRITZ , M., AND BULLING , A. 2017. It’s written all over your face: Full-face appearance-based gaze estimation. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on, IEEE, 2299–2308. Website
  • ZHU, J., PARK, T., ISOLA, P., AND EFROS, A. A. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. CoRR abs/1703.10593. Website

About

Re-implementation of "Unpaired image-to-image translation using cycle-consistent adversarial networks" by Lee et al. 2018

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages