Skip to content


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


This is the official pytorch implemention for the paper "Facial Image-to-Video Translation by a Hidden Affine Transformation" in ACM Multimedia 19.

Happy Anger Surprise Contempt
vid_0026_latest vid_0001_latest vid_0010_latest vid_0003_latest
Eye Up Close Eye Drum Cheeks
vid_0002_latest vid_0009_latest vid_0011_latest


  • Python 3.6

  • Clone this repo:

git clone
cd AffineGAN
  • Install PyTorch 0.4+(1.1 has been tested) and torchvision from and other dependencies (e.g., visdom and dominate).
    • For pip users, please type the command pip install -r requirements.txt.
    • For Conda users, we provide a installation script ./scripts/


Datasets preparation

  • As not all the volunteers agree to make their expressions public, we cannot release the whole CK-Mixed and Cheeks&Eyes dataset. Instead we provide some examples in ./datasets/.
  • You can create your own dataset following the examples, with the first frame in your training videos is the initial expression. You can download CK+ dataset for Gray-scale expression training.
  • Note that for Cheeks&Eyes, we ask only less than 100 volunteers collect these data using the camera of their cellphone. Thus, the dataset collection is easy. Details can be seen in the paper.
  • Some tips for the dataset collection. If you don't follow these tips, you can still train the model, but may harm the performance to some degree.
    • Proper aspect ratio, as we will resize the image to 256*256 when training.
    • Make the head centered in the image.
    • Keep the head fixed. Only the facial expressions are changing.
    • The attributes of volunteers like race and age will influence the effects of our model. Model trained on Asian people may not perform satisfactorily for European people.
  • You can create new expression categories, like opening eyes. But note that large motions of head (e.g., nodding or shaking head) cannot be well captured by AffineGAN now.
  • Please keep the dataset folder structure as follows:
│  ├──img/
│  │  ├──video_name
│  │  │  ├──0001.jpg
│  │  │  ├──0002.jpg
│  │  │  ├──0003.jpg
│  │  │  └──...
│  └──patch/(if necessary)
│     └──video_name
│        ├──0001.jpg
│        ├──0002.jpg
│        ├──0003.jpg
│        └──...


We provide sample scripts for training and generation in ./scripts/. Note that these scripts run on cpu. If you want to test on gpu, modify the gpu_ids option in the scripts.


  • To train a model (The detailed options can be seen in options/*_options):
python --dataroot /path/to/dataset --name your_exp_name --checkpoints_dir /path/to/checkpoints
  • If you do not use the local patch for mouse, add option --no_patch.
  • For continue training, add --continue_train and the model will be trained from the latest model.
  • The current version only supports training on single GPU, and set batch_size to 1.

Pretrained Models

We provide some pretrained models for all the expressions. Note that it is may not be the optimal one we have, and may perform badly in some online images if they are much different from our training samples. Please place the model in /path/to/checkpoints. For example, /path/to/checkpoints/happy/latest_net_G.pth, where happy is the name of experiment specified in --name option.


  • Download our pretrained models or use your models.
  • To generate frames for given test images.
python --dataroot /path/to/dataset --name your_exp_name --checkpoints_dir /path/to/checkpoints --results_dir /path/to/result --eval
  • Make gifs from generated frames.
python --exp_names exp_name1,exp_name1,.. --dataroot /path/to/dataset
  • You will see the results in the specified results_dir.


If you use this code for your research, please cite our papers.

  title={Facial Image-to-Video Translation by a Hidden Affine Transformation},
  author={Shen, Guangyao and Huang, Wenbing and Gan, Chuang and Tan, Mingkui and Huang, Junzhou and Zhu, Wenwu and Gong, Boqing},
  booktitle={Proceedings of the 27th ACM international conference on Multimedia},


This project is heavily borrowed from the project Image-to-Image Translation in PyTorch. Special thanks for Lijie Fan and the volunteers who provide their expressions.


PyTorch Implementation of "Facial Image-to-Video Translation by a Hidden Affine Transformation" in MM'19.







No releases published


No packages published