Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation.
Hao Tang1, Dan Xu2, Wei Wang3, Yan Yan4 and Nicu Sebe1.
1University of Trento, Italy, 2University of Oxford, UK, 3EPFL, Switzerland, 4Texas State University, USA.
In ACCV 2018 (Oral).
The repository offers the official implementation of our paper in PyTorch.
Copyright (C) 2019 University of Trento, Italy.
All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)
The code is released for academic research use only. For commercial use, please contact bjdxtanghao@gmail.com.
Clone this repo.
git clone https://github.com/Ha0Tang/AsymmetricGAN
cd AsymmetricGAN/
This code requires PyTorch 0.4.1 and python 3.6+. Please install dependencies by
pip install -r requirements.txt (for pip users)
or
./scripts/conda_deps.sh (for Conda users)
To reproduce the results reported in the paper, you would need two NVIDIA GeForce GTX 1080 Ti GPUs or two NVIDIA TITAN Xp GPUs.
For hand gesture-to-gesture translation task, we use NTU Hand Digit and Creative Senz3D datasets. Both datasets must be downloaded beforehand. Please download them on the respective webpages. In addition, follow GestureGAN to prepare both datasets. Please cite their papers if you use the data.
Preparing NTU Hand Digit Dataset. The dataset can be downloaded in this paper. After downloading it we adopt OpenPose to generate hand skeletons and use them as training and testing data in our experiments. Note that we filter out failure cases in hand gesture estimation for training and testing. Please cite their papers if you use this dataset. Train/Test splits for Creative Senz3D dataset can be downloaded from here.
Preparing Creative Senz3D Dataset. The dataset can be downloaded here. After downloading it we adopt OpenPose to generate hand skeletons and use them as training data in our experiments. Note that we filter out failure cases in hand gesture estimation for training and testing. Please cite their papers if you use this dataset. Train/Test splits for Creative Senz3D dataset can be downloaded from here.
Preparing Your Own Datasets. Each training sample in the dataset will contain {Ix,Iy,Cx,Cy}, where Ix=image x, Iy=image y, Cx=Controllable structure of image x, and Cy=Controllable structure of image y. Of course, you can use AsymmetricGAN for your own datasets and tasks.
Once the dataset is ready. The result images can be generated using pretrained models.
- You can download a pretrained model (e.g. ntu_asymmetricgan) with the following script:
bash ./scripts/download_asymmetricgan_model.sh ntu_asymmetricgan
The pretrained model is saved at ./checkpoints/[type]_pretrained
. Check here for all the available AsymmetricGAN models.
- Generate images using the pretrained model.
For NTU Dataset:
python test.py --dataroot [path_to_NTU_dataset] \
--name ntu_asymmetricgan_pretrained \
--model asymmetricgan \
--which_model_netG resnet_9blocks \
--which_direction AtoB \
--dataset_mode aligned \
--norm instance \
--gpu_ids 0 \
--ngf_t 64 \
--ngf_r 4 \
--batchSize 4 \
--loadSize 286 \
--fineSize 256 \
--no_flip
For Senz3D Dataset:
python test.py --dataroot [path_to_Senz3D_dataset] \
--name senz3d_asymmetricgan_pretrained \
--model asymmetricgan \
--which_model_netG resnet_9blocks \
--which_direction AtoB \
--dataset_mode aligned \
--norm instance \
--gpu_ids 0 \
--ngf_t 64 \
--ngf_r 4 \
--batchSize 4 \
--loadSize 286 \
--fineSize 256 \
--no_flip
If you are running on CPU mode, change --gpu_ids 0
to --gpu_ids -1
.
Note that testing requires a lot of time and large amount of disk storage space. If you don't have enough space, append --saveDisk on the command line.
- The outputs images are stored at
./results/[type]_pretrained/
by default. You can view them using the autogenerated HTML file in the directory.
New models can be trained with the following commands.
-
Prepare dataset.
-
Train.
For NTU dataset:
export CUDA_VISIBLE_DEVICES=3,4;
python train.py --dataroot ./datasets/ntu \
--name ntu_asymmetricgan \
--model asymmetricgan \
--which_model_netG resnet_9blocks \
--which_direction AtoB \
--dataset_mode aligned \
--norm instance \
--gpu_ids 0,1 \
--ngf_t 64 \
--ngf_r 4 \
--batchSize 4 \
--loadSize 286 \
--fineSize 256 \
--no_flip \
--lambda_L1 800 \
--cyc_L1 0.1 \
--lambda_identity 0.01 \
--lambda_feat 1000 \
--display_id 0 \
--niter 10 \
--niter_decay 10
For Senz3D dataset:
export CUDA_VISIBLE_DEVICES=5,7;
python train.py --dataroot ./datasets/senz3d \
--name senz3d_asymmetricgan \
--model asymmetricgan \
--which_model_netG resnet_9blocks \
--which_direction AtoB \
--dataset_mode aligned \
--norm instance \
--gpu_ids 0,1 \
--ngf_t 64 \
--ngf_r 4 \
--batchSize 4 \
--loadSize 286 \
--fineSize 256 \
--no_flip \
--lambda_L1 800 \
--cyc_L1 0.1 \
--lambda_identity 0.01 \
--lambda_feat 1000 \
--display_id 0 \
--niter 10 \
--niter_decay 10
There are many options you can specify. Please use python train.py --help
. The specified options are printed to the console. To specify the number of GPUs to utilize, use export CUDA_VISIBLE_DEVICES=[GPU_ID]
.
To view training results and loss plots on local computers, set --display_id
to a non-zero value and run python -m visdom.server
on a new terminal and click the URL http://localhost:8097.
On a remote server, replace localhost
with your server's name, such as http://server.trento.cs.edu:8097.
To fine-tune a pre-trained model, or resume the previous training, use the --continue_train --which_epoch <int> --epoch_count<int+1>
flag. The program will then load the model based on epoch <int>
you set in --which_epoch <int>
. Set --epoch_count <int+1>
to specify a different starting epoch count.
Testing is similar to testing pretrained models.
For NTU dataset:
python test.py --dataroot [path_to_NTU_dataset] \
--name ntu_asymmetricgan \
--model asymmetricgan \
--which_model_netG resnet_9blocks \
--which_direction AtoB \
--dataset_mode aligned \
--norm instance \
--gpu_ids 0 \
--ngf_t 64 \
--ngf_r 4 \
--batchSize 4 \
--loadSize 286 \
--fineSize 256 \
--no_flip
For Senz3D dataset:
python test.py --dataroot [path_to_Senz3D_dataset] \
--name senz3d_asymmetricgan \
--model asymmetricgan \
--which_model_netG resnet_9blocks \
--which_direction AtoB \
--dataset_mode aligned \
--norm instance \
--gpu_ids 0 \
--ngf_t 64 \
--ngf_r 4 \
--batchSize 4 \
--loadSize 286 \
--fineSize 256 \
--no_flip
Use --how_many
to specify the maximum number of images to generate. By default, it loads the latest checkpoint. It can be changed using --which_epoch
.
train.py
,test.py
: the entry point for training and testing.models/asymmetricgan_model.py
: creates the networks, and compute the losses.models/networks/
: defines the architecture of all models for GestureGAN.options/
: creates option lists usingargparse
package.data/
: defines the class for loading images and controllable structures.
We use several metrics to evaluate the quality of the generated images:
- Fréchet Inception Distance (FID)
- PSNR, need install
Lua
- Fréchet ResNet Distance (FRD), need install
MATLAB 2016+
- Upload supervised AsymmetricGAN code for hand gesture-to-gesture translation
- Upload unsupervised AsymmetricGAN code for multi-domain image-to-image translation: code
If you use this code for your research, please cite our papers.
@article{tang2019asymmetric,
title={Asymmetric Generative Adversarial Networks for Image-to-Image Translation},
author={Hao Tang and Dan Xu and Hong Liu and Nicu Sebe},
journal={arXiv preprint arXiv:1912.06931},
year={2019}
}
@inproceedings{tang2018dual,
title={Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation},
author={Tang, Hao and Xu, Dan and Wang, Wei and Yan, Yan and Sebe, Nicu},
booktitle={ACCV},
year={2018}
}
This source code is inspired by Pix2pix and GestureGAN.
If you have any questions/comments/bug reports, feel free to open a github issue or pull a request or e-mail to the author Hao Tang (bjdxtanghao@gmail.com).