Implementation of the paper : "Toward Multimodal Image-to-Image Translation"
Switch branches/tags
Clone or download
Latest commit ea1bfae Feb 13, 2018
Type Name Latest commit message Commit time
Failed to load latest commit information.
ckpt add link to checkpoint fiel Feb 12, 2018
imgs test changes Feb 12, 2018
nnet test changes Feb 12, 2018
utils test changes Feb 12, 2018
.gitignore Adding visualization notebook Dec 20, 2017
LICENSE complete single discriminator Dec 20, 2017 test changes Feb 12, 2018 test changes Feb 12, 2018 fix conflicts Feb 2, 2018
requirements.txt reading images from numpy records Jan 18, 2018 Adding resnet v1 Dec 27, 2017 complete cVAE-GAN Dec 29, 2017 complete cVAE-GAN Dec 29, 2017 test changes Feb 12, 2018
visualize.ipynb update README Jan 12, 2018


Implementation of the paper : Toward Multimodal Image-to-Image Translation


Result First column represents input, second column the ground truth. The next is the image generated from cLR-GAN and the last column represents the image generated from cVAE-GAN. Results were obtained from validation dataset.


Model Architecture Visualization

  • Network

Fig 1: Structure of BicycleGAN. (Image taken from the paper)

  • Tensorboard visualization of the entire network

cVAE-GAN Network


  • tensorflow (1.4.0)
  • numpy (1.13.3)
  • scikit-image (0.13.1)
  • scipy (1.0.0)

To install the above dependencies, run:

$ sudo pip install -r requirements.txt




  • Download the datasets from the following links

  • To generate numpy files for the datasets,

    $ python --create <dataset_name>

    This creates train.npy and val.npy in the corresponding dataset directory. This generates very huge files. As an alternate, the next step attempts to read images at run-time during training

  • Alternate to the above step, you could read the images in real time during training. To do this, you should create files containing paths to the images. This can be done by running the following script in the root of this repo.

    $ bash


  • Generating graph:

    To visualize the connections between the graph nodes, we can generate the graph using the flag archi. This would be useful to assert the connections are correct. This generates the graph for bicycleGAN

    $ python --archi

    To generate the model graph for cvae-gan,

    $ python --model cvae-gan --archi

    Possible models are: cvae-gan, clr-gan, bicycle (default)

    To visualize the graph on tensorboard, run the following command:

    $ tensorboard --logdir=logs/summary/Run_1 --host=

    Replace Run_1 with the latest directory name

  • Complete list of options:

    $ python --help

  • Training the network

    To train model (say cvae-gan) on dataset (say facades) from scratch,

    $ python --train --model cvae-gan --dataset facades

    The above command by default trains the model in which images from distribution of domain B are generated conditioned on the images from the distribution of domain A. To switch the direction,

    $ python --train --model cvae-gan --dataset facades --direction b2a

    To resume the training from a checkpoint,

    $ python --resume <path_to_checkpoint> --model cvae-gan

  • Testing the network

    • Download the checkpoint file from here and place the checkpoint files in the ckpt directory

    To test the model from the given trained models, by default the model generates 5 different images (by sampling 5 different noise samples)

    $ ./ <dataset_name> <test_image_path>

    To generate multiple output samples,

    $ ./ <dataset_name> <test_image_path> < # of samples>

    Try it with some of the test samples present in the directory imgs/test


Loss of discriminator and generator as function of iterations on edges2shoes dataset.


  • Residual Encoder
  • Multiple discriminators for cVAE-GAN and cLR-GAN
  • Inducing noise to all the layers of the generator
  • Train the model on rest of the datasets


Released under the MIT license