Skip to content
[WACV2018] Channel-Recurrent Autoencoding for Image Modeling
Lua
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
models initial commit Jan 25, 2018
train_test fix minor bug Jul 23, 2018
utils add figure Jan 25, 2018
KLDCriterion.lua initial commit Jan 25, 2018
README.md edit title Jan 26, 2018
RNNinit.lua initial commit Jan 25, 2018
Sampler.lua initial commit Jan 25, 2018
adam_gan.lua initial commit Jan 25, 2018
generate_images.lua file for image generations from pretrained models Jan 25, 2018
image.lua initial commit Jan 25, 2018
image_utils.lua initial commit Jan 25, 2018
init.lua initial commit Jan 25, 2018
main.lua initial commit Jan 25, 2018
main_mnist.lua initial commit Jan 25, 2018
opts.lua file for image generations from pretrained models Jan 25, 2018

README.md

Channel-Recurrent Autoencoding for Image Modeling

Prerequisites

luarocks install cudnn
  • Install the batchDisc branch of the git repo stnbhwd, as we need the batch discrimination layer.

Dataset

  • We provide code to train Birds dataset. The processed t7 files can be downloaded from here.
  • We also provide code to conduct ablation studies on MNIST. The MNIST files (both binary and dynamic) can be downloaded from here.

Training

  • To train Birds with baseline VAE-GAN,
th main.lua -data /path/to/Birds/ -save /path/to/checkpoints/ -alpha 0.0002 -beta 0.05 -LR 0.0003 -eps 1e-6 -mom 0.9 -step 60 -manualSeed 1196
  • To train Birds with channel-recurrent VAE-GAN,
th main.lua -data /path/to/Birds/ -save /path/to/checkpoints/ -alpha1 0.0003 -alpha2 0.0002 -beta 0.0125 -LR 0.0003 -kappa 0.02 -latentType lstm -eps 1e-6 -mom 0.9 -step 60 -manualSeed 96
  • To train MNIST with VAE,
th main_mnist.lua -LR 0.0003 -alpha 0.001 -latentType baseline -dataset mnist_28x28 -baseChannels 32 -nEpochs 200 -eps 1e-5 -mom 0.1 -step 50 -save /path/to/save/ -dynamicMNIST /path/to/dynamics/mnist/ -binaryMNIST /path/to/binary/mnist/
  • To train MNIST with convolutional VAE,
th main_mnist.lua -LR 0.0003 -alpha 0.001 -latentType conv -dataset mnist_28x28 -baseChannels 32 -nEpochs 200 -eps 1e-5 -mom 0.1 -step 50 -save /path/to/save/ -dynamicMNIST /path/to/dynamics/mnist/ -binaryMNIST /path/to/binary/mnist/
  • To train MNIST with channel-recurrent VAE,
th main_mnist.lua -LR 0.003 -timeStep 8 -alpha 0.001 -latentType lstm -dataset mnist_28x28 -baseChannels 32 -nEpochs 200 -eps 1e-5 -mom 0.1 -step 50 -save /path/to/save/ -dynamicMNIST /path/to/dynamics/mnist/ -binaryMNIST /path/to/binayr/mnist/

Pretrained Models

We provide pretrained models for Birds, CelebA, LSUN bedrooms, which can be downloaded from here for others to perform quantitative evaluations such as in terms of inception scores, human preferences and etc. We also provide a script generate_images.lua to generate individual images from the pretrained models. This script has a lot of hardcoded components targeting at our models; for the same reason, please maintain the model names since they are parsed inside the script to extract model information. Here we present selected samples from the pretrained models.

  • The Stage1 models are named in the format of DATASET_Stage1_MODELTYPE.t7; for example, birds_Stage1_crvae.t7 refers to the crVAE-GAN model trained on 64x64 Birds images. To generate (nSamples is the number of generations),
nSamples=500 modelFile=birds_Stage1_crvae.t7 modelDir=/path/to/pretrained/ saveDir=/path/to/save/ th generate_images.lua
  • To generate Stage2 images-128x128 for Birds and CelebA, 224x224 for LSUN bedrooms-the Stage1 models are necessary to have. In addition, Birds and CelebA require 2 GPUs and LSUN bedrooms 3 GPUs. The Stage2 models trained with perceptual loss are named in the format of DATASET_Stage2_MODELTYPE_perc.t7, without DATASET_Stage2_MODELTYPE.t7. For example, birds_Stage2_crvae_perc.t7 refers to the Stage2 model trained on top of crVAE-GAN Stage1 model for Birds with perceptual loss. To generate,
nSamples=500 modelFile=birds_Stage2_crvae_perc.t7 modelDir=/path/to/pretrained/ saveDir=/path/to/save/ th generate_images.lua

Citation

If you find our code useful, please cite our paper [pdf][supplementary materials]:

@inproceedings{shang2017channel,
  title={Channel-Recurrent Autoencodering for Image Modeling},
  author={Shang, Wenling and Sohn, Kihyuk and Tian, Yuandong},
  booktitle={WACV},
  year={2018}
}

If you use the Birds data, please also cite the following papers

@article{wah2011caltech,
  title={The caltech-ucsd birds-200-2011 dataset},
  author={Wah, Catherine and Branson, Steve and Welinder, Peter and Perona, Pietro and Belongie, Serge},
  year={2011},
  publisher={California Institute of Technology}
}
@inproceedings{van2015building,
  title={Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection},
  author={Van Horn, Grant and Branson, Steve and Farrell, Ryan and Haber, Scott and Barry, Jessie and Ipeirotis, Panos and Perona, Pietro and Belongie, Serge},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={595--604},
  year={2015}
}
@inproceedings{berg2014birdsnap,
  title={Birdsnap: Large-scale fine-grained visual categorization of birds},
  author={Berg, Thomas and Liu, Jiongxin and Woo Lee, Seung and Alexander, Michelle L and Jacobs, David W and Belhumeur, Peter N},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={2011--2018},
  year={2014}
}

If you use the dynamic MNIST dataset, please also cite

@article{lecun1998mnist,
  title={The MNIST database of handwritten digits},
  author={LeCun, Yann},
  journal={http://yann. lecun. com/exdb/mnist/}
}

If you use the static MNIST datset, please also cite

@article{uria2016neural,
  title={Neural autoregressive distribution estimation},
  author={Uria, Benigno and C{\^o}t{\'e}, Marc-Alexandre and Gregor, Karol and Murray, Iain and Larochelle, Hugo},
  journal={Journal of Machine Learning Research},
  volume={17},
  number={205},
  pages={1--37},
  year={2016}
}

Acknowledgments

Torch is a fantastic framework for deep learning research, which allows fast prototyping and easy manipulation of gradient propogations. We would like to thank the amazing Torch developers and the community. Our implementation has especially been benefited from the following excellent repositories:

You can’t perform that action at this time.