Skip to content

hankyul2/pytorch-ae

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pytorch AutoEncoder

This repository contains pytorch auto encoder examples.

Most code are copy & pasted version from pytorch-generative.


Table of contents

  1. Tutorial
  2. Experiment result
  3. Generated Image
  4. Reference

🌱Tutorial

  1. Clone this repo.

    git clone https://github.com/hankyul2/pytorch-ae.git
  2. Train your model.

    python3 train.py -m nade
  3. Use trained model in your way. We provide code snippet for how to sample from model.

    import torch
    from torchvision.utils import save_image
    from pae.model import NADE
    
    model = NADE()
    model.load_state_dict(torch.load('your_checkpont.pth'))
    generated_img = model.sample(16, 'cpu').reshape(16, 1, 28, 28)
    save_image(generated_img, 'generated_by_NADE.jpg')

🍀Experiment Result

Negative Log Likelihood (NLL) loss on Binarized MNIST dataset.

Method Command NLL Pretrained model
NADE1 python3 train.py -m NADE 84.0 [code] [weight] [log]
MADE2 python3 train.py -m MADE 83.8 [code] [weight] [log]
PixelCNN3 python3 train.py -m PixelCNN 81.7 [code] [weight] [log]
Gated PixelCNN4 python3 train.py -m GatedPixelCNN 81.7 [code] [weight] [log]
PixelCNN++5 python3 train.py -m PixelCNN++ -b 128 --lr 2.5e-4 78.2 [code] [weight] [log]
PixelSnail6 python3 train.py -m PixelSnail -b 64 --lr 1.25e-4
PixelSnail++56 python3 train.py -m PixelSnail++ -b 64 --lr 1.25e-4
AE7
VAE8
Categorical-VAE9
VQ-VAE10
VQ-VAE-v211
dVAE12
DDPM13
CDM14

Issues

  1. MADE: we could not utilize any agnostic training tricks (order, connectivity) proposed in the original paper.
  2. PixelCNN: we could not understand and implement the PixelRNN model which is mainly discussed in the original paper.
  3. Changing batch size & learning rate: we could not train some models with default -b 512 so, we reduce batch_size to fit our gpu memory and change lr linearly.

🖼️Generated Image

Method Reconstructed Image Randomly Sampled Image
NADE1 val_49 sample_49
MADE2 val sample
PixelCNN3 val sample
Gated PixelCNN4
PixelCNN++5
PixelSnail6
PixelSnail++56
AE7
VAE8
Categorical-VAE9
VQ-VAE10
VQ-VAE-v211
dVAE12
DDPM13
CDM14

🍁Reference

Footnotes

  1. NADE: "Neural Autoregressive Distribution Estimation", JMLR, 2016 [paper] 2

  2. MADE: "Masked Autoencoder for Distribution Estimation", PMLR, 2015 [paper] 2

  3. PixelCNN: "pixel recurrent neural networks", PMLR, 2016 [paper] 2

  4. Gated PixelCNN: "Conditional Image Generation with PixelCNN Decoders", NIPS, 2016 [paper] 2

  5. PixelCNN++: "Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications", ICLR, 2017 [paper] 2 3 4

  6. PixelSnail: "Cascaded Diffusion Models for High Fidelity Image Generation", PMLR, 2018 [paper] 2 3 4

  7. AE: "Autoencoders, Unsupervised Learning, and Deep Architectures", JMLR, 2012 [paper] 2

  8. VAE: "Auto-Encoding Variational Bayes", ArXiv, 2013 [paper] 2

  9. Categorical-VAE: "Categorical Reparameterization with Gumbel-Softmax", ICLR, 2017 [paper] 2

  10. VQ-VAE: "Neural Discrete Representation Learning", NIPS, 2017 [paper] 2

  11. VQ-VAE-v2: "Generating Diverse High-Fidelity Images with VQ-VAE-2", NIPS, 2019 [paper] 2

  12. dVAE: "Zero-Shot Text-to-Image Generation", PMLR, 2021 [paper] 2

  13. DDPM: "Denoising Diffusion Probabilistic Models", NIPS, 2020 [paper] 2

  14. CDM: "Cascaded Diffusion Models for High Fidelity Image Generation", JMLR, 2022 [paper] 2