This repository contains pytorch auto encoder examples.
Most code are copy & pasted version from pytorch-generative.
-
Clone this repo.
git clone https://github.com/hankyul2/pytorch-ae.git
-
Train your model.
python3 train.py -m nade
-
Use trained model in your way. We provide code snippet for how to sample from model.
import torch from torchvision.utils import save_image from pae.model import NADE model = NADE() model.load_state_dict(torch.load('your_checkpont.pth')) generated_img = model.sample(16, 'cpu').reshape(16, 1, 28, 28) save_image(generated_img, 'generated_by_NADE.jpg')
Negative Log Likelihood (NLL) loss on Binarized MNIST dataset.
Method | Command | NLL | Pretrained model |
---|---|---|---|
NADE1 | python3 train.py -m NADE |
84.0 | [code] [weight] [log] |
MADE2 | python3 train.py -m MADE |
83.8 | [code] [weight] [log] |
PixelCNN3 | python3 train.py -m PixelCNN |
81.7 | [code] [weight] [log] |
Gated PixelCNN4 | python3 train.py -m GatedPixelCNN |
81.7 | [code] [weight] [log] |
PixelCNN++5 | python3 train.py -m PixelCNN++ -b 128 --lr 2.5e-4 |
78.2 | [code] [weight] [log] |
PixelSnail6 | python3 train.py -m PixelSnail -b 64 --lr 1.25e-4 |
||
PixelSnail++56 | python3 train.py -m PixelSnail++ -b 64 --lr 1.25e-4 |
||
AE7 | |||
VAE8 | |||
Categorical-VAE9 | |||
VQ-VAE10 | |||
VQ-VAE-v211 | |||
dVAE12 | |||
DDPM13 | |||
CDM14 |
Issues
- MADE: we could not utilize any agnostic training tricks (order, connectivity) proposed in the original paper.
- PixelCNN: we could not understand and implement the PixelRNN model which is mainly discussed in the original paper.
- Changing batch size & learning rate: we could not train some models with default
-b 512
so, we reducebatch_size
to fit our gpu memory and changelr
linearly.
Method | Reconstructed Image | Randomly Sampled Image |
---|---|---|
NADE1 | ||
MADE2 | ||
PixelCNN3 | ||
Gated PixelCNN4 | ||
PixelCNN++5 | ||
PixelSnail6 | ||
PixelSnail++56 | ||
AE7 | ||
VAE8 | ||
Categorical-VAE9 | ||
VQ-VAE10 | ||
VQ-VAE-v211 | ||
dVAE12 | ||
DDPM13 | ||
CDM14 |
Footnotes
-
NADE: "Neural Autoregressive Distribution Estimation", JMLR, 2016 [paper] ↩ ↩2
-
MADE: "Masked Autoencoder for Distribution Estimation", PMLR, 2015 [paper] ↩ ↩2
-
PixelCNN: "pixel recurrent neural networks", PMLR, 2016 [paper] ↩ ↩2
-
Gated PixelCNN: "Conditional Image Generation with PixelCNN Decoders", NIPS, 2016 [paper] ↩ ↩2
-
PixelCNN++: "Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications", ICLR, 2017 [paper] ↩ ↩2 ↩3 ↩4
-
PixelSnail: "Cascaded Diffusion Models for High Fidelity Image Generation", PMLR, 2018 [paper] ↩ ↩2 ↩3 ↩4
-
AE: "Autoencoders, Unsupervised Learning, and Deep Architectures", JMLR, 2012 [paper] ↩ ↩2
-
VAE: "Auto-Encoding Variational Bayes", ArXiv, 2013 [paper] ↩ ↩2
-
Categorical-VAE: "Categorical Reparameterization with Gumbel-Softmax", ICLR, 2017 [paper] ↩ ↩2
-
VQ-VAE: "Neural Discrete Representation Learning", NIPS, 2017 [paper] ↩ ↩2
-
VQ-VAE-v2: "Generating Diverse High-Fidelity Images with VQ-VAE-2", NIPS, 2019 [paper] ↩ ↩2
-
dVAE: "Zero-Shot Text-to-Image Generation", PMLR, 2021 [paper] ↩ ↩2
-
DDPM: "Denoising Diffusion Probabilistic Models", NIPS, 2020 [paper] ↩ ↩2
-
CDM: "Cascaded Diffusion Models for High Fidelity Image Generation", JMLR, 2022 [paper] ↩ ↩2