- Train the Vector Quantised Variational AutoEncoder (VQ-VAE) for discrete representation and reconstruction.
- Use PixelCNN to learn the priors on the discrete latents for image sampling.
- VQ-VAE is originally mentioned in the paper Neural Discrete Representation Learning.
- PixelCNN is proposed in the papers Pixel Recurrent Neural Networks and Conditional Image Generation with PixelCNN Decoders.
- Implementation of VQ-VAE (without priors) is based on the official codes from Google DeepMind. Note: Different from the official codes, the implementation here does not rely on the Sonnet library.
- Implementation of PixelCNN is based on this repo with little modify.
- We provide the slides which may be of help for readers to gain better understanding on PixelCNN and VQ-VAE. Some images used in the slides are borrowed from papers and websites, so the slides can only be used for learning purpose.
- Run MNIST:
vqvae1_withPixelCNNprior_mnist.py
- Run cifar-10:
vqvae1_withPixelCNNprior_cifar10.py
Testing data | Reconstruction | Random samples | Samples based on PixelCNN prior | |
MNIST | ||||
cifar-10 |