This repository contains code for training a type of generative model : auto regressive models, and it's variants such as PixelCNN and GatedPixelCNN architecture.
Vanilla PixelCNN is implemented in Tensorflow, meanwhile PyTorch is used to train GatedPixelCNN and PixelCNN with color dependecy masking mechanism.
Instead of MNIST for which this model performs exceptionally well, I have used a Stanford Dog dataset, which you can download from here: https://www.kaggle.com/c/generative-dog-images/data
Thoughts on results:
- Well, I am very disappointed by seeing generated images as I am expecting some photo realistic images as GANs, which I am gonna try.
The model wasn't generating any intelligible at all and but it is still better than random noise and model learn some statistics of shape and boundries of dog's body, which you can see in GatedPixelCNN's output.
- I had tried increasing epochs and layers of residual block but it didn't work. The loss fluctuates aroung 3.6.
- I tried to generate 64 * 64 images but I trained on kaggle notebooks and I got Resource exhausted error on CUDA, tried reducing batch size
but still didn't work. So I generate 48 * 48 images. You can try to generate in original size if you have powerful machines.
- The other thing to improve the result is preprocessing and augmenting the dataset. There are only around 22k images and I have used only 16k images compared to massive datasets such as CIFAR 10 and CelebA. So training data can be further increased using data augmentation methods.
TODO : Try slightly advance and state of the art architecture such as PixelSNAIL and PixelCNN++.
Feel free to reach out to me at: jaydeept126@gmail.com for any feedback.
Resources:
Apart from papers, these blogs helps me further to understand the masking and gating mechanism clearly.
http://sergeiturukin.com/2017/02/24/gated-pixelcnn.html
http://sergeiturukin.com/2017/02/22/pixelcnn.html