This package contains the accompany code for the following paper:
Tapani Raiko, Li Yao, KyungHyun Cho, Yoshua Bengio
Iterative Neural Autoregressive Distribution Estimator (NADE-k).
Advances in Neural Information Processing Systems 2014 (NIPS14).
Download Theano and make sure it's working properly.
All the information you need can be found by following this link:
Make sure theano is added into your PYTHONPATH.
Very detailed information can be found below:
Make sure jobman is added into your PYTHONPATH.
Prepare the MNIST dataset
You can download the dataset from the links below.
After the dataset has been downloaded, make sure to change the
Reproducing the Results
Train the model
config.py. This is the directory where all the training outputs are going to be placed. For different experiments, one needs to specify
'save_model_path'in the same config file.
- To run NADE-5 1HL in Table 1 of the paper, make sure
- To run NADE-5 2HL in Table 1 of the paper, make sure
- To start training,
It is highly recommended the code is run on GPUs. For how to make it happen, take a look at this place: http://deeplearning.net/software/theano/tutorial/using_gpu.html.
During the training, lots of information is printed out on the screen, and many files are written to the
save_mode_path. You will be able to see the plot of drop of the training cost, the generated samples from the model, the log-likelihood on the validset and testset every
If you use the default setup, the model will be pretrained for 1000 epochs, and finetuned for another 3000 epochs. To have a good generative model, one need to be patient :)
In addition, we have provided some training logs with which you should be able to match your experiments with. See in the directory
After training is done, it is time to get all those SOTA numbers in Table 1 of the paper.
config.py, change the option
'action'to 1. Meanwhile make sure
'from_path'points to the directory that contains
model_configs.pkl. The option
'epoch'specify which model over there you would like to use.
- If all goes well, the evaluation script should be able to produce numbers that match those in the paper.
IMPORTANT: You probably will be surprised when you see better numbers than those reported in our paper. Calm down and we know this could happen. The longer you train our model, the more likely you will get better numbers. And do spread your joy to us when this happens.
Benchmarks with this package
NADE-5 1H model:
testset LL over 10 orderings = -89.43
testset LL over 128 ensembles = -85.77
Those numbers are better than what we used in the paper because the model is trained much longer here.
NADE-5 2H model:
testset LL over 10 orderings = -87.13
testset LL over 128 ensembles = -84.65
Need a trained model?
Contact us: firstname.lastname@example.org