Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

This package contains the accompany code for the following paper:

Tapani Raiko, Li Yao, KyungHyun Cho, Yoshua Bengio
Iterative Neural Autoregressive Distribution Estimator (NADE-k).
Advances in Neural Information Processing Systems 2014 (NIPS14).


Install Theano

Download Theano and make sure it's working properly.
All the information you need can be found by following this link:
Make sure theano is added into your PYTHONPATH.

Install Jobman

Very detailed information can be found below:
Make sure jobman is added into your PYTHONPATH.

Prepare the MNIST dataset

You can download the dataset from the links below.
[trainset] (

After the dataset has been downloaded, make sure to change the data_path in

Reproducing the Results

Train the model

  1. Change exp_path in This is the directory where all the training outputs are going to be placed. For different experiments, one needs to specify 'save_model_path' in the same config file.
  2. To run NADE-5 1HL in Table 1 of the paper, make sure
    'n_layers': 1, and 'l2': 0.0.
  3. To run NADE-5 2HL in Table 1 of the paper, make sure
    'n_layers': 2, and 'l2': 0.0012279827881.
  4. To start training, python

It is highly recommended the code is run on GPUs. For how to make it happen, take a look at this place:

Training outputs

During the training, lots of information is printed out on the screen, and many files are written to the save_mode_path. You will be able to see the plot of drop of the training cost, the generated samples from the model, the log-likelihood on the validset and testset every valid_freq epochs.

If you use the default setup, the model will be pretrained for 1000 epochs, and finetuned for another 3000 epochs. To have a good generative model, one need to be patient :)

In addition, we have provided some training logs with which you should be able to match your experiments with. See in the directory results.


After training is done, it is time to get all those SOTA numbers in Table 1 of the paper.

  1. In, change the option 'action' to 1. Meanwhile make sure 'from_path' points to the directory that contains model_params_e*.pkl and model_configs.pkl. The option 'epoch' specify which model over there you would like to use.
  2. Then python
  3. If all goes well, the evaluation script should be able to produce numbers that match those in the paper.

IMPORTANT: You probably will be surprised when you see better numbers than those reported in our paper. Calm down and we know this could happen. The longer you train our model, the more likely you will get better numbers. And do spread your joy to us when this happens.

Benchmarks with this package

NADE-5 1H model:
testset LL over 10 orderings = -89.43
testset LL over 128 ensembles = -85.77
Those numbers are better than what we used in the paper because the model is trained much longer here.

NADE-5 2H model:
testset LL over 10 orderings = -87.13
testset LL over 128 ensembles = -84.65


Need a trained model?
Contact us:


An iterative neural autoregressive distribution estimator (NADE-K)



No releases published


No packages published