Official implementation of "On GANs and GMMs"
Switch branches/tags
Nothing to show
Clone or download
Latest commit 92e8df2 Dec 6, 2018
Type Name Latest commit message Commit time
Failed to load latest commit information.
images Updating readme Nov 18, 2018
utils Fixing some small issues Dec 6, 2018
.gitignore Working MNIST demo Jun 5, 2018 Updaring the readme Dec 6, 2018 Adding MNIST MFA NDB evaluation Dec 1, 2018 Small cleanup Dec 6, 2018 Small bug fix Nov 19, 2018 Adding MNIST MFA NDB evaluation Dec 1, 2018 Fixing some small issues Dec 6, 2018

On GANs and GMMs

Implementation of NDB and MFA per NeurIPS 2018 paper On GANs and GMMs by Eitan Richardson and Yair Weiss.

NDB: An evaluation method for high-dimensional generative models

MFA: Mixture of Factor Analyzers for modeling high-dimensional data (e.g. full images)


  • NDB code cleanup
  • MNIST demo for NDB
  • MFA model cleanup
  • Inference code
  • Sharpness measure code


  • Python 3.x, NumPy, SciPy, Sklearn, TensorFlow

Demo Scripts

  • A small stand-alone demo for NDB evaluation using MNIST (compare train, val, test and simulated biased model)
  • mfa_train_mnist/ Training an MFA model for MNIST/CelebA
  • mfa_eval_mnist/ Evaluating the trained MFA model

Running the Standalone NDB MNIST Demo

The resulting binning histogram, and NDB (Number of statistically Different Bins) and JS (Jensen-Shannon Divergence) values:

NDB evaluation on this toy example reveals that:

  • A random validation split from the train data is statistically similar to the train data (NDB = 0, JS divergence = 0.0014)
  • The MNIST test set is not coming from exactly the same distribution (different writers), but is pretty close (NDB = 4, JS divergence = 0.0041)
  • NDB detects the distribution distortion in a deliberately biased set created by removing all digit 9 samples from the validation set (NDB = 11, JS divergence = 0.016)

A plot showing the NDB bin centers and the statistically-different bins (with significance level = 3 standard-errors) in the simulated Val0-8 evaluated set. All bins corresponding to digit 9 are detected as statistically-different (marked in red).

The NDB test can be used to evaluate different generative models (e.g. GANs). In our paper, we demonstrate its performance on three datasets - CelebA, SVHN and MNIST.

Training MFA on CelebA

First, download the CelebA dataset from into some folder (default location is ../../Datasets/CelebA/). The CelebA folder should contain the subfolder img_align_celeba (containing all aligned and cropped images) and the text file list_eval_partition.txt.

Run python3 ...

By Default, the script learns a Mixture of Factor Analyzers with 200 components and then breaks each component to additional components. While running, the code generates some plots that are also saved to the output folder (sorry, no TensorBoard...):

The first few components, showing each component mean, noise std and column-vectors of the scale matrix:

Random samples drawn from each component separately:

Random samples from the mixture:

The test log-likelihood progress:

Sub-components training

After training the main (root) components, the script continues to train sub-components (training is done hierarchically due to GPU memory limitations).

Below are examples for images generated for sub-components of one of the root components:

Components directions:

Random samples drawn from each sub-component separately:

Random samples from the mixture:

The test log-likelihood progress:

The MFA Implementation is a script that runs the training process.

The utils directory contains the interesting stuff:

  • is a python class implementing the CPU version of MFA (inference-only - no training)
  • contains TensorFlow implementation of relevant MFA functionality, for example:
    • get_log_likelihood(): calculates the log-likelihood of a batch of samples given the MFA model parameters.
  • The actual training code in TensorFlow is in The two most important lines are:
    G_loss = -1.0 * mfa_tf.get_log_likelihood(X, *theta_G)
    G_solver = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(G_loss, var_list=theta_G)

Training MFA on your own dataset

Most of the MFA code should be generic enough to support different datasets. Each data sample is flattened and represented as a (row) vector of floats. The interface to specific datasets is The current implementation supports CelebA, MNIST and SVHN. There are two options to support other datasets:

  1. Modify A small dataset can be preloaded like MNIST. Large datasets should be treated like CelebA, where only an image list is preloaded.
  2. Implement your own version of a data provider, exposing a similar API to (i.e. m_train_images, num_test_images, get_next_minibatch_samples(), get_test_samples(test_size)