Deep Neural Networks Entropy from Replicas
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.circleci adding CI May 23, 2018
dnner fixing note on docstring Oct 25, 2018
doc documentation + minor changes May 23, 2018
examples adding some examples May 23, 2018
notebooks folder for notebooks May 22, 2018
.gitignore creating public repo May 22, 2018
README.md moved text in readme Oct 19, 2018
setup.py documentation + minor changes May 23, 2018

README.md

CircleCI

dnner - DNNs Entropy from Replicas

The code in this package computes the entropy, mutual information and MMSE of multi-layer GLMs given orthogonally-invariant matrices of arbitrary spectrum. More details available in arXiv:1805.09785.

Instructions

Install package

First make sure you have all the requirements installed

  • Python 3.x
  • Cython
  • Numpy
  • Matplotlib
  • Scipy

Then type

python setup.py install --user

You can then try the Demo.ipynb notebook, and scripts in the examples folder.

Adding new priors/activations/ensembles

In order to add new priors/activations, new classes should be written and added to the dnner/{priors, activations, ensembles} folders. Look at the files already present in these folders for examples; the methods in them (iter_a, iter_v, iter_theta, eval_i, ...) should be reimplemented.

For the priors, one must implement: iter_v, eq. 55 in the paper, eval_i, eq. 34a, and eval_rho, defined between eqs. 32 and 33. For outputs: iter_a, eq. 52, eval_i, eq. 34b. For interfaces: iter_a, eq. 51, iter_v, eq. 54, eval_i, eq. 33, eval_rho, see footnote in page 19. Finally, for ensembles, iter_theta, eq. 50a, iter_llmse, eq. 60, and eval_f, eq. 6.

After implementing the new class, you can add it to the __init__.py inside priors/activations/ensembles so that it can be more easily imported.

Currently the following priors are available

  • Normal
  • Bimodal
  • SpikeSlab

as well as the following activations

  • Linear
  • Probit
  • ReLU
  • LeakyReLU
  • HardTanh

and the following ensembles

  • Gaussian
  • Empirical

Troubleshooting

I keep get warnings throughout the iteration, should I be worried about it?

Numerical integrations performed for the (leaky) ReLU and the hard tanh are a bit tricky: the integrator might at occasion complain about lack of precision. Despite of that, the final result seems in our experience to be consistent. In any case, a deeper study of the integration procedure should be performed at some point.

Do I need to have noise in my activations?

It is essential to have noise in the outermost activation so that quantities do not diverge. Usually one can get by with zero noise in the inner layers; however, if the variables in the model are discrete, noise should also be added there.

Can it happen that the iteration does not converge?

As described in the Supplementary Material of our paper, the fixed-point iteration we use depends on a solution to a particular equation being found at each step, and occasionaly this might not happen. In that case, using the ML-VAMP state evolution instead of the fixed-point iteration should lead to better results.

The ML-VAMP state evolution is in general more stable, but rarely it might also happen that variances/precisions become negative. In our experience however, one of the two schemes will always work.

References

  • M. Gabrié, A. Manoel, C. Luneau, J. Barbier, N. Macris, F. Krzakala and L. Zdeborová, Entropy and mutual information in models of deep neural networks, arXiv:1805.09785.