Bayesian desert for Lasagne
Python
Latest commit 808a59a Jan 22, 2017 @ferrine change requirements

README.md

Gelato

Bayesian dessert for Lasagne

About

Recent results in bayesian statistics for constructing robust neural networks have proved that it is one of the best ways to deal with uncertainty, overfitting but still having good performance. Gelato will help to use bayes for neural networks. Library heavily relies on Theano, Lasagne and PyMC3.

Installation

git clone https://github.com/ferrine/gelato
cd gelato
pip install -r requirements.txt
pip install .

Usage

I use generic approach for decorating all Lasagne at once. Thus, for using Gelato you need to replace import statements for layers only. For constructing a network you need to be the in pm.Model context environment.

Warning

  • gelato.layers.helper module, it is not equivalent to lasagne.layers.helper, it declares only get_output function.
  • lasagne.layers.noise, lasagne.layers.normalization are not supported yet
import theano
import pymc3 as pm
import numpy as np
import lasagne.nonlinearities as nonlinearities
from gelato.layers import DenseLayer, InputLayer
from gelato.variational.elbo import sample_elbo
from gelato.layers.helper import get_output
from gelato.spec import NormalSpec, LognormalSpec
from gelato.random import get_rng


def generate_data(intercept, slope, sd=.2, size=700):
    x = np.linspace(-10, 10, size)
    y = intercept + x * slope
    return x, y + get_rng().normal(size=size, scale=sd)

intercept = 1
slope = 3
sd = .1
x, y = generate_data(intercept, slope, sd=sd)
x = np.matrix(x).T
y = np.matrix(y).T
input_var = theano.shared(x)

with pm.Model() as model:
    inp = InputLayer(x.shape, input_var=input_var)
    # hierarchical prior on W
    out = DenseLayer(inp, 1, W=NormalSpec(sd=LognormalSpec()), nonlinearity=nonlinearities.identity)
    pm.Normal('y', mu=get_output(out),
              sd=sd,
              observed=y)

elbos, upd_rng, vp = sample_elbo(model, samples=1)

# elbos - array of sampled elbos
# upd_rng - updates for random streams used for sampling elbo
# vp - tuple with variational replacements dict and shared paramerers used for approximating
## vp.mapping - variational replacements
## vp.shared - tuple with dicts (means, rhos)
### vp.shared.means {varname: shared mean}
### vp.shared.rhos {varname: shared rho}
# vp.params - handy property to get both means and rhos

upd_adam = updates.adam(-elbos.mean(), vp.params)
upd_rng.update(upd_adam)
step = theano.function([], elbos.mean(), updates=upd_rng)
try:
    while True:
        step()
except KeyboardInterrupt:
    pass

stochastic_preds = get_output(out, vp=vp)
deterministic_preds = get_output(out, vp=vp, deterministic=True)
# if you don't pass `vp` to `get_output` you will get output without replacements in graph

Life Hack

Any spec class can be used standalone so feel free to use it everywhere (e.g. in keras)

References

Charles Blundell et al: "Weight Uncertainty in Neural Networks" (arXiv preprint arXiv:1505.05424)