New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bridge PyMC3 and Lasagne #693

Open
twiecki opened this Issue Jun 6, 2016 · 23 comments

Comments

Projects
None yet
5 participants
@twiecki
Copy link

twiecki commented Jun 6, 2016

I've been working a bit on Bayesian Neural Networks in PyMC3 (here is the blog post: http://twiecki.github.io/blog/2016/06/01/bayesian-deep-learning/). I wonder if it would be possible to use Lasagne for construction of the layers. The only requirement would be to place priors (theano expressions) on the weight parameters and feed the output into a likelihood function. Not sure if and how easy it would be to adapt Lasagne in that way.

@f0k

This comment has been minimized.

Copy link
Member

f0k commented Jun 6, 2016

Judging from the first code snippet in that blog post, as you just pass weights_in_1 to T.dot, you can just as well pass it for the weights when constructing a Lasagne layer (DenseLayer(..., W=weights_in_1)). To obtain the output expression to pass to pm.Bernoulli, you'd then call lasagne.layers.get_output(...) for your output layer. As far as I see, there's nothing you'd need to change!
We'd welcome a PR to Lasagne/Recipes if you have a working example.

@twiecki

This comment has been minimized.

Copy link
Author

twiecki commented Jun 6, 2016

@f0k Perfect, thanks for the quick response. I'll give that a try and report back.

@twiecki

This comment has been minimized.

Copy link
Author

twiecki commented Jun 6, 2016

Does not seem to like theano expressions:

    act_in = lasagne.layers.InputLayer(X_train.shape, input_var=ann_input)
    act_1 = lasagne.layers.DenseLayer(act_in, n_hidden, W=weights_in_1, nonlinearity=lasagne.nonlinearities.tanh)
    act_2 = lasagne.layers.DenseLayer(act_1, n_hidden, W=weights_1_2, nonlinearity=lasagne.nonlinearities.tanh)
    act_out = lasagne.layers.DenseLayer(act_1, n_hidden, W=weights_2_out, nonlinearity=lasagne.nonlinearities.sigmoid)
    net_out = lasagne.layers.get_output(act_out)
RuntimeError                              Traceback (most recent call last)
<ipython-input-17-519be30e2063> in <module>()
     32     # Build neural-network using tanh activation function
     33     act_in = lasagne.layers.InputLayer(X_train.shape, input_var=ann_input)
---> 34     act_1 = lasagne.layers.DenseLayer(act_in, n_hidden, W=weights_in_1, nonlinearity=lasagne.nonlinearities.tanh)
     35     act_2 = lasagne.layers.DenseLayer(act_1, n_hidden, W=weights_1_2, nonlinearity=lasagne.nonlinearities.tanh)
     36     act_out = lasagne.layers.DenseLayer(act_1, n_hidden, W=weights_2_out, nonlinearity=lasagne.nonlinearities.sigmoid)

/home/wiecki/miniconda3/lib/python3.5/site-packages/lasagne/layers/dense.py in __init__(self, incoming, num_units, W, b, nonlinearity, **kwargs)
     70         num_inputs = int(np.prod(self.input_shape[1:]))
     71 
---> 72         self.W = self.add_param(W, (num_inputs, num_units), name="W")
     73         if b is None:
     74             self.b = None

/home/wiecki/miniconda3/lib/python3.5/site-packages/lasagne/layers/base.py in add_param(self, spec, shape, name, **tags)
    211                 name = "%s.%s" % (self.name, name)
    212 
--> 213         param = utils.create_param(spec, shape, name)
    214         # parameters should be trainable and regularizable by default
    215         tags['trainable'] = tags.get('trainable', True)

/home/wiecki/miniconda3/lib/python3.5/site-packages/lasagne/utils.py in create_param(spec, shape, name)
    310 
    311     else:
--> 312         raise RuntimeError("cannot initialize parameters: 'spec' is not "
    313                            "a numpy array, a Theano shared variable, or a "
    314                            "callable")

RuntimeError: cannot initialize parameters: 'spec' is not a numpy array, a Theano shared variable, or a callable

Any idea?

@benanne

This comment has been minimized.

Copy link
Member

benanne commented Jun 6, 2016

Are you sure you have the latest version of Lasagne from git? Support for arbitrary Theano expressions as layer parameters was added a while back.

@twiecki

This comment has been minimized.

Copy link
Author

twiecki commented Jun 7, 2016

@benanne

This comment has been minimized.

Copy link
Member

benanne commented Jun 7, 2016

Very nice! We should link this from the Recipes in some way, I think it's a great example of how our design philosophy of transparency enables easy interoperability with other libraries.

@twiecki

This comment has been minimized.

Copy link
Author

twiecki commented Jun 7, 2016

@benanne Definitely, I didn't expect it to be this simple. I can do a PR with the current NB but I'm also experimenting with some extensions (e.g. the MNIST model) which might be a bit more interesting.

@f0k

This comment has been minimized.

Copy link
Member

f0k commented Jun 7, 2016

You don't even have to create the weights manually if you define a suitable weight initialization function. Try this:

with pm.Model() as neural_network:
    gauss_weights = lambda shape: pm.Normal('w', 0, sd=1, shape=shape, testval=np.random.randn(*shape))
    ... = DenseLayer(..., W=gauss_weights, ...)

This will give the same name to all pm.Normal instances, though, not sure if that's a problem?

/edit: Also take care about the biases. The layers will still have ordinary biases now, not sure if that's intended. Pass b=None to disable them, or define another custom initialization function.

@twiecki

This comment has been minimized.

Copy link
Author

twiecki commented Jun 7, 2016

Great, @f0k was wondering about that. The name is an issue but maybe it can be worked around somehow.

@f0k

This comment has been minimized.

Copy link
Member

f0k commented Jun 7, 2016

If you just need different names, you can also make it a class:

class GaussWeights(object):
    def __init__(self):
        self.count = 0
    def __call__(self, shape):
        self.count += 1
        return pm.Normal('w%d' % self.count, ...)

Then instantiate it once and pass it to all your layers:

init = GaussWeights()
..., W=init, ...
@twiecki

This comment has been minimized.

Copy link
Author

twiecki commented Jun 8, 2016

Thanks, I tried that, but get an error. Model:

class GaussWeights(object):
    def __init__(self):
        self.count = 0
    def __call__(self, shape):
        self.count += 1
        return pm.Normal('w%d' % self.count, mu=0, sd=1, shape=shape)

init = GaussWeights()

with pm.Model() as neural_network:
    l_in = lasagne.layers.InputLayer(shape=(None, 1, 28, 28),
                                     input_var=input_var)

    # Apply 20% dropout to the input data:
    l_in_drop = lasagne.layers.DropoutLayer(l_in, p=0.2)

    # Add a fully-connected layer of 800 units, using the linear rectifier, and
    # initializing weights with Glorot's scheme (which is the default anyway):
    n_hid1 = 800
    l_hid1 = lasagne.layers.DenseLayer(
        l_in_drop, num_units=n_hid1,
        nonlinearity=lasagne.nonlinearities.tanh,
        b=None,
        W=init
    )

    # We'll now add dropout of 50%:
    l_hid1_drop = lasagne.layers.DropoutLayer(l_hid1, p=0.5)

    n_hid2 = 800
    # Another 800-unit layer:
    l_hid2 = lasagne.layers.DenseLayer(
        l_hid1_drop, num_units=n_hid2,
        nonlinearity=lasagne.nonlinearities.tanh,
        b=None,
        W=init
    )

    # 50% dropout again:
    l_hid2_drop = lasagne.layers.DropoutLayer(l_hid2, p=0.5)

    # Finally, we'll add the fully-connected output layer, of 10 softmax units:
    l_out = lasagne.layers.DenseLayer(
        l_hid2_drop, num_units=10,
        nonlinearity=lasagne.nonlinearities.softmax,
        b=None,
        W=init
    )

    prediction = lasagne.layers.get_output(l_out)

    out = pm.Categorical('out', 
                         prediction,
                         observed=target_var)

Error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/home/wiecki/miniconda3/lib/python3.5/site-packages/lasagne/utils.py in create_param(spec, shape, name)
    352         try:
--> 353             arr = floatX(arr)
    354         except Exception:

/home/wiecki/miniconda3/lib/python3.5/site-packages/lasagne/utils.py in floatX(arr)
     20     """
---> 21     return np.asarray(arr, dtype=theano.config.floatX)
     22 

/home/wiecki/miniconda3/lib/python3.5/site-packages/numpy/core/numeric.py in asarray(a, dtype, order)
    473     """
--> 474     return array(a, dtype, copy=False, order=order)
    475 

ValueError: setting an array element with a sequence.

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
<ipython-input-53-0c2e93ec8bcc> in <module>()
     22         nonlinearity=lasagne.nonlinearities.tanh,
     23         b=None,
---> 24         W=init
     25     )
     26 

/home/wiecki/miniconda3/lib/python3.5/site-packages/lasagne/layers/dense.py in __init__(self, incoming, num_units, W, b, nonlinearity, **kwargs)
     69         num_inputs = int(np.prod(self.input_shape[1:]))
     70 
---> 71         self.W = self.add_param(W, (num_inputs, num_units), name="W")
     72         if b is None:
     73             self.b = None

/home/wiecki/miniconda3/lib/python3.5/site-packages/lasagne/layers/base.py in add_param(self, spec, shape, name, **tags)
    232                 name = "%s.%s" % (self.name, name)
    233         # create shared variable, or pass through given variable/expression
--> 234         param = utils.create_param(spec, shape, name)
    235         # parameters should be trainable and regularizable by default
    236         tags['trainable'] = tags.get('trainable', True)

/home/wiecki/miniconda3/lib/python3.5/site-packages/lasagne/utils.py in create_param(spec, shape, name)
    353             arr = floatX(arr)
    354         except Exception:
--> 355             raise RuntimeError("cannot initialize parameters: the "
    356                                "provided callable did not return an "
    357                                "array-like value")

RuntimeError: cannot initialize parameters: the provided callable did not return an array-like value

The object seems to return the correct random variable when called.

@benanne

This comment has been minimized.

Copy link
Member

benanne commented Jun 8, 2016

Lasagne assumes that when a callable is passed as an initializer, it will return an array of the right shape (i.e. numerical), not a symbolic variable or expression. If you want to use an existing variable / expression, you have to pass it directly.

Perhaps this is an artificial limitation though. We should look into whether we can support having the callable return anything that constitutes a valid initializer.

@twiecki

This comment has been minimized.

Copy link
Author

twiecki commented Jun 8, 2016

From a user-perspective that difference is certainly a bit surprising. Do let me know if you plan to change it.

@benanne

This comment has been minimized.

Copy link
Member

benanne commented Jun 8, 2016

https://github.com/Lasagne/Lasagne/blob/master/lasagne/utils.py#L350-L362 this is the part that would need to change. Seems fairly straightforward actually :) Maybe someone is willing to submit a PR!

@erfannoury

This comment has been minimized.

Copy link
Contributor

erfannoury commented Jun 9, 2016

Would something like this work? create_param.py

@benanne

This comment has been minimized.

Copy link
Member

benanne commented Jun 9, 2016

probably :) If you're okay with writing some tests for it and updating the docs, feel free to submit it as a PR.

@erfannoury

This comment has been minimized.

Copy link
Contributor

erfannoury commented Jun 9, 2016

Yes, Ok. I'm working on it.

@erfannoury

This comment has been minimized.

Copy link
Contributor

erfannoury commented Jun 9, 2016

See the PR #695

@f0k

This comment has been minimized.

Copy link
Member

f0k commented Jun 14, 2016

From a user-perspective that difference is certainly a bit surprising.

So much that it even surprised me!

@twiecki

This comment has been minimized.

Copy link
Author

twiecki commented Jul 5, 2016

I wrote a blog post about this: http://twiecki.github.io/blog/2016/07/05/bayesian-deep-learning/

Not sure if that somehow can be used as an example for lasagne?

@benanne

This comment has been minimized.

Copy link
Member

benanne commented Jul 5, 2016

Very cool, thanks! We should link it somewhere at the very least.

@TSHTUM007

This comment has been minimized.

Copy link

TSHTUM007 commented Oct 8, 2018

Hey Sorry if i am late to the party but i have been trying some tutorials on pymc3 ANN's bridged with lasagne, and i get the same error "RuntimeError: cannot initialize parameters: the provided callable did not return an array-like value
" how do i get around this?, thank you in advance

@twiecki

This comment has been minimized.

Copy link
Author

twiecki commented Oct 10, 2018

@TSHTUM007 Are all your packages up to date? Are you running the NB I wrote?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment