New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampling from user provided target densities given as Python functions #804

Open
ingmarschuster opened this Issue Aug 19, 2015 · 5 comments

Comments

Projects
None yet
3 participants
@ingmarschuster
Copy link
Contributor

ingmarschuster commented Aug 19, 2015

Given black box functions logposterior(theta) and grad_logposterior(theta) I want to be able to sample from the distribution proportional to logposterior(theta) instead of specifying a model in PyMC3s modeling language (follow up to this Stackoverflow question and the comment by @twiecki )

I've already written code to wrap logposterior and grad_logposterior in a Theano Op which provides the grad operation - but how you want to integrate that I'm not sure so I'm leaving it to you (also, thesis-writing currently makes it impossible to grok deeper into Theanos/PyMC3s software architecture for me)

So here is the code for the Theano Op (I hope my little knowledge of Theano did not lead to serious errors):

import numpy as np
import theano
import pymc3

class GradLogpOp(theano.Op):
    __props__ = ("input_shape", "grad_logp")
    def __init__(self, input_shape, grad_logp):
        self.input_shape = input_shape
        self.grad_logp = grad_logp

    def infer_shape(self, node, shapes):
        return (self.input_shape,)

    def make_node(self, x, logp_x, outputs):
        x = theano.tensor.as_tensor_variable(x)
        return theano.Apply(self, [x], [x.type()])

    def perform(self, node, inputs, outputs):
        outputs[0][0] = np.array(self.grad_logp(inputs[0]))



class LogpOp(theano.Op):
    __props__ = ("logp", "grad_logp_op")
    def __init__(self, input_shape, logp, grad_logp):
        self.logp = logp
        self.grad_logp_op = GradLogpOp(input_shape, grad_logp)

    def infer_shape(self, node, shapes):
        return (),

    def make_node(self, x):
        x = theano.tensor.as_tensor_variable(x, )
        rval = theano.Apply(self, [x], [theano.tensor.TensorType(x.dtype,())()]) #scalar output
        return rval


    def perform(self, node, inputs, outputs):
        for i in range(len(inputs)):
            outputs[i][0] = np.array(self.logp(inputs[i]))
        #print(outputs[0][0].ndim, )

    def grad(self, inputs, gradients):
            return [self.grad_logp_op(inputs[0], self(inputs[0]),
                                             gradients[0])]


@jsalvatier

This comment has been minimized.

Copy link
Member

jsalvatier commented Aug 19, 2015

Yup, that looks roughly right to me. I think there are probably things you can simplify, but I think this should work. If you're having trouble let us know.

@ingmarschuster

This comment has been minimized.

Copy link
Contributor

ingmarschuster commented Aug 20, 2015

Ah, I'm glad. Now as said, it would be great if this could be integrated in the codebase, maybe with DensityDist.
Also, I'm not sure how one would sample from it, as the observed=... parameter for the distribution doesn't make sense. But I guess there is a way around using observed=... thats just not in the docs or that I didn't see?

@jsalvatier

This comment has been minimized.

Copy link
Member

jsalvatier commented Aug 20, 2015

I think you actually want use here Potential rather than DensityDist.

@ingmarschuster

This comment has been minimized.

Copy link
Contributor

ingmarschuster commented Jan 14, 2016

so, I had another go at it and came up with the following

import numpy as np
import pymc3
import theano
import theano.tensor as T





class GradLogpOp(theano.Op):
    __props__ = ("input_shape", "grad_logp")
    def __init__(self, input_shape, grad_logp):
        self.input_shape = input_shape
        self.grad_logp = grad_logp

    def infer_shape(self, node, shapes):
        return (self.input_shape,)

    def make_node(self, x, logp_x, outputs):
        x = theano.tensor.as_tensor_variable(x)
        #assert x.shape == self.shape_input
        return theano.Apply(self, [x], [x.type()])

    def perform(self, node, inputs, outputs):
        #outputs[0][0] = self.f(inputs[0])
        outputs[0][0] = np.array(self.grad_logp(inputs[0]))
        #assert()



class LogpOp(theano.Op):
    __props__ = ("logp", "grad_logp_op")
    def __init__(self, input_shape, logp, grad_logp):
        self.logp = logp
        self.grad_logp_op = GradLogpOp(input_shape, grad_logp)

    def infer_shape(self, node, shapes):
        return (),

    def make_node(self, x):
        x = theano.tensor.as_tensor_variable(x, )
        #assert x.shape == self.shape_input
        rval = theano.Apply(self, [x], [theano.tensor.TensorType(x.dtype,())()]) #scalar output
        #assert()
        #rval = theano.tensor.as_tensor_variable(rval)
        #print(rval.ndim)
        return rval


    def perform(self, node, inputs, outputs):
        #outputs[0][0] = self.f(inputs[0])
        for i in range(len(inputs)):
            outputs[i][0] = np.array(self.logp(inputs[i]))
        #print(outputs[0][0].ndim, )
        #assert()

    def grad(self, inputs, gradients):
            return [self.grad_logp_op(inputs[0], self(inputs[0]),
                                             gradients[0])]



class BlackBoxContinuous(pymc3.Continuous):
    """
    Black Box Continuous distribution which only calls an underlying log density function
    """
    def __init__(self, default_value, f, gradf, *args, **kwargs):
        kwargs['shape'] = default_value.shape
        super(BlackBoxContinuous, self).__init__(*args, **kwargs)
        self.mean = self.mode = default_value
        self.logp_op = LogpOp(default_value.shape, f, gradf) 

    def logp(self, value):
        return self.logp_op(value)

class TheanoExprContinuous(pymc3.Continuous):
    """
    Continuous distribution given by a function which returns a theano expression
    """
    def __init__(self, default_value, logpt, *args, **kwargs):
        """
        default_value   -   default value for this distribution
        logpt           -   function of one parameter returning a theano expression giving the log density of that parameter
        """
        kwargs['shape'] = default_value.shape
        super(TheanoExprContinuous, self).__init__(*args, **kwargs)
        self.mean = self.mode = default_value
        self.internal_logpt = logpt

    def logp(self, value):
        return self.internal_logpt(value)


def test_LogpOp_GradLogpOp():
    x = theano.tensor.dvector("x")
    x.tag.test_value = np.random.rand(3)
    fo = LogpOp((3,), lambda x: np.sum(x**2), lambda x:np.sum(2*x))
    a = fo(x).eval({x:np.ones(3)*1.5})
    ga = theano.tensor.grad(fo(x), x)
    print(a.shape, a, ga.eval({x:np.ones(3)*1.5}))
    for test_val in np.random.rand(40).reshape((10,4)):
        assert(fo(x).eval({x:test_val}) == fo.logp(test_val))
        assert(ga.eval({x:test_val}) == fo.grad_logp_op.grad_logp(test_val))        


def test_BlackBoxContinuous():    
    model = pymc3.Model()

    with model:
        (f,gradf,f_gradf) = datset.banana()
        res = BlackBoxContinuous('res', np.zeros(2), lambda x:-x.dot(x), lambda x:-2*x, shape=(2,))
        trace = pymc3.sample(100,pymc3.NUTS(),{'res':np.zeros(2)})

def test_TheanoExprContinuous():
    model = pymc3.Model()

    with model:
        res = TheanoExprContinuous('res', np.zeros(2), lambda x:-x.dot(x), shape=(2,))
        trace = pymc3.sample(10000,pymc3.NUTS(),{'res':np.zeros(2)})

Now you can sample from any Theano expression that allows a second derivative for sure. However, test_BlackBoxContinuous() throws AttributeError: 'GradLogpOp' object has no attribute 'grad' because there is no mechanics to compute the hessian currently. The same is true for TheanoExprContinuous if the provided expression does not allow a second derivative (for example, the Stochastic volatility model from the NUTS paper does not allow one, or at least theano doesn't know how to do it!)

Is there no way to tell pymc3 to not use a second derivative? For instance in the NUTS paper they rely on estimating global target covariance from the burn in and do not use hessians....

@twiecki

This comment has been minimized.

Copy link
Member

twiecki commented Jan 25, 2016

FYI Theano/Theano#3747

You can now create theano ops without having to specify make_node.

Moreover, we can compute the gradient automatically from a numpy function using autograd: https://github.com/HIPS/autograd/

Or, we could directly convert a numpy function to theano using https://github.com/LowinData/pyautodiff/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment