Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Interpolated distribution class #2163

Merged
merged 4 commits into from May 11, 2017
Merged

Add Interpolated distribution class #2163

merged 4 commits into from May 11, 2017

Conversation

ghost
Copy link

@ghost ghost commented May 9, 2017

This PR, as a follow-up on #2146, adds Interpolated distribution that takes arbitrary one-dimensional probability density function as an array of values evaluated at some set of points and makes it possible to use this distribution as a prior distribution.

The PDF values between the points that are present in the input array are calculated by linear interpolation, gradient and random sampling are defined accordingly.

Time complexity of evaluation is O(log n) where n is the number of points in the input array. The step size doesn't have to be the same at all points, so it is possible to make it smaller for the regions of measure concentration and larger for other regions.

I implemented some tests for random function, but unfortunately I was not able to do the same for logp function (but added a test template to pymc3/tests/test_distributions.py). The reason is that this distribution takes plain numpy arrays instead of variables as inputs, because anyway the dimensionality of the input vectors is supposed to be large and it would not be feasible to sample them, but the test suite is supposed to work with distributions that take variables as inputs.

@twiecki
Copy link
Member

twiecki commented May 10, 2017

I love the API, not crazy about the name: from_posterior('beta1', trace['beta1']). Brain storming: Empirical() (not wise because of overlap with VI stuff), Interpolated(), FromSamples(), Histogram(), other ideas?

@ghost
Copy link
Author

ghost commented May 10, 2017

@twiecki from_posterior is not a part of the API, it is a user function in the notebook.

The API is

dist = Interpolated('interpolated_dist', x_points=x_points, pdf_points=pdf_points)

@twiecki
Copy link
Member

twiecki commented May 10, 2017

Oh, I see. Could we make the Interpolated API similar by providing good defaults?

@ghost
Copy link
Author

ghost commented May 10, 2017

I doubt that it is possible to come here up with good defaults for x_points and pdf_points. It is possible to put here something like for example uniform distribution on [0, 1], but I don't find this choice to be obvious. It might be useful for illustrative purposes though.

The thing I think about is that maybe it is better to name the parameters not x_points and pdf_points, but just x and y. I've chosen the first form in favor of the second one to avoid confusion by making a user think that the distribution is two-dimensional, but I'm not sure is it consistent with other API and maybe x and y would be better.

Btw it is not necessary to explicitly write parameters names, it might be used as well as just

dist = Interpolated('interpolated_dist', x_points, pdf_points)

and for example

x_points = np.linspace(-10, 10, 100)
pdf_points = np.exp(-x_points*x_points/2)

@junpenglao
Copy link
Member

Looks good. Could you run the whole notebook? The output of the last cell is not shown.

@ghost
Copy link
Author

ghost commented May 10, 2017

@junpenglao Sure! I've updated the notebook.

@ghost
Copy link
Author

ghost commented May 10, 2017

I managed to solve the problem of testing by creating temporary subclasses of Interpolated class that construct Interpolated instances by providing x_points and pdf_points parameters to the parent constructor, so that it becomes possible to use for tests the same functions pymc3_random and pymc3_matches_scipy as for other distributions.

@twiecki
Copy link
Member

twiecki commented May 11, 2017

This essentially uses the marginal of the posterior as the prior, right? I.e. we're not taking any possible correlations into account.

@ghost
Copy link
Author

ghost commented May 11, 2017

Yes, it is one-dimensional, so it's impossible to take any correlations into account with it.

However it is useful when the prior comes not as samples, but for example as a result of some numerical integration, so that the prior is one-dimensional, but complex enough to not be explained in terms of standard distributions.

super(Interpolated, self).__init__(transform=transform,
*args, **kwargs)

interp = InterpolatedUnivariateSpline(x_points, pdf_points, k=1, ext='zeros')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we allow the user to define k here?

Copy link
Author

@ghost ghost May 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two major problems with non-linear interpolations:

  1. PDF can become negative for higher-order spline interpolation at some points, it is nonsensical and would break NUTS sampler. It is also impossible to just replace negative values by zeros because it would make impossible to to use integral() method of SciPy spline classes.
  2. random() method can be implemented efficiently only for linear interpolation because inverse CDF can be expressed in closed form only for piecewise-quadratic CDF (well, it is possible to try to do the same for piecewise-cubic CDF using Cardano formula, but I'm not subscribing to it :) For higher-order polynomial interpolations it would be necessary to find inverses numerically, using an iterative process like Newton method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep the first point is a valid concern. I think we can merge this now and add higher order polynomial support in the future.

size=size)

def logp(self, value):
return tt.log(self.interp_op(value) / self.Z)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I ran locally on your example in #2146 I actually find your first implementation faster.

Copy link
Author

@ghost ghost May 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was because initially I incorrectly defined the gradient in #2146 , so HMC didn't work well for the second version (see this comment). In this implementation calculation of the gradient is fixed, so it is even faster to put division by the normalization constant here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, it should be faster then. I will double check on my side then.

@junpenglao junpenglao merged commit cce9dfe into pymc-devs:master May 11, 2017
@junpenglao
Copy link
Member

Great work @a-rodin! Thank you for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants