Particle samplers and traces for Emcee #1689

philastrophist · 2017-01-20T19:51:16Z

So one of the major gripes I have with pymc3 is that I couldn't use the wonderful emcee from @dfm's emcee sampler very easily with the flexible model specification that pymc3 has. So I've added a generic particle sampler and an emcee sampler along with a trace to be used for particles.

This is somewhat hacky, but it works. There are obvious improvements to be made and a modified emcee sampler could be built in to avoid those pesky reshapes!

However, I have a limited knowledge of the workings of pymc3 despite writing this and so I'm not sure how to incorporate initialisations correctly for the walkers. I want to either use the advi and randomise the result for the walkers, or generate samples from the specified priors (currently doing this).

It'll be great if someone else could look at this!

Thanks

EDIT: Implemented some suggestions and improvements. Still a bit hacky, but I have some ideas

philastrophist · 2017-01-20T19:55:34Z

An example:

import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt
from pymc3.external.emcee_samplers import EmceeEnsemble

x = np.linspace(0, 1, 100)
y = 3 + (x*2)
y += np.random.normal(0, 0.5, size=len(x))
yerr = np.ones_like(y) * 0.1

with pm.Model() as model:
    eps = pm.HalfCauchy('eps', beta=10)

    intercept = pm.Normal('intercept', mu=0, sd=10)
    gradient = pm.Normal('gradient', mu=0, sd=10)
    obs = pm.Normal('obs', mu=y, sd=yerr, shape=len(y))
    line = pm.Deterministic('line', intercept + (x * gradient))
    like = pm.Normal('like', mu=line, sd=eps, observed=obs)
    chain = pm.sample(5000, init='advi', step=EmceeEnsemble())

i = np.percentile(chain['intercept'], [16, 50, 84])
g = np.percentile(chain['gradient'], [16, 50, 84])
e = np.percentile(chain['eps'], [16, 50, 84])

print i
print g
print e

pm.traceplot(chain[::10], varnames=['eps', 'intercept', 'gradient'])

philastrophist · 2017-01-20T20:31:02Z

There was a previous proposal about this: #659

fonnesbeck · 2017-01-20T21:52:35Z

pymc3/sampling.py

@@ -83,7 +85,7 @@ def assign_step_methods(model, step=None, methods=(NUTS, HamiltonianMC, Metropol

 def sample(draws, step=None, init='advi', n_init=200000, start=None,
           trace=None, chain=0, njobs=1, tune=None, progressbar=True,
-           model=None, random_seed=-1):
+           model=None, random_seed=-1, ):


Trailing comma

fonnesbeck · 2017-01-20T21:56:01Z

pymc3/step_methods/particle.py

+        return bij.rmap(apoint)
+
+
+class EmceeSamplerStep(ParticleStep):


Scratch the last comment (deleted), it is using emcee directly. In that case, I think this should be in pymc3.external, just as the Edward extension is.

fonnesbeck · 2017-01-20T22:07:35Z

Thanks for the PR; I had a quick peek at the code and will look at it in depth later. I'm delighted to see the addition of particle samplers, in general. As I commented above, my own preference would be to put the EmceeSampler class in the external submodule, which is where we keep the Edward extension, and have emcee as a soft dependency.

fonnesbeck · 2017-01-20T22:09:15Z

pymc3/step_methods/particle.py

+        return bij.rmap(apoint)
+
+
+class EmceeSamplerStep(ParticleStep):


Our convention is not to include Step in the name of user-facing sampler classes, so I suggest renaming to simply Emcee.

ferrine · 2017-01-20T22:11:49Z

pymc3/step_methods/particle.py

+
+    f = theano.function([inarray0], logp0)
+    f.trust_input = True
+    return f


philastrophist · 2017-01-20T22:13:10Z

I'll get onto those suggestions listed above, but I'm thinking now that there could be a factory function to build a step around an external sampler. This would remove the dependency on emcee and allow easy addition of other samplers by user, no?

twiecki · 2017-01-21T10:47:29Z

@philastrophist Very cool. Did you look at #1569?

springcoil · 2017-01-21T13:00:21Z

Very cool stuff @philastrophist. A PhD student I mentored at a previous job was very upset that particle samplers weren't in PyMC3 - I'll let him know.

twiecki · 2017-01-25T08:34:15Z

@philastrophist also needs tests and an example NB for the docs.

philastrophist · 2017-01-25T17:23:40Z

Now the particle samplers can use the advi/map etc easily in the call to pm.sample!

Not read #1569 yet but plan to see how it works.

Tests and the NB are underway. For now I'll update the example above.

There is currently somewhat more lines than strictly necessary in sample but I'll prune those in the near future. I currently have it so that you can have multiple multiprocessing jobs along with multiple particles. I don't know why you would want to do that but the functionality is there. I might remove it though to clean up some confusing sections.
I'm almost there with integrating other samplers into a ParticleCompoundStep for those scenarios that require something like discrete choice (emcee doesn't do that).

All in all it's basically done and works. What I'm currently doing is making sure it can do all the things current samplers can whilst incorporating some flexibility for people to add other ParticleSteps.

P.S. This started out as something I did for myself and I'm doing it when I have time. So it may take time for me to finish everything!

twiecki · 2017-01-27T13:58:31Z

@philastrophist Seems like you need to rebase.

fonnesbeck · 2017-01-27T21:44:49Z

Great, just remove the WIP from the tag and title when you want a pre-merge review.

twiecki · 2017-02-14T15:06:02Z

@philastrophist Any progress?

philastrophist · 2017-02-15T19:29:10Z

Currently the way this method works is quite restrictive on how the sampler works. So it only works for 1 particle sampler at a time. I think this is fine for now since any further development will need the multindtrace.
I need to rebase but i'll push the latest version very soon.
I've made some progress on generalising particle samplers in a separate branch which I'd like some feedback on once I've finished it.

philastrophist · 2017-02-18T19:00:03Z

I have made some subtle pm.sample api changes with regard to initialisation and start points to make it more general (and so it can be applied to particlesteps easily). However, this means it breaks in a small subset of cases (4 tests fail) where start is specified. The error thrown relates to variable type.

Since this is a minor problem with a change in api, I think now is the time to remove WIP (since it works for any of my cases) and start some discussion...

PS. A notebook for an intro into how to use EmceeEnsemble is included in the last commit

twiecki · 2017-02-22T09:46:26Z

pymc3/sampling.py

-    return start, step
+def transform_start_particles(start, nparticles, model=None):
+    """
+    :param start:


These doc-strings need to be made numpy-style.

That's my editor's default. Will change

twiecki · 2017-02-22T09:47:13Z

pymc3/sampling.py

-            start = start[0]
-    else:
-        raise NotImplemented('Initializer {} is not supported.'.format(init))
+def get_random_starters(nwalkers, model=None):


Can this only be used with a particle sampler? Seems like the function is more general than that.

yes, I'll move it outside the particlestep block

twiecki · 2017-02-22T09:47:43Z

pymc3/sampling.py


-    return start, step
+def transform_start_particles(start, nparticles, model=None):


should this be moved to external/emcee?

I feel like this should stay here since it's applicable to specifying jobs and other particle steps not just emcee

twiecki · 2017-02-22T09:50:19Z

I think this looks good overall. I'm a bit concerned of the added code complexity but the particle stuff should be usable by other samples except emcee too.

The changes to the API make sense.

It does need tests however.

philastrophist · 2017-02-23T17:09:42Z

Thanks for the feedback. I've made some comments, but what in particular are you concerned about with added code complexity? The additions to .sample? Or the duplication of NDArray?

With respect to NDArray, I could merge MultiNDArray and NDarray by making particles=1 the default instead of having two separate classes for particles=None and particles!=None.
I guess since I have implemented nparticles in all steps it would make sense to have nparticles=1 as the default and merge the different trace styles.

Tests will be implemented of course, I just wanted to get the api out of the way first.

twiecki · 2017-02-24T08:20:49Z

pymc3/backends/sqlite.py

+        ----------
+        varname : str
+        burn : int
+        thin : int


missing particles doc

…change in pm.sample(init='advi')

philastrophist · 2017-02-24T15:01:32Z

There is a test: test_step.TestStepMethods.test_sample_exact(NUTS) that fails specifically for NUTS.
The arrays don't match where x is the master "known sample" and y is the one generated.

x: array([  1.203436e+00,   1.203436e+00,   3.284390e-01,  -5.632208e-01,
         1.374194e+00,   1.729373e+00,   4.398738e-01,   4.398738e-01,
        -4.584570e-01,  -3.028040e-01,   9.709908e-01,   1.390899e+00,...
 y: array([ 1.118324,  1.118324,  0.428752, -1.063373, -1.175014, -0.277752,
       -0.277752,  0.003286,  0.23796 ,  1.655296,  1.790141, -0.623905,
        0.60124 ,  0.60124 ,  0.684407,  0.704353,  0.602374,  0.037173,...

I'm not quite sure what to make of this.

Other than that I think I have, or can easily, fix the errors raised in testing for this branch. Now I'll just put in a MultiNDArray test and we'll be set.

twiecki · 2017-03-06T21:14:12Z

@philastrophist Any progress?

philastrophist · 2017-03-09T11:37:58Z

Still going... Rectifying errors

`transform_start_positions` now accounts for njobs

philastrophist · 2017-03-09T11:50:22Z

Tests just underway

reac2 · 2017-05-19T14:46:52Z

Just wondering - what's the status with the Emcee sampler? Excited to test this out on our data.

twiecki · 2017-05-19T15:45:29Z

Good question, would need to ask @philastrophist.

@reac2 is your model non-differentiable?

reac2 · 2017-05-19T16:10:57Z

Thanks - will do!
Sadly yes and currently seems to look like it has converged after relatively lengthly chains but away from some of the correct values - was hoping the many walker approach would explore parameter space more efficiently. Will try implementing something similar to your 2013 hack too http://twiecki.github.io/blog/2013/09/23/emcee-pymc/

twiecki · 2017-05-19T16:25:05Z

Have you tried the new ATMCMC sampler?

twiecki · 2017-05-19T16:27:38Z

https://github.com/pymc-devs/pymc3/blob/7fb2b55bdd1a11123b541e5b1a7825b864dacc79/pymc3/step_methods/smc.py#L76

reac2 · 2017-05-19T16:36:28Z

Ah, no I was looking for ATMCMC and thought it had been removed. That's great - I will try this one out! Thanks very much.

reac2 · 2017-05-19T18:13:27Z

Is there an example for the ATMCMC code you can point me to? Our model is similar in form to the arbitrary deterministic disasters example so i'm starting there but running into trouble with redefining the likelihood_name parameter in smc.SMC.

ColCarroll · 2017-05-19T18:17:06Z

I'd check out the test case in test_smc.py. There's another super basic one in test_step.py. I'd be very interested for more examples of using it, and how to make the API better. Note -- still working on checking it thoroughly for accuracy, too!

reac2 · 2017-05-19T18:27:45Z

Great, found it! Thanks. I'll see if I can get this going with the arbitrary deterministic disasters model.

junpenglao · 2017-05-19T19:29:56Z

@reac2 you can also find an example here fitting a mixed-effect model https://github.com/junpenglao/GLMM-in-Python/blob/master/pymc3_different_sampling.py#L86-L113

philastrophist · 2017-06-01T16:44:15Z

Please see #2253 for the up-to-date refactored version. I consider this pull request outdated and will close it.

fonnesbeck reviewed Jan 20, 2017

View reviewed changes

ferrine reviewed Jan 20, 2017

View reviewed changes

pymc3/step_methods/particle.py Outdated

f = theano.function([inarray0], logp0)

f.trust_input = True

return f

Copy link

Member

ferrine Jan 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new line

twiecki mentioned this pull request Jan 21, 2017

ATMCMC2 #1569

Closed

philastrophist changed the title ~~Particle samplers and traces~~ [WIP] Particle samplers and traces Jan 22, 2017

fonnesbeck added enhancements WIP labels Jan 27, 2017

twiecki mentioned this pull request Feb 14, 2017

idioms for mixture models and inference convergence/speed? #1776

Closed

philastrophist force-pushed the particle_samplers branch from b6007e6 to 6d18ecf Compare February 18, 2017 17:22

philastrophist changed the title ~~[WIP] Particle samplers and traces~~ Particle samplers and traces for Emcee Feb 18, 2017

twiecki reviewed Feb 22, 2017

View reviewed changes

twiecki reviewed Feb 24, 2017

View reviewed changes

pymc3/backends/sqlite.py

----------

varname : str

burn : int

thin : int

Copy link

Member

twiecki Feb 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing particles doc

philastrophist force-pushed the particle_samplers branch from c5294ac to a03a6f8 Compare February 24, 2017 13:51

philastrophist added 5 commits February 24, 2017 13:55

fixed assertion message

bdbab24

adapted test. all backends would be called with nparticles=>something<

5672a96

fixed wrong __len__ specification

bcd5fe6

pm.sample is inside model context now. Changed test init to adapt to …

fd45581

…change in pm.sample(init='advi')

removed confusing stuff

7a7cbd4

philastrophist added 2 commits February 25, 2017 16:16

alterations to make tests pass. working on better init default value

ee51f5b

better default init value (auto)

b115f8d

refactored to separate normal steps and particle step.

66f8dd7

`transform_start_positions` now accounts for njobs

philastrophist added 2 commits March 9, 2017 15:30

init must be None for non-continuous

8819d40

brackets

c1cd021

twiecki mentioned this pull request Mar 15, 2017

GSoC: contributing to pymc3 #1899

Closed

twiecki mentioned this pull request Apr 17, 2017

Combine powers with PYMC 3? dfm/emcee#140

Closed

philastrophist mentioned this pull request Jun 1, 2017

External emcee sampler support #2253

Closed

philastrophist closed this Jun 1, 2017

		return bij.rmap(apoint)


		class EmceeSamplerStep(ParticleStep):


		return start, step
		def transform_start_particles(start, nparticles, model=None):

Particle samplers and traces for Emcee #1689

Particle samplers and traces for Emcee #1689

Conversation

philastrophist commented Jan 20, 2017 • edited Loading

philastrophist commented Jan 20, 2017 • edited Loading

philastrophist commented Jan 20, 2017

Choose a reason for hiding this comment

fonnesbeck Jan 20, 2017 • edited Loading

Choose a reason for hiding this comment

fonnesbeck commented Jan 20, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philastrophist commented Jan 20, 2017 • edited Loading

twiecki commented Jan 21, 2017

springcoil commented Jan 21, 2017

twiecki commented Jan 25, 2017

philastrophist commented Jan 25, 2017

twiecki commented Jan 27, 2017

fonnesbeck commented Jan 27, 2017

twiecki commented Feb 14, 2017

philastrophist commented Feb 15, 2017

philastrophist commented Feb 18, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

twiecki commented Feb 22, 2017

philastrophist commented Feb 23, 2017

Choose a reason for hiding this comment

philastrophist commented Feb 24, 2017

twiecki commented Mar 6, 2017

philastrophist commented Mar 9, 2017

philastrophist commented Mar 9, 2017

reac2 commented May 19, 2017

twiecki commented May 19, 2017

reac2 commented May 19, 2017

twiecki commented May 19, 2017

twiecki commented May 19, 2017

reac2 commented May 19, 2017

reac2 commented May 19, 2017

ColCarroll commented May 19, 2017

reac2 commented May 19, 2017

junpenglao commented May 19, 2017

philastrophist commented Jun 1, 2017

philastrophist commented Jan 20, 2017 •

edited

Loading

philastrophist commented Jan 20, 2017 •

edited

Loading

fonnesbeck Jan 20, 2017 •

edited

Loading

philastrophist commented Jan 20, 2017 •

edited

Loading