Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change SMC metropolis kernel to independent metropolis kernel #4115

Merged
merged 6 commits into from
Sep 21, 2020

Conversation

aloctavodia
Copy link
Member

This seems to provide an equal or better sampling for all models I run test (including ABC). The main advantages are for models with more than ~20 dimensions and hierarchical models.

@aloctavodia aloctavodia changed the title Change metropolis kernel to independent metropolis kernel Change SMC metropolis kernel to independent metropolis kernel Sep 18, 2020
pymc3/smc/smc.py Outdated
@@ -341,6 +337,7 @@ def __init__(
self.observations = self.sum_stat(observations)

def posterior_to_function(self, posterior):
model = self.model
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, why is this added here if it's unused?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! This is a mistake, this is part of other ideas I was trying that did not make it into the PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aloctavodia while we're here, just wanted to say that I'm studying your book and that it's really good!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad you like it. @MarcoGorelli. Hopefully, next year we will publish a new one together with Junpeng, Ravin and Ari

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome @aloctavodia - feel free to reach out to me if you want any help proof-reading

@codecov
Copy link

codecov bot commented Sep 18, 2020

Codecov Report

Merging #4115 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #4115   +/-   ##
=======================================
  Coverage   88.74%   88.75%           
=======================================
  Files          89       89           
  Lines       14040    14037    -3     
=======================================
- Hits        12460    12458    -2     
+ Misses       1580     1579    -1     
Impacted Files Coverage Δ
pymc3/smc/sample_smc.py 87.95% <100.00%> (ø)
pymc3/smc/smc.py 99.48% <100.00%> (-0.02%) ⬇️
pymc3/vartypes.py 100.00% <0.00%> (ø)
pymc3/variational/opvi.py 85.32% <0.00%> (ø)
pymc3/variational/flows.py 93.60% <0.00%> (ø)
pymc3/variational/stein.py 100.00% <0.00%> (ø)
pymc3/variational/updates.py 92.11% <0.00%> (ø)
pymc3/variational/inference.py 78.65% <0.00%> (ø)
pymc3/variational/approximations.py 80.25% <0.00%> (ø)
pymc3/variational/test_functions.py 100.00% <0.00%> (ø)
... and 2 more

pymc3/smc/smc.py Outdated
@@ -181,7 +182,6 @@ def resample(self):
self.likelihood_logp = self.likelihood_logp[resampling_indexes]
self.posterior_logp = self.prior_logp + self.likelihood_logp * self.beta
self.acc_per_chain = self.acc_per_chain[resampling_indexes]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove.

pymc3/smc/smc.py Outdated
@@ -64,8 +65,6 @@ def __init__(
self.acc_rate = 1
self.acc_per_chain = np.ones(self.draws)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

pymc3/smc/smc.py Outdated
Comment on lines 217 to 223
dist = multivariate_normal(self.posterior.mean(axis=0), self.cov)

for n_step in range(self.n_steps):
proposal = floatX(self.posterior + proposals[n_step])
proposal = floatX(dist.rvs(size=self.draws))
proposal = proposal.reshape(len(proposal), -1)
forward = dist.logpdf(proposal)
backward = multivariate_normal(proposal.mean(axis=0), self.cov).logpdf(self.posterior)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to understand a bit better what the changes are doing here, so my summary is that:

1, it removes the tuning related to self.scalings, which originally is to tune a scaling of the proposal so that each chain approach a target acceptance rate 0.234;

2, the new transition kernel use global information from the current posterior mu = self.posterior.mean(axis=0) to build a MVNormal, and to account for the transition
image
(notation from Wikipedia).
For that, the "backward"
image
use a new MvNormal with mu = mean(x_prime).
It is independent MH as we have
image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool - could you add some of these as inline comments?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why dist = multivariate_normal(self.posterior.mean(axis=0), self.cov) is not in the for n_step ... loop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intuitively I thought about fixing the proposal before doing a productive run. Also because this will make easier to use more complex proposal distributions in the future, as you compute the mean(s) and covariance(s) one per stage.

Copy link
Member

@junpenglao junpenglao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please some comments on the changes and also add a line to the release note.

@junpenglao
Copy link
Member

LGTM overall, although it is not obvious to me that why this will work better.
Does the changes use more stage or less now? what about steps within each stage?

@aloctavodia
Copy link
Member Author

I think this is just offering a general better proposal distribution as it is better learning the geometry of the posterior. With the previous proposal every point close to the borders will generate a lot of rejections (and it will push the scaling to lower values), and I guess as you move to higher dimensions the chances you are at a border increase. I still think a better proposal could be learned from the previous stage (suggestions are welcome!), not sure at this point which option is the best, a Gaussian of mixture could help, maybe even one with just a few components.

I was mostly checking 3 things the trace (with kind="rank_vlines"), the ESS, R-hat. For example with the previous sampler for dimensions larger than ~15 you start to see some problems, now it will still work until dimension ~60. I additionally check the time, and is a little bit slower but not that much, considering you know get reasonable answers :-)

@junpenglao
Copy link
Member

junpenglao commented Sep 19, 2020

Wrote an implementation in TFP https://colab.research.google.com/gist/junpenglao/637647b975e3616f5bd362a07859d1d8/smc-demo.ipynb, seems in general better (except mixture distribution with mode far away, which also make sense)

@aloctavodia aloctavodia merged commit ba77d85 into pymc-devs:master Sep 21, 2020
@aloctavodia aloctavodia deleted the smc_imh branch September 21, 2020 14:37
@@ -35,7 +35,7 @@ def sample_smc(
n_steps=25,
start=None,
tune_steps=True,
p_acc_rate=0.99,
p_acc_rate=0.85,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized my simulation use .99 still and it seems better than .85 - how confidence are you with this @aloctavodia?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure at all. This is something to test and most likely change. Eyeballing 0.9 could be a good compromise. But for you chains/steps are cheaper,right? So maybe you can keep 0.99.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants