Adaptive Metropolis algorithm #57

kaandocal · 2021-05-26T10:45:10Z

I've implemented a simple version of the standard AM algorithm as a small extension to #39 (this PR is independent of #39). The pull request features a multivariate Gaussian proposal which uses a rescaled version of the posterior covariance (Haario et al.'s famous 2.38 rule) + a small epsilon term for positive-definiteness and initial exploration. In order to update the proposal I had to create a new sampler class, AMSampler, which is basically MetropolisHastings but updates the proposal at each sampling step via the trackstep function. The default implementation of trackstep does nothing.

The update mechanism is slightly different from that in #39: this one performs adaptation in AbstractMCMC.step whereas #39 does it in propose. I think the new AMSampler class should be able to handle both, and since it consists of minimal modifications to the original MetropolisHastings it could provide a general interface for adaptive MH algorithms.

Notes:

Proposals are shared between threads - how can this be changed?
I haven't implemented all overloads of propose for AMSampler but this should not be an issue.
This PR would benefit from Return accept/reject stat #38 to make the trackstep interface more uniform.
The test example is minimal, I would welcome any suggestions!

devmotion · 2021-05-26T11:13:11Z

Is it possible to just add a new proposal and extend the existing sampler, if needed, but not add a new sampler? I think this would simplify the code and also be easier if one wants to combine different proposals.

I am also wondering if trackstep is needed at all - maybe one could just update the proposal right before proposing a new sample in the propose step?

devmotion · 2021-05-26T11:19:07Z

src/adaptivemetropolis.jl

+mutable struct AMProposal <: Proposal{MvNormal}
+    epsilon::Symmetric
+    scalefactor::Float64
+    proposal::MvNormal
+    samplemean::AbstractArray
+    samplesqmean::AbstractMatrix
+    N::Int
+end


It would be good to add type parameters such that the fields are concretely typed: https://docs.julialang.org/en/v1/manual/performance-tips/index.html#Avoid-fields-with-abstract-type

This should be fixed now.

devmotion · 2021-05-26T11:19:39Z

src/adaptivemetropolis.jl

+    N::Int
+end
+
+function AMProposal(epsilon::AbstractMatrix, scalefactor=2.38^2 / size(epsilon, 1))


What's the motivation for this specific scale factor? Is there a reference for it?

The motivation is from this classic:

https://projecteuclid.org/journals/bernoulli/volume-7/issue-2/An-adaptive-Metropolis-algorithm/bj/1080222083.full

I will add a reference!

kaandocal · 2021-05-26T11:25:15Z

Is it possible to just add a new proposal and extend the existing sampler, if needed, but not add a new sampler? I think this would simplify the code and also be easier if one wants to combine different proposals.

I didn't want to modify existing code in order to not break something but the changes could be trivially implemented in MetropolisHastings! I can move things there if that'd be better.

I am also wondering if trackstep is needed at all - maybe one could just update the proposal right before proposing a new sample in the propose step?

I wanted to leave users the option of querying the sampler freely for new proposals without changing the state, hence why I thought updating the proposal in step would be a good idea. Furthermore the accept/reject decision is taken in step, so the proposed samples might not end up in the chain - the AM algorithm only uses samples in the chain to do its adaptation.

devmotion · 2021-05-26T11:31:59Z

Furthermore the accept/reject decision is taken in step, so the proposed samples might not end up in the chain - the AM algorithm only uses samples in the chain to do its adaptation.

My suggestion was not to update the proposal based on the proposed sample but on the previous sample - this should always be the accepted sample. And since the previous sample is available in the propose step, I thought this could be sufficient.

kaandocal · 2021-05-26T11:43:08Z

I see, so like the approach in #39 ! I personally think having a separate callback interface for adaptation would be cleaner, and being able to track acceptances/rejections might be useful for some algorithms. And compared to the accepted! callback in #39 this should be a bit more consistent/flexible.

devmotion · 2021-05-26T12:18:00Z

I'm not strictly against adding such a function but there is one major problem: transitions contain all parameters but only for a subset of them (e.g. just one parameter) one might want to use the adaptive proposal. Thus in general it is not correct to call trackstep with the AMProposal and a Transition, it should be called only with the corresponding sample.

One way to achieve this is to handle it in propose. Another one would be to iterate through samples and proposals.

In particular due to these mixed cases, I think it would be better to only add a new proposal but not a new sampler.

devmotion · 2021-05-26T12:20:22Z

Another advantage of handling it in the propose step is that you don't have to construct the MvNormal proposal, which probably is not very efficient.

devmotion · 2021-05-26T12:25:07Z

Slightly unrelated thought:

being able to track acceptances/rejections might be useful

I think probably one would not want to keep track of these statistics in a separate trackstep function - otherwise every proposal would have to reimplement the tracking functionality. Probably it would be better to hardcode it in the Transition object and the implementation of step, and then possibly forward it to the proposals if needed.

devmotion · 2021-05-26T12:26:45Z

src/adaptivemetropolis.jl

+# When the proposal is initialised the empirical posterior covariance is zero
+function trackstep(proposal::AMProposal, trans::Transition)
+    proposal.samplemean .= trans.params
+    proposal.samplesqmean .= trans.params * trans.params'


This still computes trans.params * trans.params' first and then copies it to proposal.samplesqmean. You can avoid this by writing

Suggested change

proposal.samplesqmean .= trans.params * trans.params'

mul!(proposal.samplesqmean, trans.params, trans.params')

devmotion · 2021-05-26T12:29:10Z

src/adaptivemetropolis.jl

+
+# Recompute the empirical posterior covariance matrix
+function trackstep(proposal::AMProposal, trans::Transition, ::Union{Val{true}, Val{false}})
+    proposal.samplemean .= (proposal.samplemean .* proposal.N .+ trans.params) ./ (proposal.N + 1)


IIRC it is numerically more stable to write

Suggested change

proposal.samplemean .= (proposal.samplemean .* proposal.N .+ trans.params) ./ (proposal.N + 1)

proposal.N += 1

proposal.samplemean .+= (trans.params .- proposal.samplemean) ./ proposal.N

devmotion · 2021-05-26T12:35:15Z

src/adaptivemetropolis.jl

+    proposal.samplesqmean .= (proposal.samplesqmean .* proposal.N + trans.params * trans.params') ./ 
+                             (proposal.N + 1)


This would fuse all operations (might require Compat.jl on older Julia versions) but is not as stable as the version for the mean above:

Suggested change

proposal.samplesqmean .= (proposal.samplesqmean .* proposal.N + trans.params * trans.params') ./

(proposal.N + 1)

mul!(proposal.samplesqmean, trans.params, trans.params', true, N)

ldiv!(N + 1, proposal.samplesqmean)

devmotion · 2021-05-26T12:39:37Z

src/adaptivemetropolis.jl

+    proposal.samplesqmean .= (proposal.samplesqmean .* proposal.N + trans.params * trans.params') ./ 
+                             (proposal.N + 1)
+
+    samplecov = proposal.samplesqmean .- proposal.samplemean * proposal.samplemean'


One should be aware that this is the "naive" algorithm for computing the variance which can lead to catastrophic cancellation (see e.g. https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Na%C3%AFve_algorithm). Maybe it would be better to use Welford's algorithm (https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm)?

I will have a look at this later. In combination with creating a new MvNormal distribution every round, which is not very efficient, I think we could make this whole step more efficient by using the rank-one Cholesky update algorithm, although I am not sure how best to do this given that MvNormals are immutable

I think the Welford implementation in AdvancedHMC is pretty good. I think it's an adaptation of the Stan version.

kaandocal · 2021-05-26T12:51:33Z

Slightly unrelated thought:

being able to track acceptances/rejections might be useful

I think probably one would not want to keep track of these statistics in a separate trackstep function - otherwise every proposal would have to reimplement the tracking functionality. Probably it would be better to hardcode it in the Transition object and the implementation of step, and then possibly forward it to the proposals if needed.

I agree, which is why I think implementing #38 would be useful! It shouldn't be a big change but probably belongs in another pull request

kaandocal · 2021-05-26T13:13:31Z

I'm not strictly against adding such a function but there is one major problem: transitions contain all parameters but only for a subset of them (e.g. just one parameter) one might want to use the adaptive proposal. Thus in general it is not correct to call trackstep with the AMProposal and a Transition, it should be called only with the corresponding sample.

One way to achieve this is to handle it in propose. Another one would be to iterate through samples and proposals.

I have just merged AMSampler into MetropolisHastings. I haven't seen multiple independent adaptive proposals in the wild, but this implementation just calls trackstep for every proposal in case of AbstractArrays or NamedTuples. Note that acceptances/rejections are not determined by a single proposal, but by a combination of all of them so having more than one adaptive proposal might not work very well.

cpfiffer

Awesome stuff -- I'm happy with this for the moment but will defer to David's judgement if he has further suggestions.

src/adaptivemetropolis.jl

cpfiffer · 2021-05-26T15:00:03Z

src/adaptivemetropolis.jl

+    proposal.samplesqmean .= (proposal.samplesqmean .* proposal.N + trans.params * trans.params') ./ 
+                             (proposal.N + 1)
+
+    samplecov = proposal.samplesqmean .- proposal.samplemean * proposal.samplemean'


I think the Welford implementation in AdvancedHMC is pretty good. I think it's an adaptation of the Stan version.

src/AdvancedMH.jl

kaandocal · 2021-05-26T16:58:51Z

Any way we could reuse the Welford implementation from AdvancedHMC?

kaandocal · 2021-05-26T16:59:57Z

By the way, it would be easy to combine #39 with this pull request, but we should maybe fix nomenclature for the methods as the names almost overlap - any suggestions?

devmotion · 2021-05-26T17:01:07Z

Any way we could reuse the Welford implementation from AdvancedHMC?

Only if it's moved to a separate package I assume.

kaandocal · 2021-05-26T17:14:04Z

By the way - and this is embarassing - #39 already implemented pretty much the same adaptive algorithm. I didn't realise this until now, so this PR is a bit of a duplicate really. Apologies to @arzwa.

kaandocal · 2021-05-26T17:22:22Z

There's a subpackage called Adaptation in AdvancedHMC - for now we could probably borrow code from there (this should be permitted by the BSD-3 and MIT Licences). So we could merge that computation of the sample covariance matrix with this and #39.

One idea would be basically copying the interface in AdvancedHMC, ie. having an abstract Adaptor struct with initialize!(adaptor, proposal, initial_sample, model) and update!(adaptor, proposal, sample_prev, sample, model, accepted). That way we could avoid creating new Proposal structs and cleanly separate the adaptation logic.

yebai · 2021-05-26T17:27:58Z

How about moving the AdvancedHMC mass matrix adaption code into AbstractMCMC.Adaption submodule?

kaandocal · 2021-05-26T17:33:30Z

How about moving the AdvancedHMC mass matrix adaption code into AbstractMCMC.Adaption submodule?

It might make sense to define an abstract Adaptation interface in AbstractMCMC. As far as I know AdvancedHMC does not use AbstractMCMC though.

yebai · 2021-05-26T17:37:00Z

There is a PR in progress - see TuringLang/AdvancedHMC.jl#259

arzwa · 2021-05-26T17:38:39Z

By the way - and this is embarassing - #39 already implemented pretty much the same adaptive algorithm. I didn't realise this until now, so this PR is a bit of a duplicate really. Apologies to @arzwa.

No problem at all, that PR got stalled due to myself not really finding the time to finish it off decently (it worked for what I needed, and I got sidetracked).

kaandocal · 2021-05-26T20:02:49Z

I created a new branch adaptive (available on my GitHub) which implements a more generic interface (I don't want to spam PRs).

Main changes:

MetropolisHastings has a second field named adaptor
AbstractAdaptor{ProposalType} as an abstract type for adaptors with methods initialize! and update!
NoAdaptor{PT} <: AbstractAdaptor{PT} as a default no-op adaptor
Adaptors can be combined for arrays/named tuples of proposals (yielding an array/named tuple of adaptors)
AMAdaptor for symmetric MvNormal random walk proposals

Not sure if the type interface is the cleanest, I would appreciate any feedback!

Note: I'm not sure if we really want to support multiple adaptives for arrays/tuples of proposals, this feature could be removed...

devmotion · 2021-05-26T20:07:30Z

Intuitively I would have thought adaptor should be part of an AdaptiveProposal but not a general property of MetropolisHastings. That seems a bit cleaner but maybe you noticed some advantages if it is part of the sampler?

kaandocal · 2021-05-26T20:13:24Z

Not really to be honest! I tried to see if it would be cleaner but at least my new "solution" is more complicated than what we have already. Unless you or @cpfiffer see any advantages in separating out the adaptor logic (I don't really) we can just keep it as it is.

cpfiffer · 2021-05-26T20:21:10Z

Unless you or @cpfiffer see any advantages in separating out the adaptor logic (I don't really) we can just keep it as it is.

For right now, I think it's fine to just go as-is. We can add in more general adaptation interfaces later on.

kaandocal · 2021-05-26T22:27:05Z

I have uploaded a new version that uses Welford's algorithm. The code now more or less covers the AdaptiveMvNormal features in #39 . The current adaptation interface is via trackstep! (added a bang).

cpfiffer · 2021-05-27T00:40:39Z

Cool. I like it. I've turned on the test suite so we can see how the tests run.

kaandocal · 2021-05-27T08:37:58Z

Cool. I like it. I've turned on the test suite so we can see how the tests run.

There was an error due to my using a different version of LinearAlgebra.jl, the AdaptiveMH test should now run correctly (hopefully)

cpfiffer · 2021-05-27T14:59:34Z

I'll approve for now. Could you add a little bit of usage info to the README file?

devmotion · 2021-05-27T16:36:52Z

Project.toml

@@ -5,6 +5,8 @@ version = "0.6.0"
 [deps]
 AbstractMCMC = "80f14c24-f653-4e6a-9b94-39d6b0f70001"
 Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
+LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
+PDMats = "90014a1f-27ba-587c-ab20-58faa44d9150"


Can you add a [compat] entry for PDMats?

src/adaptivemetropolis.jl

devmotion · 2021-05-27T16:40:16Z

src/mh-core.jl

+# Called to update proposal when the first sample is drawn
+trackstep!(proposal::Proposal, params) = nothing
+trackstep!(proposal::AbstractArray, params) = foreach(trackstep!, proposal, params)
+trackstep!(proposal::NamedTuple, params) = foreach(trackstep!, proposal, params)


Parameters in the NamedTuple params are not necessarily in the same order as the proposal.

devmotion · 2021-05-27T16:43:10Z

src/mh-core.jl

+trackstep!(proposal::Proposal, params, 
+           ::Union{Val{true}, Val{false}}) = nothing
+
+trackstep!(proposal::AbstractArray, params, accept::Union{Val{true},Val{false}}) = 
+           foreach((prop, par) -> trackstep!(prop, par, accept), proposal, params)
+
+trackstep!(proposal::NamedTuple, params, accept::Union{Val{true},Val{false}}) = 
+           foreach((prop, par) -> trackstep!(prop, par, accept), proposal, params)


Can we remove the type annotations on accept? They are not needed.

Suggested change

trackstep!(proposal::Proposal, params,

::Union{Val{true}, Val{false}}) = nothing

trackstep!(proposal::AbstractArray, params, accept::Union{Val{true},Val{false}}) =

foreach((prop, par) -> trackstep!(prop, par, accept), proposal, params)

trackstep!(proposal::NamedTuple, params, accept::Union{Val{true},Val{false}}) =

foreach((prop, par) -> trackstep!(prop, par, accept), proposal, params)

trackstep!(proposal::Proposal, params, accept) = nothing

function trackstep!(proposal::AbstractArray, params, accept)

return foreach((prop, par) -> trackstep!(prop, par, accept), proposal, params)

end

function trackstep!(proposal::NamedTuple, params, accept)

return foreach((prop, par) -> trackstep!(prop, par, accept), proposal, params)

end

Additionally, again the NamedTuple implementation is not completely correct.

devmotion · 2021-05-27T16:44:23Z

src/mh-core.jl

@@ -157,6 +173,7 @@ function AbstractMCMC.step(
        transition = AdvancedMH.transition(spl, model, init_params)
    end

+    trackstep!(spl.proposal, transition.params)


The step implementations are written for generic AbstractTransitions and MHSampler, so neither spl.proposal nor transition.params might exist.

test/runtests.jl

Co-authored-by: Cameron Pfiffer <cpfiffer@gmail.com> Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

kaandocal · 2021-05-28T10:40:48Z

@arzwa We should also incorporate your univariate adaptive proposal. Any suggestions for a name? We can also change AMProposal.

ParadaCarleton · 2022-12-28T22:03:32Z

@kaandocal Are you still interested in getting this PR merged?

@devmotion

[Diff since v0.6.7](TuringLang/AdvancedMH.jl@v0.6.7...v0.6.8) **Merged pull requests:** - Remove type restriction of log density function (TuringLang#68) (@devmotion)

kaandocal · 2023-01-02T08:25:35Z

We should probably get this PR out of the way. I did not pursue this much further because I had trouble getting it to work on my sampling problems - this approach needs a lot of samples to get a decent estimate of the covariance matrix and seems quite fickle in practice. Due to the nature of MH it just takes a while to explore the target distribution even if you have a decent covariance matrix... AdvancedHMC does a fantastic job at adaptation in my experience, and this is nowhere near that. While it is quite different @ParadaCarleton I do recommend having a look at Emcee in the meanwhile.

ParadaCarleton · 2023-01-08T22:35:26Z

We should probably get this PR out of the way. I did not pursue this much further because I had trouble getting it to work on my sampling problems - this approach needs a lot of samples to get a decent estimate of the covariance matrix and seems quite fickle in practice. Due to the nature of MH it just takes a while to explore the target distribution even if you have a decent covariance matrix... AdvancedHMC does a fantastic job at adaptation in my experience, and this is nowhere near that. While it is quite different @ParadaCarleton I do recommend having a look at Emcee in the meanwhile.

Yep, I totally get that.

The reason an AMH sampler would be useful is less to do with me wanting to use AMH itself, and more to do with AMH being a useful building block in other algorithms. I think @fipelle mentioned an interest in this earlier?

kaandocal added 2 commits May 26, 2021 11:32

Added Adaptive Metropolis algorithm

71f1072

More documentation

3a23129

devmotion reviewed May 26, 2021

View reviewed changes

kaandocal added 2 commits May 26, 2021 13:07

Concrete types for AMProposal

c40d4db

Added reference for the AM algorithm

5afb87e

devmotion reviewed May 26, 2021

View reviewed changes

Merged AMSampler into MetropolisHastings

8cb275d

cpfiffer reviewed May 26, 2021

View reviewed changes

kaandocal added 2 commits May 26, 2021 23:14

Added Welford's algorithm for adaptive covariance calculations

46f77ca

trackstep -> trackstep!

970b0f5

Fixed diagm issue in AdaptiveMH test

c800aaf

devmotion reviewed May 27, 2021

View reviewed changes

Apply suggestions from code review

eb91b36

Co-authored-by: Cameron Pfiffer <cpfiffer@gmail.com> Co-authored-by: David Widmann <devmotion@users.noreply.github.com>

cpfiffer mentioned this pull request Oct 25, 2022

Adaptive proposals in Turing composite Gibbs sampler #44

Closed

Merge tag 'v0.6.8'

77bd3e5

[Diff since v0.6.7](TuringLang/AdvancedMH.jl@v0.6.7...v0.6.8) **Merged pull requests:** - Remove type restriction of log density function (TuringLang#68) (@devmotion)

	proposal.samplesqmean .= trans.params * trans.params'
	mul!(proposal.samplesqmean, trans.params, trans.params')

	proposal.samplemean .= (proposal.samplemean .* proposal.N .+ trans.params) ./ (proposal.N + 1)
	proposal.N += 1
	proposal.samplemean .+= (trans.params .- proposal.samplemean) ./ proposal.N

		proposal.samplesqmean .= (proposal.samplesqmean .* proposal.N + trans.params * trans.params') ./
		(proposal.N + 1)

Adaptive Metropolis algorithm #57

Are you sure you want to change the base?

Adaptive Metropolis algorithm #57

Conversation

kaandocal commented May 26, 2021

devmotion commented May 26, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaandocal commented May 26, 2021

devmotion commented May 26, 2021

kaandocal commented May 26, 2021

devmotion commented May 26, 2021

devmotion commented May 26, 2021

devmotion commented May 26, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaandocal commented May 26, 2021

kaandocal commented May 26, 2021 • edited

cpfiffer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaandocal commented May 26, 2021

kaandocal commented May 26, 2021

devmotion commented May 26, 2021

kaandocal commented May 26, 2021 • edited

kaandocal commented May 26, 2021

yebai commented May 26, 2021

kaandocal commented May 26, 2021

yebai commented May 26, 2021 • edited

arzwa commented May 26, 2021

kaandocal commented May 26, 2021 • edited

devmotion commented May 26, 2021

kaandocal commented May 26, 2021

cpfiffer commented May 26, 2021

kaandocal commented May 26, 2021

cpfiffer commented May 27, 2021

kaandocal commented May 27, 2021 • edited

cpfiffer commented May 27, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaandocal commented May 28, 2021

ParadaCarleton commented Dec 28, 2022

kaandocal commented Jan 2, 2023

ParadaCarleton commented Jan 8, 2023

kaandocal commented May 26, 2021 •

edited

kaandocal commented May 26, 2021 •

edited

yebai commented May 26, 2021 •

edited

kaandocal commented May 26, 2021 •

edited

kaandocal commented May 27, 2021 •

edited