Merge de0e5b2 into 56431bd

TuringLang · Nov 17, 2021 · 4e73964 · 4e73964
2 parents 56431bd + de0e5b2
commit 4e73964
Show file tree

Hide file tree

Showing 2 changed files with 193 additions and 0 deletions.
diff --git a/docs/src/api.md b/docs/src/api.md
@@ -76,3 +76,168 @@ For chains of this type, AbstractMCMC defines the following two methods.
 AbstractMCMC.chainscat
 AbstractMCMC.chainsstack
 ```
+
+## Interacting with states of samplers
+
+To make it a bit easier to interact with some arbitrary sampler state, we encourage implementations of `AbstractSampler` to implement the following methods:
+```@docs
+AbstractMCMC.parameters
+AbstractMCMC.setparameters!!
+```
+and optionally
+```@docs
+AbstractMCMC.updatestate!!(state, transition, state_prev)
+```
+These methods can also be useful for implementing samplers which wraps some inner samplers, e.g. a mixture of samplers.
+
+### Example: `MixtureSampler`
+
+In a `MixtureSampler` we need two things:
+- `components`: collection of samplers.
+- `weights`: collection of weights representing the probability of chosing the corresponding sampler.
+
+```julia
+struct MixtureSampler{W,C} <: AbstractMCMC.AbstractSampler
+    components::C
+    weights::W
+end
+```
+
+To implement the state, we need to keep track of a couple of things:
+- `index`: the index of the sampler used in this `step`.
+- `transition`: the transition resulting from this `step`.
+- `states`: the current states of _all_ the components.
+Two aspects of this might seem a bit strange:
+1. We need to keep track of the states of _all_ components rather than just the state for the sampler we used previously.
+2. We need to put the `transition` from the `step` into the state.
+
+The reason for (1) is that lots of samplers keep track of more than just the previous realizations of the variables, e.g. in `AdvancedHMC.jl` we keep track of the momentum used, the metric used, etc.
+
+For (2) the reason is similar: some samplers might keep track of the variables _in the state_ differently, e.g. you might have a sampler which is _independent_ of the current realizations and the state is simply `nothing`. 
+
+Hence, we need the `transition`, which should always contain the realizations, to make sure we can resume from the same point in the space in the next `step`.
+```julia
+struct MixtureState{T,S}
+    index::Int
+    transition::T
+    states::S
+end
+```
+The `step` for a `MixtureSampler` is defined by the following generative process
+```math
+\begin{aligned}
+i &\sim \mathrm{Categorical}(w_1, \dots, w_k) \\
+X_t &\sim \mathcal{K}_i(\cdot \mid X_{t - 1})
+\end{aligned}
+```
+where ``\mathcal{K}_i`` denotes the i-th kernel/sampler, and ``w_i`` denotes the weight/probability of choosing the i-th sampler.
+[`AbstractMCMC.updatestate!!`](@ref) comes into play in defining/computing ``\mathcal{K}_i(\cdot \mid X_{t - 1})`` since ``X_{t - 1}`` could be coming from a different sampler. 
+
+If we let `state` be the current `MixtureState`, `i` the current component, and `i_prev` is the previous component we sampled from, then this translates into the following piece of code:
+
+```julia
+# Update the corresponding state, i.e. `state.states[i]`, using
+# the state and transition from the previous iteration.
+state_current = AbstractMCMC.updatestate!!(
+    state.states[i], state.states[i_prev], state.transition
+)
+
+# Take a `step` for this sampler using the updated state.
+transition, state_current = AbstractMCMC.step(
+    rng, model, sampler_current, sampler_state;
+    kwargs...
+)
+```
+
+The full [`AbstractMCMC.step`](@ref) implementation would then be something like:
+
+```julia
+function AbstractMCMC.step(rng, model::AbstractMCMC.AbstractModel, sampler::MixtureSampler, state; kwargs...)
+    # Sample the component to use in this `step`.
+    i = rand(Categorical(sampler.weights))
+    sampler_current = sampler.components[i]
+
+    # Update the corresponding state, i.e. `state.states[i]`, using
+    # the state and transition from the previous iteration.
+    i_prev = state.index
+    state_current = AbstractMCMC.updatestate!!(
+        state.states[i], state.states[i_prev], state.transition
+    )
+
+    # Take a `step` for this sampler using the updated state.
+    transition, state_current = AbstractMCMC.step(
+        rng, model, sampler_current, state_current;
+        kwargs...
+    )
+
+    # Create the new states.
+    # NOTE: A better approach would be to use `Setfield.@set state.states[i] = ...`
+    # but to keep this demo self-contained, we don't.
+    states_new = ntuple(1:length(state.states)) do j
+        if j != i
+            state.states[i]
+        else
+            state_inner
+        end
+    end
+
+    # Create the new `MixtureState`.
+    state_new = MixtureState(i, transition, states_new)
+
+    return transition, state_new
+end
+```
+
+And for the initial [`AbstractMCMC.step`](@ref) we have:
+
+```julia
+function AbstractMCMC.step(rng, model::AbstractMCMC.AbstractModel, sampler::MixtureSampler; kwargs...)
+    # Initialize every state.
+    transitions_and_states = map(sampler.components) do spl
+        AbstractMCMC.step(rng, model, spl; kwargs...)
+    end
+
+    # Sample the component to use this `step`.
+    i = rand(Categorical(sampler.weights))
+    # Extract the corresponding transition.
+    transition = first(transition_and_states[i])
+    # Extract states.
+    states = map(last, transitions_and_states)
+    # Create new `MixtureState`.
+    state = MixtureState(i, transition, states)
+
+    return transition, state
+end
+```
+
+To use `MixtureSampler` with two samplers `sampler1` and `sampler2` as components, we'd simply do
+
+```julia
+sampler = MixtureSampler((0.1, 0.9), (sampler1, sampler2))
+transition, state = AbstractMCMC.step(rng, model, sampler)
+while ...
+    transition, state = AbstractMCMC.step(rng, model, sampler, state)
+end
+```
+
+As a final note, there is one potential issue we haven't really addressed in the above implementation: a lot of samplers have their own implementations of `AbstractMCMC.AbstractModel` which means that we would also have to ensure that all the different samplers we are using would be compatible with the same model. A very easy way to fix this would be to just add a struct called `ManyModels` supporting `getindex`, e.g. `models[i]` would return the i-th `model`:
+
+```julia
+struct ManyModels{M} <: AbstractMCMC.AbstractModel
+    models::M
+end
+
+Base.getindex(model::ManyModels, I...) = model.models[I...]
+```
+
+Then the above `step` would just extract the `model` corresponding to the current sampler:
+
+```julia
+# Take a `step` for this sampler using the updated state.
+transition, state_current = AbstractMCMC.step(
+    rng, model[i], sampler_current, state_current;
+    kwargs...
+)
+```
+
+This issue should eventually disappear as the community moves towards a unified approach to implement `AbstractMCMC.AbstractModel`.
diff --git a/src/AbstractMCMC.jl b/src/AbstractMCMC.jl
@@ -79,6 +79,34 @@ The `MCMCSerial` algorithm allows users to sample serially, with no thread or pr
 """
 struct MCMCSerial <: AbstractMCMCEnsemble end
 
+"""
+    updatestate!!(state, transition_prev[, state_prev])
+
+Return new instance of `state` using information from `transition_prev` and, optionally, `state_prev`.
+
+Defaults to `setparameters!!(state, parameters(transition_prev))`.
+"""
+updatestate!!(state, transition_prev, state_prev) = updatestate!!(state, transition_prev)
+updatestate!!(state, transition) = setparameters!!(state, parameters(transition))
+
+"""
+    setparameters!!(state, parameters)
+
+Update the parameters of the `state` with `parameters` and return it.
+
+If `state` can be updated in-place, it is expected that this function returns `state` with updated
+parameters. Otherwise a new `state` object with the new `parameters` is returned.
+"""
+function setparameters!! end
+
+"""
+    parameters(transition)
+
+Return parameters in `transition`.
+"""
+function parameters end
+
+
 include("samplingstats.jl")
 include("logging.jl")
 include("interface.jl")