A Julia package that defines a container format for Markov Chain Monte Carlo samples.
The purpose of this package is to facilitate cooperation between Markov Chain Monte Carlo statistics packages in the Julia ecosystem by providing common data structures.
Packages which produce MCMC chains can make them available as an object of type MCMCChain
, which is a collection of AbstractVector
s, indexed by Symbol
s, and some extra information, such as draw indices (eg 1001:10:2000
means that the first 1000
draws are not included, and the rest is thinned by a factor of 10
), information on what part of the chain should end up in the pooled sample for inference, and arbitrary metadata, eg information on adaptation, sampling, or even continuing the chain.
Other packages can implement convergence diagnostics, on single MCMCChain
s or collections of them.
Once the user is satisfied with convergence and wants to perform inference or posterior predictive checks, pool
can merge the chains, discarding the draws as indicated. This results in a MCMCDraws
object, which is just a wrapper around a Dict
to enforce uniform length and consistent order of keys
.
+-------+ +--------------------------+
| Stan |----+ | convergence diagnostics: |
+-------+ | | - Rhat |
| +-------------+ | - plots |
+-------+ | | |==>==| - effective sample size |
| Klara |----+ | MCMCChain 1 | | - ... |
+-------+ | | MCMCChain 2 | +--------------------------+
+==>==| MCMCChain 3 |
+-------+ | | ... | +-------------------+
| Mamba |----+ | |==>==| discard and pool: |
+-------+ | +-------------+ | MCMCDraws |
| +-------------------+
+-------+ | ||
| ... |----+ \/
+-------+ +-----------------------------+
| posterior inference |
| - plots |
| - quantiles/HPD intervals |
| |
| posterior predictive checks |
| |
| ... |
+-----------------------------+
-
Draws of variables are "reconstituted" from a tabular format. For example, an matrix is not stored elementwise as vectors of elements, but as a vector of
Matrix
es. -
Chains of a single variable are vectors. This is not the most efficient storage format, as
Vector{Array}
could be repacked as anArray
, however, for compact storage aVector
of static arrays is recommended, which can be packed and stored compactly. This package should work fine with anyAbstractVector
. -
Posterior predictive checks and other simulations can be implemented very simply using
broadcast
. -
This package should accommodate various storage formats, JLD is recommended.