Skip to content

Commit

Permalink
Doc updates, full tests re-enabled, osx disabled in travis
Browse files Browse the repository at this point in the history
  • Loading branch information
awllee committed Dec 8, 2017
1 parent a399adf commit 441770c
Show file tree
Hide file tree
Showing 12 changed files with 43 additions and 32 deletions.
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
language: julia
os:
- linux
- osx
# - osx
julia:
- 0.6
- nightly
Expand Down
11 changes: 4 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
# SequentialMonteCarlo
# SequentialMonteCarlo.jl

[![Build Status](https://travis-ci.org/awllee/SequentialMonteCarlo.jl.svg?branch=master)](https://travis-ci.org/awllee/SequentialMonteCarlo.jl)

[![Coverage Status](https://coveralls.io/repos/awllee/SequentialMonteCarlo.jl/badge.svg?branch=master&service=github)](https://coveralls.io/github/awllee/SequentialMonteCarlo.jl?branch=master)

[![codecov.io](http://codecov.io/github/awllee/SequentialMonteCarlo.jl/coverage.svg?branch=master)](http://codecov.io/github/awllee/SequentialMonteCarlo.jl?branch=master)

[![](https://img.shields.io/badge/docs-latest-blue.svg)](https://awllee.github.io/SequentialMonteCarlo.jl/latest)

This package provides a light interface to a serial and multi-threaded implementation of the Sequential Monte Carlo (SMC) algorithm. SMC is a random algorithm for approximate numerical integration and/or sampling.
Expand Down Expand Up @@ -85,9 +82,9 @@ println(SequentialMonteCarlo.allEtas(smcio, p->p.x*p.x, false))
println(SequentialMonteCarlo.allEtas(smcio, p->p.x, true))
println(SequentialMonteCarlo.allEtas(smcio, p->p.x*p.x, true))

## Now try with 4 threads instead of 1 and time the algorithm. There are
## compilation overheads the first time for the parallel parts of the SMC
## implementation. There are still a number of small allocations the second
## Now try with Threads.nthreads() threads instead of 1 and time the algorithm.
## There are compilation overheads the first time for the parallel parts of the
## SMC implementation. There are still a number of small allocations the second
## time; there is an allocation for each parallel region in the algorithm, and
## there are a few such regions at each step of the algorithm. This is due to
## Julia's multi-threading interface.
Expand Down
1 change: 1 addition & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ makedocs(
"Implementation notes" => "impl.md"
"SMC interface" => "smcinterface.md"
"Types and functions" => "guide.md"
"Hidden Markov models" => "hmm.md"
"References" => "refs.md"
]
)
Expand Down
2 changes: 1 addition & 1 deletion docs/src/csmc.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Input: reference path $(x_{1}^{{\rm ref}},\ldots,x_{n}^{{\rm ref}})$
1. Output the path $(\zeta_{1}^{K_{1}},\ldots,\zeta_{n}^{K_{n}})$.
---

cSMC is of special interest as it defines a Markov kernel that is ergodic, for $N \geq 2$, with invariant probability measure given by \\[ \hat{\pi}(A)=\hat{Z}_{n}^{-1}\int_{A}G_{n}(x_{n})M_{1}({\rm d}x_{1})\prod_{p=2}^{n}G_{p-1}(x_{p-1})M_{p}(x_{p-1},{\rm d}x_{p}), \\] for $A \in \mathcal{X}^{\otimes{n}}$.
cSMC is of special interest as it defines a Markov kernel that is ergodic, for $N \geq 2$, with invariant probability measure given by \\[ \hat{\pi}(A)=\hat{Z}_{n}^{-1}\int_{A}G_{n}(x_{n})M_{1}({\rm d}x_{1})\prod_{p=2}^{n}G_{p-1}(x_{p-1})M_{p}(x_{p-1},{\rm d}x_{p}), \qquad A \in \mathcal{X}^{\otimes{n}}. \\]

That is, the Markov kernel $P_{N}(x^{{\rm ref}}, \cdot)$ defined by calling the cSMC algorithm has invariant probability measure $\hat{\pi}$. Theoretical results, in particular concerning the effect of $N$ on the rate of convergence of $P_{N}$, have been developed by Chopin & Singh (2015), Andrieu et al. (2018) and Lindsten et al. (2015).

Expand Down
9 changes: 9 additions & 0 deletions docs/src/hmm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Connection to hidden Markov models

Probably the most common use for SMC is in statistical applications involving hidden Markov models (HMMs), where the methodology originates. Indeed, there is a very simple correspondence between an HMM and an SMC model. The purpose of this part of the documentation is only to clarify how different SMC models can be associated with the same HMM.

An simple HMM can be described as a bivariate Markov chain $(X_1,Y_1), \ldots, (X_n, Y_n)$ as follows. $X_1, \ldots, X_n$ is a Markov chain determined by an initial distribution $\mu$ and Markov transition densities $f_2, \ldots, f_n$. For each $p \in \{1, \ldots, n\}$, $Y_p$ is conditionally independent of all other random variables given $X_p$ and $Y_p \mid (X_p = x)$ has density $g_p(x, \cdot)$. Statistical inference for HMMs involve performing inference after observing $(y_1, \ldots, y_n)$ as a realization of $(Y_1, \ldots, Y_n)$.

A standard use of SMC for such an HMM is as follows. Let the observations $(y_1, \ldots, y_n)$ be given and choose $M_1 = \mu$, $M_p(x, dx') = f_p(x, x') dx'$ for $p \in \{2, \ldots, n\}$ and $G_p(x) = g_p(x, y_p)$ for $p \in \{1, \ldots, n\}$. Notice that the potential function $G_p$ is defined using the observation $y_p$. Then it can be deduced that $\hat{Z}_p$ is the *marginal likelihood* associated with the first $p$ observations, $\eta_p$ is the *predictive* distribution of $X_p$ given the first $p-1$ observations and $\hat{\eta}_p$ is the *filtering* distribution of $X_p$ given the first $p$ observations. That is, \\[ X_p \mid \\{(Y_0, \ldots, Y_{p-1}) = (y_0, \ldots, y_{p-1})\\} \sim \eta_p \\] and \\[ X_p \mid \\{(Y_0, \ldots, Y_{p}) = (y_0, \ldots, y_{p})\\} \sim \hat{\eta}_p. \\]

However, this is only one way to map the HMM into an SMC Model. We extend the HMM's state space and define $M_1(d(x_{\rm prev}, x)) = \mu(dx) \delta_x(dx_{\rm prev})$, and \\[ M_p((x_{\rm prev},x), d(x_{\rm prev}',x')) = q_p(x, x') \delta_x(dx_{\rm prev}'), \qquad p \in \\{2, \ldots, n\\} \\] where $q_p$ is such that $q_p(x, x') > 0$ whenever $f_p(x, x') > 0$. Then one can define \\[ G_p(x_{\rm prev}, x) = \frac{f_p(x_{\rm prev}, x)}{q_p(x_{\rm prev}, x)} g(x, y_p), \qquad p \in \\{1, \ldots, n\\} \\] and it is still the case that $\hat{Z}_p$ is the marginal likelihood associated with the first $p$ observations, $\eta_p$ is the predictive distribution of $X_p$ given the first $p-1$ observations and $\hat{\eta}_p$ is the filtering distribution of $X_p$ given the first $p$ observations. An example of using different SMC models for a given HMM is provided in [SMCExamples: comparison of Linear Gaussian models](https://github.com/awllee/SMCExamples.jl/blob/master/demo/lgComparisonDemo.jl) ; the model associated with the "locally optimal proposal" is defined [here](https://github.com/awllee/SMCExamples.jl/blob/master/src/lgLOPModel.jl).
10 changes: 6 additions & 4 deletions docs/src/impl.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# Implementation notes

The multi-threaded implementation of SMC is fairly straightforward. The attempt to efficiently utilize multiple cores is largely focused around low-overhead resampling / selection, since the remaining steps are embarrassingly parallel.
The multi-threaded implementation of SMC is fairly straightforward. The attempt to efficiently utilize multiple cores is largely focused around using appropriate data structures and low-overhead resampling / selection, since the remaining steps are embarrassingly parallel.

There is also some attention paid to Julia's threading interface and, for both parallel and serial execution, the need to avoid dynamic memory allocations. All code for the main package was written using Julia's core and base functionality.
There is also some attention paid to Julia's threading interface and, for both parallel and serial execution, the need to avoid dynamic memory allocations by pre-allocating reusable memory. All code for the package was written using Julia's core and base functionality.

Generating the ancestor indices in sorted order is accomplished primarily through use of the uniform spacings method for generating a sequence of sorted uniform random variates developed by Lurie & Hartley (1972) and described by Devroye (1986, p. 214). In multi-threaded code, the multinomial variate giving the number of ancestor indices each thread should simulate using a thread-specific local vector of particle weights is simulated by using a combination of inversion sampling and a a Julia implementation of the ```btrd``` algorithm of Hörmann (1993), which simulates binomially distributed variates in expected $\mathcal{O}(1)$ time. Simulating a multinomial random variate is accomplished by simulating a sequence of appropriate Binomial random variates. The code-generating function for copying particles was adapted from [a suggestion on Julia Discourse](https://discourse.julialang.org/t/how-to-copy-all-fields-without-changing-the-referece/945/5) by Greg Plowman.
Generating the ancestor indices in sorted order is accomplished primarily through use of the uniform spacings method for generating a sequence of sorted uniform random variates developed by Lurie & Hartley (1972) and described by Devroye (1986, p. 214). In multi-threaded code, the multinomial variate giving the number of ancestor indices each thread should simulate using a thread-specific subvector of particle weights is simulated by using a combination of inversion sampling and an implementation of the ```btrd``` algorithm of Hörmann (1993), which simulates binomially distributed variates in expected $\mathcal{O}(1)$ time. Simulating a multinomial random variate is accomplished by simulating a sequence of appropriate Binomial random variates. The code-generating function for copying particles was adapted from [a suggestion on Julia Discourse](https://discourse.julialang.org/t/how-to-copy-all-fields-without-changing-the-referece/945/5) by Greg Plowman.

For the demonstration code, use of the [StaticArrays.jl](https://github.com/JuliaArrays/StaticArrays.jl) package provides dramatic improvements for multivariate particles, and is highly recommended for fixed-size vector components of particles. It is possible to use the counter-based RNGs in the [RandomNumbers.jl](https://github.com/sunoru/RandomNumbers.jl) package in place of the default random number generator; this was not working on the Julia-0.7 master branch at the time of development, and is slightly slower than the MersenneTwister RNG provided by Julia's base.
For the demonstration code in the [SMCExamples](https://github.com/awllee/SMCExamples.jl) package, use of [StaticArrays.jl](https://github.com/JuliaArrays/StaticArrays.jl) provides dramatic improvements for multivariate particles, and is highly recommended for fixed-size vector components of particles.

It is possible to use the counter-based RNGs in the [RandomNumbers.jl](https://github.com/sunoru/RandomNumbers.jl) package in place of the default random number generator; this was not working on the Julia-0.7 development branch at the time of development, and is slightly slower than the MersenneTwister RNG provided by Julia's base. In the future, allowing users to choose between the two, or to specify their own RNGs, would be useful.
2 changes: 1 addition & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@
*A light interface to serial and multi-threaded Sequential Monte Carlo*

```@contents
Pages = [ "intro.md", "smcintegrals.md", "smcalgorithm.md", "smctheory.md", "smcve.md", "smcadaptive.md", "csmc.md", "impl.md", "smcinterface.md", "guide.md", "refs.md"]
Pages = [ "intro.md", "smcintegrals.md", "smcalgorithm.md", "smctheory.md", "smcve.md", "smcadaptive.md", "csmc.md", "impl.md", "smcinterface.md", "guide.md", "hmm.md", "refs.md"]
```
2 changes: 1 addition & 1 deletion docs/src/intro.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Introduction
# [Introduction](@id intro)

Sequential Monte Carlo (SMC) algorithms are defined in terms of an initial distribution $M_1$, a sequence of Markov transition kernels $M_{2}, \ldots, M_n$ and a sequence of non-negative potential functions $G_1, \ldots, G_n$.

Expand Down
6 changes: 4 additions & 2 deletions docs/src/smcinterface.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Specifying an SMC model

We recall from the [Introduction](@ref) that the SMC algorithm is defined in terms of $M_1, \ldots, M_n$ and $G_1, \ldots, G_n$.
We recall from the [Introduction](@ref intro) that the SMC algorithm is defined in terms of $M_1, \ldots, M_n$ and $G_1, \ldots, G_n$.

A ```model``` of type ```SMCModel``` can be created by calling
```
Expand Down Expand Up @@ -47,10 +47,12 @@ with ```hat``` determining which approximation is returned. Calling this functio

One can extract the approximations $(\eta^N_p(f) \geq 0, \log |\gamma^N_p(f)|)$ or $(\hat{\eta}^N_p(f) \geq 0, \log |\hat{\gamma}^N_p(f)|)$ by calling
```
slgamma(smcio, f, hat::Bool, p::Int64)
SequentialMonteCarlo.slgamma(smcio, f, hat::Bool, p::Int64)
```
with ```hat``` determining which approximation is returned. Calling this function with ```p < smcio.n``` requires ```smcio.fullOutput = true```.

If ```smcio.fullOutput = true``` one can call ```SequentialMonteCarlo.allEtas``` or ```SequentialMonteCarlo.allGammas``` to obtain a vector of outputs of ```SequentialMonteCarlo.eta``` or ```SequentialMonteCarlo.slgamma```, respectively.

## Variance estimators

The following functions relate to the approximations detailed in the [variance estimators page](@ref vepage).
Expand Down
2 changes: 1 addition & 1 deletion docs/src/smcve.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ Lee & Whiteley (2015) also define approximations $v_{p,n}^N(f)$ and $\hat{v}_{p,
---
**Theorem** [Lee & Whiteley, 2015]. Let the potential functions $G_1, \ldots, G_n$ be bounded and strictly positive. The following hold for an arbitrary, bounded $f$:

1. Lack-of-bias: $\mathbb{E}\left[\left(\hat{Z}_n^N\right)^2\hat{v}_{p,n}^N(f)\right]=\hat{Z}_n^2\hat{\sigma}_n^2(f)$ for all $N\geq1$.
1. Lack-of-bias: $\mathbb{E}\left[\left(\hat{Z}_n^N\right)^2\hat{v}_{p,n}^N(f)\right]=\hat{Z}_n^2\hat{v}_{p,n}(f)$ for all $N\geq1$.

2. Consistency: $\hat{v}_{p,n}^N(f)\overset{P}{\rightarrow}\hat{v}_{p,n}(f)$ and $\hat{v}_{p,n}^N(f)(f-\hat{\eta}_n^N(f))\overset{P}{\rightarrow}\hat{v}_{p,n}(f-\hat{\eta}_n(f))$.

Expand Down
2 changes: 1 addition & 1 deletion test/REQUIRE
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
#SMCExamples
SMCExamples
StaticArrays
26 changes: 13 additions & 13 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -19,16 +19,16 @@ end
@time include("multinomial_test.jl")
end

# include("finiteFK_test.jl")
#
# @testset "LGModel tests" begin
# @time include("lgModel_test.jl")
# end
#
# @testset "MVLGModel tests" begin
# @time include("mvlgModel_test.jl")
# end
#
# @testset "SMC Sampler tests" begin
# @time include("smcSampler_test.jl")
# end
include("finiteFK_test.jl")

@testset "LGModel tests" begin
@time include("lgModel_test.jl")
end

@testset "MVLGModel tests" begin
@time include("mvlgModel_test.jl")
end

@testset "SMC Sampler tests" begin
@time include("smcSampler_test.jl")
end

0 comments on commit 441770c

Please sign in to comment.