Doc updates, full tests re-enabled, osx disabled in travis

awllee · Dec 8, 2017 · 441770c · 441770c
1 parent a399adf
commit 441770c
Show file tree

Hide file tree

Showing 12 changed files with 43 additions and 32 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -2,7 +2,7 @@
 language: julia
 os:
   - linux
-  - osx
+#  - osx
 julia:
   - 0.6
   - nightly

diff --git a/README.md b/README.md
@@ -1,11 +1,8 @@
-# SequentialMonteCarlo
+# SequentialMonteCarlo.jl
 
 [![Build Status](https://travis-ci.org/awllee/SequentialMonteCarlo.jl.svg?branch=master)](https://travis-ci.org/awllee/SequentialMonteCarlo.jl)
-
 [![Coverage Status](https://coveralls.io/repos/awllee/SequentialMonteCarlo.jl/badge.svg?branch=master&service=github)](https://coveralls.io/github/awllee/SequentialMonteCarlo.jl?branch=master)
-
 [![codecov.io](http://codecov.io/github/awllee/SequentialMonteCarlo.jl/coverage.svg?branch=master)](http://codecov.io/github/awllee/SequentialMonteCarlo.jl?branch=master)
-
 [![](https://img.shields.io/badge/docs-latest-blue.svg)](https://awllee.github.io/SequentialMonteCarlo.jl/latest)
 
 This package provides a light interface to a serial and multi-threaded implementation of the Sequential Monte Carlo (SMC) algorithm. SMC is a random algorithm for approximate numerical integration and/or sampling.
@@ -85,9 +82,9 @@ println(SequentialMonteCarlo.allEtas(smcio, p->p.x*p.x, false))
 println(SequentialMonteCarlo.allEtas(smcio, p->p.x, true))
 println(SequentialMonteCarlo.allEtas(smcio, p->p.x*p.x, true))
 
-## Now try with 4 threads instead of 1 and time the algorithm. There are
-## compilation overheads the first time for the parallel parts of the SMC
-## implementation. There are still a number of small allocations the second
+## Now try with Threads.nthreads() threads instead of 1 and time the algorithm.
+## There are compilation overheads the first time for the parallel parts of the
+## SMC implementation. There are still a number of small allocations the second
 ## time; there is an allocation for each parallel region in the algorithm, and
 ## there are a few such regions at each step of the algorithm. This is due to
 ## Julia's multi-threading interface.

diff --git a/docs/make.jl b/docs/make.jl
@@ -17,6 +17,7 @@ makedocs(
     "Implementation notes" => "impl.md"
     "SMC interface" => "smcinterface.md"
     "Types and functions" => "guide.md"
+    "Hidden Markov models" => "hmm.md"
     "References" => "refs.md"
   ]
 )

diff --git a/docs/src/csmc.md b/docs/src/csmc.md
@@ -20,7 +20,7 @@ Input: reference path $(x_{1}^{{\rm ref}},\ldots,x_{n}^{{\rm ref}})$
 1. Output the path $(\zeta_{1}^{K_{1}},\ldots,\zeta_{n}^{K_{n}})$.
 ---
 
-cSMC is of special interest as it defines a Markov kernel that is ergodic, for $N \geq 2$, with invariant probability measure given by \\[ \hat{\pi}(A)=\hat{Z}_{n}^{-1}\int_{A}G_{n}(x_{n})M_{1}({\rm d}x_{1})\prod_{p=2}^{n}G_{p-1}(x_{p-1})M_{p}(x_{p-1},{\rm d}x_{p}), \\] for $A \in \mathcal{X}^{\otimes{n}}$.
+cSMC is of special interest as it defines a Markov kernel that is ergodic, for $N \geq 2$, with invariant probability measure given by \\[ \hat{\pi}(A)=\hat{Z}_{n}^{-1}\int_{A}G_{n}(x_{n})M_{1}({\rm d}x_{1})\prod_{p=2}^{n}G_{p-1}(x_{p-1})M_{p}(x_{p-1},{\rm d}x_{p}), \qquad A \in \mathcal{X}^{\otimes{n}}. \\]
 
 That is, the Markov kernel $P_{N}(x^{{\rm ref}}, \cdot)$ defined by calling the cSMC algorithm has invariant probability measure $\hat{\pi}$. Theoretical results, in particular concerning the effect of $N$ on the rate of convergence of $P_{N}$, have been developed by Chopin & Singh (2015), Andrieu et al. (2018) and Lindsten et al. (2015).
 

diff --git a/docs/src/hmm.md b/docs/src/hmm.md
@@ -0,0 +1,9 @@
+# Connection to hidden Markov models
+
+Probably the most common use for SMC is in statistical applications involving hidden Markov models (HMMs), where the methodology originates. Indeed, there is a very simple correspondence between an HMM and an SMC model. The purpose of this part of the documentation is only to clarify how different SMC models can be associated with the same HMM.
+
+An simple HMM can be described as a bivariate Markov chain $(X_1,Y_1), \ldots, (X_n, Y_n)$ as follows. $X_1, \ldots, X_n$ is a Markov chain determined by an initial distribution $\mu$ and Markov transition densities $f_2, \ldots, f_n$. For each $p \in \{1, \ldots, n\}$, $Y_p$ is conditionally independent of all other random variables given $X_p$ and $Y_p \mid (X_p = x)$ has density $g_p(x,  \cdot)$. Statistical inference for HMMs involve performing inference after observing $(y_1, \ldots, y_n)$ as a realization of $(Y_1, \ldots, Y_n)$.
+
+A standard use of SMC for such an HMM is as follows. Let the observations $(y_1, \ldots, y_n)$ be given and choose $M_1 = \mu$, $M_p(x, dx') = f_p(x, x') dx'$ for $p \in \{2, \ldots, n\}$ and $G_p(x) = g_p(x, y_p)$ for $p \in \{1, \ldots, n\}$. Notice that the potential function $G_p$ is defined using the observation $y_p$. Then it can be deduced that $\hat{Z}_p$ is the *marginal likelihood* associated with the first $p$ observations, $\eta_p$ is the *predictive* distribution of $X_p$ given the first $p-1$ observations and $\hat{\eta}_p$ is the *filtering* distribution of $X_p$ given the first $p$ observations. That is, \\[ X_p \mid \\{(Y_0, \ldots, Y_{p-1}) = (y_0, \ldots, y_{p-1})\\} \sim \eta_p \\] and \\[ X_p \mid \\{(Y_0, \ldots, Y_{p}) = (y_0, \ldots, y_{p})\\} \sim \hat{\eta}_p. \\]
+
+However, this is only one way to map the HMM into an SMC Model. We extend the HMM's state space and define $M_1(d(x_{\rm prev}, x)) = \mu(dx) \delta_x(dx_{\rm prev})$, and \\[ M_p((x_{\rm prev},x), d(x_{\rm prev}',x')) = q_p(x, x') \delta_x(dx_{\rm prev}'), \qquad p \in \\{2, \ldots, n\\} \\] where $q_p$ is such that $q_p(x, x') > 0$ whenever $f_p(x, x') > 0$. Then one can define \\[ G_p(x_{\rm prev}, x) = \frac{f_p(x_{\rm prev}, x)}{q_p(x_{\rm prev}, x)} g(x, y_p), \qquad p \in \\{1, \ldots, n\\} \\] and it is still the case that $\hat{Z}_p$ is the marginal likelihood associated with the first $p$ observations, $\eta_p$ is the predictive distribution of $X_p$ given the first $p-1$ observations and $\hat{\eta}_p$ is the filtering distribution of $X_p$ given the first $p$ observations. An example of using different SMC models for a given HMM is provided in [SMCExamples: comparison of Linear Gaussian models](https://github.com/awllee/SMCExamples.jl/blob/master/demo/lgComparisonDemo.jl) ; the model associated with the "locally optimal proposal" is defined [here](https://github.com/awllee/SMCExamples.jl/blob/master/src/lgLOPModel.jl).
diff --git a/docs/src/impl.md b/docs/src/impl.md
@@ -1,9 +1,11 @@
 # Implementation notes
 
-The multi-threaded implementation of SMC is fairly straightforward. The attempt to efficiently utilize multiple cores is largely focused around low-overhead resampling / selection, since the remaining steps are embarrassingly parallel.
+The multi-threaded implementation of SMC is fairly straightforward. The attempt to efficiently utilize multiple cores is largely focused around using appropriate data structures and low-overhead resampling / selection, since the remaining steps are embarrassingly parallel.
 
-There is also some attention paid to Julia's threading interface and, for both parallel and serial execution, the need to avoid dynamic memory allocations. All code for the main package was written using Julia's core and base functionality.
+There is also some attention paid to Julia's threading interface and, for both parallel and serial execution, the need to avoid dynamic memory allocations by pre-allocating reusable memory. All code for the package was written using Julia's core and base functionality.
 
-Generating the ancestor indices in sorted order is accomplished primarily through use of the uniform spacings method for generating a sequence of sorted uniform random variates developed by Lurie & Hartley (1972) and described by Devroye (1986, p. 214). In multi-threaded code, the multinomial variate giving the number of ancestor indices each thread should simulate using a thread-specific local vector of particle weights is simulated by using a combination of inversion sampling and a a Julia implementation of the ```btrd``` algorithm of Hörmann (1993), which simulates binomially distributed variates in expected $\mathcal{O}(1)$ time. Simulating a multinomial random variate is accomplished by simulating a sequence of appropriate Binomial random variates. The code-generating function for copying particles was adapted from [a suggestion on Julia Discourse](https://discourse.julialang.org/t/how-to-copy-all-fields-without-changing-the-referece/945/5) by Greg Plowman.
+Generating the ancestor indices in sorted order is accomplished primarily through use of the uniform spacings method for generating a sequence of sorted uniform random variates developed by Lurie & Hartley (1972) and described by Devroye (1986, p. 214). In multi-threaded code, the multinomial variate giving the number of ancestor indices each thread should simulate using a thread-specific subvector of particle weights is simulated by using a combination of inversion sampling and an implementation of the ```btrd``` algorithm of Hörmann (1993), which simulates binomially distributed variates in expected $\mathcal{O}(1)$ time. Simulating a multinomial random variate is accomplished by simulating a sequence of appropriate Binomial random variates. The code-generating function for copying particles was adapted from [a suggestion on Julia Discourse](https://discourse.julialang.org/t/how-to-copy-all-fields-without-changing-the-referece/945/5) by Greg Plowman.
 
-For the demonstration code, use of the [StaticArrays.jl](https://github.com/JuliaArrays/StaticArrays.jl) package provides dramatic improvements for multivariate particles, and is highly recommended for fixed-size vector components of particles. It is possible to use the counter-based RNGs in the [RandomNumbers.jl](https://github.com/sunoru/RandomNumbers.jl) package in place of the default random number generator; this was not working on the Julia-0.7 master branch at the time of development, and is slightly slower than the MersenneTwister RNG provided by Julia's base.
+For the demonstration code in the [SMCExamples](https://github.com/awllee/SMCExamples.jl) package, use of [StaticArrays.jl](https://github.com/JuliaArrays/StaticArrays.jl) provides dramatic improvements for multivariate particles, and is highly recommended for fixed-size vector components of particles.
+
+It is possible to use the counter-based RNGs in the [RandomNumbers.jl](https://github.com/sunoru/RandomNumbers.jl) package in place of the default random number generator; this was not working on the Julia-0.7 development branch at the time of development, and is slightly slower than the MersenneTwister RNG provided by Julia's base. In the future, allowing users to choose between the two, or to specify their own RNGs, would be useful.
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -3,5 +3,5 @@
 *A light interface to serial and multi-threaded Sequential Monte Carlo*
 
 ```@contents
-Pages = [ "intro.md", "smcintegrals.md", "smcalgorithm.md", "smctheory.md", "smcve.md", "smcadaptive.md", "csmc.md", "impl.md", "smcinterface.md", "guide.md", "refs.md"]
+Pages = [ "intro.md", "smcintegrals.md", "smcalgorithm.md", "smctheory.md", "smcve.md", "smcadaptive.md", "csmc.md", "impl.md", "smcinterface.md", "guide.md", "hmm.md", "refs.md"]
 ```
diff --git a/docs/src/intro.md b/docs/src/intro.md
@@ -1,4 +1,4 @@
-# Introduction
+# [Introduction](@id intro)
 
 Sequential Monte Carlo (SMC) algorithms are defined in terms of an initial distribution $M_1$, a sequence of Markov transition kernels $M_{2}, \ldots, M_n$ and a sequence of non-negative potential functions $G_1, \ldots, G_n$.
 

diff --git a/docs/src/smcinterface.md b/docs/src/smcinterface.md
@@ -2,7 +2,7 @@
 
 ## Specifying an SMC model
 
-We recall from the [Introduction](@ref) that the SMC algorithm is defined in terms of $M_1, \ldots, M_n$ and $G_1, \ldots, G_n$.
+We recall from the [Introduction](@ref intro) that the SMC algorithm is defined in terms of $M_1, \ldots, M_n$ and $G_1, \ldots, G_n$.
 
 A ```model``` of type ```SMCModel``` can be created by calling
 ```
@@ -47,10 +47,12 @@ with ```hat``` determining which approximation is returned. Calling this functio
 
 One can extract the approximations $(\eta^N_p(f) \geq 0, \log |\gamma^N_p(f)|)$ or $(\hat{\eta}^N_p(f) \geq 0, \log |\hat{\gamma}^N_p(f)|)$ by calling
 ```
-slgamma(smcio, f, hat::Bool, p::Int64)
+SequentialMonteCarlo.slgamma(smcio, f, hat::Bool, p::Int64)
 ```
 with ```hat``` determining which approximation is returned. Calling this function with ```p < smcio.n``` requires ```smcio.fullOutput = true```.
 
+If ```smcio.fullOutput = true``` one can call ```SequentialMonteCarlo.allEtas``` or ```SequentialMonteCarlo.allGammas``` to obtain a vector of outputs of ```SequentialMonteCarlo.eta``` or ```SequentialMonteCarlo.slgamma```, respectively.
+
 ## Variance estimators
 
 The following functions relate to the approximations detailed in the [variance estimators page](@ref vepage).

diff --git a/docs/src/smcve.md b/docs/src/smcve.md
@@ -36,7 +36,7 @@ Lee & Whiteley (2015) also define approximations $v_{p,n}^N(f)$ and $\hat{v}_{p,
 ---
 **Theorem** [Lee & Whiteley, 2015]. Let the potential functions $G_1, \ldots, G_n$ be bounded and strictly positive. The following hold for an arbitrary, bounded $f$:
 
-1. Lack-of-bias: $\mathbb{E}\left[\left(\hat{Z}_n^N\right)^2\hat{v}_{p,n}^N(f)\right]=\hat{Z}_n^2\hat{\sigma}_n^2(f)$ for all $N\geq1$.
+1. Lack-of-bias: $\mathbb{E}\left[\left(\hat{Z}_n^N\right)^2\hat{v}_{p,n}^N(f)\right]=\hat{Z}_n^2\hat{v}_{p,n}(f)$ for all $N\geq1$.
 
 2. Consistency: $\hat{v}_{p,n}^N(f)\overset{P}{\rightarrow}\hat{v}_{p,n}(f)$ and $\hat{v}_{p,n}^N(f)(f-\hat{\eta}_n^N(f))\overset{P}{\rightarrow}\hat{v}_{p,n}(f-\hat{\eta}_n(f))$.
 

diff --git a/test/REQUIRE b/test/REQUIRE
@@ -1,2 +1,2 @@
-#SMCExamples
+SMCExamples
 StaticArrays
diff --git a/test/runtests.jl b/test/runtests.jl
@@ -19,16 +19,16 @@ end
   @time include("multinomial_test.jl")
 end
 
-# include("finiteFK_test.jl")
-#
-# @testset "LGModel tests" begin
-#   @time include("lgModel_test.jl")
-# end
-#
-# @testset "MVLGModel tests" begin
-#   @time include("mvlgModel_test.jl")
-# end
-#
-# @testset "SMC Sampler tests" begin
-#   @time include("smcSampler_test.jl")
-# end
+include("finiteFK_test.jl")
+
+@testset "LGModel tests" begin
+  @time include("lgModel_test.jl")
+end
+
+@testset "MVLGModel tests" begin
+  @time include("mvlgModel_test.jl")
+end
+
+@testset "SMC Sampler tests" begin
+  @time include("smcSampler_test.jl")
+end