-
Notifications
You must be signed in to change notification settings - Fork 49
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
8 changed files
with
98 additions
and
189 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
# Parametric bootstrap for linear mixed-effects models | ||
|
||
Julia is well-suited to implementing bootstrapping and other simulation-based methods for statistical models. | ||
The `bootstrap!` function in the [MixedModels package](https://github.com/dmbates/MixedModels.jl) provides | ||
an efficient parametric bootstrap for linear mixed-effects models, assuming that the results of interest | ||
from each simulated response vector can be incorporated into a vector of floating-point values. | ||
|
||
## The parametric bootstrap | ||
|
||
[Bootstrapping](https://en.wikipedia.org/wiki/Bootstrapping_(statistics)) is a family of procedures | ||
for generating sample values of a statistic, allowing for visualization of the distribution of the | ||
statistic or for inference from this sample of values. | ||
|
||
A _parametric bootstrap_ is used with a parametric model, `m`, that has been fitted to data. | ||
The procedure is to simulate `n` response vectors from `m` using the estimated parameter values | ||
and refit `m` to these responses in turn, accumulating the statistics of interest at each iteration. | ||
|
||
The parameters of a linear mixed-effects model as fit by the `lmm` function are the fixed-effects | ||
parameters, `β`, the standard deviation, `σ`, of the per-observation noise, and the covariance | ||
parameter, `θ`, that defines the variance-covariance matrices of the random effects. | ||
|
||
For example, a simple linear mixed-effects model for the `Dyestuff` data in the [`lme4`](http://github.com/lme4/lme4) | ||
package for [`R`](https://www.r-project.org) is fit by | ||
```{julia;term=true} | ||
using DataFrames, Gadfly, MixedModels | ||
``` | ||
```{julia;echo=false;results="hidden"} | ||
include(Pkg.dir("MixedModels", "test", "data.jl")) | ||
``` | ||
```{julia;term=true} | ||
show(ds) # Dyestuff data set | ||
``` | ||
```{julia;term=true} | ||
m1 = fit!(lmm(Yield ~ 1 + (1 | Batch), ds)) | ||
``` | ||
|
||
|
||
## Using the `bootstrap!` function | ||
|
||
This quick explanation is provided for those who only wish to use the `bootstrap!` method and do not need | ||
detailed explanations of how it works. | ||
The three arguments to `bootstrap!` are the matrix that will be overwritten with the results, the model to bootstrap, | ||
and a function that overwrites a vector with the results of interest from the model. | ||
|
||
Suppose the objective is to obtain 100,000 parametric bootstrap samples of the estimates of the "variance | ||
components", `σ²` and `σ₁²`, in this model. In many implementations of mixed-effects models the | ||
estimate of `σ₁²`, the variance of the scalar random effects, is reported along with a | ||
standard error, as if the estimator could be assumed to have a Gaussian distribution. | ||
Is this a reasonable assumption? | ||
|
||
A suitable function to save the results is | ||
```{julia;term=true} | ||
function saveresults!(v, m) | ||
v[1] = varest(m) | ||
v[2] = abs2(getθ(m)[1]) * v[1] | ||
end | ||
``` | ||
The `varest` extractor function returns the estimate of `σ²`. As seen above, the estimate of the | ||
`σ₁` is the product of `Θ` and the estimate of `σ`. The expression `abs2(getΘ(m)[1])` evaluates to | ||
`Θ²`. The `[1]` is necessary because the value returned by `getθ` is a vector and a scalar is needed | ||
here. | ||
|
||
As with any simulation-based method, it is advisable to set the random number seed before calling | ||
`bootstrap!` for reproducibility. | ||
```{julia;term=true;} | ||
srand(1234321); | ||
``` | ||
```{julia;term=true;} | ||
results = bootstrap!(zeros(2, 100000), m1, saveresults!); | ||
``` | ||
The results for each bootstrap replication are stored in the columns of the matrix passed in as the first | ||
argument. A density plot of the first row using the [`Gadfly`](https://github.com/dcjones/Gadfly.jl) package | ||
is created as | ||
```{julia;eval=false;term=true} | ||
plot(x = sub(results, 1, :), Geom.density(), Guide.xlabel("Parametric bootstrap estimates of σ²")) | ||
``` | ||
```{julia;echo=false;fig_cap="Density of parametric bootstrap estimates of σ² from model m1"; fig_width=8;} | ||
plot(x = sub(results, 1, :), Geom.density(), Guide.xlabel("Parametric bootstrap estimates of σ²")) | ||
``` | ||
```{julia;echo=false;fig_cap="Density of parametric bootstrap estimates of σ₁² from model m1"; fig_width=8;} | ||
plot(x = sub(results, 2, :), Geom.density(), Guide.xlabel("Parametric bootstrap estimates of σ₁²")) | ||
``` | ||
|
||
The distribution of the bootstrap samples of `σ²` is a bit skewed but not terribly so. However, the | ||
distribution of the bootstrap samples of the estimate of `σ₁²` is highly skewed and has a spike at | ||
zero. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.