# MCMC for LMMs with a single vector-valued r.e. term

Creating a Stan model for linear mixed models (LMMs) with time-varying covariates in the random-effects terms requires a model matrix approach.

A simple example of a model with correlated random effects for slope and intercept is analyzed using both the [MixedModels package](https://github.com/dmbates/MixedModels.jl) and [Stan](http://mc-stan.org/) through the [Stan package](https://github.com/goedman/Stan.jl) for [Julia](http://julialang.org)

First we install the MixedModels package

In [1]:
Pkg.add("MixedModels")

INFO: Nothing to be done
INFO: METADATA is out-of-date — you may not have the latest version of MixedModels
INFO: Use `Pkg.update()` to get the latest versions of your packages


In [2]:
using DataFrames,Stan,Mamba,RDatasets,MixedModels

Environment variable JULIA_SVG_BROWSER not found.


In [3]:
const slp = dataset("lme4","sleepstudy")

Unnamed: 0,Reaction,Days,Subject
1,249.56,0,308
2,258.7047,1,308
3,250.8006,2,308
4,321.4398,3,308
5,356.8519,4,308
6,414.6901,5,308
7,382.2038,6,308
8,290.1486,7,308
9,430.5853,8,308
10,466.3535,9,308


Start with a model for vector-valued random effects for a single grouping factor.  In the `MixedModels` package this would be fit as

In [4]:
m1 = fit(lmm(Reaction ~ 1 + Days + (1+Days|Subject), slp))

Linear mixed model fit by maximum likelihood
Formula: Reaction ~ 1 + Days + ((1 + Days) | Subject)

 logLik: -875.969672, deviance: 1751.939344

 Variance components:
                Variance    Std.Dev.  Corr.
 Subject      565.516376   23.780588
               32.682265    5.716840   0.08
 Residual     654.940901   25.591813
 Number of obs: 180; levels of grouping factors: 18

  Fixed-effects parameters:
             Estimate Std.Error z value
(Intercept)   251.405   6.63228 37.9063
Days          10.4673   1.50224 6.96779


It is very fast to fit such a model

In [5]:
@time fit(lmm(Reaction ~ 1 + Days + (1+Days|Subject), slp));

elapsed time: 0.008625299 seconds (1784296 bytes allocated)


## Creating a stan model

I have been trying to go through the Stan _User's Guide and Reference Manual_ to create a stan formulation.  It is a bit tricky in that this model has random effects for time-varying covariates, `Days` in this case, and you need to form a dot product of the random-effects vector for the m'th `Subject` and a particular row of the random-effects model matrix.

As described in [Bates et al. (2014)](http://arxiv.org/abs/1406.5823) we describe how we use sparse matrices in the [lme4 package](https://github.com/lme4/lme4) for `R` to "expand" the evaluation of $Z\Lambda u$ across the random effects for each subject.  Matrices $Z$ and $\Lambda$ are both sparse in that formulation.

Here I am using a dense formulation of $Z$.  Suppose that `Zt` is a `J`x`N` matrix and `b` is a `J`x`M` matrix.  I want to create an `N` vector whose `n`th element is the dot product of the `n`th column of `Zt` and the `subj[n]`th column of `b`, but I haven't been able to work out how to do that. 

It would be straightforward if I knew how to access a column of a matrix.  I have looked throught likely sections of the manual but still don't know how to do that, other than running nested loops and indexing individual array elements, which seems kind of clunky.

This is my latest attempt but it fails to compile, with an error message about the assignment in the loop at the end of the `transformed variables` section (i.e. the loop to assign `mu[n]`)

In [6]:
VectorOne = """
data {
  int<lower=0>  N; // num observations
  int<lower=1>  K; // length of fixed-effects vector
  int<lower=1>  M; // num subjects
  int<lower=1>  J; // length of individual vector-valued random effects
  int<lower=1,upper=J> subj[N]; // subject indicator
  matrix[N,K]   X; // model matrix for fixed-effects parameters
  row_vector[J] Z[N]; // generator model matrix for random-effects
  vector[N]     y; // response vector (reaction time)
}

parameters {
  cholesky_factor_corr[J] L; // Cholesky factor of unconditional correlation of random effects
  vector[J] tau;  // relative standard deviations of unconditional distribution of random effects
  vector[J] u[M]; // spherical random effects
  vector[K] beta; // fixed-effects
  real<lower=0> sigma; // standard deviation of response given random effects
}

transformed parameters {
  matrix[J,J] Lambda; 
  vector[J] b[M];
  vector[N] muX;
  vector[N] mu;
  Lambda <- diag_pre_multiply(tau,L);
  for (m in 1:M)
    b[m] <-  Lambda * u[m];
  muX <- X * beta;
  for (n in 1:N)
    mu[n] = muX[n] + Z[n] * b[subj[n]];
}

model {
  tau ~ cauchy(0,2.5);
  L ~ lkj_corr_cholesky(2);
  for (m in 1:M)
    as_vector(u[m]) ~ normal(0,sigma)
  y ~ normal(mu, sigma)
}
""";

In [7]:
const Xt = vcat(ones(Float64,(1,180)),array(slp[:Days])')

2x180 Array{Float64,2}:
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  …  1.0  1.0  1.0  1.0  1.0  1.0  1.0
 0.0  1.0  2.0  3.0  4.0  5.0  6.0  7.0     3.0  4.0  5.0  6.0  7.0  8.0  9.0

In [8]:
const sleepdata = [
    @Compat.Dict("N" => size(slp,1),
    "K" => 2,
    "M" => 18,
    "J" => 2,
    "subj" => array(slp[:Subject]),
    "y" => array(slp[:Reaction]),
    "X" => Xt',
    "Z" => Xt')
]

1-element Array{Dict{ASCIIString,Any},1}:
 ["Z"=>180x2 Array{Float64,2}:
 1.0  0.0
 1.0  1.0
 1.0  2.0
 1.0  3.0
 1.0  4.0
 1.0  5.0
 1.0  6.0
 1.0  7.0
 1.0  8.0
 1.0  9.0
 1.0  0.0
 1.0  1.0
 1.0  2.0
 ⋮       
 1.0  8.0
 1.0  9.0
 1.0  0.0
 1.0  1.0
 1.0  2.0
 1.0  3.0
 1.0  4.0
 1.0  5.0
 1.0  6.0
 1.0  7.0
 1.0  8.0
 1.0  9.0,"J"=>2,"M"=>18,"X"=>180x2 Array{Float64,2}:
 1.0  0.0
 1.0  1.0
 1.0  2.0
 1.0  3.0
 1.0  4.0
 1.0  5.0
 1.0  6.0
 1.0  7.0
 1.0  8.0
 1.0  9.0
 1.0  0.0
 1.0  1.0
 1.0  2.0
 ⋮       
 1.0  8.0
 1.0  9.0
 1.0  0.0
 1.0  1.0
 1.0  2.0
 1.0  3.0
 1.0  4.0
 1.0  5.0
 1.0  6.0
 1.0  7.0
 1.0  8.0
 1.0  9.0,"N"=>180,"subj"=>ASCIIString["308","308","308","308","308","308","308","308","308","308"  …  "372","372","372","372","372","372","372","372","372","372"],"K"=>2,"y"=>[249.56,258.705,250.801,321.44,356.852,414.69,382.204,290.149,430.585,466.353  …  269.412,273.474,297.597,310.632,287.173,329.608,334.482,343.22,369.142,364.124]]

In [9]:
stanmodel = Stanmodel(name="VectorOne", model=VectorOne);


File /home/juser/tmp/VectorOne.stan will be updated.



In [10]:
sim1 = stan(stanmodel, sleepdata)



--- Translating Stan model to C++ code ---
bin/stanc /home/juser/tmp/VectorOne.stan --o=/home/juser/tmp/VectorOne.cpp --no_main
Model name=VectorOne_model
Input file=/home/juser/tmp/VectorOne.stan
Output file=/home/juser/tmp/VectorOne.cpp
ErrorException("failed process: Process(`make /home/juser/tmp/VectorOne`, ProcessExited(2)) [2]")
