# Topics for March 2, 2016

My goal is to provide fast methods to use Markov-Chain Monte Carlo (MCMC) methods on linear mixed-effects models, with possible extensions to generalized linear mixed models.

There are several MCMC frameworks for [Julia](http://julialang.org).  An interface to [Stan](http://mc-stan.org) is available and there are two native Julia implementations; [Mamba](https://github.com/brian-j-smith/Mamba.jl) and [Lora](https://github.com/JuliaStats/Lora.jl).  I prefer `Mamba` because of its flexibility.  The problem with Stan, BUGS and JAGS is that each of them reinvents all the data structures, input/output, data manipulation, distribution definitions, etc. in its own environment.  They also define a Domain Specific Language (DSL) for which they must provide parsers, interpreters, run-time environments, etc.

A native implementation like Mamba can use all of the facilities of Julia and its packages.

## Linear predictor

Whenever you have a linear predictor (i.e. an $\bf X\beta$ type of expression) in a model there is a good chance that you can write out the full conditional distribution of $\beta$ or obtain a good approximation to it.  If you can write out the conditional distribution you can use a multivariate Gibbs sampler to obtain a vector-valued sample from the condtional.  This helps to avoid one of the underlying problems of MCMC methods which is successive sampling from conditionals of correlated parameters.  Consider a case where you might have hundreds or thousands of random effects and dozens of fixed effects.  You don't want to sample sequentially in those cases when you can sample from the distribution of the entire vector in one step.

## Multivariate normal conditionals

In most cases the conditional distribution of the coefficients (i.e. both random and fixed effects) is a multivariate normal with known mean and covariance matrix.  It is worthwhile examining the representation in Julia of this distribution.  Not surprisingly the representation involved the mean and covariance but the form of the covariance is encoded in the type.  For example, a common prior distribution for coefficients is a zero-mean multivariate normal with a covariance that is a large multiple of the identity - a diffuse prior.

In [1]:
using Distributions, Mamba, PDMats
d = MvNormal(2, 1000.)

ZeroMeanIsoNormal(
dim: 2
μ: [0.0,0.0]
Σ: 2x2 Array{Float64,2}:
 1.0e6  0.0  
 0.0    1.0e6
)


If you take apart the representation itself you discover that the only values that are stored are the scalar $\sigma^2$ and its inverse.

In [2]:
fieldnames(d)

2-element Array{Symbol,1}:
 :μ
 :Σ

In [3]:
typeof(d.μ)

Distributions.ZeroVector{Float64}

In [5]:
typeof(d.Σ)

PDMats.ScalMat{Float64}

In [6]:
fieldnames(d.Σ)

3-element Array{Symbol,1}:
 :dim      
 :value    
 :inv_value

In [7]:
(d.Σ.dim, d.Σ.value, d.Σ.inv_value)

(2,1.0e6,1.0e-6)

## Sampling with a prior

Let's defer the issue of a prior for a moment and consider that we have the matrix $\bf X'X$, the vector $\bf X'y$ and the scalar $\sigma$ defining the log-likelihood.  Without the prior, the mean is

In [9]:
X = hcat(ones(5), [1.:5;])

5x2 Array{Float64,2}:
 1.0  1.0
 1.0  2.0
 1.0  3.0
 1.0  4.0
 1.0  5.0

In [10]:
XtX = PDMat(X'X)

Base.LinAlg.Cholesky{Float64,Array{Float64,2}} with factor:


PDMats.PDMat{Float64,Array{Float64,2}}(2,2x2 Array{Float64,2}:
  5.0  15.0
 15.0  55.0,2x2 UpperTriangular{Float64,Array{Float64,2}}:
 2.23607  6.7082 
 0.0      3.16228)

In [14]:
y = [1.,3,3,3,5];
Xty = X'y

2-element Array{Float64,1}:
 15.0
 53.0

In [16]:
βhat= XtX\Xty

2-element Array{Float64,1}:
 0.6
 0.8