# RePsychLing &#39;Masson, Rabe, &amp; Kliegl (2017)&#39; with Julia
### Reinhold Kliegl
### 2020-02-13
# Update

This version uses `MixedModels.PCA()` to show details about RE structures. 

# Setup

Packages we (might) use.

In [2]:
#cd(joinpath(homedir(),"Google Drive/ZiF_CG_WS2/MRK17_Exp1/"))
cd("/Users/reinholdkliegl/Google Drive/ZiF_CG_WS2/MRK17_Exp1/")

using CSV, DataFrames, DataFramesMeta, MixedModels, RCall 
using StatsBase, StatsModels, BenchmarkTools

# earlier alternative for LRT
# using Printf: @sprintf
# using Distributions: Chisq, ccdf
# include("lrtest.jl")

┌ Info: Precompiling MixedModels [ff71e718-51f3-5ec2-a782-8ffcbfa3c316]
└ @ Base loading.jl:1273
┌ Info: Precompiling RCall [6f49c342-dc21-5d91-9882-a32aef131414]
└ @ Base loading.jl:1273


# Reading data

We read the data preprocessed with R and saved as RDS file (see `DataPrep.Rmd` for details).

In [3]:
R"dat_r = readRDS('MRK17_Exp1.rds')";

dat = rcopy(R"dat_r")

dat = @linq dat |>
       transform(F = levels!(:F, ["HF", "LF"]),
                 P = levels!(:P, ["rel", "unr"]),
                 Q = levels!(:Q, ["clr", "deg"]),
                lQ = levels!(:lQ, ["clr", "deg"]),
                lT = levels!(:lT, ["WD", "NW"]))

cellmeans = by(dat, [:F, :P, :Q, :lQ, :lT], 
            meanRT = :rt => mean, sdRT = :rt => std, n = :rt => length,
            semean = :rt => x -> std(x)/sqrt(length(x)))

Unnamed: 0_level_0,F,P,Q,lQ,lT,meanRT,sdRT
Unnamed: 0_level_1,Categorical…,Categorical…,Categorical…,Categorical…,Categorical…,Float64,Float64
1,HF,rel,clr,clr,WD,613.094,192.13
2,HF,rel,clr,clr,NW,635.319,205.19
3,HF,rel,clr,deg,WD,620.569,173.764
4,HF,rel,clr,deg,NW,615.647,176.826
5,HF,rel,deg,clr,WD,667.685,231.145
6,HF,rel,deg,clr,NW,645.493,179.962
7,HF,rel,deg,deg,WD,637.793,177.496
8,HF,rel,deg,deg,NW,659.266,192.637
9,HF,unr,clr,clr,WD,620.838,192.689
10,HF,unr,clr,clr,NW,619.626,176.55


# Complex LMM

This is *not* the maximal factorial LMM because we do not include interaction 
terms and associated correlation parameters in the RE structure.

## Model fit

In [4]:
const HC = HelmertCoding();
const contrasts = Dict(:F => HC, :P => HC, :Q => HC, :lQ => HC, :lT => HC);

m1form = @formula (-1000/rt) ~ 1+F*P*Q*lQ*lT +
                              (1+F+P+Q+lQ+lT | Subj) +
                              (1+P+Q+lQ+lT | Item);
cmplxLMM = @btime fit(MixedModel, m1form, dat, contrasts=contrasts);

  20.942 s (586348 allocations: 151.67 MiB)


## VCs and CPs

We don't look at fixed effects before model selection.

In [5]:
cmplxLMM.λ[1]

5×5 LinearAlgebra.LowerTriangular{Float64,Array{Float64,2}}:
  0.193296      ⋅             ⋅            ⋅          ⋅ 
 -0.00211918   0.0387396      ⋅            ⋅          ⋅ 
 -0.0156009    0.0156498     0.0371364     ⋅          ⋅ 
 -0.00757349   0.000154762  -0.00255928   0.0185151   ⋅ 
  0.00484102  -0.037064      0.0173945   -0.0117176  0.0

In [6]:
cmplxLMM.λ[2]

6×6 LinearAlgebra.LowerTriangular{Float64,Array{Float64,2}}:
  0.597685      ⋅           ⋅           ⋅          ⋅          ⋅       
 -0.00815265   0.0212523    ⋅           ⋅          ⋅          ⋅       
 -0.0134521    0.0317702   0.017194     ⋅          ⋅          ⋅       
 -0.0392939    0.0307922   0.0689438   0.0444246   ⋅          ⋅       
 -0.00205587   0.00566759  0.00245181  0.0362435  0.0         ⋅       
 -0.0283388   -0.0223824   0.0134866   0.0577824  0.0722463  0.0464293

##  rePCA

Options for information about rePCAs

```
Base.show(pca::PCA;
          ndigitsmat=2, ndigitsvec=2, ndigitscum=4,
          covcor=true, loadings=true, variances=false, stddevs=false)
```

In [7]:
cmplxLMM.rePCA

(Item = [0.42234405775568895, 0.6863960843028875, 0.9058471562701655, 1.0, 1.0], Subj = [0.4766698203907858, 0.735833655927118, 0.8811936023191282, 0.9430909892241793, 1.0, 1.0])

In [8]:
cmplx_pca=MixedModels.PCA(cmplxLMM, corr=false);
show(stdout, cmplx_pca.Subj, ndigitsmat=4, ndigitsvec=6, variances=true, stddevs=true)
show(stdout, cmplx_pca.Item, ndigitsmat=4, ndigitsvec=6, variances=true, stddevs=true)


Principal components based on (relative) covariance matrix
  0.3572    ⋅        ⋅       ⋅       ⋅       ⋅    
 -0.0049   0.0005    ⋅       ⋅       ⋅       ⋅    
 -0.008    0.0008   0.0015   ⋅       ⋅       ⋅    
 -0.0235   0.001    0.0027  0.0092   ⋅       ⋅    
 -0.0012   0.0001   0.0002  0.002   0.0014   ⋅    
 -0.0169  -0.0002  -0.0001  0.0039  0.0021  0.0122
Standard deviations:
[0.599928, 0.11604, 0.084699, 0.029258, 0.024441, 0.0]
Variances:
[0.359914, 0.013465, 0.007174, 0.000856, 0.000597, 0.0]
Normalized cumulative variances:
[0.9422, 0.9774, 0.9962, 0.9984, 1.0, 1.0]
Component loadings
 -0.9961  -0.0768  -0.0378   0.0175   0.0065  -0.0061
  0.0137   0.0022  -0.1466   0.4029  -0.5132  -0.7434
  0.0229  -0.056   -0.3451   0.6385  -0.2954   0.6183
  0.0675  -0.4893  -0.7714  -0.1511   0.3349  -0.1612
  0.0041  -0.2182  -0.1022  -0.6049  -0.7328   0.1976
  0.0493  -0.839    0.5025   0.2023   0.013    0.0   
Principal components based on (relative) covariance matrix
  0.0374    ⋅

In [9]:
cmplx_pca=MixedModels.PCA(cmplxLMM, corr=true);
show(stdout, cmplx_pca.Subj)
show(stdout, cmplx_pca.Item)


Principal components based on correlation matrix
  1.0     ⋅      ⋅     ⋅     ⋅     ⋅ 
 -0.36   1.0     ⋅     ⋅     ⋅     ⋅ 
 -0.35   0.89   1.0    ⋅     ⋅     ⋅ 
 -0.41   0.45   0.73  1.0    ⋅     ⋅ 
 -0.06   0.16   0.18  0.58  1.0    ⋅ 
 -0.26  -0.1   -0.02  0.37  0.51  1.0
Normalized cumulative variances:
[0.4767, 0.7358, 0.8812, 0.9431, 1.0, 1.0]
Component loadings
 -0.34  -0.02   0.84  -0.31  -0.27   0.11
  0.45   0.42   0.12  -0.41   0.44   0.5 
  0.51   0.35   0.16  -0.17  -0.27  -0.7 
  0.52  -0.15   0.11   0.41  -0.56   0.45
  0.32  -0.51   0.43   0.3    0.56  -0.21
  0.21  -0.65  -0.24  -0.67  -0.17  -0.0 
Principal components based on correlation matrix
  1.0     ⋅      ⋅      ⋅     ⋅ 
 -0.05   1.0     ⋅      ⋅     ⋅ 
 -0.36   0.38   1.0     ⋅     ⋅ 
 -0.38   0.03   0.03   1.0    ⋅ 
  0.11  -0.87  -0.01  -0.35  1.0
Normalized cumulative variances:
[0.4223, 0.6864, 0.9058, 1.0, 1.0]
Component loadings
 -0.3    0.67  -0.02   0.67  -0.07
  0.6    0.38   0.22  -0.05   0.67
  0.

In [10]:
cmplx_pca=MixedModels.PCA(cmplxLMM, corr=false);
show(stdout, cmplx_pca.Subj)
show(stdout, cmplx_pca.Item, stddevs=true, loadings=false)


Principal components based on (relative) covariance matrix
  0.36    ⋅     ⋅    ⋅     ⋅    ⋅  
 -0.0    0.0    ⋅    ⋅     ⋅    ⋅  
 -0.01   0.0   0.0   ⋅     ⋅    ⋅  
 -0.02   0.0   0.0  0.01   ⋅    ⋅  
 -0.0    0.0   0.0  0.0   0.0   ⋅  
 -0.02  -0.0  -0.0  0.0   0.0  0.01
Normalized cumulative variances:
[0.9422, 0.9774, 0.9962, 0.9984, 1.0, 1.0]
Component loadings
 -1.0   -0.08  -0.04   0.02   0.01  -0.01
  0.01   0.0   -0.15   0.4   -0.51  -0.74
  0.02  -0.06  -0.35   0.64  -0.3    0.62
  0.07  -0.49  -0.77  -0.15   0.33  -0.16
  0.0   -0.22  -0.1   -0.6   -0.73   0.2 
  0.05  -0.84   0.5    0.2    0.01   0.0 
Principal components based on (relative) covariance matrix
  0.04    ⋅     ⋅     ⋅    ⋅ 
 -0.0    0.0    ⋅     ⋅    ⋅ 
 -0.0    0.0   0.0    ⋅    ⋅ 
 -0.0    0.0   0.0   0.0   ⋅ 
  0.0   -0.0  -0.0  -0.0  0.0
Standard deviations:
[0.19, 0.06, 0.04, 0.02, 0.0]
Normalized cumulative variances:
[0.8773, 0.9515, 0.9915, 1.0, 1.0]


Variance-covariance matrix of random-effect structure suggests overparameterization
for both subject-related and item-related components.


# Zero-correlation parameter LMM (factors)

## Model fit

We take out correlation parameters.

In [11]:
m2form = @formula (-1000/rt) ~ 1 + F*P*Q*lQ*lT +
                               zerocorr(1+F+P+Q+lQ+lT | Subj) +
                               zerocorr(1+P+Q+lQ+lT | Item);

zcpLMM = @btime fit(LinearMixedModel, m2form, dat, contrasts=contrasts);

  2.803 s (239024 allocations: 136.98 MiB)


## VCs and CPs

In [12]:
zcpLMM.λ[1]

5×5 LinearAlgebra.LowerTriangular{Float64,Array{Float64,2}}:
 0.193111   ⋅          ⋅          ⋅         ⋅       
 0.0       0.0290307   ⋅          ⋅         ⋅       
 0.0       0.0        0.0414897   ⋅         ⋅       
 0.0       0.0        0.0        0.013579   ⋅       
 0.0       0.0        0.0        0.0       0.0368135

In [13]:
zcpLMM.λ[2]

6×6 LinearAlgebra.LowerTriangular{Float64,Array{Float64,2}}:
 0.597014   ⋅          ⋅          ⋅          ⋅          ⋅      
 0.0       0.0134607   ⋅          ⋅          ⋅          ⋅      
 0.0       0.0        0.0340468   ⋅          ⋅          ⋅      
 0.0       0.0        0.0        0.0948922   ⋅          ⋅      
 0.0       0.0        0.0        0.0        0.0372432   ⋅      
 0.0       0.0        0.0        0.0        0.0        0.111137

##  rePCA

Options for information about rePCAs

```
Base.show(pca::PCA;
          ndigitsmat=2, ndigitsvec=2, ndigitscum=4,
          covcor=true, loadings=true, variances=false, stddevs=false)
```

In [14]:
show(stdout, zcpLMM.rePCA)

zcp_pca=MixedModels.PCA(zcpLMM, corr=true);
show(stdout, zcp_pca.Subj, stddevs=true)
show(stdout, zcp_pca.Item, stddevs=true)

(Item = [0.2, 0.4, 0.6, 0.8, 1.0], Subj = [0.16666666666666666, 0.3333333333333333, 0.5, 0.6666666666666666, 0.8333333333333334, 1.0])
Principal components based on correlation matrix
 1.0   ⋅    ⋅    ⋅    ⋅    ⋅ 
 0.0  1.0   ⋅    ⋅    ⋅    ⋅ 
 0.0  0.0  1.0   ⋅    ⋅    ⋅ 
 0.0  0.0  0.0  1.0   ⋅    ⋅ 
 0.0  0.0  0.0  0.0  1.0   ⋅ 
 0.0  0.0  0.0  0.0  0.0  1.0
Standard deviations:
[1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
Normalized cumulative variances:
[0.1667, 0.3333, 0.5, 0.6667, 0.8333, 1.0]
Component loadings
 0.0  0.0  0.0  0.0  1.0  0.0
 1.0  0.0  0.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0  0.0  0.0
 0.0  0.0  1.0  0.0  0.0  0.0
 0.0  0.0  0.0  1.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  1.0
Principal components based on correlation matrix
 1.0   ⋅    ⋅    ⋅    ⋅ 
 0.0  1.0   ⋅    ⋅    ⋅ 
 0.0  0.0  1.0   ⋅    ⋅ 
 0.0  0.0  0.0  1.0   ⋅ 
 0.0  0.0  0.0  0.0  1.0
Standard deviations:
[1.0, 1.0, 1.0, 1.0, 1.0]
Normalized cumulative variances:
[0.2, 0.4, 0.6, 0.8, 1.0]
Component loadings
 0.0  0.0 

Looks ok, but the last PCs are very tiny. Might be a good idea to prune the LMM. 

# Zero-correlation parameter LMM (indicators)

An alternative solution is to extract the indicators of contrasts from the design matrix.
Sometimes RE structures are more conviently specified with indicator variables (i.e., 
@ level of contrasts) than the factors.

In [15]:
mm = Int.(zcpLMM.X)

dat = @linq dat |>
       transform(f = mm[:, 2],
                 p = mm[:, 3],
                 q = mm[:, 4],
                lq = mm[:, 5],
                lt = mm[:, 6]);
dat[1:10, 10:14]

Unnamed: 0_level_0,f,p,q,lq,lt
Unnamed: 0_level_1,Int64,Int64,Int64,Int64,Int64
1,1,1,1,1,1
2,-1,1,-1,1,-1
3,-1,-1,-1,1,1
4,-1,-1,1,1,1
5,-1,1,1,1,-1
6,-1,-1,-1,1,-1
7,-1,-1,1,1,1
8,1,1,1,-1,1
9,1,1,-1,1,-1
10,-1,-1,-1,1,1


We take out correlation parameters.

In [16]:
m2form_b = @formula (-1000/rt) ~ 1 + f*p*q*lq*lt +
 (1 | Subj) + (0+f | Subj) + (0+p | Subj) + (0+q | Subj) + (0+lq | Subj) + (0+lt | Subj) +
 (1 | Item) +                (0+p | Item) + (0+q | Item) + (0+lq | Item) + (0+lt | Item);

zcpLMM_b = @btime fit(LinearMixedModel, m2form_b, dat, contrasts=contrasts);

const mods = [cmplxLMM, zcpLMM, zcpLMM_b];

  3.104 s (203978 allocations: 134.23 MiB)


In [17]:
gof_summary = DataFrame(dof=dof.(mods), deviance=deviance.(mods),
              AIC = aic.(mods), AICc = aicc.(mods), BIC = bic.(mods))

Unnamed: 0_level_0,dof,deviance,AIC,AICc,BIC
Unnamed: 0_level_1,Int64,Float64,Float64,Float64,Float64
1,69,7147.55,7285.55,7286.14,7817.24
2,44,7188.49,7276.49,7276.73,7615.54
3,44,7188.49,7276.49,7276.73,7615.54


In [18]:
#lrtest(cmplxLMM, zcpLMM) # earlier alternative
MixedModels.likelihoodratiotest(zcpLMM, cmplxLMM)

Model Formulae
─────────────────────────────────────────────
   model-dof  deviance     χ²  χ²-dof  P(>χ²)
─────────────────────────────────────────────
1       44.0   7188.49  -0.00     0.0  <1e-99
2       69.0   7147.55  40.94    25.0  0.0233
─────────────────────────────────────────────

1: :(-1000 / rt) ~ 1 + F + P + Q + lQ + lT + F & P + F & Q + P & Q + F & lQ + P & lQ + Q & lQ + F & lT + P & lT + Q & lT + lQ & lT + F & P & Q + F & P & lQ + F & Q & lQ + P & Q & lQ + F & P & lT + F & Q & lT + P & Q & lT + F & lQ & lT + P & lQ & lT + Q & lQ & lT + F & P & Q & lQ + F & P & Q & lT + F & P & lQ & lT + F & Q & lQ & lT + P & Q & lQ & lT + F & P & Q & lQ & lT + MixedModels.ZeroCorr((1 + F + P + Q + lQ + lT | Subj)) + MixedModels.ZeroCorr((1 + P + Q + lQ + lT | Item))
2: :(-1000 / rt) ~ 1 + F + P + Q + lQ + lT + F & P + F & Q + P & Q + F & lQ + P & lQ + Q & lQ + F & lT + P & lT + Q & lT + lQ & lT + F & P & Q + F & P & lQ + F & Q & lQ + P & Q & lQ + F & P & lT + F & Q & lT + P & Q & lT + F & lQ & lT + P & lQ & lT + Q & lQ & lT + F & P & Q & lQ + F & P & Q & lT + F & P & lQ & lT + F & Q & lQ & lT + P & Q & lQ & lT + F & P & Q & lQ & lT + (1 + F + P + Q + lQ + lT | Subj) + (1 + P + Q + lQ + lT | Item)


Results are identical; goodness of fit is better for complex LMM -- 
marginally because 2 * ΔDOF < ΔDeviance). 

# A replication of MRK17 LMM

## Indicators

Replication of final LMM in Masson and Kliegl (2013, Table 1) as well as
reproduction of final lme4-based LMM in Masson, Rabe, and Kliegl (2017, Figure 2)

In [19]:
m3form = @formula (-1000/rt) ~ 1 + f*p*q*lq*lt +
        (1+q | Subj) + (0+lt | Subj) + (1 | Item) + (0 + p | Item) ;
mrk17_LMM = @btime fit(LinearMixedModel, m3form, dat, contrasts=contrasts);

VarCorr(mrk17_LMM)

  322.761 ms (236750 allocations: 103.75 MiB)


Variance components:
            Column      Variance     Std.Dev.    Corr.
Item     (Intercept)  0.00320580475 0.056619826
         p            0.00006691847 0.008180371   .  
Subj     (Intercept)  0.03061902267 0.174982921
         q            0.00076247734 0.027612992 -0.42
         lt           0.00106213066 0.032590346   .     .  
Residual              0.08640795566 0.293952302


Is the correlation paramter significant?

In [20]:
# remove single CP for nested LMMs
m4form = @formula (-1000/rt) ~ 1 + f*p*q*lq*lt +
        (1 | Subj) + (0+q | Subj) + (0+lt | Subj) + (1 | Item) + (0+p | Item);
rdcdLMM = @btime fit(LinearMixedModel, m4form, dat, contrasts=contrasts);

#compare nested model sequence
# lrtest(rdcdLMM, mrk17_LMM)  # earlier alternative
MixedModels.likelihoodratiotest(rdcdLMM, mrk17_LMM)

  308.328 ms (170632 allocations: 102.14 MiB)


Model Formulae
─────────────────────────────────────────────
   model-dof  deviance     χ²  χ²-dof  P(>χ²)
─────────────────────────────────────────────
1       38.0   7195.87  -0.00     0.0  <1e-99
2       39.0   7186.82   9.06     1.0  0.0026
─────────────────────────────────────────────

1: :(-1000 / rt) ~ 1 + f + p + q + lq + lt + f & p + f & q + p & q + f & lq + p & lq + q & lq + f & lt + p & lt + q & lt + lq & lt + f & p & q + f & p & lq + f & q & lq + p & q & lq + f & p & lt + f & q & lt + p & q & lt + f & lq & lt + p & lq & lt + q & lq & lt + f & p & q & lq + f & p & q & lt + f & p & lq & lt + f & q & lq & lt + p & q & lq & lt + f & p & q & lq & lt + (1 | Subj) + (0 + q | Subj) + (0 + lt | Subj) + (1 | Item) + (0 + p | Item)
2: :(-1000 / rt) ~ 1 + f + p + q + lq + lt + f & p + f & q + p & q + f & lq + p & lq + q & lq + f & lt + p & lt + q & lt + lq & lt + f & p & q + f & p & lq + f & q & lq + p & q & lq + f & p & lt + f & q & lt + p & q & lt + f & lq & lt + p & lq & lt + q & lq & lt + f & p & q & lq + f & p & q & lt + f & p & lq & lt + f & q & lq & lt + p & q & lq & lt + f & p & q & lq & lt + (1 + q | Subj) + (0 + lt | Subj) + (1 | Item) + (0 + p | Item)


Yes, it is! Replicates a previous result. 

Note that `zcpLMM` and `mrk17LMM` are not nested; we cannot compare them with a LRT.

## rePCA

Options for information about rePCAs
 
```
Base.show(pca::PCA;
          ndigitsmat=2, ndigitsvec=2, ndigitscum=4,
          covcor=true, loadings=true, variances=false, stddevs=false)
```

In [21]:
mrk17_LMM.rePCA

(Item = [0.5, 1.0], Subj = [0.474184709788416, 0.8075180431217494, 1.0])

In [22]:
mrk17_pca=MixedModels.PCA(mrk17_LMM, corr=true);
show(stdout, mrk17_pca.Subj, stddevs=true)
show(stdout, mrk17_pca.Item, stddevs=true)

VarCorr(mrk17_LMM)


Principal components based on correlation matrix
  1.0    ⋅    ⋅ 
 -0.42  1.0   ⋅ 
  0.0   0.0  1.0
Standard deviations:
[1.19, 1.0, 0.76]
Normalized cumulative variances:
[0.4742, 0.8075, 1.0]
Component loadings
 -0.71  0.0  0.71
  0.71  0.0  0.71
  0.0   1.0  0.0 
Principal components based on correlation matrix
 1.0   ⋅ 
 0.0  1.0
Standard deviations:
[1.0, 1.0]
Normalized cumulative variances:
[0.5, 1.0]
Component loadings
 1.0  0.0
 0.0  1.0

Variance components:
            Column      Variance     Std.Dev.    Corr.
Item     (Intercept)  0.00320580475 0.056619826
         p            0.00006691847 0.008180371   .  
Subj     (Intercept)  0.03061902267 0.174982921
         q            0.00076247734 0.027612992 -0.42
         lt           0.00106213066 0.032590346   .     .  
Residual              0.08640795566 0.293952302


## Factors

This is an excursion with cautionary note. 
The replication LMM cannot be specified with factors in the RE-structure.

In [23]:
m3form_b = @formula (-1000/rt) ~ 1 + F*P*Q*lQ*lT +
        (1+Q | Subj) + zerocorr(0+lT | Subj) + zerocorr(1 + P | Item) ;
mrk17_LMM_b = @btime fit(LinearMixedModel, m3form_b, dat, contrasts=contrasts);

VarCorr(mrk17_LMM_b)

  637.107 ms (353882 allocations: 121.01 MiB)


Variance components:
            Column      Variance     Std.Dev.    Corr.
Item     (Intercept)  0.00320428639 0.056606417
         P: unr       0.00006696904 0.008183462   .  
Subj     (Intercept)  0.02885810591 0.169876737
         Q: deg       0.00076206268 0.027605483 -0.39
         lT: WD       0.00398044057 0.063090733   .     .  
         lT: NW       0.00026436157 0.016259200   .     .     .  
Residual              0.08640886782 0.293953853


In [24]:
VarCorr(mrk17_LMM)

Variance components:
            Column      Variance     Std.Dev.    Corr.
Item     (Intercept)  0.00320580475 0.056619826
         p            0.00006691847 0.008180371   .  
Subj     (Intercept)  0.03061902267 0.174982921
         q            0.00076247734 0.027612992 -0.42
         lt           0.00106213066 0.032590346   .     .  
Residual              0.08640795566 0.293952302


This will be fixed (see https://github.com/JuliaStats/MixedModels.jl/issues/268)

# Model comparisons

In [25]:
const mods = [cmplxLMM, zcpLMM, mrk17_LMM, rdcdLMM];
gof_summary = DataFrame(dof=dof.(mods), deviance=deviance.(mods),
              AIC = aic.(mods), AICc = aicc.(mods), BIC = bic.(mods))



Unnamed: 0_level_0,dof,deviance,AIC,AICc,BIC
Unnamed: 0_level_1,Int64,Float64,Float64,Float64,Float64
1,69,7147.55,7285.55,7286.14,7817.24
2,44,7188.49,7276.49,7276.73,7615.54
3,39,7186.82,7264.82,7265.01,7565.33
4,38,7195.87,7271.87,7272.06,7564.69


Here `dof` or degrees of freedom is the total number of parameters estimated 
in the model and `deviance` is simply negative twice the log-likelihood at 
convergence, without a correction for a saturated model.  The There information 
criteria are on a scale of "smaller is better" and all would select `mrk17_LMM` as "best".

The correlation parameter was replicated (i.e., -.42 in MRK17)

# Illustration of crossing and nesting of factors

There is an implementation of Wilkinson & Rogers (1973) formula syntax, allowing the specification of factors not only as crossed, but also as nested in the levels of another factor or combination of factors. We illustrate this functionality with a subset of the MRK17 data. (We use oviLMM as RE structure and rt as dependent variable.)

## Crossing factors

The default analysis focuses on crossed factors yielding main effects and interactions.

In [26]:
m5form = @formula rt ~ 1 + F*P + (1 | Subj) + (1 | Item);
crossedLMM = @btime fit(LinearMixedModel, m5form, dat, contrasts=contrasts)

  16.152 ms (8609 allocations: 7.94 MiB)


Linear mixed model fit by maximum likelihood
 rt ~ 1 + F + P + F & P + (1 | Subj) + (1 | Item)
     logLik        -2 logLik          AIC             BIC       
 -1.07702642×10⁵  2.15405285×10⁵  2.15419285×10⁵  2.15473224×10⁵

Variance components:
            Column    Variance   Std.Dev.  
Item     (Intercept)    600.6089  24.507323
Subj     (Intercept)   6858.5417  82.816313
Residual              28526.1483 168.896857
 Number of obs: 16409; levels of grouping factors: 240, 73

  Fixed-effects parameters:
──────────────────────────────────────────────────────
                 Estimate  Std.Error  z value  P(>|z|)
──────────────────────────────────────────────────────
(Intercept)     647.616      9.90951    65.35   <1e-99
F: LF             7.76838    2.06014     3.77   0.0002
P: unr            7.13002    1.31934     5.40   <1e-7 
F: LF & P: unr    3.83349    1.31924     2.91   0.0037
──────────────────────────────────────────────────────

Main effects of frequency (F) and priming (P) and their interaction are significant.

In [27]:
cellmeans = by(dat, [:F, :P], 
            meanRT = :rt => mean, sdRT = :rt => std, n = :rt => length)

Unnamed: 0_level_0,F,P,meanRT,sdRT,n
Unnamed: 0_level_1,Categorical…,Categorical…,Float64,Float64,Int64
1,HF,rel,636.888,192.78,4154
2,HF,unr,642.603,184.051,4124
3,LF,rel,644.332,186.599,4107
4,LF,unr,665.376,195.107,4024


# Nesting factors

The interaction tests whether lines visualizing the interaction are parallel, 
but depending on the theoretical context one might be interested whether the
priming effect is significant high frequency targets and for low frequency targets. 
In other words, the focus is on whether the priming effect is significant for 
the levels of the frequency factor.

In [28]:
m6form = @formula rt ~ 1 + F/P + (1 | Subj) + (1 | Item);
nestedLMM = @btime fit(LinearMixedModel, m6form, dat, contrasts=contrasts)

  15.173 ms (8380 allocations: 7.54 MiB)


Linear mixed model fit by maximum likelihood
 rt ~ 1 + F + F & P + (1 | Subj) + (1 | Item)
     logLik        -2 logLik          AIC             BIC       
 -1.07702642×10⁵  2.15405285×10⁵  2.15419285×10⁵  2.15473224×10⁵

Variance components:
            Column    Variance   Std.Dev.  
Item     (Intercept)    600.6089  24.507323
Subj     (Intercept)   6858.5412  82.816310
Residual              28526.1483 168.896857
 Number of obs: 16409; levels of grouping factors: 240, 73

  Fixed-effects parameters:
──────────────────────────────────────────────────────
                 Estimate  Std.Error  z value  P(>|z|)
──────────────────────────────────────────────────────
(Intercept)     647.616      9.90951    65.35   <1e-99
F: LF             7.76838    2.06014     3.77   0.0002
F: HF & P: unr    3.29654    1.85707     1.78   0.0759
F: LF & P: unr   10.9635     1.87441     5.85   <1e-8 
──────────────────────────────────────────────────────

The results show that the priming effect is not significant for high-frequency
targets. The estimates are the differences of the cell means from the grand 
mean (i.e., 2 x estimate = effect).