# RePsychLing Gann and Barr (2014)

## Data from Gann and Barr (2014)

These data are available as `gb12` in the [RePsychLing package](https://github.com/dmbates/RePsychLing) for [R](http://www.r-project.org)

In [1]:
using DataFrames,RCall,MixedModels
gb12 = DataFrame("RePsychLing::gb12")

Unnamed: 0,session,phase,list,item,sottrunc2,T,P,F,TP,TF,PF,TPF
1,56,2,6,5,2461.0,1.0,-1.0,-1.0,-1.0,-1.0,1.0,1.0
2,56,2,6,4,1497.0,1.0,-1.0,-1.0,-1.0,-1.0,1.0,1.0
3,56,2,6,10,1648.0,-1.0,-1.0,-1.0,1.0,1.0,1.0,-1.0
4,56,2,6,9,3320.0,-1.0,-1.0,-1.0,1.0,1.0,1.0,-1.0
5,56,4,6,14,1671.0,1.0,-1.0,-1.0,-1.0,-1.0,1.0,1.0
6,56,4,6,15,1683.0,1.0,-1.0,-1.0,-1.0,-1.0,1.0,1.0
7,56,4,6,20,1451.0,-1.0,-1.0,-1.0,1.0,1.0,1.0,-1.0
8,56,4,6,19,1097.0,-1.0,-1.0,-1.0,1.0,1.0,1.0,-1.0
9,56,6,6,24,1764.0,1.0,-1.0,1.0,-1.0,1.0,-1.0,-1.0
10,56,6,6,25,1033.0,1.0,-1.0,1.0,-1.0,1.0,-1.0,-1.0


In [2]:
size(gb12)

(512,12)

### Maximal linear mixed model (_maxLMM_)

We assume `P`, the partner, is a between-session factor and `F`, feedback, is a between-item factor (i.e., they are not included in RE terms). The model fit in the paper is:

In [3]:
m0 = fit(lmm(sottrunc2 ~ 1+T+P+F+TP+TF+PF+TPF +
     (1+T+F+TF|session)+(1+T+P+TP|item), gb12))

Linear mixed model fit by maximum likelihood
Formula: sottrunc2 ~ 1 + T + P + F + TP + TF + PF + TPF + ((1 + T + F + TF) | session) + ((1 + T + P + TP) | item)

 logLik: -3963.337715, deviance: 7926.675431

 Variance components:
                Variance    Std.Dev.  Corr.
 session      103476.674271  321.677905
              26882.960829  163.960242   0.99
              15341.632436  123.861344   0.13  0.13
              2776.968858   52.696953   0.12  0.12  0.12
 item         19908.343617  141.096930
              10244.429104  101.214767   1.00
              2543.725722   50.435362   0.28  0.28
              1972.407409   44.411794   0.32  0.32  0.32
 Residual     241932.090259  491.865927
 Number of obs: 512; levels of grouping factors: 32, 16

  Fixed-effects parameters:
             Estimate Std.Error   z value
(Intercept)   1787.92   70.3594   25.4112
T             368.705   44.1915   8.34334
P              63.986   62.1704    1.0292
F            -98.3445   46.8639  -2.09851
TP  

In [4]:
m0.λ

2-element Array{Any,1}:
 PDLCholF(Cholesky{Float64} with factor:
4x4 Triangular{Float64,Array{Float64,2},:L,false}:
 0.653995    0.0        0.0         0.0
 0.32893     0.0540647  0.0         0.0
 0.0319725  -0.249781   1.35534e-6  0.0
 0.0129557  -0.106351   7.86406e-7  0.0)
 PDLCholF(Cholesky{Float64} with factor:
4x4 Triangular{Float64,Array{Float64,2},:L,false}:
 0.286861   0.0        0.0          0.0
 0.205057   0.0172017  0.0          0.0
 0.0288284  0.0984023  0.000361463  0.0
 0.0292098  0.0854366  0.000333003  0.0)

In [5]:
deviance(m0)

7926.675430779498

In [6]:
MixedModels.objective!(m0,
    Float64[abs(x) < 5.e-6 ? 0. : x for x in MixedModels.θ(m0)])

7926.675430779473

In [7]:
m0

Linear mixed model fit by maximum likelihood
Formula: sottrunc2 ~ 1 + T + P + F + TP + TF + PF + TPF + ((1 + T + F + TF) | session) + ((1 + T + P + TP) | item)

 logLik: -3963.337715, deviance: 7926.675431

 Variance components:
                Variance    Std.Dev.  Corr.
 session      103476.674271  321.677905
              26882.960829  163.960242   0.99
              15341.632436  123.861344   0.13  0.13
              2776.968858   52.696953   0.12  0.12  0.12
 item         19908.343617  141.096930
              10244.429104  101.214767   1.00
              2543.725722   50.435362   0.28  0.28
              1972.407409   44.411794   0.32  0.32  0.32
 Residual     241932.090259  491.865927
 Number of obs: 512; levels of grouping factors: 32, 16

  Fixed-effects parameters:
             Estimate Std.Error   z value
(Intercept)   1787.92   70.3594   25.4112
T             368.705   44.1915   8.34334
P              63.986   62.1704    1.0292
F            -98.3445   46.8639  -2.09851
TP  

In [8]:
open("/tmp/gb12m0th.csv","w") do io
    writecsv(io,MixedModels.θ(m0))
end

The model converges without problems, but two correlation parameters are estimated as very close to 1.

### Singular value analysis for _maxLMM_

In [9]:
map(svdvals,m0.λ)

2-element Array{Any,1}:
 [0.732972,0.276533,0.0,0.0]       
 [0.356004,0.128691,5.49455e-5,0.0]

The `svdvals`results indicate two dimensions with no variability in the random effects for session and another two dimensions with essentially no variability in the random effects for item.

### Zero-correlation-parameter linear mixed model (_zcpLMM_)

In [10]:
m1 = fit(lmm(sottrunc2 ~ 1+T+P+F+TP+TF+PF+TPF + 
(1|session)+(0+T|session)+(0+F|session)+(0+TF|session) +
(1|item)+(0+T|item)+(0+P|item)+(0+TP|item), gb12))

Linear mixed model fit by maximum likelihood
Formula: sottrunc2 ~ 1 + T + P + F + TP + TF + PF + TPF + (1 | session) + ((0 + T) | session) + ((0 + F) | session) + ((0 + TF) | session) + (1 | item) + ((0 + T) | item) + ((0 + P) | item) + ((0 + TP) | item)

 logLik: -3990.068127, deviance: 7980.136254

 Variance components:
                Variance    Std.Dev.  Corr.
 session      99532.706670  315.488045
              20289.346355  142.440677   0.00
              11565.892981  107.544842   0.00  0.00
                0.000000    0.000000   0.00  0.00  0.00
 item         16881.335869  129.928195
                0.000000    0.000000   0.00
                0.000000    0.000000   0.00  0.00
                0.000000    0.000000   0.00  0.00  0.00
 Residual     272586.466142  522.098138
 Number of obs: 512; levels of grouping factors: 32, 16

  Fixed-effects parameters:
             Estimate Std.Error   z value
(Intercept)   1787.92   68.5411   26.0853
T             368.705   34.1531   10.7956

This is a case where `lmer` converges to a better fit at

In [11]:
MixedModels.objective!(m1,
   [0.615368637774424, 0.282035365967759, 0.215250102492395, 0, 
    0.254447462606184, 0.154461621829023, 0, 0])

7977.369678454903

In [12]:
m1.fit = false;
fit(m1)

Linear mixed model fit by maximum likelihood
Formula: sottrunc2 ~ 1 + T + P + F + TP + TF + PF + TPF + (1 | session) + ((0 + T) | session) + ((0 + F) | session) + ((0 + TF) | session) + (1 | item) + ((0 + T) | item) + ((0 + P) | item) + ((0 + TP) | item)

 logLik: -3988.684839, deviance: 7977.369678

 Variance components:
                Variance    Std.Dev.  Corr.
 session      100314.215619  316.724195
              21071.665701  145.160827   0.00
              12273.801392  110.787190   0.00  0.00
                0.000000    0.000000   0.00  0.00  0.00
 item         17150.936483  130.961584
              6320.241355   79.499946   0.00
                0.000000    0.000000   0.00  0.00
                0.000000    0.000000   0.00  0.00  0.00
 Residual     264906.062881  514.690259
 Number of obs: 512; levels of grouping factors: 32, 16

  Fixed-effects parameters:
             Estimate Std.Error   z value
(Intercept)   1787.92   68.7324   26.0127
T             368.705   39.6346   9.302

In [13]:
MixedModels.lrt(m1,m0)

Unnamed: 0,Df,Deviance,Chisq,pval
1,17,7977.369678454892,,
2,29,7926.675430779473,50.69424767541932,1.0546049552543038e-06


The _zcpLMM_ fits significantly worse than the _maxLMM_, but it reveals several variance components with values close to zero.  

### Iterative reduction of model complexity

Let's refit the model without small variance components.

In [14]:
m2 = lmm(sottrunc2 ~ 1+T+P+F+TP+TF+PF+TPF + 
(1|session)+(0+T|session)+(0+F|session) + (1|item) + (0+T|item), gb12);
MixedModels.objective!(m2,MixedModels.θ(m1)[[1,2,3,5,6]]);
fit(m2)

Linear mixed model fit by maximum likelihood
Formula: sottrunc2 ~ 1 + T + P + F + TP + TF + PF + TPF + (1 | session) + ((0 + T) | session) + ((0 + F) | session) + (1 | item) + ((0 + T) | item)

 logLik: -3988.684839, deviance: 7977.369678

 Variance components:
                Variance    Std.Dev.  Corr.
 session      100314.182538  316.724143
              21071.667686  145.160834   0.00
              12273.800284  110.787185   0.00  0.00
 item         17150.934601  130.961577
              6320.241489   79.499946   0.00
 Residual     264906.068618  514.690265
 Number of obs: 512; levels of grouping factors: 32, 16

  Fixed-effects parameters:
             Estimate Std.Error   z value
(Intercept)   1787.92   68.7324   26.0127
T             368.705   39.6346   9.30261
P              63.986   60.4335   1.05878
F            -98.3445   44.4172  -2.21411
TP            131.556   34.2912   3.83645
TF            12.1399   30.2061  0.401902
PF           -29.3292   30.0158 -0.977122
TPF        

In [15]:
MixedModels.lrt(m2,m0)

Unnamed: 0,Df,Deviance,Chisq,pval
1,14,7977.369678454891,,
2,29,7926.675430779473,50.69424767541841,9.268327589153276e-06


In [16]:
m3 = fit(lmm(sottrunc2 ~ 1+T+P+F+TP+TF+PF+TPF + 
            (1+T+F|session) + (1+T|item), gb12))

Linear mixed model fit by maximum likelihood
Formula: sottrunc2 ~ 1 + T + P + F + TP + TF + PF + TPF + ((1 + T + F) | session) + ((1 + T) | item)

 logLik: -3966.676588, deviance: 7933.353176

 Variance components:
                Variance    Std.Dev.  Corr.
 session      102592.553658  320.300724
              26095.792012  161.541920   1.00
              13178.592785  114.798052   0.10  0.10
 item         19381.724597  139.218262
              9873.700903   99.366498   1.00
 Residual     251905.053446  501.901438
 Number of obs: 512; levels of grouping factors: 32, 16

  Fixed-effects parameters:
             Estimate Std.Error   z value
(Intercept)   1787.92   70.0669   25.5173
T             368.705   43.8703   8.40444
P              63.986   60.8113    1.0522
F            -98.3445   45.9912  -2.13833
TP            131.556   36.1593   3.63824
TF            12.1399   33.3033  0.364525
PF           -29.3292   30.0638 -0.975563
TPF          -29.8917   22.1811  -1.34762


In [17]:
MixedModels.lrt(m3,m0)

Unnamed: 0,Df,Deviance,Chisq,pval
1,18,7933.35317580533,,
2,29,7926.675430779473,6.677745025856893,0.824547816944365


In [18]:
m4 = fit(lmm(sottrunc2 ~ 1+T+P+F+TP+TF+PF+TPF + 
            (1+T+F|session) + (1|item), gb12))

Linear mixed model fit by maximum likelihood
Formula: sottrunc2 ~ 1 + T + P + F + TP + TF + PF + TPF + ((1 + T + F) | session) + (1 | item)

 logLik: -3974.828980, deviance: 7949.657960

 Variance components:
                Variance    Std.Dev.  Corr.
 session      101064.766390  317.906852
              25245.483512  158.888274   1.00
              12029.517288  109.679156   0.10  0.10
 item         16690.092252  129.190140
 Residual     265113.077424  514.891326
 Number of obs: 512; levels of grouping factors: 32, 16

  Fixed-effects parameters:
             Estimate Std.Error   z value
(Intercept)   1787.92   68.6965   26.0263
T             368.705   36.1486   10.1997
P              63.986   60.6306   1.05534
F            -98.3445   44.0097  -2.23461
TP            131.556   36.1486   3.63932
TF            12.1399   22.7552    0.5335
PF           -29.3292   29.8952 -0.981066
TPF          -29.8917   22.7552  -1.31362


In [21]:
MixedModels.lrt(m4,m3)

Unnamed: 0,Df,Deviance,Chisq,pval
1,16,7949.657960235922,,
2,18,7933.35317580533,16.304784430592917,0.0002880454679979


We seem to be iterating toward model `m3` except that this model is singular.

In [22]:
m3.λ

2-element Array{Any,1}:
 PDLCholF(Cholesky{Float64} with factor:
3x3 Triangular{Float64,Array{Float64,2},:L,false}:
 0.638175   0.0       0.0
 0.32186    0.0       0.0
 0.0230519  0.227562  0.0)
 PDLCholF(Cholesky{Float64} with factor:
2x2 Triangular{Float64,Array{Float64,2},:L,false}:
 0.277382  0.0
 0.19798   0.0)                                                

In [23]:
map(svdvals,m3.λ)

2-element Array{Any,1}:
 [0.715159,0.22743,0.0]
 [0.340788,0.0]        