# RePsychLing Kleigl et al. (2011)

This is a set of follow-up analyses to Kliegl et al. (2011).

Reinhold Kliegl, Ping Wei, Michael Dambacher, Ming Yan, & Xiaolin Zhou (2011). Experimental Effects and Individual Differences in Linear Mixed Models: Estimating the Relation between Spatial, Object, and Attraction Effects in Visual Attention. Frontiers in Psychology, 1, 1-12.

We are using the final set of data used in paper, that is after filtering a few outlier responses, defining `sdif` contrasts for factor `tar` and corresponding vector-valued contrasts `spt`, `c2`, `c3` from model matrix. The dataframe also includes transformations of the `rt` (`lrt=log(rt)`, `srt=sqrt(rt)`, `rrt=1000/rt` (note change in effect direction), `prt=rt^0.4242424` (acc to boxcox); `subj = factor(id)`).

In [1]:
using DataFrames,RCall,MixedModels
kwdyz = DataFrame("RePsychLing::KWDYZ")

Unnamed: 0,item,tar,dir,rt,subj,c1,c2,c3,srt,lrt,qrt,prt
1,39,dod,hor,506.1,1,0.25,0.5,-0.75,0.5061,6.226734278220032,22.4966664197165,14.036235047977032
2,52,dod,hor,489.6,1,0.25,0.5,-0.75,0.48960000000000004,6.193588731198116,22.126906697502932,13.840242483234347
3,89,dod,hor,518.7,1,0.25,0.5,-0.75,0.5187,6.251325681357355,22.774986278810356,14.183437487026127
4,104,dod,hor,459.6,1,0.25,0.5,-0.75,0.4596,6.130356545974601,21.438283513378586,13.473903246772165
5,120,dod,hor,384.2,1,0.25,0.5,-0.75,0.3842,5.9511632503344565,19.601020381602584,12.487565630398025
6,161,dod,hor,470.0,1,0.25,0.5,-0.75,0.47,6.152732694704104,21.6794833886788,13.6024187183506
7,194,dod,hor,422.0,1,0.25,0.5,-0.75,0.422,6.045005314036012,20.54263858417414,12.994746292564496
8,248,dod,hor,462.8,1,0.25,0.5,-0.75,0.4628,6.137294995319522,21.51278689523977,13.513623211612503
9,270,dod,hor,471.9,1,0.25,0.5,-0.75,0.4719,6.156767098732342,21.723259423944647,13.625720058735002
10,277,dod,hor,445.5,1,0.25,0.5,-0.75,0.4455,6.099197246910864,21.106870919205434,13.29696266938164


## Models

### Maximal linear mixed model (_maxLMM_) 

The maximal model (_maxLMM_) reported in this paper is actually an overparameterized/degenerate model. Here we show how to identify the overparameterization and how we tried to deal with it.

In [2]:
m0 = fit(lmm(rt ~ 1+c1+c2+c3 + (1+c1+c2+c3|subj), kwdyz))

Linear mixed model fit by maximum likelihood
Formula: rt ~ 1 + c1 + c2 + c3 + ((1 + c1 + c2 + c3) | subj)

 logLik: -162904.774673, deviance: 325809.549347

 Variance components:
                Variance    Std.Dev.  Corr.
 subj         3047.168904   55.201168
              540.507922   23.248826   0.60
              115.651117   10.754121  -0.13 -0.13
               90.443092    9.510157  -0.25 -0.25 -0.25
 Residual     4876.903925   69.834833
 Number of obs: 28710; levels of grouping factors: 61

  Fixed-effects parameters:
             Estimate Std.Error z value
(Intercept)   389.734   7.09149 54.9579
c1            33.7817   3.28744  10.276
c2            13.9852   2.30574 6.06539
c3            2.74695   2.21425 1.24058


In [3]:
m0.λ

1-element Array{Any,1}:
 PDLCholF(Cholesky{Float64} with factor:
4x4 Triangular{Float64,Array{Float64,2},:L,false}:
  0.790453    0.0        0.0       0.0
  0.201073    0.26533    0.0       0.0
 -0.0202282   0.0137804  0.152036  0.0
 -0.0338984  -0.119124   0.056618  0.0)

In [4]:
map(svdvals,m0.λ)

1-element Array{Any,1}:
 [0.820001,0.282049,0.161098,0.0]

The singular value decomposition (svd) and the form of the $\lambda$ matrix itself show that the estimated covariance matrix from the unconditional distribution of the random effects is singular.

### Zero-correlation parameter linear mixed model (zcppLMM)

One option to reduce the complexity of the _maxLMM_ is to force correlation parameters to zero.

In [5]:
m1 = fit(lmm(rt ~ 1+c1+c2+c3 + 
(1|subj)+(0+c1|subj)+(0+c2|subj)+(0+c3|subj), kwdyz))

Linear mixed model fit by maximum likelihood
Formula: rt ~ 1 + c1 + c2 + c3 + (1 | subj) + ((0 + c1) | subj) + ((0 + c2) | subj) + ((0 + c3) | subj)

 logLik: -162933.484994, deviance: 325866.969987

 Variance components:
                Variance    Std.Dev.  Corr.
 subj         3000.448567   54.776350
              696.365808   26.388744   0.00
                0.000000    0.000000   0.00  0.00
                0.000000    0.000000   0.00  0.00  0.00
 Residual     4888.337817   69.916649
 Number of obs: 28710; levels of grouping factors: 61

  Fixed-effects parameters:
             Estimate Std.Error z value
(Intercept)   389.732   7.03734 55.3806
c1            33.7618   3.65606 9.23446
c2            14.0247   1.85135 7.57537
c3            2.77449   1.85134 1.49864


A better fit is obtained by `lmer` at

In [6]:
MixedModels.objective!(m1,[0.783409125819598, 0.343236594542506, 
    0.148817522465133, 0.123267732858873])

325848.60722980055

In [8]:
m1.fit = false; fit(m1)

Linear mixed model fit by maximum likelihood
Formula: rt ~ 1 + c1 + c2 + c3 + (1 | subj) + ((0 + c1) | subj) + ((0 + c2) | subj) + ((0 + c3) | subj)

 logLik: -162924.303615, deviance: 325848.607230

 Variance components:
                Variance    Std.Dev.  Corr.
 subj         2993.405273   54.712021
              574.613927   23.971106   0.00
              108.017873   10.393165   0.00  0.00
               74.111792    8.608821   0.00  0.00  0.00
 Residual     4877.415190   69.838494
 Number of obs: 28710; levels of grouping factors: 61

  Fixed-effects parameters:
             Estimate Std.Error z value
(Intercept)   389.728   7.02908  55.445
c1             33.774    3.3715 10.0175
c2            14.0033   2.27855 6.14568
c3            2.78726   2.15314 1.29451


In [10]:
MixedModels.lrt(m1,m0) # significant, too much reduction

Unnamed: 0,Df,Deviance,Chisq,pval
1,9,325848.6072297999,,
2,15,325809.5493466815,39.057883118395694,6.973019046337306e-07


There is no exact singularity for the _zcpLMM_. This model, however, fits significantly worse than _maxLMM_. Thus, removing all correlation parameters was too much of a reduction in model complexity. Before checking invidual correlation parameters for inclusion, we check whether any of the variance components are not supported b the data. 

In [11]:
m2 = fit(lmm(rt ~ 1+c1+c2+c3 + (1+c1+c2|subj),kwdyz))

Linear mixed model fit by maximum likelihood
Formula: rt ~ 1 + c1 + c2 + c3 + ((1 + c1 + c2) | subj)

 logLik: -162913.894074, deviance: 325827.788149

 Variance components:
                Variance    Std.Dev.  Corr.
 subj         3046.867891   55.198441
              536.876928   23.170605   0.61
               96.794222    9.838405  -0.02 -0.02
 Residual     4881.760085   69.869593
 Number of obs: 28710; levels of grouping factors: 61

  Fixed-effects parameters:
             Estimate Std.Error z value
(Intercept)   389.734   7.09117 54.9604
c1            33.7795   3.27867 10.3028
c2            14.0089   2.23836 6.25853
c3            2.78883   1.85014 1.50736


In [12]:
MixedModels.lrt(m2,m0)  # still highly significant change from m0

Unnamed: 0,Df,Deviance,Chisq,pval
1,11,325827.7881488834,,
2,15,325809.5493466815,18.23880220187129,0.001108279331069


In [13]:
m2.λ

1-element Array{Any,1}:
 PDLCholF(Cholesky{Float64} with factor:
3x3 Triangular{Float64,Array{Float64,2},:L,false}:
  0.790021    0.0       0.0     
  0.200855    0.263882  0.0     
 -0.00308719  0.076954  0.117882)

###  Using lrt=log(rt) or prt= rt^power (acc Box-Cox)

In [14]:
m2i = fit(lmm(lrt ~ 1 + c1 + c2 + c3 + (1 + c1 + c2 + c3 | subj),kwdyz))

Linear mixed model fit by maximum likelihood
Formula: lrt ~ 1 + c1 + c2 + c3 + ((1 + c1 + c2 + c3) | subj)

 logLik: 6391.186870, deviance: -12782.373741

 Variance components:
                Variance    Std.Dev.  Corr.
 subj           0.020765    0.144100
                0.003385    0.058178   0.48
                0.000753    0.027442  -0.24 -0.24
                0.000622    0.024941  -0.30 -0.30 -0.30
 Residual       0.036854    0.191975
 Number of obs: 28710; levels of grouping factors: 61

  Fixed-effects parameters:
              Estimate  Std.Error z value
(Intercept)    5.93583  0.0185187 320.532
c1           0.0877736 0.00837831 10.4763
c2           0.0366027 0.00618003 5.92273
c3           0.0086108 0.00600355 1.43428


In [15]:
m2i.λ

1-element Array{Any,1}:
 PDLCholF(Cholesky{Float64} with factor:
4x4 Triangular{Float64,Array{Float64,2},:L,false}:
  0.750618    0.0         0.0        0.0
  0.144542    0.266359    0.0        0.0
 -0.0345324  -0.00627849  0.138568   0.0
 -0.0391828  -0.116144    0.0430527  0.0)

In [16]:
m2j = fit(lmm(prt ~ 1 + c1 + c2 + c3 + (1 + c1 + c2 + c3 | subj),kwdyz))

Linear mixed model fit by maximum likelihood
Formula: prt ~ 1 + c1 + c2 + c3 + ((1 + c1 + c2 + c3) | subj)

 logLik: -40364.956363, deviance: 80729.912726

 Variance components:
                Variance    Std.Dev.  Corr.
 subj           0.573057    0.757006
                0.096789    0.311109   0.53
                0.020716    0.143931  -0.19 -0.19
                0.016826    0.129714  -0.28 -0.28 -0.28
 Residual       0.957027    0.978277
 Number of obs: 28710; levels of grouping factors: 61

  Fixed-effects parameters:
              Estimate Std.Error z value
(Intercept)    12.4741 0.0972639 128.251
c1            0.463096 0.0443697 10.4372
c2            0.192511 0.0317935 6.05505
c3           0.0423165 0.0307731 1.37511


In [17]:
m2j.λ

1-element Array{Any,1}:
 PDLCholF(Cholesky{Float64} with factor:
4x4 Triangular{Float64,Array{Float64,2},:L,false}:
  0.773815    0.0        0.0        0.0
  0.169872    0.268846   0.0        0.0
 -0.028671    0.0031853  0.144271   0.0
 -0.0374287  -0.117442   0.0488643  0.0)