# RePsychLing Kleigl et al. (2011)

This is a set of follow-up analyses to Kliegl et al. (2011).

Reinhold Kliegl, Ping Wei, Michael Dambacher, Ming Yan, & Xiaolin Zhou (2011). Experimental Effects and Individual Differences in Linear Mixed Models: Estimating the Relation between Spatial, Object, and Attraction Effects in Visual Attention. Frontiers in Psychology, 1, 1-12.

We are using the final set of data used in paper, that is after filtering a few outlier responses, defining `sdif` contrasts for factor `tar` and corresponding vector-valued contrasts `spt`, `c2`, `c3` from model matrix. The dataframe also includes transformations of the `rt` (`lrt=log(rt)`, `srt=sqrt(rt)`, `rrt=1000/rt` (note change in effect direction), `prt=rt^0.4242424` (acc to boxcox); `subj = factor(id)`).

In [10]:
using BenchmarkTools, LinearAlgebra, MixedModels, RCall

┌ Info: Precompiling BenchmarkTools [6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf]
└ @ Base loading.jl:1186


In [1]:
kwdyz = rcopy(R"RePsychLing::KWDYZ")

Unnamed: 0,item,tar,dir,rt,subj,c1,c2,c3,srt,lrt,qrt,prt
1,39,dod,hor,506.1,1,0.25,0.5,-0.75,0.5061,6.22673,22.4967,14.0362
2,52,dod,hor,489.6,1,0.25,0.5,-0.75,0.4896,6.19359,22.1269,13.8402
3,89,dod,hor,518.7,1,0.25,0.5,-0.75,0.5187,6.25133,22.775,14.1834
4,104,dod,hor,459.6,1,0.25,0.5,-0.75,0.4596,6.13036,21.4383,13.4739
5,120,dod,hor,384.2,1,0.25,0.5,-0.75,0.3842,5.95116,19.601,12.4876
6,161,dod,hor,470.0,1,0.25,0.5,-0.75,0.47,6.15273,21.6795,13.6024
7,194,dod,hor,422.0,1,0.25,0.5,-0.75,0.422,6.04501,20.5426,12.9947
8,248,dod,hor,462.8,1,0.25,0.5,-0.75,0.4628,6.13729,21.5128,13.5136
9,270,dod,hor,471.9,1,0.25,0.5,-0.75,0.4719,6.15677,21.7233,13.6257
10,277,dod,hor,445.5,1,0.25,0.5,-0.75,0.4455,6.0992,21.1069,13.297


## Models

### Maximal linear mixed model (_maxLMM_) 

The maximal model (_maxLMM_) reported in this paper is actually an overparameterized/degenerate model. Here we show how to identify the overparameterization and how we tried to deal with it.

In [3]:
m0 = fit!(LinearMixedModel(@formula(rt ~ 1+c1+c2+c3 + (1+c1+c2+c3|subj)), kwdyz))

Linear mixed model fit by maximum likelihood
 Formula: rt ~ 1 + c1 + c2 + c3 + ((1 + c1 + c2 + c3) | subj)
     logLik        -2 logLik          AIC             BIC       
 -1.62904775×10⁵  3.25809549×10⁵  3.25839549×10⁵  3.25963524×10⁵

Variance components:
              Column    Variance   Std.Dev.    Corr.
 subj     (Intercept)  3046.68856 55.1968166
          c1            540.42902 23.2471292  0.60
          c2            115.67074 10.7550333 -0.13 -0.01
          c3             90.40940  9.5083858 -0.25 -0.85  0.36
 Residual              4876.90461 69.8348381
 Number of obs: 28710; levels of grouping factors: 61

  Fixed-effects parameters:
             Estimate Std.Error z value P(>|z|)
(Intercept)   389.734   7.09094 54.9622  <1e-99
c1            33.7817   3.28724 10.2766  <1e-24
c2            13.9852   2.30581 6.06521   <1e-8
c3            2.74699   2.21412 1.24067  0.2147


In [11]:
@btime fit!(LinearMixedModel(@formula(rt ~ 1+c1+c2+c3 + (1+c1+c2+c3|subj)), kwdyz));

  213.369 ms (6160176 allocations: 109.23 MiB)


In [7]:
Λ = getΛ(m0)[1]

4×4 LowerTriangular{Float64,Array{Float64,2}}:
  0.790391     ⋅         ⋅          ⋅ 
  0.201019    0.26534    ⋅          ⋅ 
 -0.0202177   0.013757  0.152053    ⋅ 
 -0.0338419  -0.119135  0.0565678  0.0

In [8]:
show(svdvals(Λ))

[0.819923, 0.282066, 0.161098, 0.0]

The singular value decomposition (svd) and the form of the $\Lambda$ matrix itself show that the estimated covariance matrix from the unconditional distribution of the random effects is singular.

### Zero-correlation parameter linear mixed model (zcppLMM)

One option to reduce the complexity of the _maxLMM_ is to force correlation parameters to zero.

In [12]:
@time m1 = fit!(LinearMixedModel(@formula(rt ~ 1+c1+c2+c3 + 
    (1|subj)+(0+c1|subj)+(0+c2|subj)+(0+c3|subj)), kwdyz))

  0.401214 seconds (6.21 M allocations: 112.292 MiB, 30.19% gc time)


Linear mixed model fit by maximum likelihood
 Formula: rt ~ 1 + c1 + c2 + c3 + (1 | subj) + ((0 + c1) | subj) + ((0 + c2) | subj) + ((0 + c3) | subj)
    logLik      -2 logLik        AIC           BIC      
 -1.629243×10⁵ 3.2584861×10⁵ 3.2586661×10⁵ 3.2594099×10⁵

Variance components:
              Column    Variance   Std.Dev.   Corr.
 subj     (Intercept)  2993.35621 54.711573
          c1            574.62586 23.971355  0.00
          c2            108.01087 10.392828  0.00  0.00
          c3             74.11924  8.609253  0.00  0.00  0.00
 Residual              4877.41513 69.838493
 Number of obs: 28710; levels of grouping factors: 61

  Fixed-effects parameters:
             Estimate Std.Error z value P(>|z|)
(Intercept)   389.728   7.02902 55.4455  <1e-99
c1             33.774   3.37153 10.0174  <1e-22
c2            14.0033   2.27853 6.14575   <1e-9
c3            2.78726   2.15317 1.29449  0.1955


In [16]:
MixedModels.lrt(m0, m1)

Unnamed: 0,Df,Deviance,Chisq,pval
1,9,325849.0,,
2,15,325810.0,39.0579,6.97303e-07


There is no exact singularity for the _zcpLMM_. This model, however, fits significantly worse than _maxLMM_. Thus, removing all correlation parameters was too much of a reduction in model complexity. Before checking invidual correlation parameters for inclusion, we check whether any of the variance components are not supported b the data. 

In [17]:
m2 = fit!(LinearMixedModel(@formula(rt ~ 1+c1+c2+c3 + (1+c1+c2|subj)),kwdyz))

Linear mixed model fit by maximum likelihood
 Formula: rt ~ 1 + c1 + c2 + c3 + ((1 + c1 + c2) | subj)
     logLik       -2 logLik         AIC            BIC      
 -1.6291389×10⁵  3.2582779×10⁵  3.2584979×10⁵   3.259407×10⁵

Variance components:
              Column    Variance   Std.Dev.   Corr.
 subj     (Intercept)  3046.26125 55.192946
          c1            536.96391 23.172482  0.61
          c2             96.77148  9.837249 -0.02  0.42
 Residual              4881.76064 69.869597
 Number of obs: 28710; levels of grouping factors: 61

  Fixed-effects parameters:
             Estimate Std.Error z value P(>|z|)
(Intercept)   389.734   7.09047 54.9659  <1e-99
c1            33.7795   3.27889 10.3021  <1e-24
c2            14.0089   2.23828 6.25877   <1e-9
c3            2.78883   1.85014 1.50736  0.1317


In [18]:
MixedModels.lrt(m2,m0)  # still highly significant change from m0

Unnamed: 0,Df,Deviance,Chisq,pval
1,11,325828.0,,
2,15,325810.0,18.2388,0.00110827


In [19]:
getΛ(m2)[1]

3×3 LowerTriangular{Float64,Array{Float64,2}}:
  0.789942     ⋅          ⋅      
  0.200844    0.263923    ⋅      
 -0.00303928  0.0768552  0.117928

###  Using lrt=log(rt) or prt= rt^power (acc Box-Cox)

In [20]:
m2i = fit!(LinearMixedModel(@formula(lrt ~ 1 + c1 + c2 + c3 + (1 + c1 + c2 + c3 | subj)),kwdyz))

Linear mixed model fit by maximum likelihood
 Formula: lrt ~ 1 + c1 + c2 + c3 + ((1 + c1 + c2 + c3) | subj)
     logLik        -2 logLik          AIC             BIC       
  6.39118687×10³ -1.27823737×10⁴ -1.27523737×10⁴ -1.26283987×10⁴

Variance components:
              Column     Variance     Std.Dev.    Corr.
 subj     (Intercept)  0.0207649899 0.144100624
          c1           0.0033846953 0.058178134  0.48
          c2           0.0007529822 0.027440521 -0.24 -0.15
          c3           0.0006220570 0.024941070 -0.30 -0.93  0.43
 Residual              0.0368543388 0.191974839
 Number of obs: 28710; levels of grouping factors: 61

  Fixed-effects parameters:
              Estimate  Std.Error z value P(>|z|)
(Intercept)    5.93583  0.0185188  320.53  <1e-99
c1           0.0877736 0.00837833 10.4763  <1e-24
c2           0.0366027 0.00617995 5.92281   <1e-8
c3           0.0086108 0.00600357 1.43428  0.1515


In [21]:
getΛ(m2i)[1]

4×4 LowerTriangular{Float64,Array{Float64,2}}:
  0.750622     ⋅          ⋅          ⋅ 
  0.144551    0.266355    ⋅          ⋅ 
 -0.0345388  -0.0062691  0.138561    ⋅ 
 -0.039185   -0.116143   0.0430596  0.0

In [22]:
m2j = fit!(LinearMixedModel(@formula(prt ~ 1 + c1 + c2 + c3 + (1 + c1 + c2 + c3 | subj)),kwdyz))

Linear mixed model fit by maximum likelihood
 Formula: prt ~ 1 + c1 + c2 + c3 + ((1 + c1 + c2 + c3) | subj)
     logLik        -2 logLik          AIC             BIC       
 -4.03649564×10⁴  8.07299127×10⁴  8.07599127×10⁴  8.08838877×10⁴

Variance components:
              Column     Variance   Std.Dev.    Corr.
 subj     (Intercept)  0.573062576 0.75700897
          c1           0.096787923 0.31110757  0.53
          c2           0.020716390 0.14393189 -0.19 -0.09
          c3           0.016825331 0.12971249 -0.28 -0.90  0.40
 Residual              0.957026881 0.97827751
 Number of obs: 28710; levels of grouping factors: 61

  Fixed-effects parameters:
              Estimate Std.Error z value P(>|z|)
(Intercept)    12.4741 0.0972643  128.25  <1e-99
c1            0.463096 0.0443696 10.4372  <1e-24
c2            0.192511 0.0317936 6.05504   <1e-8
c3           0.0423166  0.030773 1.37512  0.1691


In [23]:
getΛ(m2j)[1]

4×4 LowerTriangular{Float64,Array{Float64,2}}:
  0.773818     ⋅           ⋅          ⋅ 
  0.169881    0.268839     ⋅          ⋅ 
 -0.0286692   0.00318065  0.144273    ⋅ 
 -0.0374256  -0.117444    0.0488586  0.0