# RePsychLing Kronmüller and Barr (2007)

We apply the iterative reduction of LMM complexity to truncated response times of a 2x2x2 factorial psycholinguistic experiment (Kronmüller and Barr, 2007, Exp. 2; reanalyzed with an LMM in Barr, Levy, Scheepers and Tily, 2013). The data are from 56 subjects who responded to 32 items. Specifically, subjects had to select one of several objects presented on a monitor with a cursor. The manipulations involved (1) auditory instructions that maintained or broke a precedent of reference for the objects established over prior trials, (2) with the instruction being presented by the speaker who established the precedent (i.e., an old speaker) or a new speaker, and (3) whether the task had to be performed without or with a cognitive load consisting of six random digits. All factors were varied within subjects and within items. There were main effects of Load, Speaker, and Precedent; none of the interactions were significant. Although standard errors of fixed-effect coefficents varied slightly across models, our reanalyses afforded the same statistical inference about the experimental manipulations as the original article, irrespective of LMM specification. The purpose of the analysis is to illustrate an assessment of model complexity as far as variance components and correlation parameters are concerned, neither of which were in the focus of the original publication. 

## Data

The data are available as `kb07` in the [RePsychLing package](https://github.com/dmbates/RePsychLing) for [R](http://www.r-project.org)

In [2]:
using DataFrames,RCall,MixedModels
kb07 = DataFrame("RePsychLing::kb07")

Unnamed: 0,subj,item,RTtrunc,S,P,C,SP,SC,PC,SPC
1,30,1,2267.0,1.0,-1.0,1.0,-1.0,1.0,-1.0,-1.0
2,30,2,3856.0,-1.0,1.0,-1.0,-1.0,1.0,-1.0,1.0
3,30,3,1567.0,-1.0,-1.0,-1.0,1.0,1.0,1.0,-1.0
4,30,4,1732.0,1.0,1.0,-1.0,1.0,-1.0,-1.0,-1.0
5,30,5,2660.0,1.0,-1.0,-1.0,-1.0,-1.0,1.0,1.0
6,30,6,2763.0,-1.0,1.0,1.0,-1.0,-1.0,1.0,-1.0
7,30,7,3528.0,-1.0,-1.0,1.0,1.0,-1.0,-1.0,1.0
8,30,8,1741.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
9,30,9,3692.0,1.0,-1.0,1.0,-1.0,1.0,-1.0,-1.0
10,30,10,1949.0,-1.0,1.0,-1.0,-1.0,1.0,-1.0,1.0


### Maximal linear mixed model (_maxLMM_)

Barr et al. (2012, supplement) analyzed Kronmüller and Barr (2007, Exp. 2) with the _maxLMM_ comprising 16 variance component parameters (eight each for the random factors `subj` and `item`, respectively). This model takes a relatively long time to fit using `lmm` because there are so many parameters and the likelihood surface is very flat. To save time we start the optimization near the optimum.

In [3]:
m0 = lmm(RTtrunc ~ 1+S+P+C+SP+SC+PC+SPC +
(1+S+P+C+SP+SC+PC+SPC|subj) + (1+S+P+C+SP+SC+PC+SPC|item), kb07);

MixedModels.objective!(m0,
[0.4765945402846765,-0.049465200650367254,-0.05577395841877958,
    0.029513337949340943,0.029148407625182948,0.03194864344405289,
    -0.013919372830411688,-0.04638735990276547,0.10257407134683175,
    -0.0171000112983217,-0.016603717040510488,-0.1109210610843561,
    -0.02435319160421018,0.0126607985319349,0.021792305804416535,
    0.10232424394072726,0.07801670286928138,-0.09440572778256788,
    0.005078526659066913,-0.0134017916950307,-0.06505508009231055,
    0.10841211573450886,0.0054641402665540385,-0.05402178547126058,
    -0.1343169035616224,0.05153142117691683,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,
    0.5699800437077661,-0.02339206691982298,-0.26704717626433516,
    0.01736408969190548,0.029034597610324384,0.01783789093485394,
    0.00848310344219369,0.004879390800290307,0.06399876167032546,
    -0.288060036040779,0.003544393801046275,-0.030843446553322655,
    0.004685779731614644,-0.023824295073761197,-0.05397308647226873,
    0.042916034668339306,-0.01217658229652332,-0.01712436845429653,
    -0.01634632724087294,0.10045183250207336,-0.008212424968067053,
    0.08316744039186683,-0.006491337942497235,0.02291032480523459,
    -0.0007845845820843916,-0.07761538290870172,0.021897215394384946,
    -0.054052700984661306,-0.03210211651923669,0.05507672149152027,
    0.,0.,0.,0.,0.,0.])

28586.317619598813

In [4]:
fit(m0)

Linear mixed model fit by maximum likelihood
Formula: RTtrunc ~ 1 + S + P + C + SP + SC + PC + SPC + ((1 + S + P + C + SP + SC + PC + SPC) | subj) + ((1 + S + P + C + SP + SC + PC + SPC) | item)

 logLik: -14293.158809, deviance: 28586.317618

 Variance components:
                Variance    Std.Dev.  Corr.
 subj         90768.454631  301.278035
              5182.222981   71.987659  -0.43
              5543.917579   74.457488  -0.47 -0.47
              7587.207623   87.104579   0.21  0.21  0.21
              8829.508465   93.965464   0.20  0.20  0.20  0.20
              1821.402959   42.677898   0.47  0.47  0.47  0.47  0.47
              7422.585309   86.154427  -0.10 -0.10 -0.10 -0.10 -0.10 -0.10
              3802.035391   61.660647  -0.48 -0.48 -0.48 -0.48 -0.48 -0.48 -0.48
 item         129824.964990  360.312316
              1855.413006   43.074505  -0.34
              62393.466734  249.786843  -0.68 -0.68
              2948.599566   54.301009   0.20  0.20  0.20
              10

This fit converges and produces what look like reasonable parameter estimates (i.e., no variance components with estimates close to zero; no correlation parameters with values close to $\pm1$).

Further investigation, however, shows that the Cholesky factors of the covariance matrices of the unconditional distribution of the within-subject and within-item random effects are singular.

In [5]:
m0.λ

2-element Array{Any,1}:
 PDLCholF(Cholesky{Float64} with factor:
8x8 Triangular{Float64,Array{Float64,2},:L,false}:
  0.476593    0.0         0.0          0.0         0.0  0.0  0.0  0.0
 -0.0494647   0.102574    0.0          0.0         0.0  0.0  0.0  0.0
 -0.055773   -0.0170985   0.102324     0.0         0.0  0.0  0.0  0.0
  0.0295124  -0.0166024   0.0780166    0.108412    0.0  0.0  0.0  0.0
  0.029148   -0.110922   -0.0944035    0.00546411  0.0  0.0  0.0  0.0
  0.0319486  -0.0243532   0.00507938  -0.0540216   0.0  0.0  0.0  0.0
 -0.0139184   0.0126607  -0.0134017   -0.134316    0.0  0.0  0.0  0.0
 -0.0463876   0.0217912  -0.0650552    0.0515311   0.0  0.0  0.0  0.0)
 PDLCholF(Cholesky{Float64} with factor:
8x8 Triangular{Float64,Array{Float64,2},:L,false}:
  0.56998      0.0          0.0         …   0.0        0.0  0.0  0.0
 -0.0233922    0.0639987    0.0             0.0        0.0  0.0  0.0
 -0.267047    -0.288061     0.04291         0.0        0.0  0.0  0.0
  0.017364     0.0035384

Although the random effects vectors for `subj` and for `item` are 8-dimensional there are 4 directions with no variability in the `subj` random effects and 3 directions with no variability in the `item` random effects.

### Evaluation of singular value decomposition (svd) for _maxLMM_

Considering that there are only 56 subjects and 32 items it is quite optimistic to expect to estimate 36 highly nonlinear covariance parameters for `subj` and another 36 for `item`.

The singular value decompositions of these factors are equivalent to a _principal components analysis_ (PCA) of the covariance matrices.  The variance components are the squares of the singular values and the component loadings are the left singular vectors.

In [10]:
svds = map(svdfact,m0.λ);
varcomp = [x[:S].^2 for x in svds]  # variances of the principal components

2-element Array{Any,1}:
 [0.238286,0.0395663,0.0305344,0.0193237,0.0,0.0,0.0,0.0]       
 [0.415843,0.0768207,0.0170854,0.0109844,0.00278772,0.0,0.0,0.0]

In [12]:
map(x->x ./sum(x), varcomp)   # proportion of variance in each component

2-element Array{Any,1}:
 [0.727124,0.120736,0.0931747,0.0589659,0.0,0.0,0.0,0.0]       
 [0.794319,0.146738,0.0326356,0.0209818,0.00532495,0.0,0.0,0.0]

In [14]:
map(x->cumsum(x ./ sum(x)), varcomp)  # cumulative proportions of variance

2-element Array{Any,1}:
 [0.727124,0.847859,0.941034,1.0,1.0,1.0,1.0,1.0]     
 [0.794319,0.941058,0.973693,0.994675,1.0,1.0,1.0,1.0]

In [15]:
[x[:U] for x in svds]   # component loadings

2-element Array{Any,1}:
 8x8 Array{Float64,2}:
 -0.975171   -0.0236937   0.110507   …  -0.0232548  -0.0720005   0.0697229
  0.110936   -0.062438    0.360765      -0.0300218   0.168765    0.706214 
  0.115533   -0.285156    0.261866      -0.220285   -0.205188    0.387967 
 -0.0608491  -0.657069   -0.116164       0.655537    0.243853    0.0897252
 -0.0730182   0.322259   -0.722651       0.0587394  -0.0391085   0.566407 
 -0.0669295   0.217408    0.0950017  …  -0.154439    0.912439   -0.0141082
  0.0307724   0.574871    0.394548       0.671865   -0.124433    0.0563242
  0.0944791  -0.0298937  -0.297743       0.203958    0.123327   -0.116805         
 8x8 Array{Float64,2}:
 -0.860721      0.466888   -0.0502723  …   0.0662198   0.0281145  -0.178243
  0.0130449    -0.239939    0.0141917      0.659885    0.579556   -0.390245
  0.505933      0.800512   -0.136057       0.100172    0.0823936  -0.229744
 -0.0288903    -0.0273149  -0.462091       0.325901   -0.158737    0.310215
 -0.0339228     0.

### Zero-correlation-parameter linear mixed model (_zcpLMM_)

As a first step of model reduction, we propose to start with a model including all 16 variance components, but no correlation parameters. Note that here we go through the motion to be consistent with the recommended strategy. The large number of components with zero or close to zero variance in _maxLMM_ already strongly suggests the need for a reduction of the number of variance components--as done in the next step. For this _zcpLMM_, we extract the vector-valued variables from the model matrix without the intercept column which is provided by the R formula. Then, we use the new double-bar syntax for `lmer()` to force correlation parameters to zero.

At present the `lmm` formulas for these models are rather tedious to write 

In [16]:
m1 = fit(lmm(RTtrunc ~ 1+S+P+C+SP+SC+PC+SPC +
(1|subj)+(0+S|subj)+(0+P|subj)+(0+C|subj)+
(0+SP|subj)+(0+SC|subj)+(0+PC|subj)+(0+SPC|subj) +
(1|item)+(0+S|item)+(0+P|item)+(0+C|item)+
(0+SP|item)+(0+SC|item)+(0+PC|item)+(0+SPC|item), kb07))

Linear mixed model fit by maximum likelihood
Formula: RTtrunc ~ 1 + S + P + C + SP + SC + PC + SPC + (1 | subj) + ((0 + S) | subj) + ((0 + P) | subj) + ((0 + C) | subj) + ((0 + SP) | subj) + ((0 + SC) | subj) + ((0 + PC) | subj) + ((0 + SPC) | subj) + (1 | item) + ((0 + S) | item) + ((0 + P) | item) + ((0 + C) | item) + ((0 + SP) | item) + ((0 + SC) | item) + ((0 + PC) | item) + ((0 + SPC) | item)

 logLik: -14341.033697, deviance: 28682.067394

 Variance components:
                Variance    Std.Dev.  Corr.
 subj         90911.807064  301.515849
                0.000000    0.000000   0.00
                0.000000    0.000000   0.00  0.00
                0.000000    0.000000   0.00  0.00  0.00
                0.000000    0.000000   0.00  0.00  0.00  0.00
                0.000000    0.000000   0.00  0.00  0.00  0.00  0.00
                0.000000    0.000000   0.00  0.00  0.00  0.00  0.00  0.00
                0.000000    0.000000   0.00  0.00  0.00  0.00  0.00  0.00  0.00
 item      

The methods in `lmm` do not fit such models well.  The `lmer` function in the [lme4 package](https://github.com/lme4/lme4) for [R](http://www.r-project.org) is more successful.

In [18]:
MixedModels.objective!(m1,[0.461098922494114, 0.0477069109747168, 0.0700075426143427, 
    0.0948946258676561, 0.113579418439824, 0, 0.0987708829960627,
    0, 0.553747500257642, 0, 0.380220894182068, 
    0.0805837316205983, 0, 0.0413129898395064, 
    0.0900636564806011, 0.075488529192162])

28670.913319613523

In [19]:
m1.fit = false; fit(m1)

Linear mixed model fit by maximum likelihood
Formula: RTtrunc ~ 1 + S + P + C + SP + SC + PC + SPC + (1 | subj) + ((0 + S) | subj) + ((0 + P) | subj) + ((0 + C) | subj) + ((0 + SP) | subj) + ((0 + SC) | subj) + ((0 + PC) | subj) + ((0 + SPC) | subj) + (1 | item) + ((0 + S) | item) + ((0 + P) | item) + ((0 + C) | item) + ((0 + SP) | item) + ((0 + SC) | item) + ((0 + PC) | item) + ((0 + SPC) | item)

 logLik: -14335.456660, deviance: 28670.913320

 Variance components:
                Variance    Std.Dev.  Corr.
 subj         91738.707471  302.883984
              982.040875   31.337531   0.00
              2114.728844   45.986181   0.00  0.00
              3885.506056   62.333828   0.00  0.00  0.00
              5566.265760   74.607411   0.00  0.00  0.00  0.00
                0.000000    0.000000   0.00  0.00  0.00  0.00  0.00
              4209.413273   64.879991   0.00  0.00  0.00  0.00  0.00  0.00
                0.000000    0.000000   0.00  0.00  0.00  0.00  0.00  0.00  0.00
 item  

The variance components that are estimated as 0. can be dropped without affecting the fit.

In [22]:
m2 = lmm(RTtrunc ~ 1+S+P+C+SP+SC+PC+SPC +
    (1|subj)+(0+S|subj)+(0+P|subj)+(0+C|subj)+(0+SP|subj)+(0+PC|subj) +
    (1|item)+(0+P|item)+(0+C|item)+(0+SC|item)+(0+PC|item)+(0+SPC|item), kb07);
MixedModels.objective!(m2,MixedModels.θ(m1)[[1:5,7,9,11,12,14:16]])

28670.913319613494

To look further for trivial variance components, we examine the proportion of the 
variance of the random effects for `subj` and for `item`

In [35]:
function cumulativevar{T<:Real}(svds::Vector{T})
    var = cumsum([abs2(x) for x in svds])  # cumulative variances
    var ./ var[end]  # cumulative proportion
end
cumulativevar(m::LinearMixedModel) = map(cumulativevar,map(svdvals, m.λ))

cumulativevar(m2)

2-element Array{Any,1}:
 [0.845544,0.896848,0.935645,0.971457,0.990949,1.0]
 [0.647986,0.953487,0.970629,0.984351,0.996393,1.0]

For `subj` 85% of the variability in the unconditional distribution of the random effects is attributed to the random intercept.  For `item` 95% of the variability in the unconditional distribution is attributed to the random intercept and the random effect for `P`.

At this point we could reduce to these random effects and reintroduce the correlation between the random effects for `item`.

For some reason this fit, even starting from the converged `lmer` values, does not produce the same deviance.  In the [lme4 package](https://github.com/lme4/lme4) `m0` is not a significantly better fit than this model.  Here, it is.

In [44]:
m3 = lmm(RTtrunc ~ 1+S+P+SP+SC+PC+SPC + (1+P|item) + (1|subj), kb07);
MixedModels.objective!(m3,
[0.537364783211338, -0.258992402219962, 0.268590222400181,0.440834222302868])
fit(m3,true)

f_1: 28689.95953, [0.537365,-0.258992,0.26859,0.440834]
f_2: 29076.1246, [0.095609,-0.153327,0.0,0.0]
f_3: 28741.4515, [0.425541,-0.246119,0.0930285,0.254338]
f_4: 28690.45069, [0.523578,-0.257675,0.244015,0.414138]
f_5: 28689.94941, [0.535863,-0.258852,0.265861,0.437858]
f_6: 28689.96377, [0.526882,-0.255787,0.26638,0.432938]
f_7: 28689.94822, [0.534093,-0.258254,0.265962,0.436895]
f_8: 28689.99481, [0.529312,-0.247708,0.265213,0.445168]
f_9: 28689.94806, [0.533612,-0.257182,0.265887,0.437733]
f_10: 28689.94795, [0.533297,-0.257448,0.265893,0.437076]
f_11: 28689.94793, [0.53333,-0.25736,0.26589,0.437236]
f_12: 28689.94794, [0.533203,-0.257084,0.265886,0.437292]
f_13: 28689.94793, [0.533312,-0.257321,0.26589,0.437244]
f_14: 28689.94796, [0.53302,-0.257416,0.265886,0.43729]
f_15: 28689.94793, [0.533283,-0.257331,0.265889,0.437248]
f_16: 28689.94793, [0.5333,-0.257325,0.26589,0.437246]
f_17: 28689.94793, [0.53327,-0.257306,0.265887,0.43726]
f_18: 28689.94793, [0.533294,-0.257321,0.265889

Linear mixed model fit by maximum likelihood
Formula: RTtrunc ~ 1 + S + P + SP + SC + PC + SPC + ((1 + P) | item) + (1 | subj)

 logLik: -14344.973964, deviance: 28689.947928

 Variance components:
                Variance    Std.Dev.  Corr.
 item         132302.682783  363.734357
              63690.445101  252.369660  -0.70
 subj         88939.184156  298.226733
 Residual     465196.049819  682.052820
 Number of obs: 1790; levels of grouping factors: 32, 56

  Fixed-effects parameters:
             Estimate Std.Error  z value
(Intercept)   2180.66    77.347  28.1932
S             -67.107   16.1215 -4.16256
P            -333.764   47.4366 -7.03601
SP            22.1176   16.1215  1.37193
SC           -18.8073   16.1215  -1.1666
PC            5.14488   16.1215 0.319131
SPC          -23.9168   16.1215 -1.48353


In [42]:
MixedModels.lrt(m3,m0)

Unnamed: 0,Df,Deviance,Chisq,pval
1,12,28689.9479284958,,
2,81,28586.317617892324,103.63031060347566,0.0044324081895381


In [43]:
m3.λ

2-element Array{Any,1}:
 PDLCholF(Cholesky{Float64} with factor:
2x2 Triangular{Float64,Array{Float64,2},:L,false}:
  0.533294  0.0     
 -0.257321  0.265889)
 PDScalF(0.43724873481629667,1)                                                                                                       