# RePsychLing Barr and Seyfeddinipur (2010)

## Data from Barr and Seyfeddinipur (2010)

Some of the data from Barr and Seyfeddinipur (2010) are available as the data frame `bs10` in the `RePsychLing` package for [R](http://www.r-project.org)

In [1]:
# using DataFrames,RCall,MixedModels
# bs10 = DataFrame("RePsychLing::bs10")
using DataFrames,MixedModels
bs10 = MixedModels.rdata("bs10")

Unnamed: 0,SubjID,ItemID,Spkr,Filler,ms,d,d2,Spkr2,dif,SF,F,S
1,1,12,SS,NS,-3049.0,0.0,0.0,Same,0.0,1.0,-1.0,-1.0
2,1,3,DS,FP,-3048.0,-0.01,-0.01,Different,0.0,1.0,1.0,1.0
3,1,6,DS,NS,-3048.0,0.0,0.0,Different,0.0,-1.0,-1.0,1.0
4,1,9,SS,NS,-3048.0,0.0,0.0,Same,0.0,1.0,-1.0,-1.0
5,1,11,SS,FP,-3048.0,0.0,0.0,Same,0.0,-1.0,1.0,-1.0
6,1,10,SS,NS,-3048.0,0.0,0.55,Same,0.55,1.0,-1.0,-1.0
7,1,5,SS,FP,-3048.0,0.0,1.0,Same,1.0,-1.0,1.0,-1.0
8,1,7,DS,NS,-3049.0,0.0,0.0,Different,0.0,-1.0,-1.0,1.0
9,1,2,DS,FP,-3048.0,0.67,-0.13,Different,-0.8,1.0,1.0,1.0
10,1,1,DS,FP,-3048.0,0.0,0.0,Different,0.0,1.0,1.0,1.0


As with other data frames in this package, the subject and item factors are called `subj` and `item`.  The response being modelled, `dif`, variable is the difference in two response times.

The two experimental factors `S` and `F`, both at two levels, are represented in the -1/+1 encoding, as is their interaction, `SF`.  The `S` factor is the speaker condition with levels -1 for the same speaker in both trials and +1 for different speakers.  The `F` factor is the filler condition with levels -1 for `NS` and +1 for `FP`.

### Maximal linear mixed model (_maxLMM_)

The maximal model has a full factorial design `1+S+F+SF` for the fixed-effects and for potentially correlated vector-valued random effects for the subject and the item.

In [2]:
m0 = fit(lmm(dif ~ 1+S+F+SF + (1+S+F+SF|SubjID) + (1+S+F+SF|ItemID), bs10))

Linear mixed model fit by maximum likelihood
Formula: dif ~ 1 + S + F + SF + ((1 + S + F + SF) | SubjID) + ((1 + S + F + SF) | ItemID)

 logLik: -515.477648, deviance: 1030.955295

 Variance components:
                Variance    Std.Dev.  Corr.
 SubjID         0.013393    0.115730
                0.011216    0.105907  -0.56
                0.003373    0.058075   0.99  0.99
                0.005192    0.072053  -0.13 -0.13 -0.13
 ItemID         0.000305    0.017459
                0.000079    0.008904  -1.00
                0.000136    0.011647   1.00  1.00
                0.000083    0.009125  -1.00 -1.00 -1.00
 Residual       0.127779    0.357462
 Number of obs: 1104; levels of grouping factors: 92, 12

  Fixed-effects parameters:
               Estimate Std.Error  z value
(Intercept)    0.039221 0.0169329  2.31626
S            -0.0174094 0.0156289 -1.11392
F             0.0174819 0.0127947  1.36633
SF           -0.0322645 0.0133832 -2.41082


In [4]:
gc(); @time fit(lmm(dif ~ 1+S+F+SF + (1+S+F+SF|SubjID) + (1+S+F+SF|ItemID), bs10));

elapsed time: 0.793719975 seconds (243262944 bytes allocated, 18.93% gc time)


It look as if additional efforts to conserve memory will be helpful.

Memory considerations aside, the fit of _maxLMM_ is suspect because of the high correlations of the vector-valued random effects for `item` and the repeated values in the random effects for `subj`.

Many of the $\theta$ parameters are close to zero in magnitude.  The two $\lambda$ matrices, which are lower Cholesky factors formed from sections of the $\theta$ parameters (in column-major ordering) are singular.

In [5]:
show(MixedModels.θ(m0))

[0.3237548077337227,-0.1663421908266492,0.16112085758748357,-0.026568073084886252,0.24517014284710592,-0.02085235373127605,0.19980923164721606,1.282938788009876e-6,-1.8667508743497564e-6,0.0,0.04884104176916634,-0.02490891281779611,0.03258144488208834,-0.0255267086246206,9.798001877943274e-7,-7.885771115835552e-8,-3.6392924495676325e-8,0.0,5.093674097430509e-10,0.0]

In [6]:
m0.λ

2-element Array{Any,1}:
 PDLCholF(Cholesky{Float64} with factor:
4x4 Triangular{Float64,Array{Float64,2},:L,false}:
  0.323755    0.0         0.0         0.0
 -0.166342    0.24517     0.0         0.0
  0.161121   -0.0208524   1.28294e-6  0.0
 -0.0265681   0.199809   -1.86675e-6  0.0)    
 PDLCholF(Cholesky{Float64} with factor:
4x4 Triangular{Float64,Array{Float64,2},:L,false}:
  0.048841    0.0         0.0          0.0
 -0.0249089   9.798e-7    0.0          0.0
  0.0325814  -7.88577e-8  0.0          0.0
 -0.0255267  -3.63929e-8  5.09367e-10  0.0)

We consider elements of $\theta$ of magnitude less than, say, `5.e-6`, to be negligible.  Notice that setting these very small values to zero actually results in a small improvement in the deviance (smaller is better).

In [7]:
th = Float64[abs(x) < 5.e-6 ? 0. : x for x in MixedModels.θ(m0)]

20-element Array{Float64,1}:
  0.323755 
 -0.166342 
  0.161121 
 -0.0265681
  0.24517  
 -0.0208524
  0.199809 
  0.0      
  0.0      
  0.0      
  0.048841 
 -0.0249089
  0.0325814
 -0.0255267
  0.0      
  0.0      
  0.0      
  0.0      
  0.0      
  0.0      

In [8]:
deviance(m0)

1030.9552950545328

In [9]:
MixedModels.objective!(m0,th)

1030.9552950535974

We save the value of `th` in the `R` package, so that the `lmer` fit can use it instead of recreating the value. (The optimization is much slower with `lmer`.)

In [11]:
#globalEnv[:bsm0th] = th;
#reval("save(bsm0th,file='/tmp/bsm0th.rda',compress='xz')")

Alternatively, a `.csv` file will do.

In [12]:
open("bsm0th.csv","w") do io writecsv(io,th) end

## Zero-correlation-parameter mixed model (_zcpLMM_)

A zero-correlation-parameter model fits independent random effects for the intercept, the experimental factors and their interaction for each of the `subj` and `item` grouping factors.

In [10]:
m1 = fit(lmm(dif ~ 1+S+F+SF + (1|SubjID)+(0+S|SubjID)+(0+F|SubjID)+(0+SF|SubjID)
+ (1|ItemID)+(0+S|ItemID)+(0+F|ItemID)+(0+SF|ItemID), bs10))

Linear mixed model fit by maximum likelihood
Formula: dif ~ 1 + S + F + SF + (1 | SubjID) + ((0 + S) | SubjID) + ((0 + F) | SubjID) + ((0 + SF) | SubjID) + (1 | ItemID) + ((0 + S) | ItemID) + ((0 + F) | ItemID) + ((0 + SF) | ItemID)

 logLik: -540.040342, deviance: 1080.080684

 Variance components:
                Variance    Std.Dev.  Corr.
 SubjID         0.011801    0.108631
                0.010078    0.100391   0.00
                0.000000    0.000000   0.00  0.00
                0.003577    0.059809   0.00  0.00  0.00
 ItemID         0.000000    0.000000
                0.000000    0.000000   0.00
                0.000000    0.000000   0.00  0.00
                0.000000    0.000000   0.00  0.00  0.00
 Residual       0.136033    0.368827
 Number of obs: 1104; levels of grouping factors: 92, 12

  Fixed-effects parameters:
               Estimate Std.Error  z value
(Intercept)    0.039221 0.0158584  2.47321
S            -0.0174094 0.0152567  -1.1411
F             0.0174819 0.011

In [13]:
show(MixedModels.θ(m1))

[0.2945318916149058,0.27219104174740977,0.0,0.16216066533636336,0.0,0.0,0.0,0.0]

A likelihood-ratio test of the two model fits shows that the more complex model, `m0`, fits significantly better than the simpler model, `m1`.

In [14]:
MixedModels.lrt(m1,m0)

Unnamed: 0,Df,Deviance,Chisq,pval
1,13,1080.0806840648645,,
2,25,1030.9552950535974,49.12538901126686,1.9886741893319797e-06


Interestingly, the random effects for `item` in model `m1` are all zero; even the random intercepts.

There is no purpose in doing a principal components analysis because the matrix of loadings with be the identity for a zero-correlation-parameter model.

In [15]:
m1.λ

2-element Array{Any,1}:
 PDDiagF(4x4 Diagonal{Float64}:
 0.294532  0.0       0.0  0.0     
 0.0       0.272191  0.0  0.0     
 0.0       0.0       0.0  0.0     
 0.0       0.0       0.0  0.162161)
 PDDiagF(4x4 Diagonal{Float64}:
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0)                                                            

## Iterative reduction of model complexity

First we remove those variance components estimated to have a value of zero.

In [16]:
m2 = fit(lmm(dif ~ 1+S+F+SF + (1|SubjID)+(0+S|SubjID)+(0+SF|SubjID),bs10))

Linear mixed model fit by maximum likelihood
Formula: dif ~ 1 + S + F + SF + (1 | SubjID) + ((0 + S) | SubjID) + ((0 + SF) | SubjID)

 logLik: -540.040342, deviance: 1080.080684

 Variance components:
                Variance    Std.Dev.  Corr.
 SubjID         0.011801    0.108631
                0.010078    0.100391   0.00
                0.003577    0.059809   0.00  0.00
 Residual       0.136033    0.368827
 Number of obs: 1104; levels of grouping factors: 92

  Fixed-effects parameters:
               Estimate Std.Error  z value
(Intercept)    0.039221 0.0158583  2.47321
S            -0.0174094 0.0152566  -1.1411
F             0.0174819 0.0111004  1.57489
SF           -0.0322645 0.0127319 -2.53415


Naturally, the fit for this model is equivalent to that for `m1` because it is only the variance components with zero estimates that are eliminated.

In [17]:
MixedModels.lrt(m2,m1)

Unnamed: 0,Df,Deviance,Chisq,pval
1,8,1080.0806840657065,,
2,13,1080.0806840648645,8.421920938417314e-10,1.0


Next we check whether the variance component for the interaction, `SF`, could reasonably be zero.

In [18]:
m3 = fit(lmm(dif ~ 1+S+F+SF + (1|SubjID)+(0+S|SubjID),bs10))

Linear mixed model fit by maximum likelihood
Formula: dif ~ 1 + S + F + SF + (1 | SubjID) + ((0 + S) | SubjID)

 logLik: -541.715626, deviance: 1083.431253

 Variance components:
                Variance    Std.Dev.  Corr.
 SubjID         0.011443    0.106972
                0.009721    0.098594   0.00
 Residual       0.140326    0.374601
 Number of obs: 1104; levels of grouping factors: 92

  Fixed-effects parameters:
               Estimate Std.Error  z value
(Intercept)    0.039221 0.0158584  2.47321
S            -0.0174094 0.0152567  -1.1411
F             0.0174819 0.0112742  1.55062
SF           -0.0322645 0.0112742 -2.86181


In [19]:
MixedModels.lrt(m3,m2)

Unnamed: 0,Df,Deviance,Chisq,pval
1,7,1083.431252751904,,
2,8,1080.0806840657065,3.350568686197448,0.06718180011321


Not quite significant, but could be considered. The fit is still worse than for the _maxLMM_ `m0`. We now reintroduce a correlation parameters in the vector-valued random effects for `subj`.

### Extending the reduced LMM with correlation parameters

In [20]:
m4 = fit(lmm(dif ~ 1+S+F+SF + (1+S|SubjID), bs10))

Linear mixed model fit by maximum likelihood
Formula: dif ~ 1 + S + F + SF + ((1 + S) | SubjID)

 logLik: -537.693502, deviance: 1075.387003

 Variance components:
                Variance    Std.Dev.  Corr.
 SubjID         0.009877    0.099382
                0.007818    0.088421  -1.00
 Residual       0.143794    0.379202
 Number of obs: 1104; levels of grouping factors: 92

  Fixed-effects parameters:
               Estimate Std.Error  z value
(Intercept)    0.039221 0.0154145  2.54443
S            -0.0174094 0.0146707 -1.18668
F             0.0174819 0.0114126   1.5318
SF           -0.0322645 0.0114126 -2.82708


It turns out that this fit is to a local optimum.  The fit from `lmer` in the [lme4 package](https://github.com/lme4/lme4) for [R](http://www.r-project.org) provides a parameter vector of

In [22]:
MixedModels.objective!(m4,[0.285562798374853, -0.1834729162814, 0.188706334932744]);
m4

Linear mixed model fit by maximum likelihood
Formula: dif ~ 1 + S + F + SF + ((1 + S) | SubjID)

 logLik: -536.401757, deviance: 1072.803514

 Variance components:
                Variance    Std.Dev.  Corr.
 SubjID         0.011443    0.106972
                0.009721    0.098594  -0.70
 Residual       0.140326    0.374601
 Number of obs: 1104; levels of grouping factors: 92

  Fixed-effects parameters:
               Estimate Std.Error  z value
(Intercept)    0.039221 0.0158583  2.47321
S            -0.0174094 0.0152567  -1.1411
F             0.0174819 0.0112742  1.55062
SF           -0.0322645 0.0112742 -2.86181


In [23]:
MixedModels.lrt(m4,m0)

Unnamed: 0,Df,Deviance,Chisq,pval
1,8,1072.8035144441687,,
2,25,1030.9552950535974,41.84821939057133,0.0007053628460637
