Advanced mixed-models workshop: Session 6
Low-powered study with complex design and categorical DV
Determining Maximal Random Effects
Random intercept is needed whenever there are multiple observations per unit
Any within-unit factor gets a random slope, unless there is only
one observation per level per unit
Between-unit factors do not get a random slope
For each interaction, include a slope for the highest order
combination of within-subject factors subsumed by the interaction
For time-series data, include random slopes for time predictors if
you have more than one time series per unit
Audience design in language production
How do we explain referential misspecification?
Retrieval of past descriptions from memory is an obligatory
consequence of attending to a referent with a referential goal in
mind
Instance Theory of Automaticity (Logan, 1988)
Retrieved descriptions checked against context for adequacy, and
re-shaped if necessary
Addressee: New or Old (Between)
Novelty (of referent): New or Old
Feedback: Yes or No
16 dyads (1 spkr, 1 addr)
16 triads (1 spkr, 2 addr)
Before doing analysis: look at your data
library(lme4 )
crefs <- readRDS(" crefs.rds" )
head(crefs , 3 )
crefs_mit <- subset(crefs , ModInTrain )
with(crefs_mit , aggregate(Modifier ~ Novelty + Addressee + Feedback , FUN = mean ))
SessionID ItemID RespID Novelty Addressee Feedback ModInTrain Modifier SOT
1 56 2 6314 New New No TRUE 0 1462
2 56 3 6319 New New No TRUE 0 1126
3 56 7 6315 Old New No TRUE 1 1695
Novelty Addressee Feedback Modifier
1 New New No 0.10526316
2 Old New No 0.61224490
3 New Old No 0.08771930
4 Old Old No 0.75510204
5 New New Yes 0.06779661
6 Old New Yes 0.62500000
7 New Old Yes 0.06896552
8 Old Old Yes 0.76000000
Specifying the random effects
Subjects Items
Addressee Between Within
Novelty Within Within
Feedback Within Between
xtabs(~ Addressee + SessionID , crefs )
xtabs(~ Addressee + ItemID , crefs )
Creating predictor variables
crefs_mit2 <- transform(crefs_mit , N = ifelse(Novelty == " New" ,1 ,0 ),
A = ifelse(Addressee == " New" ,1 ,0 ),
F = ifelse(Feedback == " Yes" ,1 ,0 ))
crefs_mit2c <- transform(crefs_mit2 ,
Nc = N - mean(N ),
Ac = A - mean(A ),
Fc = F - mean(F ))
head(crefs_mit2c )
Our model:
\(log\left(\frac{p_{ij}}{1-p{ij}}\right) &= β_0 + β_1 A + β_2 N + β_3 F + β_4 AN + β_5 AF + β_6 NF + β_7 ANF\)
Subjects Items
Addressee Between Within
Novelty Within Within
Feedback Within Between
m1 <- glmer(Modifier ~ Ac*Nc*Fc + (1 + Nc*Fc | SessionID) + (1 + Ac*Nc | ItemID),
data=crefs_mit2c, family=binomial(link="logit"),
control=glmerControl(optimizer="bobyqa"))
# takes a long time but doesn't converge
m1 <- glmer(Modifier ~ Ac * Nc * Fc +
(1 + Nc * Fc | SessionID ) + (1 + Ac * Nc | ItemID ),
crefs_mit2c , family = binomial(link = " logit" ),
control = glmerControl(optimizer = " bobyqa" ))
: Warning messages:
: 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
: Model failed to converge with max|grad| = 0.01537 (tol = 0.001, component 20)
: 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
: Model is nearly unidentifiable: very large eigenvalue
: - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
: - Rescale variables?
Diagonal model (covariance parameters fixed to zero)
# no-random-correlations model
m2 <- glmer(Modifier ~ Ac * Nc * Fc +
(1 + Nc * Fc || SessionID ) + (1 + Ac * Nc || ItemID ),
crefs_mit2c , family = binomial(link = " logit" ),
control = glmerControl(optimizer = " bobyqa" )) # converges
Random effects:
Groups Name Variance Std.Dev.
SessionID (Intercept) 6.566e-01 8.103e-01
SessionID.1 Nc 1.132e+00 1.064e+00
SessionID.2 Fc 0.000e+00 0.000e+00
SessionID.3 Nc:Fc 4.297e-15 6.555e-08
ItemID (Intercept) 1.026e+00 1.013e+00
ItemID.1 Ac 1.580e-01 3.975e-01
ItemID.2 Nc 1.216e+00 1.103e+00
ItemID.3 Ac:Nc 0.000e+00 0.000e+00
Number of obs: 427, groups: SessionID, 32; ItemID, 16
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.13298 0.35710 -3.173 0.00151 **
Ac -0.29116 0.46433 -0.627 0.53063
Nc -4.32096 0.61720 -7.001 2.54e-12 ***
Fc -0.38034 0.62448 -0.609 0.54249
Ac:Nc 1.05604 0.78602 1.344 0.17910
Ac:Fc -0.07195 0.71227 -0.101 0.91954
Nc:Fc -0.32425 0.89398 -0.363 0.71683
Ac:Nc:Fc -0.53686 1.33216 -0.403 0.68695
Clean up the random effects
get rid of REPs estimated to be zero because we’re using model
comparison, and those could slow down the estimation procedure
m3 <- glmer(Modifier ~ Ac * Nc * Fc +
(1 | SessionID ) +
(0 + Nc | SessionID ) +
(1 | ItemID ) +
(0 + Ac | ItemID ) +
(0 + Nc | ItemID ),
crefs_mit2c , family = binomial(link = " logit" ),
control = glmerControl(optimizer = " bobyqa" )) # converges
Perform tests using model comparison
m3_noA <- update(m3 , . ~ . - Ac )
m3_noN <- update(m3 , . ~ . - Nc )
m3_noF <- update(m3 , . ~ . - Fc )
m3_noAN <- update(m3 , . ~ . - Ac : Nc )
m3_noAF <- update(m3 , . ~ . - Ac : Fc )
m3_noNF <- update(m3 , . ~ . - Nc : Fc )
m3_noANF <- update(m3 , . ~ . - Ac : Nc : Fc )
anova(m3 , m3_noA )
anova(m3 , m3_noN )
anova(m3 , m3_noF )
anova(m3 , m3_noAN )
anova(m3 , m3_noAF )
anova(m3 , m3_noNF )
anova(m3 , m3_noANF )
Chisq Df p
A .382 1 .537
N 32.693 1 <.001
F .372 1 .542
AN 1.843 1 .175
AF .010 1 .920
NF .131 1 .717
ANF .162 1 .687
Little evidence that the speaker took partner’s perspective into account
Use of modifier driven by listener’s own experience
Supports idea that modifier use is (at least partly) based on memory retrieval
When performing (reviewing) analyses, it is of utmost importance to
ensure random effects are appropriately specified
Random-intercept-only models are rarely appropriate
Use design-driven rather than data-driven random effects
Development of lme4
is rapid, and there are many “tricks of the
trade”; tune into blogs, mailing lists, and social media to keep up
Don’t give up! It will make sense… at some point…