Skip to content

Latest commit

 

History

History
350 lines (282 loc) · 10.8 KB

session_06.org

File metadata and controls

350 lines (282 loc) · 10.8 KB

Advanced mixed-models workshop: Session 6

Setup

Low-powered study with complex design and categorical DV

Determining Maximal Random Effects

  • Random intercept is needed whenever there are multiple observations per unit
  • Any within-unit factor gets a random slope, unless there is only one observation per level per unit
  • Between-unit factors do not get a random slope
  • For each interaction, include a slope for the highest order combination of within-subject factors subsumed by the interaction
  • For time-series data, include random slopes for time predictors if you have more than one time series per unit

Audience design in language production

  • How do we explain referential misspecification?
  • Retrieval of past descriptions from memory is an obligatory consequence of attending to a referent with a referential goal in mind
    • Instance Theory of Automaticity (Logan, 1988)
  • Retrieved descriptions checked against context for adequacy, and re-shaped if necessary

Design

img/GannBarr.png

left

  • Addressee: New or Old (Between)
  • Novelty (of referent): New or Old
  • Feedback: Yes or No

right

  • 16 dyads (1 spkr, 1 addr)
  • 16 triads (1 spkr, 2 addr)

Experimental Items

img/GannBarrItems.png

Before doing analysis: look at your data

library(lme4)

crefs <- readRDS("crefs.rds")
head(crefs, 3)

crefs_mit <- subset(crefs, ModInTrain) 

with(crefs_mit, aggregate(Modifier ~ Novelty+Addressee+Feedback, FUN=mean))
  SessionID ItemID RespID Novelty Addressee Feedback ModInTrain Modifier  SOT
1        56      2   6314     New       New       No       TRUE        0 1462
2        56      3   6319     New       New       No       TRUE        0 1126
3        56      7   6315     Old       New       No       TRUE        1 1695
  Novelty Addressee Feedback   Modifier
1     New       New       No 0.10526316
2     Old       New       No 0.61224490
3     New       Old       No 0.08771930
4     Old       Old       No 0.75510204
5     New       New      Yes 0.06779661
6     Old       New      Yes 0.62500000
7     New       Old      Yes 0.06896552
8     Old       Old      Yes 0.76000000

Plot cell means

img/GBplot.png

Specifying the random effects

SubjectsItems
AddresseeBetweenWithin
NoveltyWithinWithin
FeedbackWithinBetween
xtabs(~Addressee+SessionID, crefs)
xtabs(~Addressee+ItemID, crefs)

Creating predictor variables

crefs_mit2 <- transform(crefs_mit, N=ifelse(Novelty=="New",1,0),
                    A=ifelse(Addressee=="New",1,0),
                    F=ifelse(Feedback=="Yes",1,0))
crefs_mit2c <- transform(crefs_mit2,
                     Nc=N-mean(N),
                     Ac=A-mean(A),
                     Fc=F-mean(F))                          

head(crefs_mit2c)

Specifying the model

Our model: \(log\left(\frac{p_{ij}}{1-p{ij}}\right) &= β_0 + β_1 A + β_2 N + β_3 F + β_4 AN + β_5 AF + β_6 NF + β_7 ANF\)

left

right

SubjectsItems
AddresseeBetweenWithin
NoveltyWithinWithin
FeedbackWithinBetween

back

m1 <- glmer(Modifier ~ Ac*Nc*Fc + (1 + Nc*Fc | SessionID) + (1 + Ac*Nc | ItemID),
                      data=crefs_mit2c, family=binomial(link="logit"),
                      control=glmerControl(optimizer="bobyqa"))

Fitting the model

# takes a long time but doesn't converge
m1 <- glmer(Modifier ~ Ac*Nc*Fc +
            (1 + Nc*Fc | SessionID) + (1 + Ac*Nc | ItemID),
            crefs_mit2c, family=binomial(link="logit"),
            control=glmerControl(optimizer="bobyqa"))
:  Warning messages:
: 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
:   Model failed to converge with max|grad| = 0.01537 (tol = 0.001, component 20)
: 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
:   Model is nearly unidentifiable: very large eigenvalue
:  - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio
:  - Rescale variables?

Diagonal model (covariance parameters fixed to zero)

# no-random-correlations model
m2 <- glmer(Modifier ~ Ac*Nc*Fc + 
            (1 + Nc*Fc || SessionID) + (1 + Ac*Nc || ItemID),
            crefs_mit2c, family=binomial(link="logit"),
            control=glmerControl(optimizer="bobyqa")) # converges

Diagonal model (output)

Random effects:
 Groups      Name        Variance  Std.Dev. 
 SessionID   (Intercept) 6.566e-01 8.103e-01
 SessionID.1 Nc          1.132e+00 1.064e+00
 SessionID.2 Fc          0.000e+00 0.000e+00
 SessionID.3 Nc:Fc       4.297e-15 6.555e-08
 ItemID      (Intercept) 1.026e+00 1.013e+00
 ItemID.1    Ac          1.580e-01 3.975e-01
 ItemID.2    Nc          1.216e+00 1.103e+00
 ItemID.3    Ac:Nc       0.000e+00 0.000e+00
Number of obs: 427, groups:  SessionID, 32; ItemID, 16

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -1.13298    0.35710  -3.173  0.00151 ** 
Ac          -0.29116    0.46433  -0.627  0.53063    
Nc          -4.32096    0.61720  -7.001 2.54e-12 ***
Fc          -0.38034    0.62448  -0.609  0.54249    
Ac:Nc        1.05604    0.78602   1.344  0.17910    
Ac:Fc       -0.07195    0.71227  -0.101  0.91954    
Nc:Fc       -0.32425    0.89398  -0.363  0.71683    
Ac:Nc:Fc    -0.53686    1.33216  -0.403  0.68695    

Clean up the random effects

  • get rid of REPs estimated to be zero because we’re using model comparison, and those could slow down the estimation procedure
m3 <- glmer(Modifier ~ Ac*Nc*Fc +
                (1 | SessionID) +
                (0 + Nc | SessionID) +
                (1 | ItemID) +
                (0 + Ac | ItemID) +
                (0 + Nc | ItemID),
            crefs_mit2c, family=binomial(link="logit"),
            control=glmerControl(optimizer="bobyqa")) # converges

Perform tests using model comparison

lcol

m3_noA <- update(m3, . ~ . - Ac) 
m3_noN <- update(m3, . ~ . - Nc) 
m3_noF <- update(m3, . ~ . - Fc) 
m3_noAN <- update(m3, . ~ . - Ac:Nc)
m3_noAF <- update(m3, . ~ . - Ac:Fc)
m3_noNF <- update(m3, . ~ . - Nc:Fc)
m3_noANF <- update(m3, . ~ . - Ac:Nc:Fc) 

anova(m3, m3_noA)
anova(m3, m3_noN)
anova(m3, m3_noF)
anova(m3, m3_noAN)
anova(m3, m3_noAF)
anova(m3, m3_noNF)
anova(m3, m3_noANF)

rcol

ChisqDfp
A.3821.537
N32.6931<.001
F.3721.542
AN1.8431.175
AF.0101.920
NF.1311.717
ANF.1621.687

Implications

  • Little evidence that the speaker took partner’s perspective into account
    • Use of modifier driven by listener’s own experience
  • Supports idea that modifier use is (at least partly) based on memory retrieval

Analysis of count data

Final thoughts

  • When performing (reviewing) analyses, it is of utmost importance to ensure random effects are appropriately specified
  • Random-intercept-only models are rarely appropriate
  • Use design-driven rather than data-driven random effects
  • Development of lme4 is rapid, and there are many “tricks of the trade”; tune into blogs, mailing lists, and social media to keep up
  • Don’t give up! It will make sense… at some point…