Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent SE estimates by order of data #262

Open
bbolker opened this issue Dec 11, 2014 · 13 comments
Open

inconsistent SE estimates by order of data #262

bbolker opened this issue Dec 11, 2014 · 13 comments

Comments

@bbolker
Copy link
Member

bbolker commented Dec 11, 2014

Copied from CrossValidated

library(lme4)
set.seed(999)

# make a somewhat complex data frame
x = c(rnorm(10000),rnorm(10000,0.1))
x = sample(x)
y = jitter(x,amount=10)
a = rep(1:20,length.out=length(x))
y[a==1] = jitter(y[a==1],amount=3)
y[a==2] = jitter(x[a==2],amount=1)
y[a>3 & a<6] = rnorm(sum(a>3 & a<6))
# convert to binary variables
y = y >0
x = x >0
# make a data frame
d = data.frame(x1=x,y1=y,a1=a)

# run model 
m1 = glmer(x1~y1+(1+y1|a1),data=d,family=binomial(link='logit'))

# shuffle order of rows
d = d[sample(nrow(d)),]

# run model again
m2 = glmer(x1~y1+(1+y1|a1),data=d,family=binomial(link='logit'))

# show output
summary(m1)
summary(m2)

Results are similar but not identical (but not nearly as different as reported on original, non-shared datasets, which gave p-values differing by as much as 0.001 and 0.08!)

 all.equal(coef(summary(m1)),coef(summary(m2)))
[1] "Mean relative difference: 0.0004398406"

Run with

R Under development (unstable) (2014-09-17 r66626)
Platform: i686-pc-linux-gnu (32-bit)

Raw results (differences are not visible on this scale):

> m1
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: x1 ~ y1 + (1 + y1 | a1)
   Data: d
      AIC       BIC    logLik  deviance  df.resid 
 27220.90  27260.41 -13605.45  27210.90     19995 
Random effects:
 Groups Name        Std.Dev. Corr 
 a1     (Intercept) 0.3067        
        y1TRUE      0.5921   -1.00
Number of obs: 20000, groups:  a1, 20
Fixed Effects:
(Intercept)       y1TRUE  
    -0.1729       0.4331 
 m2
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: x1 ~ y1 + (1 + y1 | a1)
   Data: d
      AIC       BIC    logLik  deviance  df.resid 
 27220.90  27260.41 -13605.45  27210.90     19995 
Random effects:
 Groups Name        Std.Dev. Corr 
 a1     (Intercept) 0.3067        
        y1TRUE      0.5921   -1.00
Number of obs: 20000, groups:  a1, 20
Fixed Effects:
(Intercept)       y1TRUE  
    -0.1729       0.4331  
@stevencarlislewalker
Copy link
Member

I can't reproduce. The two models look identical to me:

> summary(m1)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: x1 ~ y1 + (1 + y1 | a1)
   Data: d

     AIC      BIC   logLik deviance df.resid 
 27220.9  27260.4 -13605.4  27210.9    19995 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.0340 -1.0021  0.4916  0.9566  2.0335 

Random effects:
 Groups Name        Variance Std.Dev. Corr 
 a1     (Intercept) 0.09408  0.3067        
        y1          0.35056  0.5921   -1.00
Number of obs: 20000, groups:  a1, 20

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)   
(Intercept) -0.17293    0.07159  -2.415  0.01571 * 
y1           0.43314    0.13553   3.196  0.00139 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
   (Intr)
y1 -0.979
> summary(m2)
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: x1 ~ y1 + (1 + y1 | a1)
   Data: d

     AIC      BIC   logLik deviance df.resid 
 27220.9  27260.4 -13605.4  27210.9    19995 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.0339 -1.0021  0.4917  0.9566  2.0336 

Random effects:
 Groups Name        Variance Std.Dev. Corr 
 a1     (Intercept) 0.09408  0.3067        
        y1          0.35056  0.5921   -1.00
Number of obs: 20000, groups:  a1, 20

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)   
(Intercept) -0.17293    0.07156  -2.417  0.01567 * 
y1           0.43314    0.13547   3.197  0.00139 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
   (Intr)
y1 -0.979

@bbolker
Copy link
Member Author

bbolker commented Dec 11, 2014

See edited comment above: what is all.equal(coef(summary(m1)),coef(summary(m2))) in your case?

@stevencarlislewalker
Copy link
Member

> all.equal(coef(summary(m1)),coef(summary(m2)))
[1] "Mean relative difference: 0.0003941612"

@bbolker
Copy link
Member Author

bbolker commented Dec 11, 2014

OP on CrossValidated reports (on R 3.0.1 GUI 1.61 Snow Leopard build (6492))

When I run all.equal(coef(summary(m1)),coef(summary(m2))) I get "Mean relative difference: 0.00276842"

So, this is an order of magnitude larger than our errors (3e-3 vs. 4e-4), but still two orders of magnitude less than all.equal(0.01,0.008) (relative error=0.2)

According to the OP summary, for the real data the summary of the fixed effect of interest is:

mean       SE       z       p
0.3772     0.2123   1.777   0.0756

What would the SE need to be in order to get a two-tailed p-value of 0.001 instead?

m <- 0.3772
se <- 0.2123
p <-  2*pnorm(abs(m/se),lower.tail=FALSE)
## so
qnorm(p/2) = abs(m/se)
se = m/qnorm(p/2)
m/qnorm(0.001/2,lower.tail=FALSE)  ## 0.1146321

i.e., a 2-fold difference in the estimated SE

More precise tests of equality:

 all.equal(sqrt(diag(vcov(m1))),sqrt(diag(vcov(m2))))

(e.g. maybe the SE relative differences are much larger than the rest of the values in the coefficient summary) -- however I get similar results here (Mean relative difference: 0.0003951268). Differences are larger for p-values, but still (for this example) at least one order of magnitude too small to worry about:

ff <- function(x) coef(summary(x))[2,"Pr(>|z|)"]
> all.equal(ff(m1),ff(m2))
[1] "Mean relative difference: 0.005135089"

@xinxie-cogsci
Copy link

Hi I've recently noticed this old issue when running analyses on my data. The differences for p values caused by the inconsistent SE estimates are much larger than the ones in the sample data above. If this bug has not been fixed, I would be happy to share my data for you to reproduce the problem.

@bbolker
Copy link
Member Author

bbolker commented Apr 14, 2020

Sure (I haven't got anything else to do right now :-) :-) :-) )

@xinxie-cogsci
Copy link

Great! I am not sure what the best way is to share the data. But you can access it here: https://www.dropbox.com/sh/0g1alnp7oe3lzdd/AABKGlVe-dXnt3b-zN2kmA2za?dl=0

The R script loads the RData file and run some analyses. As you will see, I was simply subsetting the whole dataset by subject and run mixed effects models on each subject's data. It seems that in most cases (i.e., for most subjects' data), the reshuffling of rows does not affect p values much (consistent with the magnitudes shown in the sample data) but in a couple cases it does change p value by up to 0.4! I wonder what makes some data more susceptible to this issue than others.

Some pointers would be really helpful!

@adamrobinson361
Copy link

adamrobinson361 commented Nov 26, 2020

@bbolker - just to add to this thread, myself and a colleague have had this problem recently too.

We're working on a fairly large dataset and similar to the above person we have changes in p values that suggest quite different conclusions to our analysis when our dataset is loaded in a different order.

In lieu of a solution short term are you able to offer any thoughts on why this might be/ avenues to explore?

Our current plan is to begin testing with optimisers other than the default and comparing results. Further we are planning to do reading on optimiser options that we can pass to glmer.

Are there any other avenues you might suggest?

Thanks for your work on the package!

Cheers,

Adam

@bachlaw
Copy link

bachlaw commented Nov 26, 2020 via email

@adamrobinson361
Copy link

Thanks for your input @bachlaw!

I think we're probably too far on in our analysis to change programming language and afraid we don't have either experience in Stan or the ability to use it easily in our corporate environment. Sure there would be a way but think best to stick to R for now.

Thanks for bringing glmmTMB to my attention though. I'll definitely try and reproduce the model in that package and see if the problem remains.

@bamaly
Copy link

bamaly commented Oct 6, 2022

@adamrobinson361 - Were you able to find a solution to your problem? I am facing the same problem right now with a logistic glmm. In my case, glmmTMB seems to provide more robust estimates. In addition, glmer generates singular fits for some models, while this problem does not occur with glmmTMB.
Can someone explain this to me?

@bbolker
Copy link
Member Author

bbolker commented Oct 6, 2022

Do you have a reproducible example? We could take a look.
glmmTMB never reports singular fits -- often it estimates very small values for the variances where lme4 would (correctly) find a variance of exactly zero.

@bbolker
Copy link
Member Author

bbolker commented Oct 17, 2022

@bamaly: reproducible example(s)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants