Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

lmer vs lme/SAS PROC MIXED; fixed estimate in random-slopes model #376

Open
bbolker opened this Issue May 4, 2016 · 1 comment

Comments

Projects
None yet
1 participant
Owner

bbolker commented May 4, 2016 edited

via e-mail from Georg Ferber:

data set here and my beginnings of exploration here

To give you a brief background: We perform a simulation study where we generate studies investigating the relationship of the QT-interval (or rather its change from a baseline value) and the concentration of a hypothetical drug. In one of the scenarios that we investigated, we observed that in about 10 % of the cases, the estimate for the fixed effect for concentration, which, in this case, should be around 0, was much larger and there was an outlier in the random effects. When we tried to identify the reason for this, we discovered that with the same dataset, the estimates obtained with the lme function in the nlme package gave us much more "reasonable" (i.e. a slope closer to the expected value of 0) results and that SAS proc mixed also gave these "reasonable" results. It should be noted that lme at times did not converrge, Likewise, by using a model with a smaller number of random effects, we could get values we expected. We think that in the meantime we could identify the element in the data that caused the problem to occur, but we still do not understand what happens.

More precisely, we fit this model:

lmer(dQTcF ~ C + trt + TP - 1 + (1 + C|id),  data=data )

where trt and TP are factors, while C is a continuous covariate) and we are interested (among others) in the fixed effect for C.
In the dataset attached (Oddsets1.Rdata), you find a list of 10 data.frames that cause these strange results together with a set of results, where the row named "Slope" is the one of primary interest. Note that this table is not a directly based on the lmer model, but uses the package lsmeans to come up with a Kenward-Rogers approximation. However, the estimate for the coefficient for C is unchanged.

Below is an overview over the results (fixed effect for C) using the three programs.

          lmer           lme        SAS
 [1,] 0.2567071  0.0368688065    0.03687   
 [2,] 0.6657036  0.0149760616    0.01498
 [3,] 0.2412575  0.0008564606    0.000853
 [4,] 0.1646511  0.1646511556    -0.00071
 [5,] 0.2916294 -0.0230802265    -0.02308
 [6,] 0.3600185            NA    NA: stopped for UN, but VC: -0.00527
 [7,] 0.3022839  0.3022838689    UN: 0.3023, but VC: -0.04856
 [8,] 0.2426037            NA    NA: stopped for UN, but VC: -0.01465
 [9,] 0.2897166 -0.0063613888    -0.00636
[10,] 0.2960527  0.2960525330    NA: stopped for UN, but VC: 0.03951

Upon furhter investigation, we found that the large values for the coefficient disappear, when the model is changed to

lmer(dQTcF ~ C + trt + TP - 1 + (1|id),  data=data )

while they remain essentially unchanged if we set REML=F.
Some more results are summarized in the attached word document.

We are currently thinking about strategies to cope with the problem above, but, of course, would be very glad to better understand what happens and how to avoid the problem.
We hope that you are also interested in feedback from the user community and that you find this problem interesting.

Owner

bbolker commented May 4, 2016 edited

findings so far:

  • results vary by replicate data set and optimizer within lme4: for some replicates (reps 1,3,7) all optimizers give the same answer; for all the others, the nloptr-based optimizers give values much closer to zero (presumably more correct), although they also return (possibly/presumably bogus) convergence warnings
  • stuff to check/explore:
    • VarCorr, conditional modes, standard errors of fixed estimates
    • likelihood slices
    • vary starting values? evaluate likelihood at 'best fit'
    • improve broom::tidy to handle all of these cases?
  • I'm experimenting with other tools (INLA, glmmTMB, glmmADMB) as well, not much to report yet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment