Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

predict vs fitted #403

Open
JannieLee opened this Issue Dec 11, 2016 · 16 comments

Comments

Projects
None yet
5 participants

Recently, I used this package to deal with the linear mixed effects model, however, I have got a problem as follows.

library(Matrix)
library(lme4)

fit<-lmer(Y~X1+X2+X3+X4+X5+X6+X7+(X1|Subject)+(X2|Subject)+(X3|Subject)+(X4|Subject)+(X5|Subject)+(X6|Subject)+(X7|Subject),data=mydata)
			
f_fit<-fitted(fit)
f_pre<-predict(fit, newdata = mydata , re.form = NULL)

The f_fit are not same to f_pre (e.g., f_fit[1]=32.5, f_pre[1]=179.6), but why?

Owner

bbolker commented Dec 12, 2016

Sorry, but there's no way we can answer this question without a reproducible example ... in a simple case, these values are indeed the same, e.g.

> library(lme4)
Loading required package: Matrix
> m1 <- lmer(Reaction~Days+(Days|Subject),sleepstudy)
> fitted(m1)[1]
       1 
253.6637 
> predict(m1,newdata=sleepstudy,re.form=NULL)[[1]][1]
[1] 253.6637

You have a rather complex random-effects structure, and in particular you have (implicitly) repeated intercept terms across all of your random effects. In principle this shouldn't mess up your predictions, but you should at least have seen a warning ...

> m2 <- lmer(Reaction~Days+(1|Subject)+(Days|Subject),sleepstudy)
Warning message:
In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?
Contributor

pitakakariki commented Dec 12, 2016

I can reproduce it with this dataset.

N <- 1000

mydata <- data.frame(

    Y = rnorm(N),
    X1 = rnorm(N),
    X2 = rnorm(N),
    X3 = rnorm(N),
    X4 = rnorm(N),
    X5 = rnorm(N),
    X6 = rnorm(N),
    X7 = rnorm(N),
    Subject = sample(letters[1:6], N, replace=TRUE)
)
Owner

bbolker commented Dec 12, 2016

Thanks @pitakakariki . I will dig into this. I will say, though, that I will be more diligent/enthusiastic if I can see some evidence that this also happens in a model that is not pathological ...

Thanks, @bbolker . I also got the same results, when I added random effects on onle one variable. But why does it not work with a rather complex random-effects structure? Thanks for your attention.

Contributor

pitakakariki commented Dec 12, 2016

Just a hunch but if the model hasn't converged, might mu still be an iteration behind b?

Owner

mmaechler commented Dec 12, 2016

@pitakakariki .. good hunch. I'd also guess something like that (there are more components of the fit object which could have not been completely updated). Do we have any example of a difference predict / fitted in a converged model?

Contributor

pitakakariki commented Dec 13, 2016

I think I've worked out what's happening. Have a look at ranef(fit)$Subject - it's a data frame with multiple (Intercept) columns.

There's a line in mkNewReTrms:

re_x[[rname]][,Rcnms[[i]]]

which in this example translates roughly to e.g.

ranef(fit)[["Subject"]][, c("(Intercept)", "X4")]

That's going to pick up the first column, whereas the "correct" column is the seventh in this case.

Contributor

pitakakariki commented Dec 13, 2016

> ranef(fit)$Subject
  (Intercept)            X1 (Intercept)            X2 (Intercept)            X3 (Intercept)
a           0  1.441526e-15           0 -3.695788e-13           0 -8.277490e-14 -0.01479389
b           0 -2.271339e-15           0 -2.245552e-14           0  8.921899e-14 -0.01030506
c           0 -1.709907e-15           0  2.729122e-14           0  1.999105e-14  0.01896256
d           0  2.394823e-15           0  1.816504e-13           0 -1.062534e-14  0.06577116
e           0  1.321437e-15           0 -8.350977e-14           0 -2.766679e-14 -0.04958625
f           0 -1.176540e-15           0  2.666024e-13           0  1.185700e-14 -0.01004851
            X4   (Intercept)            X5 (Intercept)            X6 (Intercept)           X7
a  0.013516964  1.105199e-13 -1.187151e-13           0 -2.164586e-14           0  0.001793271
b  0.009415586  8.720571e-14 -9.367217e-14           0  1.063831e-14           0  0.027662202
c -0.017325809  2.194088e-13 -2.356784e-13           0  2.616791e-14           0 -0.019034680
d -0.060094140  7.632466e-14 -8.198427e-14           0  2.097997e-14           0 -0.004971522
e  0.045306224 -3.346151e-13  3.594275e-13           0 -2.636821e-14           0 -0.003153396
f  0.009181175 -1.588440e-13  1.706225e-13           0 -9.772113e-15           0 -0.002295876
Owner

bbolker commented Dec 13, 2016

great! I will see what I can do to fix this. (Nevertheless, it does confirm my suspicion that this will only happen in a weird situation, i.e. when we have redundant/confounded intercept terms)

Contributor

pitakakariki commented Dec 13, 2016

Is it possible to specify the same model but with a single identifiable intercept term --- i.e. a random intercept which is correlated with the random slopes for x1 through x7, but where the random slopes are uncorrelated?

Or would that need flexLambda?

Owner

bbolker commented Dec 13, 2016

You could hack this up with the modular structure. The most immediately sensible way to fix this is to specify the random effects as (1|Subject) + (0+X1|Subject)+ (0+X2|Subject) + ... I'd like to hear more from the OP about their problem. Do they really have enough data to fit this complex a model? How important is it that the intercept be correlated with the slopes? (Setting the slopes to be all independent of each other is already a fairly strong assumption.) Would a compound-symmetric structure (which also requires some extra work) be more sensible? Or model with a full unstructured variance-covariance matrix but shrunk toward the diagonal matrix (which could be achieved via MCMCglmm, brms, or blme) ?

Thanks, @bbolker, I think it works with the structure ((1|Subject) + (0+X1|Subject)+ (0+X2|Subject) + ...).
Thanks for all.

Is it correct to have X1, X2....X7 in bot the fixed and the random side simultaneously?

Owner

bbolker commented Mar 10, 2017

Yes. In fact it's generally necessary for a sensible model. y ~ X + (X|f) specifies a fixed population-level effect of X plus random variation of the effect across levels of f. y~1+(X|f) only makes sense in very particular situations (e.g. effects of X have already been standardized to 0 at the population level, or as a null model for testing whether the population-level effect is zero).

I could have sworn I started to work on this problem, but I can't find any evidence thereof (e.g., a branch where I was working on a solution).

Another example.
Imagine you measure the weight a person every year for his whole life.
In order to study each persons weight variation I could write a simple model

Weight ~Age+(1|ID)

Would it be right to express it as
Weight ~Age+(Age|ID) instead?

Owner

bbolker commented Mar 11, 2017

This isn't the right venue for this conversation; the issue list is for discussion of the software itself (bugs, wishlist, etc.), not general questions about mixed modeling or lme4's formula specifications. Could you please raise it on r-sig-mixed-models@r-project.org instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment