Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistency of syntax between lmer and aov_4 #34

Closed
puterleat opened this issue Jun 26, 2017 · 2 comments
Closed

Consistency of syntax between lmer and aov_4 #34

puterleat opened this issue Jun 26, 2017 · 2 comments

Comments

@puterleat
Copy link

Am I wrong to think that these two models should do the same thing (or at least, be a close approximation of one another)?

library(lmerTest)
library(afex)
aov_4(distance ~ age + (1|Subject), data=nlme::Orthodont, print.formula = T)
anova(lmer(distance ~ factor(age) + (1|Subject), data=nlme::Orthodont))

In practice this aov_4 call doesn't work because it doesn't properly specify the nesting of multiple observations within Subject. To get (what I think is) the equivalent of the lmer model you need to write:

aov_4(distance ~ age + (age|Subject), data=nlme::Orthodont)

I was hoping to be use aov_4 to help students transition between RM Anova and mixed models, but I'm worried these subtle differences in syntax will make it even more confusing than simply using aov_ez.

@singmann
Copy link
Owner

I do not agree. The Orthodont data is special in the sense that it has a repeated-measures variable, but no replicates for each cell of the design and unit of observation. This is data that is traditionally analyzed with ANOVAs, but can also be analyzed with mixed models, but not perfectly. We usually would want to estimate random slopes for the effect of age for the Subject random intercept, but because of the absence of replicates, the random slopes are not identified. So in principle, the correct mixed model formula would indeed be (assuming age is a factor): distance ~ age + (age|Subject).

However, given that we have no replicates, the random intercept model seems indeed the most appropriate, as I have discussed here: http://singmann.org/mixed-models-for-anova-designs-with-one-observation-per-unit-of-observation-and-cell-of-the-design/

So no, aov4 works exactly as it should. Within-subject factors should have something like a random slope in principle and that is what you have to specify. Furthermore, you need to somehow flag the within-subject factors. Note that the correct formula for aov for this design is distance ~ age + Error(Subject/(age)). Again, age is put somehow in relationship to Subject.

To sum this up, random intercept models are in principle dangerous, but somehow okay for this specific case.

@puterleat
Copy link
Author

Many thanks for your quick response, and that article, both of which are really helpful in clarifying the problem.

I can see that, from the perspective of experimental data where repetitions are the norm, this problem might seem a bit odd. However this type of data (no replicates for each cell of the design and unit of observation) are actually quite common in applied settings where one might want to avoid traditional RM Anova for other reasons (e.g. because of missing or unbalanced data). The most common case in my experience would be a clinical trial with a single outcome and > 2 measurement occasions. It seems a shame that afex, which does such a brilliant job of abstracting away some of the cruft of aov for experimental data, can't also work in this context.

As a consequence I wonder, if nothing else, whether the afex help pages would benefit from a small clarification? Or, even better an amended error message when someone tries to specify (1 | grouping) in aov_4 or fails to specify a random slope?

The issue is that, because aov4 is mimicking the syntax of lmer (described as 'lmer-like'), one might reasonably expect that either the pair of models I included above would be equivalent (they are not, and the aov4 call fails) or that the models below would be equivalent, when in fact the lmer model would fail (because as you point out it is non-identified):

aov_4(distance ~ age + (age | Subject), data=nlme::Orthodont)
lmerTest::anova(lmer(distance ~ factor(age) + (factor(age) | Subject), data=nlme::Orthodont)

The issue here is not the rights and wrongs of intercept-only mixed models, but rather the implied promise that aov4 will translate lmer model formulae to an equivalent traditional RM anova. Inevitably this translation is not 1:1 because the underlying models are different - but I'd hate for anyone else to waste the time I have trying to work out why. If you'd like me to draft something I'd be happy to do so!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants