-
Notifications
You must be signed in to change notification settings - Fork 305
Closed
Description
While working on a PR to add a new emmeans-tidier, I noticed that the emmeans-tidiers have some internal and external inconsistencies:
- The
lsmobj-method uses the common argumentsconf.intandconf.levelas defined in theparam_confinttemplate. The other methods (e.g.,emmGrid) do not provide these arguments and instead rely on the argument names native to theemmeanssummary()-methods (e.g.,inferandlevel).
I haven't looked exhaustively at other methods, but I additionally noticed some inconsistencies compared to other contrast tidiers, specifically tidy.TukeyHSD():
tidy.TukeyHSD()reports the contrasted conditions in a column labelledcomparisonin the form ofa-b. In contrast, theemmeanstidiers return the same information in two columns labelledlevel1andlevel2(containingaandb).
fm1 <- aov(breaks ~ wool + tension, data = warpbreaks)
thsd <- TukeyHSD(fm1, "tension", ordered = TRUE)
tidy(thsd)
emmp <- pairs(emmeans(fm1, ~ tension))
tidy(emmp)- In the
tibblereturned bytidy.TukeyHSD(), the column containing p-values is labelledadj.p.value. In contrast, theemmeanstidiers label this columnp.valueregardless of whether it has been adjusted for multiple comparisons or not (see code above). Unless I missed something, the use ofadj.p.valueis currently unique totidy.TukeyHSD().
It seems desirable to try to keep things consistent across methods where possible but particularly within the set of tidiers for a given package. I would, therefore, suggest the following changes, that I'd be willing to implement in a PR:
- Add the arguments
conf.intandconf.levelto all emmeans tidiers. - Change reporting of contrast pairs in either
tidy.TukeyHSD()or theemmeans-methods. I'm not sure which of the two is preferable here. - Either use
adj.p.valueinemmeanstidiers whenever p-values are adjusted for multiple comparisons or usep.valueintidy.TukeyHSD(). Again I'm not sure which is preferable.
Any thoughts?