Dimensionality of Scheffe Method #171

zkuizx · 2020-02-06T21:21:31Z

I am using a one-way anova analysis with six levels. I have three contrasts of interest. When I use Scheffe method to get the confidence interval of them, the dimensionality from the output is 2 instead of 5. This only happens when I use the newest version of emmeans. When I use an old version of emmeans, the dimensionality is still 5. As my understanding, the dimensionality from Scheffe method should be always 5 = 6 - 1 if I have 6 levels and only focus on the contrasts. Is this a bug?

The attached are part of the R codes and output:

my.con = list(tc1.2 = c(0.5, -0.5, 0, 0.5, -0.5, 0), tc1.3 = c(0.5, 0, -0.5, 0.5, 0, -0.5) ,tc2.3 = c(0, 0.5, -0.5, 0, 0.5, -0.5))

summary(contrast(weight.lm, method = my.con, adjust = "scheffe"), level = 1 - alpha, side = "two-sided", infer = c(T,T))

contrast estimate SE df lower.CL upper.CL t.ratio p.value
tc1.2 -5.64 0.445 60 -6.72 -4.55 -12.656 <.0001
tc1.3 -8.00 0.445 60 -9.08 -6.92 -17.963 <.0001
tc2.3 -2.36 0.445 60 -3.45 -1.28 -5.307 <.0001

Confidence level used: 0.94
Conf-level adjustment: scheffe method with dimensionality 2
P value adjustment: scheffe method with dimensionality 2

rvlenth · 2020-02-06T22:00:55Z

Your old emmeans must be VERY old, because I changed this some time ago, from a "textbook" calculation of dimensionality to an actual dimensionality determination. In its underlying theory, the Scheffe method is based on the maximum possible t statistic among all linear combinations of contrasts under test, and its dimensionality is the dimensionality of the linear space containing all those under test. If you were testing all possible contrasts among the 6 means, that dimensionality is indeed 6-1 = 5. But you specified only 3 contrasts, and so the dimensionality cannot exceed 3. And it is in fact only 2 because tc2.3 = tc1.3 – tc1.2 (you can easily verify this) so that the third contrast can be determined from the other two. If these 3 contrasts are the only ones you are interested in, then the Scheffe correction with dimensionality 5 is way, way conservative and the results you have are preferable. If you really do want to view these three contrasts as just three among a vast number of others, then put more contrasts in the mix so that the dimensionality comes out 5; and then just ignore the results for all but those three. An easy way to do this is to use my.conx = c(my.con, emmeans:::consec.emmc(1:6)) summary(contrast(weight.lm, my.conx, adjust="scheffe", infer = TRUE) Russ

zkuizx · 2020-02-06T22:19:33Z

Hi, Russ,

Thank you very much! Your explanation confirmed my initial thought. I know my three contrasts listed above can be written as functions of two contrasts. Actually, I tried several other combinations of contrasts and found the change of dimension.

I agree with you that if there are only three pre-planned contrasts, then the adjustment with 5 dimension here is way too conservative. But if I do some "data snooping" and find that these three contrasts, I think we should still use the dimension of 5 and your suggestion provides me a perfect solution.

Again, thank you very much for the help!

rvlenth · 2020-02-08T18:10:53Z

I think your concerns are important enough to provide for manually specifying the rank to use in the Scheffe adjustment. After all, the purpose is data-snooping, and it seems it would not be uncommon to not specify all the contrasts we want in one call. So I have incorporated the possibility of a "hidden" argument scheffe.rank that may be included among ....

> library(emmeans)
> fm1 = aov(count ~ spray, InsectSprays)
> emm1 = emmeans(fm1, "spray")
> con1 = emmeans:::consec.emmc(LETTERS[1:6])[c(1,2,5)]

> summary(contrast(emm1, con1), adjust = "scheffe")
 contrast estimate  SE df t.ratio p.value
 B - A       0.833 1.6 66  0.520  0.9651 
 C - B     -13.250 1.6 66 -8.276  <.0001 
 F - E      13.167 1.6 66  8.223  <.0001 

P value adjustment: scheffe method with rank 3 

> summary(contrast(emm1, con1), adjust = "scheffe", scheffe.rank = 5)
 contrast estimate  SE df t.ratio p.value
 B - A       0.833 1.6 66  0.520  0.9981 
 C - B     -13.250 1.6 66 -8.276  <.0001 
 F - E      13.167 1.6 66  8.223  <.0001 

P value adjustment: scheffe method with rank 5

Note that I changed the annotation form "dimensionality" to "rank". It was just less awkward to use "rank" in the revised documentation, and so the annotation is changed to make it consistent.

The updated version will be pushed to github soon.

zkuizx · 2020-02-08T22:22:37Z

Hi, Russell,

Thank you very much! This is fantastic. Such implementation of manual specification of rank also provides a solution for another potential problem. When I used the contrast (1, 1, 1, -1, -1, -1) / 3 and several other contrasts, the rank of 6 instead of 5 was used. I think due to the numerical error, sometimes the system may not treat (1, 1, 1, -1, -1, -1) / 3 as an contrast. Your implementation actually provides a solution for this: I can manually set the rank to 5.

Again, thank you very much for the help! Have a nice weekend!

rvlenth · 2020-02-09T02:26:00Z

I think if you got a dimensionality of 6, you must have had a linear function somewhere in there that is not a contrast (coefficients do not sum to zero). I suppose it is possible that division by 3 creates a numerical inexactness, but I seriously doubt that that would exceed the tolerances for rank determination used by qr().

You can check this manually: Make a matrix of the contrasts you used, and run qr((). Then look at the $rank slot of the result. You can also calculate the sum of each set of coefficients.

> mat = cbind(as.matrix(emmeans:::pairwise.emmc(1:6)), c(1, 1, 1, -1, -1, -1) / 3)
> qr(mat)$rank
[1] 5

zkuizx · 2020-02-09T02:51:56Z

Hi, Russel,

I am quite surprised too. Here is what I got from R. You can see that only contrasts are in there. The rank from qr is 5 but the dimensionality is 6. By the way, how did you embed your R code in your message? They look much nicer than mine.

> my.con <- list(a = c(1, 1, 1, -1, -1, -1) / 3, 
    b12 = c(1, 0, 0, -1, 0, 0), 
    b13 = c(0, 1, 0, 0, -1, 0), 
    b23 = c(0, 0, 1, 0, 0, -1))
> my.con.mix <- c(my.con, emmeans:::consec.emmc(1:6))
> summary(contrast(rec.lsm, method = my.con.mix, adjust = "scheffe"), infer = c(T,T))
 contrast estimate      SE df lower.CL upper.CL t.ratio p.value
 a        -0.07233 0.00802 12 -0.10633 -0.03834 -9.021  0.0001 
 b12      -0.08333 0.01389 12 -0.14222 -0.02445 -6.000  0.0042 
 b13      -0.08067 0.01389 12 -0.13955 -0.02178 -5.808  0.0055 
 b23      -0.05300 0.01389 12 -0.11189  0.00589 -3.816  0.0901 
 2 - 1    -0.00633 0.01389 12 -0.06522  0.05255 -0.456  0.9997 
 3 - 2     0.03333 0.01389 12 -0.02555  0.09222  2.400  0.4905 
 4 - 3     0.05633 0.01389 12 -0.00255  0.11522  4.056  0.0647 
 5 - 4    -0.00900 0.01389 12 -0.06789  0.04989 -0.648  0.9981 
 6 - 5     0.00567 0.01389 12 -0.05322  0.06455  0.408  0.9999 

Confidence level used: 0.95 
Conf-level adjustment: scheffe method with dimensionality 6 
P value adjustment: scheffe method with dimensionality 6 
> 
> aa <- cbind(my.con[[1]], my.con[[2]], my.con[[3]], my.con[[4]], as.matrix(emmeans:::consec.emmc(1:6)))
> aa
                      2 - 1 3 - 2 4 - 3 5 - 4 6 - 5
1  0.3333333  1  0  0    -1     0     0     0     0
2  0.3333333  0  1  0     1    -1     0     0     0
3  0.3333333  0  0  1     0     1    -1     0     0
4 -0.3333333 -1  0  0     0     0     1    -1     0
5 -0.3333333  0 -1  0     0     0     0     1    -1
6 -0.3333333  0  0 -1     0     0     0     0     1
> qr(aa)$rank
[1] 5

rvlenth · 2020-02-09T03:35:26Z

Hmmm. OK, I have verified this. The rank determination is actually based on the linfct slot. In debug mode (on a different example with 6 means), I found:

Browse[2]> object@linfct
       (Intercept)     sprayB     sprayC     sprayD     sprayE     sprayF
 [1,] 1.110223e-16  0.3333333  0.3333333 -0.3333333 -0.3333333 -0.3333333
 [2,] 0.000000e+00  0.0000000  0.0000000 -1.0000000  0.0000000  0.0000000
 [3,] 0.000000e+00  1.0000000  0.0000000  0.0000000 -1.0000000  0.0000000
 [4,] 0.000000e+00  0.0000000  1.0000000  0.0000000  0.0000000 -1.0000000
 [5,] 0.000000e+00  1.0000000  0.0000000  0.0000000  0.0000000  0.0000000
 [6,] 0.000000e+00 -1.0000000  1.0000000  0.0000000  0.0000000  0.0000000
 [7,] 0.000000e+00  0.0000000 -1.0000000  1.0000000  0.0000000  0.0000000
 [8,] 0.000000e+00  0.0000000  0.0000000 -1.0000000  1.0000000  0.0000000
 [9,] 0.000000e+00  0.0000000  0.0000000  0.0000000 -1.0000000  1.0000000
Browse[2]> qr(object@linfct)$rank
[1] 6

so apparently, even a minuscule value in the intercept column rules the day. If I add zapsmall() to it,

Browse[2]> zapsmall(object@linfct)
      (Intercept)     sprayB     sprayC     sprayD     sprayE     sprayF
 [1,]           0  0.3333333  0.3333333 -0.3333333 -0.3333333 -0.3333333
 [2,]           0  0.0000000  0.0000000 -1.0000000  0.0000000  0.0000000
 [3,]           0  1.0000000  0.0000000  0.0000000 -1.0000000  0.0000000
 [4,]           0  0.0000000  1.0000000  0.0000000  0.0000000 -1.0000000
 [5,]           0  1.0000000  0.0000000  0.0000000  0.0000000  0.0000000
 [6,]           0 -1.0000000  1.0000000  0.0000000  0.0000000  0.0000000
 [7,]           0  0.0000000 -1.0000000  1.0000000  0.0000000  0.0000000
 [8,]           0  0.0000000  0.0000000 -1.0000000  1.0000000  0.0000000
 [9,]           0  0.0000000  0.0000000  0.0000000 -1.0000000  1.0000000
Browse[2]> qr(zapsmall(object@linfct))$rank
[1] 5

so I modified the code accordingly. Not sure when I'll push that up, but that should at least help.

Thanks again

rvlenth · 2020-02-09T03:42:25Z

Highlight the code and click on <> in the toolbar. OR manually type a line with 3 back-ticks before and after the code block (```r to get syntax highlighting)

zkuizx · 2020-02-09T04:08:51Z

Hi, Russel,

Thank you very much for the quick response! This is really helpful! I will definitely try the tricks you mentioned to me in future.

Best wishes,

rvlenth · 2020-02-10T21:33:49Z

I think this is resolved, so am closing.

meierluk · 2021-05-10T12:31:30Z

Could it be the case the scheffe.rank is not working when there is only one contrast?

> data("PlantGrowth")
> fit <- aov(weight ~ group, data = PlantGrowth)
> fit.emm <- emmeans(fit, specs = ~ group)
> 
> ## Want to test the contrast c(1/2, -1, 1/2) with Scheffé
> 
> ## The following just gives the uncorrected p-value (of the regular t-test)
> summary(contrast(fit.emm, method = list(c(1/2, -1, 1/2)), adjust = "scheffe"),
+         scheffe.rank = 2)
 contrast        estimate    SE df t.ratio p.value
 c(0.5, -1, 0.5)    0.618 0.241 27 2.560   0.0164 

> 
> ## Adding a second contrast seems to help
> summary(contrast(fit.emm, method = list(c(1/2, -1, 1/2),
+                                         c(1, 0, -1)), adjust = "scheffe"),
+         scheffe.rank = 2)
 contrast        estimate    SE df t.ratio p.value
 c(0.5, -1, 0.5)    0.618 0.241 27  2.560  0.0532 
 c(1, 0, -1)       -0.494 0.279 27 -1.772  0.2265 

P value adjustment: scheffe method with rank 2

rvlenth · 2021-05-10T12:43:50Z

That could be. I think I have some code that disables all multiplicity adjustments when there is only one test. It does seem though that in principle, scheffe.rank should override it. I'll take a look at it.

rvlenth · 2021-05-10T21:00:00Z

Indeed. It took more doing than you might think to correct this. But definitely, if we want Scheffe adjustment with specified rank, it should not ignore that. Now I have, with your example:

> confint(contrast(fit.emm, method = list(c(1/2, -1, 1/2)), adjust = "sch"), scheffe.rank = 2)
 contrast        estimate    SE df lower.CL upper.CL
 c(0.5, -1, 0.5)    0.618 0.241 27 -0.00732     1.24

Confidence level used: 0.95 
Conf-level adjustment: scheffe method with rank 2 

> test(contrast(fit.emm, method = list(c(1/2, -1, 1/2)), adjust = "sch"), scheffe.rank = 2)
 contrast        estimate    SE df t.ratio p.value
 c(0.5, -1, 0.5)    0.618 0.241 27 2.560   0.0532 

P value adjustment: scheffe method with rank 2 

> test(contrast(fit.emm, method = list(c(1/2, -1, 1/2)), adjust = "bon"), scheffe.rank = 2)
 contrast        estimate    SE df t.ratio p.value
 c(0.5, -1, 0.5)    0.618 0.241 27 2.560   0.0164

So we have it working right for both CIs and tests. The last run verifies that it still ignores families of one in a non-Scheffe situation, even if we specified a scheffe rank.

One more test case:

> test(contrast(fit.emm, method = list(c(1/2, -1, 1/2)), adjust = "sch"), scheffe.rank = 1)
 contrast        estimate    SE df t.ratio p.value
 c(0.5, -1, 0.5)    0.618 0.241 27 2.560   0.0164 

P value adjustment: scheffe method with rank 1

It could have ignored this case since scheffe.rank == 1, but I think it is OK to explicitly annotate that anyway.

Thanks for reporting this.

Correction to Scheffe correction bug (#171 addendum)

meierluk · 2021-05-11T06:19:03Z

Wonderful, thanks for the fast fix!

rvlenth · 2021-06-30T15:43:57Z

I think this issue is resolved, so am closing.

rvlenth added enhancement question labels Feb 8, 2020

rvlenth added a commit that referenced this issue Feb 8, 2020

Added manually-specified Scheffe rank (#171)

80df0bb

rvlenth closed this as completed Feb 10, 2020

rvlenth added the bug label May 10, 2021

rvlenth reopened this May 10, 2021

rvlenth added a commit that referenced this issue May 10, 2021

Transformed scales in plots

be94689

Correction to Scheffe correction bug (#171 addendum)

rvlenth closed this as completed Jun 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dimensionality of Scheffe Method #171

Dimensionality of Scheffe Method #171

zkuizx commented Feb 6, 2020

rvlenth commented Feb 6, 2020 via email •

edited

zkuizx commented Feb 6, 2020

rvlenth commented Feb 8, 2020

zkuizx commented Feb 8, 2020

rvlenth commented Feb 9, 2020

zkuizx commented Feb 9, 2020 •

edited by rvlenth

rvlenth commented Feb 9, 2020

rvlenth commented Feb 9, 2020 •

edited

zkuizx commented Feb 9, 2020

rvlenth commented Feb 10, 2020

meierluk commented May 10, 2021

rvlenth commented May 10, 2021 via email •

edited

rvlenth commented May 10, 2021

meierluk commented May 11, 2021

rvlenth commented Jun 30, 2021

Dimensionality of Scheffe Method #171

Dimensionality of Scheffe Method #171

Comments

zkuizx commented Feb 6, 2020

rvlenth commented Feb 6, 2020 via email • edited

zkuizx commented Feb 6, 2020

rvlenth commented Feb 8, 2020

zkuizx commented Feb 8, 2020

rvlenth commented Feb 9, 2020

zkuizx commented Feb 9, 2020 • edited by rvlenth

rvlenth commented Feb 9, 2020

rvlenth commented Feb 9, 2020 • edited

zkuizx commented Feb 9, 2020

rvlenth commented Feb 10, 2020

meierluk commented May 10, 2021

rvlenth commented May 10, 2021 via email • edited

rvlenth commented May 10, 2021

meierluk commented May 11, 2021

rvlenth commented Jun 30, 2021

rvlenth commented Feb 6, 2020 via email •

edited

zkuizx commented Feb 9, 2020 •

edited by rvlenth

rvlenth commented Feb 9, 2020 •

edited

rvlenth commented May 10, 2021 via email •

edited