Result opposite to expected when using same "ref.group =" and "alternative =" parameters with t.test(), compare_means() and stat_compare_means() #118

OwenDonohoe · 2018-10-12T15:22:39Z

I've recently been playing around with t.test(), compare_means() and stat_compare_means() using R's example dataset ToothGrowth (the one used in common examples). But I have, what may be, 2 simple questions.

Here is what I've been doing

For compare_means and stat_compare_means I prepare the data like so.

data("ToothGrowth")
df = ToothGrowth

Note: I only use the "Lenght" (containing meaurements) and "supp" (containing group info, VC or OJ) columns in analysis

For t.test() I just split the data from the supp column of thr ToothGrowth dataset into two numeric vectors like so (as per here)

 VC <- c(4.2, 11.5, 7.3, 5.8, 6.4, 10, 11.2, 11.2, 5.2, 7, 16.5, 16.5, 15.2, 17.3, 22.5, 17.3, 13.6, 14.5, 18.8, 15.5, 23.6, 18.5, 33.9, 25.5, 26.4, 32.5, 26.7, 21.5, 23.3, 29.5) 
OJ <- c(15.2, 21.5, 17.6, 9.7, 14.5, 10, 8.2, 9.4, 16.5, 9.7, 19.7, 23.3, 23.6, 26.4, 20, 25.2, 25.8, 21.2, 14.5, 27.3, 25.5, 26.4, 22.4, 24.5, 24.8, 30.9, 26.4, 27.3, 29.4, 23)

Here, I use these 3 applications to compare tooth growth in two groups OJ (larger mean) and VC (smaller mean) from the "supp" column of the ToothGrowth dataset.

Firstly, I used t.test() as I figured it may be useful to get to know, given that its used in compare-_means and stat_compare_means. By placing VC second in the command line, it is taken as the reference/control in this test (this is the only conclusion I can arrive at, given the outcome). Here I test the alternative hypothesis that the tooth growth in OJ significantly lower, which it is clearly not. So we expect to be able to reject this Alternative Hypothesis. And indeed it its the case.

t.test(OJ, VC, alternative="less", var.equal=TRUE)

Two Sample t-test

data:  OJ and VC
t = 1.9153, df = 58, p-value = 0.9698
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
    -Inf 6.92918
sample estimates:
mean of x mean of y 
 20.66333  16.96333

If I switch the order of the groups in the command line, taking OJ as reference and testing the alternative hypothesis that tooth growth in the VC group is significantly lower (which is the case), I expect to be able to confirm this alternative hypothesis. And indeed it its the case.

`t.test(VC, OJ, alternative="less", var.equal=TRUE)

Two Sample t-test

data:  VC and OJ
t = -1.9153, df = 58, p-value = 0.0302
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
       -Inf -0.4708204
sample estimates:
mean of x mean of y 
 16.96333  20.66333

As an extension of this, if I test a different alternative hypothesis that tooth growth in the VC group is significantly greater, then as expected the test reveals that we should reject this alternative hypothesis.

t.test(VC, OJ, alternative="greater", var.equal=TRUE)

	Two Sample t-test

data:  VC and OJ
t = -1.9153, df = 58, p-value = 0.9698
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 -6.92918      Inf
sample estimates:
mean of x mean of y 
 16.96333  20.66333

All fine

In agreement with this, if I call upon the t.test() package, the same data and same perameters to test the same hypotheses in compare_means(), I get the same answers as t.test() using OJ as the reference group and testing the "less" or "greater" alternative hypotheses

>compare_means(len ~ supp, data = df, ref.group = "OJ", method = "t.test", alternative = "less", var.equal=TRUE)
# A tibble: 1 x 8
  .y.   group1 group2      p p.adj p.format p.signif method
  <chr> <chr>  <chr>   <dbl> <dbl> <chr>    <chr>    <chr> 
1 len   OJ     VC     0.0302  0.03 0.03     *        T-test
>compare_means(len ~ supp, data = df, ref.group = "OJ", method = "t.test", alternative = "greater", var.equal=TRUE)
# A tibble: 1 x 8
  .y.   group1 group2     p p.adj p.format p.signif method
  <chr> <chr>  <chr>  <dbl> <dbl> <chr>    <chr>    <chr> 
1 len   OJ     VC     0.970  0.97 0.97     ns       T-test

All great, every thing checks out so far.

Problem with stat_compare_means()

However, if I perform the exact same tests using stats_compare_means() I get the opposite answer

outfile="box_plot_VC_Less_Test.pdf"
pdf(file=outfile)
ggboxplot(df, 
          x = "supp",
          y = "len", 
          color = "supp", 
          palette = "npg", 
          add = "jitter")+
          stat_compare_means(method = "t.test", 
                             ref.group = "OJ", 
                             method.args = list(alternative = "less", 
                                                var.equal=TRUE))
dev.off()

The p-value from the test above is is 0.97 (see PDF of resulting plot), indicating that we reject the alternative hypothesis that tooth growth in the VC group is smaller than OJ (despite the other two methods giving the opposite answer)

outfile="box_plot_VC_Greater_Test.pdf"
pdf(file=outfile)
ggboxplot(df, 
          x = "supp",
          y = "len", 
          color = "supp", 
          palette = "npg", 
          add = "jitter")+
          stat_compare_means(method = "t.test", 
                             ref.group = "OJ", 
                             method.args = list(alternative = "greater", 
                                                var.equal=TRUE))
dev.off()

Then get the same opposite effect when I test the opposite alternative hypothesis the tooth growth in the VC group is greater (a hypotheses that I know I can reject from earlier results), in this case the alternative hypotheses holds, with a p-value of 0.03 (?) .The resulting plot would even suggest otherwise.

I'm unsure why this is happening, hopefully its just something small I that I'm doing wrong. Any thoughts?

Problem with compare_means()

When I return to compare_means to play around with the "ref.group" function as part of troubleshooting, I noticed something else I don't understand. When I switch the reference groups from earlier tests, rather than getting the opposite results, there is no change.

Example

OJ as reference group = p-value of 0.0302

>compare_means(len ~ supp, data = df, ref.group = "OJ", method = "t.test", alternative = "less", var.equal=TRUE)
# A tibble: 1 x 8
  .y.   group1 group2      p p.adj p.format p.signif method
  <chr> <chr>  <chr>   <dbl> <dbl> <chr>    <chr>    <chr> 
1 len   OJ     VC     0.0302  0.03 0.03     *        T-test

same test but with VC as reference group, p=value = 0.0302

> compare_means(len ~ supp, data = df, ref.group = "VC", method = "t.test", alternative = "less", var.equal=TRUE)
# A tibble: 1 x 8
  .y.   group1 group2      p p.adj p.format p.signif method
  <chr> <chr>  <chr>   <dbl> <dbl> <chr>    <chr>    <chr> 
1 len   VC     OJ     0.0302  0.03 0.03     *        T-test

And the same if I test with the opposite alternative hypothesis. Different reference groups, but the same answer

>compare_means(len ~ supp, data = df, ref.group = "OJ", method = "t.test", alternative = "greater", var.equal=TRUE)
# A tibble: 1 x 8
  .y.   group1 group2     p p.adj p.format p.signif method
  <chr> <chr>  <chr>  <dbl> <dbl> <chr>    <chr>    <chr> 
1 len   OJ     VC     0.970  0.97 0.97     ns       T-test
> compare_means(len ~ supp, data = df, ref.group = "VC", method = "t.test", alternative = "greater", var.equal=TRUE)
# A tibble: 1 x 8
  .y.   group1 group2     p p.adj p.format p.signif method
  <chr> <chr>  <chr>  <dbl> <dbl> <chr>    <chr>    <chr> 
1 len   VC     OJ     0.970  0.97 0.97     ns       T-test

Am I missing something here? Any thoughts on this would be much appreciated.

The text was updated successfully, but these errors were encountered:

…isons #118

kassambara · 2018-11-02T08:56:14Z

Thank you for reporting this issue, fixed now!

The option ref.group was only considered when the grouping variable contains more than two levels. In that case, each level is compared against the specified reference group. Now, ref.group option is also considereded in two samples mean comparisons.

library(ggpubr)
compare_means(len ~ supp, data = ToothGrowth, ref.group = "OJ", 
              method = "t.test", alternative = "less", 
              var.equal=TRUE)

# A tibble: 1 x 8
  .y.   group1 group2      p p.adj p.format p.signif method
                   
1 len   OJ     VC     0.0302  0.03 0.03     *        T-test

compare_means(len ~ supp, data = ToothGrowth, ref.group = "VC", 
              method = "t.test", alternative = "less", 
              var.equal=TRUE)

# A tibble: 1 x 8
  .y.   group1 group2     p p.adj p.format p.signif method
                  
1 len   VC     OJ     0.970  0.97 0.97     ns       T-test

kassambara added a commit that referenced this issue Nov 2, 2018

Now, ref.group option is also considereded in two samples mean compar…

ace05f9

…isons #118

kassambara closed this as completed Nov 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Result opposite to expected when using same "ref.group =" and "alternative =" parameters with t.test(), compare_means() and stat_compare_means() #118

Result opposite to expected when using same "ref.group =" and "alternative =" parameters with t.test(), compare_means() and stat_compare_means() #118

OwenDonohoe commented Oct 12, 2018 •

edited

Loading

kassambara commented Nov 2, 2018

Result opposite to expected when using same "ref.group =" and "alternative =" parameters with t.test(), compare_means() and stat_compare_means() #118

Result opposite to expected when using same "ref.group =" and "alternative =" parameters with t.test(), compare_means() and stat_compare_means() #118

Comments

OwenDonohoe commented Oct 12, 2018 • edited Loading

kassambara commented Nov 2, 2018

OwenDonohoe commented Oct 12, 2018 •

edited

Loading