Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Result opposite to expected when using same "ref.group =" and "alternative =" parameters with t.test(), compare_means() and stat_compare_means() #118

Closed
OwenDonohoe opened this issue Oct 12, 2018 · 1 comment

Comments

@OwenDonohoe
Copy link

OwenDonohoe commented Oct 12, 2018

I've recently been playing around with t.test(), compare_means() and stat_compare_means() using R's example dataset ToothGrowth (the one used in common examples). But I have, what may be, 2 simple questions.

Here is what I've been doing

For compare_means and stat_compare_means I prepare the data like so.

data("ToothGrowth")
df = ToothGrowth

Note: I only use the "Lenght" (containing meaurements) and "supp" (containing group info, VC or OJ) columns in analysis

For t.test() I just split the data from the supp column of thr ToothGrowth dataset into two numeric vectors like so (as per here)

 VC <- c(4.2, 11.5, 7.3, 5.8, 6.4, 10, 11.2, 11.2, 5.2, 7, 16.5, 16.5, 15.2, 17.3, 22.5, 17.3, 13.6, 14.5, 18.8, 15.5, 23.6, 18.5, 33.9, 25.5, 26.4, 32.5, 26.7, 21.5, 23.3, 29.5) 
OJ <- c(15.2, 21.5, 17.6, 9.7, 14.5, 10, 8.2, 9.4, 16.5, 9.7, 19.7, 23.3, 23.6, 26.4, 20, 25.2, 25.8, 21.2, 14.5, 27.3, 25.5, 26.4, 22.4, 24.5, 24.8, 30.9, 26.4, 27.3, 29.4, 23)

Here, I use these 3 applications to compare tooth growth in two groups OJ (larger mean) and VC (smaller mean) from the "supp" column of the ToothGrowth dataset.

Firstly, I used t.test() as I figured it may be useful to get to know, given that its used in compare-_means and stat_compare_means. By placing VC second in the command line, it is taken as the reference/control in this test (this is the only conclusion I can arrive at, given the outcome). Here I test the alternative hypothesis that the tooth growth in OJ significantly lower, which it is clearly not. So we expect to be able to reject this Alternative Hypothesis. And indeed it its the case.

t.test(OJ, VC, alternative="less", var.equal=TRUE)

Two Sample t-test

data:  OJ and VC
t = 1.9153, df = 58, p-value = 0.9698
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
    -Inf 6.92918
sample estimates:
mean of x mean of y 
 20.66333  16.96333 

If I switch the order of the groups in the command line, taking OJ as reference and testing the alternative hypothesis that tooth growth in the VC group is significantly lower (which is the case), I expect to be able to confirm this alternative hypothesis. And indeed it its the case.

`t.test(VC, OJ, alternative="less", var.equal=TRUE)

Two Sample t-test

data:  VC and OJ
t = -1.9153, df = 58, p-value = 0.0302
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
       -Inf -0.4708204
sample estimates:
mean of x mean of y 
 16.96333  20.66333 

As an extension of this, if I test a different alternative hypothesis that tooth growth in the VC group is significantly greater, then as expected the test reveals that we should reject this alternative hypothesis.

t.test(VC, OJ, alternative="greater", var.equal=TRUE)

	Two Sample t-test

data:  VC and OJ
t = -1.9153, df = 58, p-value = 0.9698
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 -6.92918      Inf
sample estimates:
mean of x mean of y 
 16.96333  20.66333 

All fine

In agreement with this, if I call upon the t.test() package, the same data and same perameters to test the same hypotheses in compare_means(), I get the same answers as t.test() using OJ as the reference group and testing the "less" or "greater" alternative hypotheses

>compare_means(len ~ supp, data = df, ref.group = "OJ", method = "t.test", alternative = "less", var.equal=TRUE)
# A tibble: 1 x 8
  .y.   group1 group2      p p.adj p.format p.signif method
  <chr> <chr>  <chr>   <dbl> <dbl> <chr>    <chr>    <chr> 
1 len   OJ     VC     0.0302  0.03 0.03     *        T-test
>compare_means(len ~ supp, data = df, ref.group = "OJ", method = "t.test", alternative = "greater", var.equal=TRUE)
# A tibble: 1 x 8
  .y.   group1 group2     p p.adj p.format p.signif method
  <chr> <chr>  <chr>  <dbl> <dbl> <chr>    <chr>    <chr> 
1 len   OJ     VC     0.970  0.97 0.97     ns       T-test


All great, every thing checks out so far.

Problem with stat_compare_means()

However, if I perform the exact same tests using stats_compare_means() I get the opposite answer

outfile="box_plot_VC_Less_Test.pdf"
pdf(file=outfile)
ggboxplot(df, 
          x = "supp",
          y = "len", 
          color = "supp", 
          palette = "npg", 
          add = "jitter")+
          stat_compare_means(method = "t.test", 
                             ref.group = "OJ", 
                             method.args = list(alternative = "less", 
                                                var.equal=TRUE))
dev.off()

The p-value from the test above is is 0.97 (see PDF of resulting plot), indicating that we reject the alternative hypothesis that tooth growth in the VC group is smaller than OJ (despite the other two methods giving the opposite answer)

outfile="box_plot_VC_Greater_Test.pdf"
pdf(file=outfile)
ggboxplot(df, 
          x = "supp",
          y = "len", 
          color = "supp", 
          palette = "npg", 
          add = "jitter")+
          stat_compare_means(method = "t.test", 
                             ref.group = "OJ", 
                             method.args = list(alternative = "greater", 
                                                var.equal=TRUE))
dev.off()

Then get the same opposite effect when I test the opposite alternative hypothesis the tooth growth in the VC group is greater (a hypotheses that I know I can reject from earlier results), in this case the alternative hypotheses holds, with a p-value of 0.03 (?) .The resulting plot would even suggest otherwise.

I'm unsure why this is happening, hopefully its just something small I that I'm doing wrong. Any thoughts?

Problem with compare_means()

When I return to compare_means to play around with the "ref.group" function as part of troubleshooting, I noticed something else I don't understand. When I switch the reference groups from earlier tests, rather than getting the opposite results, there is no change.

Example

OJ as reference group = p-value of 0.0302

>compare_means(len ~ supp, data = df, ref.group = "OJ", method = "t.test", alternative = "less", var.equal=TRUE)
# A tibble: 1 x 8
  .y.   group1 group2      p p.adj p.format p.signif method
  <chr> <chr>  <chr>   <dbl> <dbl> <chr>    <chr>    <chr> 
1 len   OJ     VC     0.0302  0.03 0.03     *        T-test

same test but with VC as reference group, p=value = 0.0302

> compare_means(len ~ supp, data = df, ref.group = "VC", method = "t.test", alternative = "less", var.equal=TRUE)
# A tibble: 1 x 8
  .y.   group1 group2      p p.adj p.format p.signif method
  <chr> <chr>  <chr>   <dbl> <dbl> <chr>    <chr>    <chr> 
1 len   VC     OJ     0.0302  0.03 0.03     *        T-test

And the same if I test with the opposite alternative hypothesis. Different reference groups, but the same answer

>compare_means(len ~ supp, data = df, ref.group = "OJ", method = "t.test", alternative = "greater", var.equal=TRUE)
# A tibble: 1 x 8
  .y.   group1 group2     p p.adj p.format p.signif method
  <chr> <chr>  <chr>  <dbl> <dbl> <chr>    <chr>    <chr> 
1 len   OJ     VC     0.970  0.97 0.97     ns       T-test
> compare_means(len ~ supp, data = df, ref.group = "VC", method = "t.test", alternative = "greater", var.equal=TRUE)
# A tibble: 1 x 8
  .y.   group1 group2     p p.adj p.format p.signif method
  <chr> <chr>  <chr>  <dbl> <dbl> <chr>    <chr>    <chr> 
1 len   VC     OJ     0.970  0.97 0.97     ns       T-test

Am I missing something here? Any thoughts on this would be much appreciated.

@kassambara
Copy link
Owner

Thank you for reporting this issue, fixed now!

The option ref.group was only considered when the grouping variable contains more than two levels. In that case, each level is compared against the specified reference group. Now, ref.group option is also considereded in two samples mean comparisons.

library(ggpubr)
compare_means(len ~ supp, data = ToothGrowth, ref.group = "OJ", 
              method = "t.test", alternative = "less", 
              var.equal=TRUE)
# A tibble: 1 x 8
  .y.   group1 group2      p p.adj p.format p.signif method
                   
1 len   OJ     VC     0.0302  0.03 0.03     *        T-test
compare_means(len ~ supp, data = ToothGrowth, ref.group = "VC", 
              method = "t.test", alternative = "less", 
              var.equal=TRUE)
# A tibble: 1 x 8
  .y.   group1 group2     p p.adj p.format p.signif method
                  
1 len   VC     OJ     0.970  0.97 0.97     ns       T-test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants