Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upSome issues with p.adjust.method and combining p.values from multiple tests #28
Comments
PROBLEM 1This is not a real problem. The obtained output is expected by design. As explained, in the details section of the function, for a grouped data, the p-value adjustment is performed for each group level independently. In your example, there is only one comparaison by label. So, the p-value remains unchanged no matter whether you specify or not the option In the following example, we have 3 comparisons per library(rstatix)
# Data
df <- ToothGrowth
df$dose <- factor(Tdf$dose)
#> Error in factor(Tdf$dose): objet 'Tdf' introuvable
# Test 1
df %>%
group_by(supp) %>%
wilcox_test(len ~ dose)
#> # A tibble: 6 x 10
#> supp .y. group1 group2 n1 n2 statistic p p.adj
#> * <fct> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl>
#> 1 OJ len 0.5 1 10 10 7.5 1.00e-3 3.00e-3
#> 2 OJ len 0.5 2 10 10 0 1.81e-4 5.43e-4
#> 3 OJ len 1 2 10 10 26.5 8.20e-2 8.20e-2
#> 4 VC len 0.5 1 10 10 0 1.80e-4 5.40e-4
#> 5 VC len 0.5 2 10 10 0 1.82e-4 5.40e-4
#> 6 VC len 1 2 10 10 3 4.35e-4 5.40e-4
#> # … with 1 more variable: p.adj.signif <chr>
# Test 2
df %>%
group_by(supp) %>%
wilcox_test(len ~ dose, p.adjust.method = "none")
#> # A tibble: 6 x 10
#> supp .y. group1 group2 n1 n2 statistic p p.adj
#> * <fct> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl>
#> 1 OJ len 0.5 1 10 10 7.5 1.00e-3 1.00e-3
#> 2 OJ len 0.5 2 10 10 0 1.81e-4 1.81e-4
#> 3 OJ len 1 2 10 10 26.5 8.20e-2 8.20e-2
#> 4 VC len 0.5 1 10 10 0 1.80e-4 1.80e-4
#> 5 VC len 0.5 2 10 10 0 1.82e-4 1.82e-4
#> 6 VC len 1 2 10 10 3 4.35e-4 4.35e-4
#> # … with 1 more variable: p.adj.signif <chr>Created on 2020-02-17 by the reprex package (v0.3.0) PROBLEM 2This was also by design to have a fixed output format for pairwise comparisons, where p-values are supposed to be always adjusted. PROBLEM 3From my point of view, it makes no sens to combine the p-values of two completely different tests and to finally adjust them together. So, I would go as follow: # Required package --------------------
library(ggpubr)
#> Le chargement a nécessité le package : ggplot2
#> Le chargement a nécessité le package : magrittr
library(rstatix) # rstatix version: 0.4.0
#>
#> Attachement du package : 'rstatix'
#> The following object is masked from 'package:stats':
#>
#> filter
# I'm on Fedora 31 with R updated and all packages updated, if that detail is relevant
# Generate dataset ------------------------
set.seed(24)
labels <- c("A", "B", "C", "D", "E", "F", "G")
test_dataset <- data.frame(label = NULL, value = NULL)
means_vector <- c(75, 125, 75, 55, 45, 35, 25)
for (i in 1:length(means_vector)) {
ran_nums <- rnorm(n = 20, mean = means_vector[i], sd = 50)
temp_data <- data.frame(label = labels[i], value = ran_nums)
test_dataset <- rbind(test_dataset, temp_data)
}
# Tests -------------------
test_dataset_pvalues_1 <- test_dataset %>%
group_by(label) %>%
wilcox_test(value ~ 1, mu = 100) %>%
adjust_pvalue(method = "BH")
# Then I need to make some pairwise comparisons
comparisons_list <- list(c("A", "B"), c("A", "D"), c("A", "F"))
test_dataset_pvalues_2 <- test_dataset %>%
wilcox_test(value ~ label, comparisons = comparisons_list, p.adjust.method = "BH")
# Plot the data -------------------------------
myplot <- ggplot(test_dataset, aes(x = label, y = value)) +
geom_boxplot() +
geom_hline(yintercept = 100, linetype = 2, colour = "red")
# 1) Plot + One-sample test result
test_dataset_pvalues_1 <- test_dataset_pvalues_1 %>%
add_xy_position(x = "label") %>%
p_round(p.adj, digits = 2)
myplot +
stat_pvalue_manual(test_dataset_pvalues_1, label = "p.adj",
remove.bracket = TRUE, hjust = 1) # 2) Plot + Two-sample test result
test_dataset_pvalues_2 <- test_dataset_pvalues_2 %>%
add_xy_position(x = "label")
myplot +
stat_pvalue_manual(test_dataset_pvalues_2, label = "p.adj", tip.length = 0) Created on 2020-02-18 by the reprex package (v0.3.0) |
|
Ok, this is the murkier part of multiple testing correction for me.
|
|
The x positions calculated seem to be incorrect when making custom comparisons and not starting with the first label.
So here, even if the comparisons are D vs B, C vs D and B vs C the brackets are drawn from A onwards. |
|
Thank you for reporting this issue, I will fix it as soon as possible |
|
fixed now, thanks! |
|
Thanks for the fix! |
|
stat_compare_means if from the ggpubr package. It doesn't handle multiple testing correction. I'll work on this; Related issue: kassambara/ggpubr#119 |



Hi again,
I'm trying to use your package for stuff I normally do in R. And so as I face difficulties, I'm posting here. If I'm doing something obviously wrong apologies again.
I've three issues this time:
Here's the code I used in the previous issue as an example code here again with the issues commented.
I use t_test / wilcox_test.
I think the issues are reproducible in the both of them (and maybe other tests that I've not checked).
Again, if I'm doing something obviously wrong as last time, apologies in advance!