-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
P-value Rounding #257
Comments
Could you, please, provide a small reproducible example of this behavior? Package reprex may be quite handy here. |
library(infer)
library(tidyverse)
head(iris)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3.0 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5.0 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
diff_means <- iris %>%
filter(Species != "versicolor") %>%
specify(Sepal.Length ~ Species) %>%
calculate(stat = "diff in means",
order = c("setosa", "virginica"))
permute <- iris %>%
filter(Species != "versicolor") %>%
specify(Sepal.Length ~ Species) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute") %>%
calculate(stat = "diff in means",
order = c("setosa", "virginica"))
get_pvalue(permute, diff_means, direction = "two_sided")
#> # A tibble: 1 x 1
#> p_value
#> <dbl>
#> 1 0 Created on 2019-10-09 by the reprex package (v0.3.0) |
I believe this is a case where the p-value is actually effectively zero. If you plot the observed difference in the context of the null distribution: library(infer)
library(tidyverse)
diff_means <- iris %>%
filter(Species != "versicolor") %>%
specify(Sepal.Length ~ Species) %>%
calculate(stat = "diff in means",
order = c("setosa", "virginica"))
permute <- iris %>%
filter(Species != "versicolor") %>%
specify(Sepal.Length ~ Species) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute") %>%
calculate(stat = "diff in means",
order = c("setosa", "virginica"))
permute %>%
visualize() +
shade_p_value(obs_stat = diff_means, direction = "two_sided") Created on 2019-10-09 by the reprex package (v0.3.0) Is it such an extreme value that reporting a p-value of zero seems reasonable to me. |
Yes, you are correct. In this case, the p-value is negligible and effectively zero. Yet, to students, the p-value reported appears to be exactly zero. Concerningly, when using a parametric approach, with a t-test, the p-value reported is not exactly 0, but a very very very small value. library(infer)
library(tidyverse)
iris %>%
filter(Species != "versicolor") %>%
t_test(Sepal.Length ~ Species,
order = c("setosa", "virginica"),
alternative = "two_sided")
#> # A tibble: 1 x 6
#> statistic t_df p_value alternative lower_ci upper_ci
#> <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
#> 1 -15.4 76.5 3.97e-25 two.sided -1.79 -1.38 Created on 2019-10-10 by the reprex package (v0.3.0) Generally, we teach students that the probability of any value on a continuous distribution is never 0. Furthermore, when employing a permutation test, a p-value is never exactly 0. Instead, it is less than the number of replications you performed. I suggest two possible options to alleviate these concerns when the p-value is effectively 0:
|
For reference: very similar issues are discussed in #205. Mainly after this comment. |
Yes, the issues brought up in this issue reflect a similar sentiment towards the reporting of an exact 0 for a p-value when using a permutation test. I agree with @TimHesterberg that "a p-value of zero is literally impossible. You shouldn't report an impossible p-value" (comment). As @ismayc correctly asserted, a p-value in a permutation test is "dependent on how many samples we generated" (comment). Therefore, I believe it would be reasonable to take either of the two approaches I proposed above to remedy this output issue. Adding a simple check to see if the p-value is below a given threshold and modifying the output to be "~0" or "< 1/reps" seems to be a reasonable resolution. |
Hmmm... I just realized that from a programming perspective these approaches are tricky. Output of |
Maybe returning a warning stating that a value of 0 is based on the number of simulations chosen and is more likely to be very small would suffice instead of trying to return a string here instead. We could also create a vignette showing this expected behavior so that users are aware of it before it actually happens. |
@echasnovski, I see the difficulty with coercing the p-value to a character and how that may result in issues down the road. I agree with @ismayc that adding in a warning based on the size of the p-value or the number of simulations would be sufficient to make students think twice before reporting a p-value of 0. |
Based on the discussion, the best compromise seems to be adding a warning if output is exactly 0, as @ismayc suggested. |
If the true P-value is 3/reps, there is a 5% chance of observing zero
(using a Poisson approximation).
So a 95% one-sided CI for the true P-value is (0, 3/reps).
…On Fri, Jan 3, 2020 at 12:53 AM Evgeni Chasnovski ***@***.***> wrote:
Based on the discussion, the best compromise seems to be adding a warning
if output is exactly 0, as @ismayc <https://github.com/ismayc> suggested.
Something like "Note that zero p-value is based on number of random
samples computed in generate() step. Here it means that p-value is likely
to be less than {1/reps}.", where reps is number of samples computed in
generate() step (i.e. number of samples drawn from null distribution.
@atheobold <https://github.com/atheobold>, is this good enough?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#257?email_source=notifications&email_token=AH2N4PEUOOKLHJJEKPHIEALQ334JJA5CNFSM4I6ZLTPKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIAT7HY#issuecomment-570507167>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH2N4PERMMTLK2EXZGW673TQ334JJANCNFSM4I6ZLTPA>
.
|
@echasnovski, I like your compromise. While @TimHesterberg's suggestion is interesting, I don't feel that the output needs to contain a confidence interval for the p-value. Rather, I would appreciate it if when the output of |
I wasn't thinking of explicitly giving a CI.
Instead, I suggest replacing the statement
Here it means that p-value is likely to be less than {1/reps}."
with
Here it means that p-value is likely to be less than {3/reps}."
…On Fri, Jan 3, 2020 at 3:27 PM Allison Theobold ***@***.***> wrote:
@echasnovski <https://github.com/echasnovski>, I like your compromise.
While @TimHesterberg <https://github.com/TimHesterberg>'s suggestion is
interesting, I don't feel that the output needs to contain a confidence
interval for the p-value.
Rather, I would appreciate it if when the output of get_pvalue() is 0
that it come with a word of caution. This will help the teachers who use
the infer package when teaching permutation tests to better convey to
students how p-values from permutation tests should be reported.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#257?email_source=notifications&email_token=AH2N4PCBNTJO3I6YUS2TCPTQ37CWPA5CNFSM4I6ZLTPKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEICJ7CQ#issuecomment-570728330>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH2N4PCIBJ5UUQEHCMGDTIDQ37CWPANCNFSM4I6ZLTPA>
.
|
Deciding on what to use as upper bound of p-value in case it is zero proved to be challenging. @TimHesterberg, I'd be honest: I didn't really understand the reasoning behind choosing That got me thinking about how to approach this, which ended up being the following framework I can understand. Each time a value is generated from a null distribution it can end up inside critical region (region, probability of which under null distribution is p-value by definition) or outside of it. So "is value inside a region" generation can be viewed as sampling from Bernoulli distribution, main parameter of which constitutes a desirable p-value. Confidence interval of its parameter is thus can be viewed as confidence interval for p-value. I knew that this should be something meaningful when I found this answer on Cross Validated, which states that under certain conditions 95% CI for zero estimate is So I suggest the following warning: "Note that zero p-value is based on number of random samples computed in `generate()` step. Here it means that p-value is 95% likely to be between 0 and {ci[2]}.". Variable @andrewpbray, @ismayc, any thoughts? By the way, this also means that we can implement a |
I fear that adding this much content to a warning message is going to further confuse some introductory statistics students when they see it. Something like “Please be cautious. A value of 0 for the p-value here is an approximation based on the number of I used 1 instead of 3 above since we probably shouldn’t assume someone using the {infer} package has experience with distributions like the Poisson. Alternatively, a vignette could be created and then linked in the warning message to explain what is going on so the message isn’t so wordy. |
Ah, so not replace So I'll soon just add a simple warning without much extra information about upper bound of p-value. |
1/reps is misleading, because it implies too much precision.
We want to subtly steer people away from using too few samples,
by showing them the actual precision rather than letting them just assume
their answer is near perfect.
I'd suggest
The true P-value could easily be as much as {3/reps}. {Use the actual
value.}
or
The true P-value could easily be as much as {3.7/reps}. {Use the actual
value.}
The latter value is from
poisson.test(0)
Here a Poisson is a good approximation to binomial.
The default is a 95% two-sided interval, which seems reasonable.
Similarly, give warnings if other small numbers are observed. If observe 1,
then
The true P-value could easily be as much as {5.57/reps}.
If observe 2, then
The true P-value could easily be as much as {7.22/reps}.
See
poisson.test(0)
poisson.test(1)
poisson.test(2)
…On Sat, Jan 4, 2020 at 11:20 AM Evgeni Chasnovski ***@***.***> wrote:
Ah, so not replace 1/reps and reps with actual values? Yeah, this might
work and is simpler implementation. However, using 1/reps does seem like
imprecise alternative.
So I'll soon just add a simple warning without much extra information
about upper bound of p-value.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#257?email_source=notifications&email_token=AH2N4PDCEAAF3PRMSX3HCRDQ4DOQZA5CNFSM4I6ZLTPKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIC6NRY#issuecomment-570812103>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH2N4PA623CSMTWOJRVG6WDQ4DOQZANCNFSM4I6ZLTPA>
.
|
One reason we avoided all of this is the stats student using {infer} has been taught to think about the p-value in an intuitive way. I understand the concerns about imprecision so feel free to adapt as needed, but a warning message saying something like 7.22/1000 is going to be more confusing than it is worth to an Intro stats reader while inside R. I recommend adding a simple warning to be cautious of the p-value when it is 0 (or sufficiently small) in R and then add a vignette that goes into a lot more detail and link to that in the warning. In the vignette all of this discussion can be added with a lot more detail. This helps solve both problems of being cautious of reps size and also providing further information as needed without overwhelming a novice learner. |
I think vignette is a little bit too much. Section in help page of |
That’s fine with me too! If someone wanted to contribute a vignette to explain further via PR, we welcome it too though. |
I agree with @ismayc in that when making changes we should be mindful of who is using the While statistically I agree with the precision of presenting an upperbound for the p-value of a permutation test, this will likely confuse students. Many Introductory Statistics textbooks which employ simulation based inference advocate that p-values are reported to be less than 1 in the number of replications generated. I appreciate the revised warning message, but do think that further information about why the value of |
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue. |
Currently, when a test results in a very small p-value the default output rounds this p-value to 0. This hinders the conversation of properly reporting the size of you p-value, when teaching hypothesis testing.
Could we modify the default output for small p-values?
The text was updated successfully, but these errors were encountered: