Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numeric p-value "always" 1 or 0? #13

Closed
wleoncio opened this issue Feb 21, 2024 · 2 comments
Closed

Numeric p-value "always" 1 or 0? #13

wleoncio opened this issue Feb 21, 2024 · 2 comments
Assignees

Comments

@wleoncio
Copy link
Member

wleoncio commented Feb 21, 2024

Having a hard time finding an example where the analytic p-value is NA and the numeric one is neither 0 or 1.

I wonder if the numeric p-value (calculated below) should be computed based on >= or >.

perm_p_value <- sum(perm_chisq_bar >= chisq_bar) / n_perm

If chisq_bar = 0, then all perm_p_values are going to be equal to or larger than that. Assuming the equation below (eq. 5 from Chacko 1966) is always non-negative, which seems to be the case:

bilde

The code below offers a template for some benchmarking to be done between the >= and > solutions. Play around with different values for runs, reps, and vec. Bias, error, etc, are some simple ways to compare the estimates to the tabular value:

runs <- 1000L
reps <- 1000L
vec <- c(6, 8, 4, 10, 7, 3, 2)
sapply(seq_len(runs), function(x) permChacko(vec, n_perm = reps)[["p_values"]])
@wleoncio wleoncio self-assigned this Feb 21, 2024
@wleoncio
Copy link
Member Author

wleoncio commented Mar 7, 2024

As a first attempt at studying the issue through simulations, please see this script, which outputs this the last time I tried:

bilde

Which may be an indication that > is the preferrable estimator, especially if the analytic solution is absent.

To reproduce (results vary due to randomization), we need the package version on the issue-13 branch. I used version 0.2.0.9000-1709818294 for this post

remotes::install_github("ocbe-uio/permChacko@issue-13")

@wleoncio
Copy link
Member Author

wleoncio commented Apr 5, 2024

Solution: use mid-P values (Lancaster, 1961).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant