# p-value

is the probability of obtaining test results at least as extreme as the results actually observed, under the assumption that the null hypothesis is correct. A very small p-value means that such an extreme observed outcome would be very unlikely under the null hypothesis.

p-value consists of 3 parts:

- observed data is randomly sampled
- propability of obtaining results as extream(rare) as observation
- probability of obtaining results more extream(rarer) than observation


If the null hypothesis specifies the probability distribution of the test statistic uniquely, then the p-value is given by:

- <img src="resources/p_value_right_tail.png" style="width: 300px; margin-left: 4em"/>
- <img src="resources/p_value_left_tail.png" style="width: 300px; margin-left: 4em"/>
- <img src="resources/p_value_both_sided.png" style="width: 450px; margin-left: 4em"/>

<!-- p-value of a 2 sided test could also be given as
<img src="resources/p_value_both_sided2.png" style="width: 300px; margin-left: 4em"/> -->

https://en.wikipedia.org/wiki/P-value

### Example

an experiment is performed to determine whether a coin flip is fair. Suppose that the experimental results show the coin turning up heads 14 times out of 20 total flips.

#### one-tailed (right-tailed) test

if one is actually interested in the possibility that the coin is biased towards falling heads, then the p-value of this result is the chance of a fair coin landing on heads at least 14 times out of 20 flips.

- H0: coin is fair, p(head) = 0.5
- Ha: coin favors head, p(head) > 0.5
- Alpha level: 0.05
- Observation: 14 heads out of 20 flips

That probability can be computed from binomial coefficients as

<img src="resources/coin_flip_right_tail.png" style="width: 400px; margin-left: 4em"/>

In [1]:
# verify the above result with code
from scipy import stats

p_value = 1 - stats.binom.cdf(13, 20, 0.5)
p_value

0.057659149169921875

#### two-sided (two-tailed) test

one might be interested in deviations in either direction, favoring either heads or tails. The two-tailed p-value, which considers deviations favoring either heads or tails, may instead be calculated.

- H0: coin is fair, p(head) = 0.5
- Ha: coin favors head, p(head) != 0.5
- Alpha level: 0.05
- Observation: 14 heads out of 20 flips

`2*min(Prob(no. of heads ≥ 14 heads), Prob(no. of heads ≤ 14 heads))= 2*min(0.058, 0.978) = 2*0.058 = 0.115`

*however, symmetry of the binomial distribution makes it an unnecessary computation to find the smaller of the two probabilities*

In [2]:
# verify the above result with code
from scipy import stats

p_ge_14_heads = 1 - stats.binom.cdf(13, 20, 0.5)
p_le_14_heads = stats.binom.cdf(14, 20, 0.5)

p_value = 2 * min(p_ge_14_heads, p_le_14_heads)
p_value

0.11531829833984375