### P value and hypotheis test
#### By: Thalia

Incorrect or confusing interpretations of P values are very common. In this tutorial I want to clarify the concept with a very simple example, hopefully it will help you better understand the meaning of p-value.

Recall in the tutorial of hypothesis testing, we gave an coin flipping example of hypothesis test:

$$ H_0: \text{ This is a fair coin } \qquad  H_a: \text{ This is not a fair coin }$$

When we design an experiment, we have to fix the level of significance $\alpha$ before conducting the experiment and collecting samples. Once the alpha level has been set, a test statistic is computed. Each statistic has their corresponding p-value. Mostly, the level of significance is fixed at $0.05$.

$\alpha$ sets the standard for how extreme the data must be before we can reject the null hypothesis. The p-value indicates how extreme the data are. We compare the p-value with the alpha to determine whether the observed data are statistically significantly different from the null hypothesis:

According to wikipedia:
>The p-value is the probability of obtaining test results at least as extreme as the observed results during the test, assuming that the null hypothesis is correct

P value is used to evaluate how well the sample data support the null hypothesis:
* A **low p value (smaller than $\alpha$)** indicates strong evidence against the null hypothesis -> **Statistically significant** -> **REJECT** the null hypotheses

* A **high p value (larger than $\alpha$)** indicates strong evidence for the null hypothesis -> **NOT statistically significant** -> **FAIL to reject** the null hypothesis


Go back to our coin flipping example, if we flip the coin 100 times and we got mostly, say 83 times, head (or tail), then we have no other reason but to accept that the coin is not fair. This is when it is said that the evidence is 
statistically significant. 

To be more specific, the p-value for this example is:

$$p-value = prob(\text{get 83 heads out of 100 trials} \mid \text{This is a fair coin})$$

Apparently, the p-value in this case gonna be really small, meaning this test result (83 tails/heads) is very abnormal given the null hypothesis is true. Thus, we can reject the null hypothesis, and accept the alternative hypothesis.

> We can actually calculate the p-value using p.d.f for Binomial distribution $B(k;n,p)$ with $n=100$, $p = 0.5$:

$$ prob(\text{83 heads out of 100 trials} \mid \text{This is a fair coin}) = \binom{100}{83} p^{83} (1-p)^{100-83}$$

In [12]:
# A binomial discrete random variable
from scipy.stats import binom
k, n, p = 83, 100, 0.5 
# Syntax: binom.pmf(k, n, p) = choose(n, k) * p**k * (1-p)**(n-k)
p_value = binom.pmf(k, n, p, loc=0)
print("P value of this hypothesis test is %s than significance \
level 0.05" %("LESS" if p_value < 0.05 else "Bigger"))

P value of this hypothesis test is LESS than significance level 0.05


We reject the null hypothesis at significance level 0.05.