# Bayesian vs. Frequentist Fair Coin Hypothesis Testing

We answer whether a coin is fair or not based on a coin-flip experiment using Bayesian and Frequentist hypothesis testing methodologies.

In [1]:
from scipy import stats
import scipy.special as sps
import numpy as np

# Follows https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Chapter1_Introduction/Ch1_Introduction_PyMC3.ipynb
def coin_flipper(N, pi=0.5, seed=42):
    """Returns a coin flip, where 1 is heads and 0 is tails.

    Parameters
    ----------
    N : int
        Number of flips in the experiment.
    pi : float in [0, 1], optional.
        The probability of the coin returning heads.  Default: 0.5.
    seed : None or int or `np.random.RandomState` instance, optional
        Seed value.  Default: 42.
    """
    return stats.bernoulli.rvs(pi, size=N, random_state=seed)

## Bayesian Methodology

Following [these lecture notes](http://idiom.ucsd.edu/~rlevy/lign251/fall2007/lecture_9.pdf) from [Roger Levy's Linguistics 251 course](http://idiom.ucsd.edu/~rlevy/lign251/fall2007/), let us set up two hypotheses, $H_f$ that the coin is fair, and $H_{uf}$ that it is unfair.  Assume our data is $\vec{x}$.  Then

\begin{eqnarray}
P(H_f|\vec{x}) &=& \frac{P(\vec{x}|H_f)P(H_f)}{P(\vec{x})} \\
P(H_{uf}|\vec{x}) &=& 1 - P(H_f|\vec{x})
\end{eqnarray}

Moreover, we must marginalize over the possible hypotheses:

$$
P(\vec{x}) = P(\vec{x}|H_f)P(H_f) + P(\vec{x}|H_{uf})P(H_{uf})
$$

To use these equations, we'll need to specify the hypotheses and their probabilities.  Take $\pi$ to be probability of heads (more generally the Bernoulli parameter).  Then, for $H_f$, we have $P(\pi|H_f) = 1$ if $\pi = 0.5$ and $P(\pi|H_f) = 0$ if $\pi \neq 0.5$. Then:

$$
P(\vec{x}|H_f) = \sum_i P(\vec{x}|\pi_i)P_{H_f}(\pi_i) = P(\vec{x}|\pi_i = 0.5) = {n\choose n_\mathrm{heads}}\frac{1}{2^n}
$$

Where we have used the fact that $P(\vec{x}|\pi_i) =  {n\choose n_\mathrm{heads}} \pi^{n_\mathrm{heads}}(1 - \pi)^{n_\mathrm{tails}}$.

For $H_{uf}$ we run into a problem, since we'd likely want to assume $\pi \in [0, 1]$ (and is thus uniformly distributed), and is equally likely to take on any value within that range.  In that case:

$$
P(\vec{x}|H_{uf}) = \sum_i P(\vec{x}|\pi_i)P_{H_{uf}}(\pi_i) = \int_0^1 P(\vec{x}|\pi_i)P(\pi_i)d\pi
$$

$P(\pi_i) = 1$ for a uniform RV about $[0, 1]$, and $P(\vec{x}|\pi_i)$ is the same as from above.  As it turns out, the following relation holds:

$$
\int_0^1 \pi^{a}(1 - \pi)^{b}d\pi = \frac{\Gamma(a + 1)\Gamma(b + 1)}{\Gamma(a + b + 2)} = \frac{a!b!}{(a + b + 1)!}
$$

Where $\Gamma$ is the usual [Gamma function](https://en.wikipedia.org/wiki/Gamma_function).  This simplifies cancels with the ${n\choose n_\mathrm{heads}}$ and thus,

$$
P(\vec{x}|H_{uf}) = \frac{1}{n_\mathrm{heads} + n_\mathrm{tails} + 1}
$$

Assuming we have no prior information, $P(H_f) = P(H_{uf}) = 0.5$.  In cases like this where both hypotheses have equally likely priors, we can use the [Bayes factor](https://en.wikipedia.org/wiki/Bayes_factor), the ratio of likelihoods between two hypotheses, to determine which to favour.  (In cases where there are multiple hypotheses with equal priors, we would take the maximum among them, which would have the highest Bayes factor when compared against any other hypothesis.)

In [44]:
def get_prob_xhf(heads, tails):
    return sps.comb(heads + tails, heads) / 2.**(heads + tails)

def get_prob_xhuf(heads, tails):
    return float(heads + tails + 1)**(-1)

def get_cointoss_bayesian(experiment):
    """Determine whether the coin is fair using Bayesian hypothesis testing.
    
    Parameters
    ----------
    experiment : list-like
        Output from `coin_flipper`.
    
    Returns
    -------
    prob_hf : float
        Probability the coin is fair.
    bayes_factor : float
        Bayes factor P(H_f) / P(H_uf).    
    """
    heads = np.sum(experiment)
    tails = len(experiment) - heads
    prob_xhf = get_prob_xhf(heads, tails)
    prob_xhuf = get_prob_xhuf(heads, tails)
    prob_hf = prob_xhf / (prob_xhf + prob_xhuf)
    return prob_hf, prob_xhf / prob_xhuf

def print_cointoss_bayesian(experiment, name):
    print("Experiment {0} - heads/all: {1:d}/{2:d} = {3:.2f},"
          " P(H_f): {4:.4f}, Bayes Factor: {5:.4f}"
          .format(name, np.sum(experiment), len(experiment),
                  np.sum(experiment)/len(experiment), 
                  *get_cointoss_bayesian(experiment)))

In [73]:
experiment_f1 = coin_flipper(30, pi=0.5, seed=42)
experiment_f2 = coin_flipper(787, pi=0.5, seed=584)
experiment_uf1 = coin_flipper(27, pi=0.34, seed=56)
experiment_uf2 = coin_flipper(491, pi=0.59, seed=26854)
experiment_uf3 = coin_flipper(787, pi=0.55, seed=455)

In [74]:
print_cointoss_bayesian(experiment_f1, "F1")

Experiment F1 - heads/all: 13/30 = 0.43, P(H_f): 0.7757, Bayes Factor: 3.4576


In [75]:
print_cointoss_bayesian(experiment_f2, "F2")

Experiment F2 - heads/all: 385/787 = 0.49, P(H_f): 0.9491, Bayes Factor: 18.6507


In [76]:
print_cointoss_bayesian(experiment_uf1, "UF1")

Experiment UF1 - heads/all: 10/27 = 0.37, P(H_f): 0.6377, Bayes Factor: 1.7599


In [79]:
print_cointoss_bayesian(experiment_uf2, "UF2")

Experiment UF2 - heads/all: 292/491 = 0.59, P(H_f): 0.0026, Bayes Factor: 0.0026


In [78]:
print_cointoss_bayesian(experiment_uf3, "UF3")

Experiment UF3 - heads/all: 442/787 = 0.56, P(H_f): 0.0533, Bayes Factor: 0.0564


We see that in cases where 

https://www.annualreviews.org/doi/full/10.1146/annurev-statistics-031017-100307

http://idiom.ucsd.edu/~rlevy/lign251/fall2007/lecture_7.pdf