# Example: True bias of a coin

We're given a sequence of coin tosses, e.g. `h t t h h h t t h h t t t t ...`

Say, we observe 35 heads in 100 tosses.

What is the true ("hidden") parameter $r := P(\text{heads})$ of the coin?

### Quick note: Frequentist vs. Bayesian statistics

**Frequentist stats**:
- Treats probability as the long-run frequency of occurrence of an event
- Parameters considered unknown but *deterministic* quantities
- Relies on sampling, estimators, hypothesis testing, etc *without* incorporating prior information

**Bayesian stats**:
- Treats probability as a measure of belief or certainty about an event
- Parameters modeled as *random variables* with own probability distributions
- Combines prior beliefs with observed data through Bayes' theorem

Bayesian stats receive an increased interest, e.g. in medical research:

<img src="img/bayesian-medical.jpg" style="width: 450px;"/>

<small>Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6406060/</small>

### Frequentist approach

So-called frequentist guess is:
- Mean: $\mu = 35/100 = 0.35$
- Std: (cannot estimate, have a single sample only)

We're already done here :/

### Bayesian approach

Symbols used:
- $H$, $T$: Random variables expressing the numbers of heads and tails, respectively
- $h$, $t$: Actually observed numbers of heads and tails, respectively
- $N$: Total number of coin tosses, $N = H + T = h + t$
- $r$: Unknown probability of observing heads, $P(\text{heads})$
 

The posterior density of $r$ conditional on $h$ and $t$ is (Bayes' theorem): 
    
$\displaystyle p(r\mid H=h, T=t) = \frac{P(H=h\mid r, N=h+t)\,p(r)}{\int _{0}^{1}\Pr(H=h\mid s, N=h+t)\,p(s)\,ds}$

where
- $p(r) = \text{uniform}(0, 1)$ the prior distribution
- $\displaystyle \Pr(H=h\mid r,N=h+t)={N \choose h}r^{h}(1-r)^{t}$ the likelihood distribution

Plugging this back into the posterior (and using a popular trick called [conjugate priors](https://en.wikipedia.org/wiki/Conjugate_prior)) yields

${\displaystyle p(r\mid H=h,T=t)={\frac {(h+t+1)!}{h!\,t!}}r^{h}(1-r)^{t}.}$

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from scipy import special

N = 100
K = 35

p_r = lambda r: special.comb(N, K) * r**K * (1 - r) ** (N - K)

rs = np.linspace(0, 1, 1000)
ps = np.fromiter((p_r(r) for r in rs), float)
ps /= np.sum(ps)  # re-normalize the distribution

# mean and std of the distribution
print(f"mean: {np.sum(ps * rs):.5f}")
print(f"std:  {np.std(ps*rs):.5f}")
plt.figure(figsize=(9, 5))
plt.plot(rs, ps)
plt.xlabel("True bias r")
plt.ylabel("p(q)")
plt.grid(True)