## Guess the Coin

In [2]:
# Week 2 — From Discrete to Continuous Bayes

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import beta
from ipywidgets import interact, IntSlider
%matplotlib inline


plt.style.use("seaborn-v0_8")

# --- Parameters ---
# prior parameters (Beta distribution)
a_prior, b_prior = 2, 2   # "probably fair" — centered around 0.5

# --- Helper function ---
def plot_posterior(heads=0, tosses=0):
    """
    Plot prior, likelihood (up to scale), and posterior for given data.
    heads: number of observed heads
    tosses: total number of coin tosses
    """
    x = np.linspace(0, 1, 400)
    
    # prior Beta(a,b)
    prior = beta.pdf(x, a_prior, b_prior)
    
    # posterior Beta(a+a_prior, b+b_prior)
    a_post = a_prior + heads
    b_post = b_prior + tosses - heads
    posterior = beta.pdf(x, a_post, b_post)
    
    plt.figure(figsize=(7,4))
    plt.plot(x, prior, "--", label=f"Prior Beta({a_prior},{b_prior})")
    plt.plot(x, posterior, label=f"Posterior Beta({a_post},{b_post})")
    plt.fill_between(x, posterior, alpha=0.2)
    plt.xlabel("Coin bias θ (probability of heads)")
    plt.ylabel("Density")
    plt.title(f"{heads} heads out of {tosses} tosses")
    plt.legend()
    plt.show()

# --- Interactive widget ---
interact(plot_posterior,
         heads=IntSlider(min=0, max=10, step=1, value=4, description="Heads"),
         tosses=IntSlider(min=1, max=10, step=1, value=5, description="Tosses"));



interactive(children=(IntSlider(value=4, description='Heads', max=10), IntSlider(value=5, description='Tosses'…

### Mini Exercise
Play with the sliders until you find a situation where your prior and data disagree.
- Describe what happens to the posterior.
- Why doesn’t it completely ignore your prior after only a few tosses?


1. As the prior mean is already fixed, only the data can be varied here. And when the observed data strongly disagrees with the prior belief, the posterior distribution shifts to a compromise value which lies between the prior mean(0.5) and the data's observed frequency(1.0). Then the posterior curve becomes much taller and narrower than the prior curve because the addition of real data significantly increases certainity by reducing variance. The posterior mean doesn't reach the observed frequency(1.0) because the weak but existing prior evidence still pulls the estimate slightly back toward 0.5.

2. The posterior distribution doesn't completely ignore the prior, even after a few contradictory tosses, because the Beta prior acts mathematically as "pseudo-data" that must be outweighed by the actual evidence. Since the default Beta(2,2) is equivalent to having seen 4 prior observations (2 heads and 2 tails), if we only input a few new tosses, our total evidence is still small. The strength of the new likelihood is insufficient to completely override the initial 4 pseudo-observations, forcing the resulting posterior to remain weighted average that incorporates the prior belief.