## The Bayesian Mindset (Conceptual Focus)

### What Bayesian thinking is about

**Traditional (Frequentist) statistics asks**: If the true parameter were θ₀, how likely is the data we observed?

**Bayesian inference asks**: Given the data I observed, how likely is each possible value of $\theta$?

This shift—from reasoning about data given a parameter to parameter given data—is what makes Bayesian methods so intuitive for decision-making.

The Bayesian approach is learning from data: it is updating our beliefs as new evidence comes in.

### 🧩 The Building Blocks

(a) **Prior** — $ P(\theta) $

Represents your initial belief **before** seeing data.  

> Example: “I think most coins are fair, so $ \theta \approx 0.5 $.”


(b) **Likelihood** — $P(D \mid \theta)$

How probable are the observed data, **given** a specific $ \theta $?  

> Example: “If the coin bias were 0.7, how likely is getting 8 heads out of 10?”


 (c) **Posterior** — $ P(\theta \mid D) $

The **updated belief** after seeing data.  

$$
\text{Posterior} \propto \text{Likelihood} \times \text{Prior}
$$

> The prior and likelihood are combined like a recipe: prior belief filtered through data evidence. *Richard McElreath, Statistical Rethinking*

In [53]:
# Define a prior: coin bias θ ~ Uniform(0, 1)
import numpy as np
from scipy.stats import beta

theta = np.linspace(0, 1, 200)
prior = np.ones_like(theta)  # Uniform prior
prior_beta = beta.pdf(theta, a=10, b=10)  # centered around 0.5

In [54]:
# Define likelihood: observe coin flips, Suppose 8 heads and 2 tails
heads, tails = 100, 50
likelihood = theta**heads * (1 - theta)**tails

In [55]:
# Compute unnormalized posterior
posterior_unnorm = likelihood * prior_beta
posterior = posterior_unnorm / np.trapz(posterior_unnorm, theta)
posterior_beta = beta.pdf(theta, a=heads + 10, b=tails + 10)  # conjugate prior

In [56]:
mean_theta = np.trapz(theta * posterior, theta)
ci = (theta[np.searchsorted(np.cumsum(posterior)/posterior.sum(), 0.025)],
      theta[np.searchsorted(np.cumsum(posterior)/posterior.sum(), 0.975)])
print(f"Posterior mean θ ≈ {mean_theta:.2f}")
print(f"95% credible interval: {ci}")

Posterior mean θ ≈ 0.65
95% credible interval: (0.5728643216080402, 0.7185929648241206)


In [59]:
#Plot prior, likelihood, posterior
from lets_plot import *
LetsPlot.setup_html()
import pandas as pd

# Normalize for comparison
prior_norm = prior_beta / prior_beta.sum()
like_norm = likelihood / likelihood.sum()
post_norm = posterior_beta / posterior_beta.sum()

df = pd.DataFrame({
    "θ": np.concatenate([theta, theta, theta]),
    "density": np.concatenate([prior_norm, like_norm, post_norm]),
    "Distribution": ["Prior (Uniform)"] * len(theta)
                    + ["Likelihood"] * len(theta)
                    + ["Posterior"] * len(theta)
})
p = (
    ggplot(df, aes(x="θ", y="density", color="Distribution"))
    + geom_line(size=1.3)
    + ggtitle("Bayesian Update: Coin Bias after 8 Heads, 2 Tails")
    + xlab("θ (probability of Heads)")
    + ylab("Density")
    + theme(legend_position="right",
            plot_title=element_text(size=14, face="bold"))
    + theme_classic()
)

p.show()

In [60]:
p

In [62]:
# 1. Define the parameter space (the possible conversion rates)
# We focus on 0 to 0.1 (0% to 10%) since rates are low.
theta = np.linspace(0, 0.1, 200)

# 2. (a) Define the Prior belief
# We believe the rate is around 3%. A Beta(3, 97) distribution fits this well.
a_prior = 3
b_prior = 97
prior_dist = beta.pdf(theta, a_prior, b_prior)

# 3. (b) Define the Likelihood from the data
# We observe 1000 visitors and 40 signups.
visitors = 1000
signups = 40
non_signups = visitors - signups

# The likelihood is the probability of the data for each possible theta
likelihood = theta**signups * (1 - theta)**non_signups

# 4. (c) Compute the Posterior belief
# Posterior is proportional to Prior x Likelihood.
# A special property of the Beta distribution (conjugacy) makes this easy.
# The posterior is also a Beta distribution!
a_posterior = a_prior + signups
b_posterior = b_prior + non_signups
posterior = beta.pdf(theta, a_posterior, b_posterior)


In [63]:
# Normalize for comparison
prior_norm = prior_dist / prior_dist.sum()
like_norm = likelihood / likelihood.sum()
post_norm = posterior / posterior.sum()

df = pd.DataFrame({
    "θ": np.concatenate([theta, theta, theta]),
    "density": np.concatenate([prior_norm, like_norm, post_norm]),
    "Distribution": ["Prior (Uniform)"] * len(theta)
                    + ["Likelihood"] * len(theta)
                    + ["Posterior"] * len(theta)
})
p = (
    ggplot(df, aes(x="θ", y="density", color="Distribution"))
    + geom_line(size=1.3)
    + ggtitle("Bayesian Update: Coin Bias after 8 Heads, 2 Tails")
    + xlab("θ (probability of Heads)")
    + ylab("Density")
    + theme(legend_position="right",
            plot_title=element_text(size=14, face="bold"))
    + theme_classic()
)

p.show()