
# Bayesian modeling of minds, brains and behavior
---
**Overview.** This notebook allows you to run Python code in your browser. We can mix text and code in a single document. This iis intended to support the lectures allowing me to demo certain concepts in a hands on way. It is also designed for you to work through it slowly after lectures, reading the text and doing the exercises by playing with the plots or even altering or writing code if you want to. We have an emphasis on play. Mess around. "I wonder what happens if I change this?". Let's start with a simple example of a Gaussian distribution.

---

## Lecture 2

**Beliefs as distributions.** Here you can adjust the sliders to change the mean (μ) and standard deviation (σ) of this Gaussian distribution. The mean moves the distribution so that its mean is higher or lower. The standard deviation changes the spread, changing the uncertainty of the belief

**Exercise.** Click on code cell below and press play to run. Play around with the distributions and think about how they would represent belief. Use the sliders to generate distributions that represent different prior beliefs for theta:

  - Certain belief that the theta is high

  - Certain belief that the theta is low 
  
  - Uncertain belief that the theta is high   
  
  - Uncertain belief that the theta is low*

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from ipywidgets import interact
from scipy.stats import norm

def plot_gaussian(mu=0.0, sigma=1.0):
    # Fixed x range
    x = np.linspace(-20, 20, 1000)
    y = norm.pdf(x, mu, sigma)

    fig, ax = plt.subplots(figsize=(8, 6))
    ax.plot(x, y, label=f'N({mu:.2f}, {sigma:.2f}²)', color='purple')
    ax.fill_between(x, y, color='plum', alpha=0.4)
    ax.set_xlabel("θ")
    ax.set_ylabel("Probability Density")
    ax.set_xlim(-20, 20)
    ax.set_ylim(0, 3)
    ax.grid(True)
    ax.legend()
    plt.show()

interact(
    plot_gaussian,
    mu=widgets.FloatSlider(min=-10, max=10, step=0.1, value=0.0, description="μ"),
    sigma=widgets.FloatSlider(min=0.1, max=5.0, step=0.1, value=1.0, description="σ")
);

interactive(children=(FloatSlider(value=0.0, description='μ', max=10.0, min=-10.0), FloatSlider(value=1.0, des…

**Interpreting probability distributions.** There are intuitive ways to read off probabilities from probability distributions like these. According to the belief encoded by the distribution, the probability of theta being larger than say 0.5 or 5 is the area under the curve above this value. Similarly, the probability of theta being between say 0.1 and 0.7 is the area under the curve between these two values

**Exercise.** Click on the code cell below and press play to run. 

- Play with the sliders to see how the area under the curve changes as you change the minimum and maximum values. 

- The shaded area represents the probability of theta being in the range you selected.

- What happens when you make the distribution more narrow, or more wide? 

In [2]:
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from ipywidgets import interact
from scipy.stats import norm

def plot_gaussian(mu=0.0, sigma=1.0, x_min=-1.0, x_max=1.0):
    # Fixed x range
    x = np.linspace(-20, 20, 1000)
    y = norm.pdf(x, mu, sigma)

    fig, ax = plt.subplots(figsize=(8, 6))
    ax.plot(x, y, label=f'N({mu:.2f}, {sigma:.2f}²)', color='purple')

    # Fill full curve lightly
    ax.fill_between(x, y, color='plum', alpha=0.2)

    # Highlight area between x_min and x_max
    mask = (x >= x_min) & (x <= x_max)
    ax.fill_between(x[mask], y[mask], color='mediumvioletred', alpha=0.6,
                    label=f"P({x_min:.2f} < θ < {x_max:.2f}) ≈ {norm.cdf(x_max, mu, sigma) - norm.cdf(x_min, mu, sigma):.2f}")

    ax.set_title("Probability as Area Under the Curve")
    ax.set_xlabel("θ")
    ax.set_ylabel("Probability Density")
    ax.set_xlim(-20, 20)
    ax.set_ylim(0, max(y) * 1.1)
    ax.grid(True)
    ax.legend()
    plt.show()

interact(
    plot_gaussian,
    mu=widgets.FloatSlider(min=-10, max=10, step=0.1, value=0.0, description="μ"),
    sigma=widgets.FloatSlider(min=0.1, max=5.0, step=0.1, value=1.0, description="σ"),
    x_min=widgets.FloatSlider(min=-10, max=10, step=0.1, value=-1.0, description="x min"),
    x_max=widgets.FloatSlider(min=-10, max=10, step=0.1, value=1.0, description="x max")
)


interactive(children=(FloatSlider(value=0.0, description='μ', max=10.0, min=-10.0), FloatSlider(value=1.0, des…

<function __main__.plot_gaussian(mu=0.0, sigma=1.0, x_min=-1.0, x_max=1.0)>

**Prior beliefs for theta.** We need to think carefully about what theta means. Theta represents cognitive ability in our go-no-go task. 0 is the worst possible cognitive ability, and 1 is the best. So we should set a prior distribution for theta between 0 and 1. The distributions above do not do this, so let's fix this. We can use a beta distribution to represent our prior belief about theta. The beta distribution is a continuous probability distribution defined on the interval [0, 1], so it is perfect for our needs.


**Exercise.** Play around with beta distribution parameters to see how the shape of the distribution changes:

  - You are complete uncertain about the ability → Set the widest prior

  - You are quite certain that ability will be high  → Set a narrow prior centered on a high value

  - You are quite certain that ability will be low  → Set a narrow prior centered on a low value

  - You are uncertain but you think ability is low  → Set a wide prior centered on a low value

In [3]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import beta
import ipywidgets as widgets
from IPython.display import display, clear_output

# Sliders
n_slider = widgets.IntSlider(value=10, min=1, max=100, step=1, description="n")
k_slider = widgets.IntSlider(value=5, min=0, max=100, step=1, description="k")

# Output area
plot_output = widgets.Output()

# Plot function
def plot_beta(k, n):
    x = np.linspace(0, 1, 500)
    a = k + 1
    b = (n - k) + 1
    y = beta.pdf(x, a, b)

    with plot_output:
        clear_output(wait=True)
        fig, ax = plt.subplots(figsize=(8, 5))
        ax.plot(x, y, label=f"Beta({a}, {b})", color='blue')
        ax.fill_between(x, y, color='skyblue', alpha=0.3)
        ax.set_title("Prior beliefs over θ")
        ax.set_xlabel("θ")
        ax.set_ylabel("Probability Density")
        ax.grid(True)
        ax.legend()
        plt.show()

# Unified update function
def on_slider_change(change=None):
    # Avoid recursive triggers by updating value only if needed
    if k_slider.value > n_slider.value:
        k_slider.unobserve(on_slider_change, names='value')
        k_slider.value = n_slider.value
        k_slider.observe(on_slider_change, names='value')
    plot_beta(k_slider.value, n_slider.value)

# Set up observers (after defining callback)
n_slider.observe(on_slider_change, names='value')
k_slider.observe(on_slider_change, names='value')

# Layout and display
display(widgets.VBox([n_slider, k_slider]), plot_output)

# Initial plot
plot_beta(k_slider.value, n_slider.value)


VBox(children=(IntSlider(value=10, description='n', min=1), IntSlider(value=5, description='k')))

Output()

**Multiplying prior and likelihood.**  We’ve learned that the posterior is proportional to the prior times the likelihood:

$$
\text{Posterior} \propto \text{Prior} \times \text{Likelihood}
$$

The likelihood tells us how well each value of $\theta$ predicted the data. If a value of $\theta$ predicts the data well → the posterior belief increases. If it predicts the data poorly → the posterior belief decreases. So the posterior reshapes your belief, using the likelihood as a reweighting function over your prior.

**Exercise.** Now you can explore this directly: Use the sliders to define a prior distribution and a likelihood. The resulting product is shown as the posterior. By default, this is not normalized (it doesnt inegrate to 1) — you are just seeing the raw shape. This shape still carries meaning: it shows which values of $\theta$ are most consistent with both your prior beliefs and the observed data.
- Set a prior and likelihood.
- Pick a few values of theta, and multiply the prior by the likelihood.
- See if the height of the posterior curve matches your expectation.

**Normalising the Posterior.** You can press the button to normalize the posterior.
To turn the posterior into a true probability distribution, we need to divide by the marginal likelihood — the constant that ensures the posterior integrates to 1 and is therefore a proper probability distribution.

$$
p(\theta \mid \text{data}) = \frac{p(\theta) \cdot p(\text{data} \mid \theta)}{p(\text{data})}
$$

**Exercise.** Click on the button to normalize the posterior. 

- Notice how the area under the curve is now 1, and the posterior is a proper probability distribution.

- The area under the prior and the posterior should be the same. 

- Do you need the same amount of paint to colour each in?

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import beta
import ipywidgets as widgets
from IPython.display import display

x = np.linspace(0, 1, 1000)

# --- Plotting function ---
def plot_bayes(k_prior, n_prior, k_like, n_like, show_norm_post):
    # Prior
    a_prior = k_prior + 1
    b_prior = (n_prior - k_prior) + 1
    y_prior = beta.pdf(x, a_prior, b_prior)

    # Likelihood (visual only)
    a_like = k_like + 1
    b_like = (n_like - k_like) + 1
    y_like = beta.pdf(x, a_like, b_like)

    # Unnormalized Posterior
    y_post = y_prior * y_like

    # Normalize if requested
    if show_norm_post:
        norm_const = np.trapz(y_post, x)
        y_post = y_post / norm_const if norm_const > 0 else np.zeros_like(x)

    # Plot
    fig, ax = plt.subplots(figsize=(8, 6))
    ax.plot(x, y_prior, label="Prior", color="blue")
    ax.plot(x, y_like, label="Likelihood", color="red")
    ax.plot(x, y_post, label="Posterior", color="green")

    ax.set_xlabel("θ")
    ax.set_ylabel("Probability Density")
    ax.grid(True)
    ax.legend()
    ax.set_ylim(0, 10)  # Or another sensible fixed upper limit

    plt.show()

# --- Sliders ---
k_prior_slider = widgets.IntSlider(min=0, max=100, step=1, value=1, description="k_prior")
n_prior_slider = widgets.IntSlider(min=0, max=100, step=1, value=2, description="n_prior")

k_like_slider = widgets.IntSlider(min=0, max=100, step=1, value=1, description="k_like")
n_like_slider = widgets.IntSlider(min=0, max=100, step=1, value=2, description="n_like")

# --- Checkbox to toggle normalization
show_norm_post = widgets.Checkbox(value=False, description="Normalize Posterior", indent=False)

# --- Ensure k ≤ n ---
def update_k_max(slider_k, slider_n):
    slider_k.max = slider_n.value
    if slider_k.value > slider_k.max:
        slider_k.value = slider_k.max

def update_k_max_prior(*args):
    update_k_max(k_prior_slider, n_prior_slider)

def update_k_max_like(*args):
    update_k_max(k_like_slider, n_like_slider)

n_prior_slider.observe(update_k_max_prior, names='value')
n_like_slider.observe(update_k_max_like, names='value')
update_k_max_prior()
update_k_max_like()

# --- UI layout ---
ui = widgets.VBox([
    widgets.HTML("<b style='color:blue'>Prior sliders</b>"),
    widgets.HBox([n_prior_slider, k_prior_slider]),
    widgets.HTML("<b style='color:red'>Likelihood sliders</b>"),
    widgets.HBox([n_like_slider, k_like_slider]),
    show_norm_post
])

# --- Plot binding
out = widgets.interactive_output(plot_bayes, {
    'k_prior': k_prior_slider,
    'n_prior': n_prior_slider,
    'k_like': k_like_slider,
    'n_like': n_like_slider,
    'show_norm_post': show_norm_post
})

display(ui, out)


VBox(children=(HTML(value="<b style='color:blue'>Prior sliders</b>"), HBox(children=(IntSlider(value=2, descri…

Output()

**Is the Likelihood a proper probability distribution?** Does the likelihood integrate to 1? Not the way we plot it. The likelihood is $p(\mathrm{data} \mid \theta)$: a probability distribution over the data, given a fixed value of $\theta$. It is not a probability distribution over $\theta$. So when we plot $\theta \mapsto p(\mathrm{data} \mid \theta)$, we are not plotting a probability density function — we're plotting a likelihood function, and it does not need to integrate to 1 over $\theta$. However, for each fixed $\theta$, the function is a proper probability distribution over data, and it satisfies:

$$
\int p(\mathrm{data} \mid \theta) \, d\,\mathrm{data} = 1
$$

So yes — the likelihood does integrate to 1, but only over the data, not over $\theta$.

**Exercise.** Wiggle the sliders to get different likelihoods:

- Left, when you plot the likelihood, over different values of theta, you can see that the likelihood does not integrate to 1.

- Right, when you plot the likelihood over different values of data, you can see that the likelihood does integrate to 1.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import binom
import ipywidgets as widgets
from IPython.display import display

# --- Plotting function ---
def plot_likelihood_views(n, k, theta_fixed):
    theta_vals = np.linspace(0, 1, 500)
    likelihood_theta = binom.pmf(k, n, theta_vals)
    area_theta = np.trapz(likelihood_theta, theta_vals)

    k_vals = np.arange(0, n + 1)
    likelihood_data = binom.pmf(k_vals, n, theta_fixed)
    area_data = np.sum(likelihood_data)

    fig, axes = plt.subplots(1, 2, figsize=(14, 5))

    # Left: likelihood as function of θ
    axes[0].plot(theta_vals, likelihood_theta, color='black', lw=2)
    axes[0].set_title(rf"$\theta \mapsto p(k={k} \mid \theta, n={n})$")
    axes[0].set_xlabel("θ")
    axes[0].set_ylabel(r"$p(k \mid \theta, n)$")
    axes[0].grid(True, linestyle='--', alpha=0.5)
    axes[0].set_ylim(0, 1)
    axes[0].text(0.95, 0.95, f"∫ dθ ≈ {area_theta:.3f}", transform=axes[0].transAxes,
                 ha='right', va='top', fontsize=10, bbox=dict(boxstyle='round', facecolor='white', edgecolor='gray'))

    # Right: likelihood as function of data (k)
    axes[1].bar(k_vals, likelihood_data, color='black', alpha=0.8)
    axes[1].set_title(rf"$k \mapsto p(k \mid \theta={theta_fixed:.2f}, n={n})$")
    axes[1].set_xlabel("k (number of successes)")
    axes[1].set_ylabel("Probability")
    axes[1].grid(True, linestyle='--', alpha=0.5)
    axes[1].set_ylim(0, 1)
    axes[1].text(0.95, 0.95, f"∑ = {area_data:.3f}", transform=axes[1].transAxes,
                 ha='right', va='top', fontsize=10, bbox=dict(boxstyle='round', facecolor='white', edgecolor='gray'))

    plt.tight_layout()
    plt.show()

# --- Sliders ---
n_slider = widgets.IntSlider(min=1, max=50, value=10, step=1, description="n (trials)")
k_slider = widgets.IntSlider(min=0, max=50, value=5, step=1, description="k (successes)")
theta_slider = widgets.FloatSlider(min=0.01, max=0.99, value=0.5, step=0.01, description="θ (fixed)")

# --- Enforce k ≤ n ---
def update_k_max(*args):
    k_slider.max = n_slider.value
    if k_slider.value > k_slider.max:
        k_slider.value = k_slider.max

n_slider.observe(update_k_max, names='value')
update_k_max()

# --- Layout ---
ui = widgets.VBox([
    widgets.HTML("Left: Likelihood as a function of θ"),
    widgets.HTML("Right: Likelihood as a function of data"),
    widgets.HBox([n_slider, k_slider, theta_slider])
])

out = widgets.interactive_output(plot_likelihood_views, {
    'n': n_slider,
    'k': k_slider,
    'theta_fixed': theta_slider
})

display(ui, out)


VBox(children=(HTML(value='Left: Likelihood as a function of θ'), HTML(value='Right: Likelihood as a function …

Output()

TraitError: The 'value' trait of an IntSlider instance expected an int, not the NoneType None.

**Bayesian credibility intervals.** A Bayesian credibility interval is a range of values that contains the true value of the parameter with a certain probability. 
It is similar to a confidence interval in frequentist statistics, but it is based on the posterior distribution of the parameter rather than the sampling distribution. Its interpretation is actually what most people think of when they hear the term "confidence interval". Its what they are looking for when they ask for a confidence interval. You can construct a Bayesian credibility interval by picking the probability you want to contain the true value of the parameter, and then finding the range of values that contains that probability. 

**Exercise.** Click on the code cell below and press play to run. 

- Play with the sliders to see how the credibility interval changes as you change the probability:

- The most typical is the 95% credibility interval $BCI_{95}$, which contains the value of the parameter with 95% probability. 

- The 50% credibility interval $BCI_{50}$ would contains the value of the parameter with 50% probability. And so on.

- How is this different from the frequentist confidence interval? 

In [7]:
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from ipywidgets import interact
from scipy.stats import norm

def plot_gaussian_bci(mu=0.0, sigma=1.0, bci=95.0):
    # Fixed x and y ranges
    x = np.linspace(-10, 10, 1000)
    y = norm.pdf(x, mu, sigma)

    # Compute credible interval bounds
    alpha = 1 - bci / 100
    lower_bound = norm.ppf(alpha / 2, mu, sigma)
    upper_bound = norm.ppf(1 - alpha / 2, mu, sigma)
    y_bound = norm.pdf([lower_bound, upper_bound], mu, sigma).min()

    fig, ax = plt.subplots(figsize=(8, 6))

    # Plot curve and shading
    ax.plot(x, y, label=f'N({mu:.2f}, {sigma:.2f}²)', color='purple')
    ax.fill_between(x, y, color='plum', alpha=0.2)

    # Highlight credible interval
    mask = (x >= lower_bound) & (x <= upper_bound)
    ax.fill_between(x[mask], y[mask], color='mediumvioletred', alpha=0.6,
                    label=f"{bci:.0f}% BCI: [{lower_bound:.2f}, {upper_bound:.2f}]")

    # Dashed lines at interval bounds
    ax.hlines(y=y_bound, xmin=lower_bound, xmax=upper_bound,
              color='black', linestyle='--', linewidth=1)
    ax.vlines([lower_bound, upper_bound], ymin=0, ymax=y_bound,
              color='black', linestyle='--', linewidth=1)

    # Fixed axes
    ax.set_xlim(-10, 10)
    ax.set_ylim(0, 0.5)  # Fixed y-range to accommodate all practical normal densities

    # Labels and legend
    ax.set_xlabel("θ")
    ax.set_ylabel("Probability Density")
    ax.grid(True, linestyle='--', alpha=0.5)
    ax.legend(loc="upper left")
    plt.show()

# Interactive sliders
interact(
    plot_gaussian_bci,
    mu=widgets.FloatSlider(min=-5, max=5, step=0.1, value=0.0, description="μ"),
    sigma=widgets.FloatSlider(min=0.1, max=3.0, step=0.1, value=1.0, description="σ"),
    bci=widgets.FloatSlider(min=50, max=99, step=1, value=95, description="BCI (%)")
)


interactive(children=(FloatSlider(value=0.0, description='μ', max=5.0, min=-5.0), FloatSlider(value=1.0, descr…

<function __main__.plot_gaussian_bci(mu=0.0, sigma=1.0, bci=95.0)>

**Summarising the posterior.** With a proper normalised posterior, we may want to summarise it with credible intervals. These are the Bayesian equivalent of confidence intervals. But better, obvs. 

**Exercise.** Play with the slider that finds the Xth percentile credible interval.

- What happens to the interval when you drop the percentile lower?

- Plot the MAP - Maximum a posteriori. What is it? 

- How would you define it?

In [10]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import beta
from scipy.special import betaln
import ipywidgets as widgets
from IPython.display import display

x = np.linspace(0, 1, 1000)

# --- Plotting function ---
def plot_bayes(k_prior, n_prior, k_like, n_like, ci_width, show_shading, show_mean, show_map, show_curves):
    # Prior parameters
    a_prior = k_prior + 1
    b_prior = (n_prior - k_prior) + 1
    y_prior = beta.pdf(x, a_prior, b_prior)

    # Likelihood as Beta PDF (for visualization only)
    a_like = k_like + 1
    b_like = (n_like - k_like) + 1
    y_like = beta.pdf(x, a_like, b_like)

    # Posterior parameters (from conjugate Beta-Binomial update)
    a_post = a_prior + k_like
    b_post = b_prior + n_like - k_like
    y_post = beta.pdf(x, a_post, b_post)

    # Compute proper marginal likelihood
    log_ml = betaln(k_like + a_prior, n_like - k_like + b_prior) - betaln(a_prior, b_prior)
    marginal_likelihood = np.exp(log_ml)

    # Compute Bayesian credible interval (BCI)
    lower = beta.ppf((1 - ci_width / 100) / 2, a_post, b_post)
    upper = beta.ppf(1 - (1 - ci_width / 100) / 2, a_post, b_post)

    # Posterior Mean and MAP
    posterior_mean = a_post / (a_post + b_post)
    map_index = np.argmax(y_post)
    posterior_map = x[map_index]

    # Plotting
    fig, ax = plt.subplots(figsize=(8, 6))
    
    if show_curves:
        ax.plot(x, y_prior, label="Prior", color="blue")
        ax.plot(x, y_like, label="Likelihood (visualized)", color="red")

    ml_text = f"Posterior (marginal likelihood ≈ {marginal_likelihood:.3f})"
    ax.plot(x, y_post, label=ml_text, color="green")

    if show_shading:
        ax.fill_between(x, y_post, where=(x >= lower) & (x <= upper), color='gray', alpha=0.3, label=f"{ci_width}% BCI")
    else:
        ax.axvline(lower, color='gray', linestyle='--', label=f"{ci_width}% BCI")
        ax.axvline(upper, color='gray', linestyle='--')

    if show_mean:
        ax.axvline(posterior_mean, color='black', linestyle=':', label="Posterior Mean")

    if show_map:
        ax.axvline(posterior_map, color='purple', linestyle='-.', label="Posterior MAP")

    ax.set_title("Bayesian Updating with BCI, Mean, and MAP")
    ax.set_xlabel("θ")
    ax.set_ylabel("Probability Density")
    ax.grid(True)
    ax.legend()
    plt.show()

# Sliders
k_prior_slider = widgets.IntSlider(min=0, max=100, step=1, value=1, description="k_prior", style={'description_width': 'initial'})
n_prior_slider = widgets.IntSlider(min=0, max=100, step=1, value=2, description="n_prior", style={'description_width': 'initial'})

k_like_slider = widgets.IntSlider(min=0, max=100, step=1, value=1, description="k_like", style={'description_width': 'initial'})
n_like_slider = widgets.IntSlider(min=0, max=100, step=1, value=2, description="n_like", style={'description_width': 'initial'})

ci_slider = widgets.IntSlider(min=50, max=99, step=1, value=95, description="BCI (%)", style={'description_width': 'initial'})

# Toggles (defaults set to False)
shading_toggle = widgets.Checkbox(value=False, description='Shade BCI')
mean_toggle = widgets.Checkbox(value=False, description='Show Posterior Mean')
map_toggle = widgets.Checkbox(value=False, description='Show MAP')
curves_toggle = widgets.Checkbox(value=False, description='Show Prior & Likelihood')

# Reusable fix for k ≤ n
def enforce_k_leq_n(k_slider, n_slider):
    k_slider.max = n_slider.value
    if k_slider.value > k_slider.max:
        k_slider.value = k_slider.max

# Observers
def update_k_max_prior(*args):
    enforce_k_leq_n(k_prior_slider, n_prior_slider)

def update_k_max_like(*args):
    enforce_k_leq_n(k_like_slider, n_like_slider)

n_prior_slider.observe(update_k_max_prior, names='value')
n_like_slider.observe(update_k_max_like, names='value')

# Initial sync
update_k_max_prior()
update_k_max_like()

# Layout
ui = widgets.VBox([
    widgets.HTML("<b style='color:blue'>Prior sliders</b>"),
    widgets.HBox([n_prior_slider, k_prior_slider]),
    widgets.HTML("<b style='color:red'>Likelihood sliders</b>"),
    widgets.HBox([n_like_slider, k_like_slider]),
    widgets.HTML("<b>Posterior Display Options</b>"),
    ci_slider,
    shading_toggle,
    mean_toggle,
    map_toggle,
    curves_toggle
])

out = widgets.interactive_output(plot_bayes, {
    'k_prior': k_prior_slider,
    'n_prior': n_prior_slider,
    'k_like': k_like_slider,
    'n_like': n_like_slider,
    'ci_width': ci_slider,
    'show_shading': shading_toggle,
    'show_mean': mean_toggle,
    'show_map': map_toggle,
    'show_curves': curves_toggle
})

display(ui, out)


VBox(children=(HTML(value="<b style='color:blue'>Prior sliders</b>"), HBox(children=(IntSlider(value=2, descri…

Output()

**Compute the posterior by updating the beta.** The beta distribution allows for a very simple way to compute the posterior just by adding n and k to the inputs to the Beta function. This is called computing the posterior analytically, and it is a special case of the *conjugate* prior. This simplicity is, alas, not always possible.

**Advanced explanation for why this works.**
You can skip this for the moment, its ok to assume that the equation works out this way for this specific case.  
- Beta is the conjugate prior for the binomial likelihood
- Prior: $p(\theta) \propto \theta^{\alpha - 1}(1 - \theta)^{\beta - 1}$
- Likelihood: $p(D \mid \theta) \propto \theta^k (1 - \theta)^{n - k}$
- Multiply prior and likelihood: exponents add
- Posterior: $p(\theta \mid D) \sim \text{Beta}(\alpha + k,\ \beta + n - k)$

**Exercise.** Click on the code cell below and press play to run.

- Play with the sliders to see how the posterior changes as you change the number of successes and failures.

- Notice that the posterior is another beta distribution, with parameters $n + k$ and $m + n - k$.

- This makes it easy to compute the posterior. 

In [11]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import beta
import ipywidgets as widgets
from IPython.display import display, HTML

x = np.linspace(0, 1, 1000)

def plot_special_case(n, k):
    if k > n:
        print("Error: k must be ≤ n.")
        return

    # Prior: Beta(1, 1)
    a_prior, b_prior = 1, 1
    y_prior = beta.pdf(x, a_prior, b_prior)

    # Posterior: Beta(1 + k, 1 + n - k)
    a_post = 1 + k
    b_post = 1 + (n - k)
    y_post = beta.pdf(x, a_post, b_post)

    # Plot
    fig, ax = plt.subplots(figsize=(8, 6))

    # Plot prior
    ax.plot(x, y_prior, label="Prior: Beta(1, 1)", color="blue")
    ax.fill_between(x, y_prior, color="skyblue", alpha=0.4)

    # Plot posterior
    full_label = f"Posterior: Beta(1 + k, 1 + (n − k)) = Beta({a_post}, {b_post})"
    ax.plot(x, y_post, label=full_label, color="green")
    ax.fill_between(x, y_post, color="lightgreen", alpha=0.4)

    # Adjust layout and axis
    ax.set_xlabel("θ")
    ax.set_ylabel("Probability Density")
    ax.set_ylim(0, max(np.max(y_post), np.max(y_prior)) * 1.2)
    ax.grid(True)

    # Move legend outside to the right
    fig.subplots_adjust(right=0.75)
    ax.legend(loc='center left', bbox_to_anchor=(1.02, 0.5), frameon=False)

    plt.show()

# Sliders
n_slider = widgets.IntSlider(min=1, max=100, step=1, value=56, description="n (trials)", style={'description_width': 'initial'})
k_slider = widgets.IntSlider(min=0, max=56, step=1, value=43, description="k (correct)", style={'description_width': 'initial'})

# Auto-adjust k bounds
def update_k_slider(*args):
    k_slider.max = n_slider.value
    if k_slider.value > k_slider.max:
        k_slider.value = k_slider.max

n_slider.observe(update_k_slider, names='value')
update_k_slider()  # Initial sync

# Layout
ui = widgets.VBox([
    widgets.HBox([n_slider, k_slider])
])

out = widgets.interactive_output(plot_special_case, {'n': n_slider, 'k': k_slider})

display(ui, out)


VBox(children=(HBox(children=(IntSlider(value=56, description='n (trials)', min=1, style=SliderStyle(descripti…

Output()

**Sequential vs aggregated updating.**  Bayesian updating can be done in two ways. You can update your beliefs step by step as new data arrives (sequential updating), or you can combine all the data and update in one go (aggregated updating). With conjugate priors like the Beta distribution, both methods give the same result. Let’s see how this works in a simple example.

We’ll start with a uniform prior: Beta(1, 1). You then observe two datasets:
- First: 9 correct out of 10 (k = 9, n = 10)
- Second: 3 correct out of 5 (k = 3, n = 5)

There are two ways to compute the posterior:

**Sequential updating**  
   - Start with Beta(1, 1)  
   - Update with the first dataset → Beta(10, 2)  
   - Use that as the new prior, update with the second dataset → Beta(13, 4)

**Aggregated updating**  
   - Combine the data: k = 12, n = 15  
   - Start with Beta(1, 1), update once → Beta(13, 4)


**Exercise.** Use the sliders to try both methods and confirm they give the same result.

- First, try the sequential approach:  
  Set prior to Beta(1, 1), then enter k = 9, n = 10.  
  Use the resulting posterior as the new prior, then enter k = 3, n = 5.

- Then try the aggregated approach:  
  Set prior to Beta(1, 1), then enter k = 12, n = 15.

- Do you get the same posterior in both cases?



In [12]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import beta
import ipywidgets as widgets
from IPython.display import display

x = np.linspace(0, 1, 1000)

def plot_update(n_prior, k_prior, n_new, k_new):
    if k_prior > n_prior or k_new > n_new:
        print("Error: k must be ≤ n.")
        return

    # Prior = Beta(1 + k_prior, 1 + n_prior - k_prior)
    a_prior = 1 + k_prior
    b_prior = 1 + (n_prior - k_prior)
    y_prior = beta.pdf(x, a_prior, b_prior)

    # Posterior = Beta(1 + total_k, 1 + total_n - total_k)
    total_k = k_prior + k_new
    total_n = n_prior + n_new
    a_post = 1 + total_k
    b_post = 1 + (total_n - total_k)
    y_post = beta.pdf(x, a_post, b_post)

    # Plot
    fig, ax = plt.subplots(figsize=(8, 6))

    ax.plot(x, y_prior, label=f"Prior: Beta(1 + {k_prior}, 1 + {n_prior - k_prior})", color="blue")
    ax.fill_between(x, y_prior, color="skyblue", alpha=0.4)

    ax.plot(x, y_post, label=f"Posterior: Beta(1 + {total_k}, 1 + {total_n - total_k})", color="green")
    ax.fill_between(x, y_post, color="lightgreen", alpha=0.4)

    ax.set_xlabel("θ")
    ax.set_ylabel("Probability Density")
    ax.set_ylim(0, max(np.max(y_prior), np.max(y_post)) * 1.2)
    ax.grid(True)

    # Adjust spacing and move legend to right
    fig.subplots_adjust(right=0.75)
    ax.legend(loc='center left', bbox_to_anchor=(1.02, 0.5), frameon=False)

    plt.show()

# Sliders
n_prior_slider = widgets.IntSlider(min=0, max=100, step=1, value=20, description="n_prior")
k_prior_slider = widgets.IntSlider(min=0, max=20, step=1, value=10, description="k_prior")

n_new_slider = widgets.IntSlider(min=0, max=100, step=1, value=30, description="n_new")
k_new_slider = widgets.IntSlider(min=0, max=30, step=1, value=15, description="k_new")

# Auto-limit k sliders
def update_k_prior_max(*args):
    k_prior_slider.max = n_prior_slider.value
    if k_prior_slider.value > k_prior_slider.max:
        k_prior_slider.value = k_prior_slider.max

def update_k_new_max(*args):
    k_new_slider.max = n_new_slider.value
    if k_new_slider.value > k_new_slider.max:
        k_new_slider.value = k_new_slider.max

n_prior_slider.observe(update_k_prior_max, names='value')
n_new_slider.observe(update_k_new_max, names='value')

# Initial sync
update_k_prior_max()
update_k_new_max()

# Layout
ui = widgets.VBox([
    widgets.HTML("🔵 <b>Prior</b>"),
    widgets.HBox([n_prior_slider, k_prior_slider]),
    widgets.HTML("🟢 <b>New Data</b>"),
    widgets.HBox([n_new_slider, k_new_slider])
])

out = widgets.interactive_output(plot_update, {
    'n_prior': n_prior_slider,
    'k_prior': k_prior_slider,
    'n_new': n_new_slider,
    'k_new': k_new_slider
})

display(ui, out)


VBox(children=(HTML(value='🔵 <b>Prior</b>'), HBox(children=(IntSlider(value=20, description='n_prior'), IntSli…

Output()

**MCMC.** To illustrate the power of MCMC, we demo with this interactive example. The demo runs JAGS via python. JAGS is a sampler that implements MCMC, though there are many samplers available. The model is set up via as specific language. It is everything inside "model {...}" in the code cell below. Here the model is a simple Bernoulli model, where we have a prior on theta, and we have a likelihood for observe k successes out of n trials.

**Exercise.** Play with the data k and n and set the number of samples that sampler runs. Keep it low to begin with.

-  Compare sampling to an analytical posterior for the same data. Click on the show analytic posteror button to overlay the analytical posterior. 

- How do you get the MCMC posterior to better match the analytical posterior? 

In [None]:
import pyjags
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from IPython.display import display
from scipy.stats import beta

# Ensure plots show inline
%matplotlib inline

def run_jags(k=6, n=10, samples=100, show_analytic=False):
    if k > n:
        print("k must be ≤ n")
        return

    # Data setup
    y = np.array([1]*k + [0]*(n-k))

    # JAGS model string
    model_code = """
    model {
      for (i in 1:N) {
        y[i] ~ dbern(theta)
      }
      theta ~ dbeta(1, 1)
    }
    """

    data = {"y": y, "N": len(y)}
    inits = [{"theta": 0.5}]

    # Run JAGS model
    model = pyjags.Model(code=model_code, data=data, init=inits, chains=1)
    model.update(100)  # Burn-in
    result = model.sample(samples, vars=["theta"])
    theta_samples = result["theta"].reshape(-1)

    # Plotting
    fig, axs = plt.subplots(1, 3, figsize=(15, 4))

    # --- Trace plot ---
    axs[0].plot(theta_samples)
    axs[0].set_title("Trace of θ (Chain 1)")
    axs[0].set_xlabel("Sample")
    axs[0].set_ylabel("θ")
    axs[0].set_ylim(0, 1)
    axs[0].grid(True)

    # --- Horizontal histogram ---
    axs[1].hist(theta_samples, bins=30, density=True, alpha=0.6, color='gray', orientation='horizontal', label='MCMC posterior')
    axs[1].set_title("Posterior of θ")
    axs[1].set_xlabel("Density")
    axs[1].set_ylabel("θ")
    axs[1].set_ylim(0, 1)

    # --- Line-over-histogram plot ---
    axs[2].hist(theta_samples, bins=30, density=True, alpha=0.5, color='steelblue', label='MCMC posterior')

    # Optional analytical overlay
    if show_analytic:
        alpha_post = 1 + k
        beta_post = 1 + (n - k)
        theta_vals = np.linspace(0, 1, 200)
        analytical_posterior = beta.pdf(theta_vals, alpha_post, beta_post)
        axs[1].plot(analytical_posterior, theta_vals, 'r-', lw=2, label='Analytical posterior')
        axs[1].legend()

        axs[2].plot(theta_vals, analytical_posterior, 'r-', lw=2, label='Analytical posterior')
        axs[2].legend()

    axs[2].set_title(f"Posterior of θ (k={k}, n={n})")
    axs[2].set_xlabel("θ")
    axs[2].set_ylabel("Density")
    axs[2].set_xlim(0, 1)
    axs[2].grid(True)

    plt.tight_layout()
    plt.show()

# Widgets
k_slider = widgets.IntSlider(value=6, min=0, max=20, step=1, description='Successes (k):')
n_slider = widgets.IntSlider(value=10, min=1, max=20, step=1, description='Trials (n):')
sample_slider = widgets.IntSlider(value=100, min=100, max=10000, step=100, description='Samples:')

analytic_toggle = widgets.ToggleButton(
    value=False,
    description='Show analytic posterior',
    tooltip='Toggle analytical posterior overlay',
    icon='line-chart'
)

# Link function to widgets
ui = widgets.VBox([k_slider, n_slider, sample_slider, analytic_toggle])
out = widgets.interactive_output(run_jags, {
    'k': k_slider,
    'n': n_slider,
    'samples': sample_slider,
    'show_analytic': analytic_toggle
})

# Display
display(ui, out)


VBox(children=(IntSlider(value=6, description='Successes (k):', max=20), IntSlider(value=10, description='Tria…

Output()

**Binomial distribution.**  Here we can see the binomial distribution: the probability of getting `k` successes in `n` independent trials, where each trial has a success probability of `θ`. Use the sliders below to adjust `n` and `θ`, and watch how the shape of the distribution changes. The equation updates to show how the probability is computed based on the current values.

**Exercise.** Play with the sliders and answer the following:

- What happens when θ is close to 0 or close to 1?

- How does the distribution change as you increase n?

- When is the distribution symmetric, and when is it skewed?

- What value of `k` seems most likely, and how does that relate to `θ × n`?

Try predicting the outcome before you move the sliders — then use the plot to check your intuition.


In [14]:
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from IPython.display import display, Math, Latex, clear_output

# Define sliders
theta_slider = widgets.FloatSlider(value=0.5, min=0.01, max=0.99, step=0.01, description='θ')
n_slider = widgets.IntSlider(value=10, min=1, max=100, step=1, description='n')

# Define function to update the plot and LaTeX
def update_plot(theta, n):
    clear_output(wait=True)
    
    # Binomial support and PMF
    k = np.arange(0, n+1)
    pmf = [np.math.comb(n, ki) * theta**ki * (1 - theta)**(n - ki) for ki in k]
    
    # Display formula with current values
    display(Math(f"P(k\\mid\\theta={theta:.2f},\\ n={n}) = \\binom{{n}}{{k}} \\theta^k (1 - \\theta)^{{n - k}}"))
    
    # Plot
    plt.figure(figsize=(8, 4))
    plt.bar(k, pmf, color='skyblue', edgecolor='black')
    plt.title(f'Binomial Distribution: n={n}, θ={theta:.2f}')
    plt.xlabel('k')
    plt.ylabel('P(k | θ, n)')
    plt.grid(True, axis='y', linestyle=':')
    plt.show()

# Interactive output
widgets.interact(update_plot, theta=theta_slider, n=n_slider)


interactive(children=(FloatSlider(value=0.5, description='θ', max=0.99, min=0.01, step=0.01), IntSlider(value=…

<function __main__.update_plot(theta, n)>

**MCMC convergaence checks.** This is the same model as the one we used in the MCMC demo. The model is a simple Bernoulli model, where we have a prior on theta, and we have a likelihood for observe k successes out of n trials. The model is set up via as specific language. It is everything inside "model {...}" in the code cell below. We run it here so that we can look at the model outputs and see if they look ok. 

**Exercise.** Play with the sliders, run the model, and check the convergence diagnostics.
- Do the chains look like they have converged?
- Do they look like "hairy caterpillars"?
- Is the R-hat statistic close to 1?
- Is the Pope Catholic?
- Just checking you are awake. x 


In [15]:
import pyjags
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from IPython.display import display, Math, Latex, clear_output

# Define the JAGS model as a string
model_code = """
model {
  theta ~ dbeta(1,1)
  k ~ dbin(theta, n)
}
"""

# R-hat computation
def compute_rhat(chains):
    m = len(chains)
    n = len(chains[0])
    
    chain_means = np.array([np.mean(c) for c in chains])
    chain_vars = np.array([np.var(c, ddof=1) for c in chains])

    B = n * np.var(chain_means, ddof=1)
    W = np.mean(chain_vars)
    var_hat = ((n - 1) / n) * W + (1 / n) * B
    Rhat = np.sqrt(var_hat / W)

    return Rhat, B, W, var_hat, chain_means, chain_vars

# Main model runner
def run_model(k, n, samples):
    clear_output(wait=True)
    print(f"Running with k={k}, n={n}, samples={samples} per chain\n")
    
    chains = []
    for i in range(4):
        data = {'k': k, 'n': n}
        model = pyjags.Model(code=model_code, data=data, chains=1, adapt=500)
        model.update(1000)
        result = model.sample(samples, vars=['theta'])
        theta_samples = result['theta'].reshape(-1)
        chains.append(theta_samples)
        plt.plot(theta_samples, label=f"Chain {i+1}")

    plt.title("Trace plots of θ for 4 chains")
    plt.xlabel("Sample")
    plt.ylabel("θ")
    plt.ylim(0, 1)
    plt.legend()
    plt.grid(True)
    plt.tight_layout()
    plt.show()

    # Compute R-hat
    Rhat, B, W, var_hat, chain_means, chain_vars = compute_rhat(chains)

    # Print results
    print("Per-chain statistics:")
    for i, (mean, var) in enumerate(zip(chain_means, chain_vars)):
        print(f"  Chain {i+1}: mean = {mean:.4f}, var = {var:.5f}")

    print(f"\nBetween-chain variance (B): {B:.5f}")
    print(f"Within-chain variance (W): {W:.5f}")
    print(f"Estimated variance (var_hat): {var_hat:.5f}")
    print(f"R-hat: {Rhat:.4f}\n")

    display(Math(r"""
    \hat{R} = \sqrt{ \frac{ \left( \frac{n-1}{n} \right) W + \left( \frac{1}{n} \right) B }{W} }
    """))

# Interactive widgets
k_slider = widgets.IntSlider(value=6, min=0, max=20, description='Successes (k)')
n_slider = widgets.IntSlider(value=10, min=1, max=20, description='Trials (n)')
samples_slider = widgets.IntSlider(value=1000, min=100, max=5000, step=100, description='Samples')

run_button = widgets.Button(description="Run model")
output = widgets.Output()

def on_click(b):
    with output:
        run_model(k_slider.value, n_slider.value, samples_slider.value)

run_button.on_click(on_click)
display(widgets.VBox([k_slider, n_slider, samples_slider, run_button, output]))


VBox(children=(IntSlider(value=6, description='Successes (k)', max=20), IntSlider(value=10, description='Trial…

**Inferring the difference between two rates.** In this model, we observe two processes — for example, two groups or experimental conditions — each with a number of successes out of a number of trials. We assume the underlying success rates are governed by parameters $\theta_1$ and $\theta_2$, with uniform Beta priors:

$\theta_1 \sim \text{Beta}(1, 1)$, $\theta_2 \sim \text{Beta}(1, 1)$  
$k_1 \sim \text{Binomial}(n_1, \theta_1)$, $k_2 \sim \text{Binomial}(n_2, \theta_2)$

Our quantity of interest is the difference in success rates:

$\delta = \theta_1 - \theta_2$

This tells us how much more (or less) likely success is in one group than the other.

Use the sliders to adjust $k_1$, $n_1$, $k_2$, and $n_2$. The trace plots show MCMC samples of $\delta$, and the histogram shows its posterior distribution. You can also overlay an approximate analytical solution for comparison.

**Exercise.** Explore the model using the sliders and answer the following:

- When does the posterior of $\delta$ look symmetric? When is it skewed?
- How does increasing $n_1$ or $n_2$ affect the width of the posterior?
- What happens when $k_1 = k_2$ but $n_1 \ne n_2$?
- Use the checkbox to compare the MCMC posterior with the analytical approximation. How close are they?

Try predicting what the posterior will look like before adjusting the sliders — then use the plot to check your intuition.


In [16]:
import pyjags
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from IPython.display import display, Math, clear_output

# JAGS model code
model_code = """
model {
  k1 ~ dbin(theta1, n1)
  k2 ~ dbin(theta2, n2)
  theta1 ~ dbeta(1,1)
  theta2 ~ dbeta(1,1)
  delta <- theta1 - theta2
}
"""

# R-hat calculation
def compute_rhat(chains):
    m = len(chains)
    n = len(chains[0])
    chain_means = np.array([np.mean(c) for c in chains])
    chain_vars = np.array([np.var(c, ddof=1) for c in chains])
    B = n * np.var(chain_means, ddof=1)
    W = np.mean(chain_vars)
    var_hat = ((n - 1) / n) * W + (1 / n) * B
    Rhat = np.sqrt(var_hat / W)
    return Rhat, B, W, var_hat, chain_means, chain_vars

# Main model function
def run_model(k1, n1, k2, n2, samples):
    clear_output(wait=True)
    print(f"Running with (k1={k1}, n1={n1}), (k2={k2}, n2={n2}), samples={samples} per chain")

    data = {"k1": k1, "n1": n1, "k2": k2, "n2": n2}
    chains = []

    for i in range(4):
        model = pyjags.Model(code=model_code, data=data, chains=1, adapt=500)
        model.update(1000)
        result = model.sample(samples, vars=["delta"])
        delta_chain = result["delta"].reshape(-1)
        chains.append(delta_chain)

    # Plot traces and histogram
    fig = plt.figure(figsize=(12, 6))
    grid = plt.GridSpec(1, 2, width_ratios=[3, 1], wspace=0.3)

    ax_traces = fig.add_subplot(grid[0])
    for i in range(4):
        ax_traces.plot(chains[i], label=f"Chain {i+1}")
    ax_traces.set_title("Trace plots of δ (4 chains)")
    ax_traces.set_xlabel("Sample")
    ax_traces.set_ylabel("δ = θ₁ − θ₂")
    ax_traces.axhline(0, color='black', linestyle='--', lw=1)
    ax_traces.legend()
    ax_traces.grid(True)

    ax_hist = fig.add_subplot(grid[1])
    all_samples = np.concatenate(chains)
    ax_hist.hist(all_samples, bins=30, orientation='horizontal', density=True, color='gray', alpha=0.6, label='Posterior')
    ax_hist.set_title("Posterior of δ")
    ax_hist.set_xlabel("Density")
    ax_hist.set_ylabel("δ")
    ax_hist.grid(True)

    plt.tight_layout()
    plt.show()

    # R-hat diagnostics
    Rhat, B, W, var_hat, chain_means, chain_vars = compute_rhat(chains)

    print("Per-chain statistics:")
    for i, (mean, var) in enumerate(zip(chain_means, chain_vars)):
        print(f"  Chain {i+1}: mean = {mean:.4f}, var = {var:.5f}")

    print(f"\nBetween-chain variance (B): {B:.5f}")
    print(f"Within-chain variance (W): {W:.5f}")
    print(f"Estimated variance (var_hat): {var_hat:.5f}")
    print(f"R-hat: {Rhat:.4f}\n")

    display(Math(r"""
    \hat{R} = \sqrt{ \frac{ \left( \frac{n-1}{n} \right) W + \left( \frac{1}{n} \right) B }{W} }
    """))

# Sliders
k1_slider = widgets.IntSlider(value=15, min=0, max=30, description='k₁')
n1_slider = widgets.IntSlider(value=20, min=1, max=40, description='n₁')
k2_slider = widgets.IntSlider(value=10, min=0, max=30, description='k₂')
n2_slider = widgets.IntSlider(value=20, min=1, max=40, description='n₂')
samples_slider = widgets.IntSlider(value=1000, min=100, max=5000, step=100, description='Samples')

run_button = widgets.Button(description="Run model")
output = widgets.Output()

def on_click(b):
    with output:
        run_model(
            k1_slider.value, n1_slider.value,
            k2_slider.value, n2_slider.value,
            samples_slider.value
        )

run_button.on_click(on_click)

display(widgets.VBox([
    k1_slider, n1_slider, k2_slider, n2_slider,
    samples_slider, run_button, output
]))


VBox(children=(IntSlider(value=15, description='k₁', max=30), IntSlider(value=20, description='n₁', max=40, mi…

**Inferring a common rate.**  In this model, we assume a single underlying common rate $\theta$ governs two independent processes: one produces $k_1$ successes out of $n_1$ trials, and the other produces $k_2$ out of $n_2$. We use a uniform Beta prior for $\theta$ and observe both outcomes to update our belief about its value. This model is useful when we believe the same latent process drives two different datasets. Later we can turn this into a question whether the data is modelled better by a single rate (e.g. ability), or two different rates.

**Exercise.**  Use the sliders to explore how the posterior of $\theta$ responds to different inputs:

- What happens when $k_1$ and $k_2$ are both small or both large?
- How does increasing $n_1$ or $n_2$ while keeping $k_1$ and $k_2$ fixed affect the posterior? 
- What if one process supports a high $\theta$ and the other supports a low one?
- How can you change the sliders such that process 1 or 2 has more influence over the posterior? Why does this work?

Try guessing what the posterior will look like before you run the code — then use the plots to test your expectations.


In [None]:
import pyjags
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from IPython.display import display, clear_output

# JAGS model code
model_code = """
model {
  k1 ~ dbin(theta, n1)
  k2 ~ dbin(theta, n2)
  theta ~ dbeta(1,1)
}
"""

# Function to run the model
def run_model(k1, k2, n1, n2, samples):
    clear_output(wait=True)
    print(f"Running with k1={k1}, n1={n1}, k2={k2}, n2={n2}, samples={samples}")
    
    data = {"k1": k1, "k2": k2, "n1": n1, "n2": n2}
    model = pyjags.Model(code=model_code, data=data, chains=1, adapt=500)
    model.update(1000)
    samples_dict = model.sample(samples, vars=["theta"])
    theta_chain = samples_dict["theta"].reshape(-1)

    # Plotting
    fig, axs = plt.subplots(1, 2, figsize=(12, 4), gridspec_kw={"width_ratios": [3, 1]})
    
    # Trace plot
    axs[0].plot(theta_chain)
    axs[0].set_title("Trace plot of θ")
    axs[0].set_xlabel("Iteration")
    axs[0].set_ylabel("θ")
    axs[0].set_ylim(0, 1)
    axs[0].grid(True)

    # Horizontal histogram
    axs[1].hist(theta_chain, bins=30, orientation="horizontal", density=True, color="gray", edgecolor="black")
    axs[1].set_title("Posterior of θ")
    axs[1].set_xlabel("Density")
    axs[1].set_ylabel("θ")
    axs[1].set_ylim(0, 1)
    axs[1].grid(True)

    plt.tight_layout()
    plt.show()

# Sliders
n1_slider = widgets.IntSlider(value=10, min=1, max=30, description="n₁")
k1_slider = widgets.IntSlider(value=6, min=0, max=n1_slider.value, description="k₁")

n2_slider = widgets.IntSlider(value=10, min=1, max=30, description="n₂")
k2_slider = widgets.IntSlider(value=4, min=0, max=n2_slider.value, description="k₂")

samples_slider = widgets.IntSlider(value=1000, min=100, max=5000, step=100, description="Samples")
run_button = widgets.Button(description="Run model")
output = widgets.Output()

# Link k1 max to n1
def update_k1_max(*args):
    k1_slider.max = n1_slider.value
    if k1_slider.value > k1_slider.max:
        k1_slider.value = k1_slider.max

def update_k2_max(*args):
    k2_slider.max = n2_slider.value
    if k2_slider.value > k2_slider.max:
        k2_slider.value = k2_slider.max

n1_slider.observe(update_k1_max, names='value')
n2_slider.observe(update_k2_max, names='value')

# Run button action
def on_click(b):
    with output:
        run_model(
            k1_slider.value, k2_slider.value,
            n1_slider.value, n2_slider.value,
            samples_slider.value
        )

run_button.on_click(on_click)

# Display everything
display(widgets.VBox([
    n1_slider, k1_slider,
    n2_slider, k2_slider,
    samples_slider,
    run_button,
    output
]))


VBox(children=(IntSlider(value=10, description='n₁', max=30, min=1), IntSlider(value=6, description='k₁', max=…

**Simulate guessers and tryers** This code allows you simulate the mix of guessers and tryers along with the ability of the tryers. The code is set up to simulate a test with 1000 trials, and you can adjust the number of guessers and tryers using the sliders. The ability of the tryers is also adjustable, and you can see how this affects the distribution of scores.

**Exercise.** Play around with the sliders to see how the distribution of scores changes as you adjust the number of guessers and tryers, and the ability of the tryers. 

- what happens when the number of trials is low? 

- is it harder or easier to tell the difference between guessers and tryers?

- what happens when you change the ability of the tryers? why? 

- how would this inform your experimental design?

In [49]:
import pyjags
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from IPython.display import display

# Store for simulated data
sim_data = {}
sim_output = widgets.Output()

# --- SIMULATION ---
def simulate_data(num_guessers, num_tryers, phi, n):
    sim_output.clear_output(wait=True)

    # Clamp phi to valid Beta range
    phi = np.clip(phi, 0.01, 0.99)

    p = num_guessers + num_tryers
    z_true = np.array([0]*num_guessers + [1]*num_tryers)
    np.random.shuffle(z_true)
    theta = np.where(z_true == 0, 0.5, np.random.beta(phi*10, (1-phi)*10, size=p))
    k = np.random.binomial(n, theta)

    sim_data.update({"z_true": z_true, "k": k, "p": p, "n": n, "theta": theta})

    with sim_output:
        plt.figure(figsize=(10, 4))
        plt.bar(np.arange(p), k, color='steelblue', edgecolor='black')
        plt.title(f"Simulated correct responses (n = {n})")
        plt.xlabel("Participant")
        plt.ylabel("Correct responses")
        plt.ylim(0, n)
        plt.grid(True)
        plt.show()

        print("True z values (0 = guesser, 1 = tryer):")
        print(z_true)
        print("Simulated theta values:")
        print(np.round(theta, 3))

# --- WIDGETS ---
guess_slider = widgets.IntSlider(value=5, min=0, max=30, description="Guessers")
tryer_slider = widgets.IntSlider(value=15, min=0, max=30, description="Tryers")
phi_slider = widgets.FloatSlider(value=0.9, min=0.01, max=0.99, step=0.01, description="Mean Ability φ")
n_slider = widgets.IntSlider(value=10, min=1, max=100, step=1, description="Trials (n)")

simulate_button = widgets.Button(description="Simulate")

simulate_button.on_click(lambda b: simulate_data(
    guess_slider.value, tryer_slider.value, phi_slider.value, n_slider.value
))

# --- DISPLAY UI ---
display(widgets.VBox([
    widgets.HTML("<b>1. Simulate data:</b>"),
    guess_slider,
    tryer_slider,
    phi_slider,
    n_slider,
    simulate_button,
    sim_output
]))


VBox(children=(HTML(value='<b>1. Simulate data:</b>'), IntSlider(value=5, description='Guessers', max=30), Int…

**Simulate a single subject then infer guesser or tryer.** This code allows you to simulate a single subject's performance on the test, and then infer whether they are a guesser or a tryer based on their score. The model assumes that the subject is either a guesser or a tryer, and uses the observed score to update the prior beliefs about their type. 

**Exercise.** 
- Play around with the sliders to see how the posterior probabilities of being a guesser or a tryer change as you make the guesser and the tryer closer, or further apart. 
- Play around with the amount of data you have from the experiment. Does this make it easier to tell the difference between guessers and tryers when they are closer together?

In [71]:
import pyjags
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from IPython.display import display, clear_output

# Output areas
sim_output = widgets.Output()
infer_output = widgets.Output()

# Store simulated data
sim_data = {}

# --- SIMULATE ONE SUBJECT ---
def simulate_one_subject(z_true, phi, n):
    sim_output.clear_output(wait=True)
    
    theta = 0.5 if z_true == 0 else np.random.beta(phi * 10, (1 - phi) * 10)
    k = np.random.binomial(n, theta)
    
    sim_data.update({"z_true": z_true, "k": k, "n": n})

    with sim_output:
        print(f"True z: {z_true} ({'guesser' if z_true == 0 else 'tryer'})")
        print(f"Generated θ: {theta:.3f}")
        print(f"Observed correct responses: k = {k} out of n = {n}")

# --- INFER ONE SUBJECT ---
def infer_one_subject(samples):
    infer_output.clear_output(wait=True)
    
    if not sim_data:
        with infer_output:
            print("⚠️ Please simulate data first.")
        return

    k = sim_data["k"]
    n = sim_data["n"]
    z_true = sim_data["z_true"]

    model_code = """
    model {
      z ~ dcat(pi[1:2])
      pi[1] <- 0.5
      pi[2] <- 0.5

      theta[1] <- 0.5
      theta[2] ~ dbeta(1, 1)

      k ~ dbin(theta[z], n)
    }
    """

    data = {"k": k, "n": n}

    model = pyjags.Model(code=model_code, data=data, chains=1, adapt=500)
    model.update(1000)
    result = model.sample(samples, vars=["z"])
    z_chain = result["z"].reshape(-1)

    with infer_output:
        plt.figure(figsize=(10, 4))

        plt.hist(z_chain, bins=[0.5, 1.5, 2.5], rwidth=0.7, align='mid', color='gray')
        plt.xticks([1, 2], ['z=1 (guesser)', 'z=2 (tryer)'])
        plt.title("Posterior distribution of z")
        plt.xlabel("Model")
        plt.ylabel("Frequency")
        plt.grid(True)
        plt.tight_layout()
        plt.show()

        p1 = np.mean(z_chain == 1)
        p2 = np.mean(z_chain == 2)
        print(f"Posterior p(z = 1): {p1:.3f}")
        print(f"Posterior p(z = 2): {p2:.3f}")
        print(f"True z: {z_true} ({'guesser' if z_true == 0 else 'tryer'})")

# --- WIDGETS ---
z_true_slider = widgets.Dropdown(options=[("Guesser (z=0)", 0), ("Tryer (z=1)", 1)], value=0, description="True z:")
phi_slider = widgets.FloatSlider(value=0.9, min=0.5, max=1.0, step=0.01, description="Ability φ:")
n_slider = widgets.IntSlider(value=10, min=1, max=100, step=1, description="Trials (n):")
samples_slider = widgets.IntSlider(value=1000, min=500, max=5000, step=100, description="Samples:")

simulate_button = widgets.Button(description="Simulate 1 subject")
infer_button = widgets.Button(description="Run Inference")

simulate_button.on_click(lambda b: simulate_one_subject(
    z_true_slider.value, phi_slider.value, n_slider.value
))

infer_button.on_click(lambda b: infer_one_subject(samples_slider.value))

# --- DISPLAY UI ---
display(widgets.VBox([
    widgets.HTML("<b>1. Simulate a single subject:</b>"),
    z_true_slider,
    phi_slider,
    n_slider,
    simulate_button,
    sim_output,
    widgets.HTML("<b>2. Run inference for that subject:</b>"),
    samples_slider,
    infer_button,
    infer_output
]))


VBox(children=(HTML(value='<b>1. Simulate a single subject:</b>'), Dropdown(description='True z:', options=(('…

In [None]:
**Latent mixture models for model comparison.** In the above example we are inferring what type of person a subject is based on their performance. This is really just a type of model comparison. We are comparing two models: one where the subject is a guesser, and one where the subject is a tryer. The model comparison is done by computing the posterior probability of each model given the data. Here we make it more explicit that we are comparing two models. You can set



In [2]:
import pyjags
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from IPython.display import display, clear_output
from scipy.stats import beta as beta_dist

# Output display
output = widgets.Output()

# --- RUN THE MODEL COMPARISON ---
def run_model_comparison(k, n, theta_null, alpha, beta_param, samples):
    output.clear_output(wait=True)

    model_code = """
    model {
      z ~ dcat(pi[1:2])
      pi[1] <- 0.5
      pi[2] <- 0.5

      theta[1] <- theta_null
      theta[2] ~ dbeta(alpha, beta)

      k ~ dbin(theta[z], n)
    }
    """

    data = {
        "k": k,
        "n": n,
        "theta_null": theta_null,
        "alpha": alpha,
        "beta": beta_param
    }

    model = pyjags.Model(code=model_code, data=data, chains=1, adapt=500)
    model.update(1000)
    samples_dict = model.sample(samples, vars=["z"])
    z_samples = samples_dict["z"].reshape(-1)

    # Posterior stats
    p_null = np.mean(z_samples == 1)
    p_alt = np.mean(z_samples == 2)
    bf = p_null / p_alt if p_alt > 0 else np.inf

    with output:
        # DATA summary
        print("## Data")
        print(f"Observed successes (k): {k}")
        print(f"Number of trials (n):   {n}")
        print()

        # PRIORS
        print("## Model Priors")
        print(f"Null model: θ = {theta_null:.2f}")
        print(f"Alternative model: θ ~ Beta({alpha}, {beta_param})")
        print()

        # Plot priors and posterior of z
        theta_vals = np.linspace(0, 1, 300)
        plt.figure(figsize=(12, 4))

        plt.subplot(1, 2, 1)
        plt.axvline(theta_null, color='blue', linestyle='--', label=f"θ_null = {theta_null:.2f}")
        plt.plot(theta_vals, beta_dist.pdf(theta_vals, alpha, beta_param),
                 label=f"Beta({alpha}, {beta_param})", color='red')
        plt.title("Model Priors for θ")
        plt.xlabel("θ")
        plt.ylabel("Density")
        plt.legend()
        plt.grid(True)

        plt.subplot(1, 2, 2)
        plt.hist(z_samples, bins=[0.5, 1.5, 2.5], align='mid', rwidth=0.6, color='gray')
        plt.xticks([1, 2], ["Null (θ fixed)", "Alt (θ ~ Beta)"])
        plt.title("Posterior of z")
        plt.ylabel("Frequency")
        plt.grid(True)

        plt.tight_layout()
        plt.show()

        # Posterior summaries
        print("## Posterior Model Probabilities")
        print(f"p(z = 1 | data)  = {p_null:.3f} (Model Null)")
        print(f"p(z = 2 | data)  = {p_alt:.3f} (Model Alt)")

        if np.isinf(bf):
            print("Bayes Factor BF_null/alt = ∞ (Model Alt never sampled)")
        else:
            print(f"Bayes Factor BF_null/alt = {bf:.3f}")

# --- WIDGETS ---
# Group 1: Data
k_slider = widgets.IntSlider(value=30, min=0, max=100, description="Successes k:")
n_slider = widgets.IntSlider(value=50, min=1, max=200, step=1, description="Trials n:")
data_box = widgets.VBox([k_slider, n_slider])

# Group 2: Null model
theta_null_slider = widgets.FloatSlider(value=0.5, min=0, max=1, step=0.01, description="θ_null:")
null_box = widgets.VBox([theta_null_slider])

# Group 3: Alternative model
alpha_slider = widgets.FloatSlider(value=1, min=0.1, max=10, step=0.1, description="Beta α:")
beta_slider = widgets.FloatSlider(value=1, min=0.1, max=10, step=0.1, description="Beta β:")
alt_box = widgets.VBox([alpha_slider, beta_slider])

# Group 4: Sampling
samples_slider = widgets.IntSlider(value=1000, min=500, max=5000, step=100, description="Samples:")
sample_box = widgets.VBox([samples_slider])

# Button
run_button = widgets.Button(description="Run Model Comparison")

def on_click(b):
    run_model_comparison(
        k_slider.value,
        n_slider.value,
        theta_null_slider.value,
        alpha_slider.value,
        beta_slider.value,
        samples_slider.value
    )

run_button.on_click(on_click)

# Display UI
display(widgets.VBox([
    widgets.HTML("<h3>Compare Null vs. Alternative Model</h3>"),
    widgets.HTML("<b>Data</b>"),
    data_box,
    widgets.HTML("<b>Null Model (θ = constant)</b>"),
    null_box,
    widgets.HTML("<b>Alternative Model (θ ~ Beta)</b>"),
    alt_box,
    widgets.HTML("<b>Sampling</b>"),
    sample_box,
    run_button,
    output
]))


VBox(children=(HTML(value='<h3>Compare Null vs. Alternative Model</h3>'), HTML(value='<b>Data</b>'), VBox(chil…

**Compare octopusses via marginal likelihood.** This illustration shows how two different predictive models—one broad and vague (Alice), one narrow and focused (Bob)—assign different probability mass to a fixed data point (the submarine). The model that predicts the actual outcome more precisely (without wasting too much probability elsewhere) achieves a higher marginal likelihood. We approximate this using a grid and visualize the predictions alongside a log-scale ratio of their marginal likelihoods.

**Exercise.**  
- Try changing Bob’s prediction location or spread. When does he outperform Alice?  
- Try switching Alice’s hemisphere. When does she beat Bob?  
- What happens to the marginal likelihood ratio as Bob’s prediction gets more precise?  
- Meet Chris the squid. He's a fun guy but he's very vague. He thinks the submarine is "you know, probably just in the water somewhere." What would his marginal likelihood be? 


In [17]:
import numpy as np
import matplotlib.pyplot as plt
from ipywidgets import FloatSlider, RadioButtons, VBox, HBox, Label, interactive_output
from scipy.stats import multivariate_normal

# Grid setup
grid_res = 100
x = np.linspace(0, 1, grid_res)
y = np.linspace(0, 1, grid_res)
X, Y = np.meshgrid(x, y)
grid_points = np.dstack([X, Y])

# Submarine location
true_x, true_y = 0.65, 0.8
ix = (np.abs(x - true_x)).argmin()
iy = (np.abs(y - true_y)).argmin()

def compute_bob_marginal_likelihood(mu_x, mu_y, sigma):
    rv = multivariate_normal(mean=[mu_x, mu_y], cov=sigma**2 * np.eye(2))
    pred = rv.pdf(grid_points)
    pred /= pred.sum()
    return pred, pred[iy, ix]

def compute_alice_marginal_likelihood(hemisphere):
    mask = Y > 0.5 if hemisphere == 'North' else Y < 0.5
    pred = np.zeros_like(X)
    pred[mask] = 1.0
    pred /= pred.sum()
    return pred, pred[iy, ix]

def plot_predictions(mu_x=0.5, mu_y=0.75, sigma=0.1, hemisphere='North'):
    fig = plt.figure(figsize=(8, 8))
    ax1 = plt.subplot2grid((3, 2), (0, 0))
    ax2 = plt.subplot2grid((3, 2), (0, 1))
    ax3 = plt.subplot2grid((3, 2), (1, 0), colspan=2)

    alice_pred, alice_ml = compute_alice_marginal_likelihood(hemisphere)
    bob_pred, bob_ml = compute_bob_marginal_likelihood(mu_x, mu_y, sigma)

    # Alice's prediction
    ax1.imshow(alice_pred, extent=[0, 1, 0, 1], origin='lower', cmap='Reds', alpha=0.3)
    ax1.scatter(true_x, true_y, marker='x', color='black', s=100)
    ax1.set_title("Alice's prediction")
    ax1.text(0.02, 0.95, f"Marginal likelihood: {alice_ml:.5f}", transform=ax1.transAxes)
    ax1.set_xlabel("Longitude")
    ax1.set_ylabel("Latitude")
    ax1.set_aspect('equal')

    # Bob's prediction
    ax2.imshow(bob_pred, extent=[0, 1, 0, 1], origin='lower', cmap='Reds', alpha=0.6)
    ax2.scatter(true_x, true_y, marker='x', color='black', s=100)
    ax2.set_title("Bob's prediction")
    ax2.text(0.02, 0.95, f"Marginal likelihood: {bob_ml:.5f}", transform=ax2.transAxes)
    ax2.set_xlabel("Longitude")
    ax2.set_ylabel("Latitude")
    ax2.set_aspect('equal')

    # Ratio plot (vertical and narrow)
    ax3.set_title("Ratio of marginal likelihoods (Bob / Alice)")
    ax3.set_yscale('log')
    ax3.set_ylim(1/100, 100)
    ax3.set_xlim(0.495, 0.505)
    ax3.set_yticks([1/100, 1/30, 1/3, 1, 3, 10, 30, 100])
    ax3.set_yticklabels(["1/100", "1/30", "1/3", "1", "3", "10", "30", "100"])
    ax3.set_xticks([])
    ax3.set_ylabel("Bayes factor (log scale)")

    if alice_ml > 0:
        ratio = np.clip(bob_ml / alice_ml, 1/100, 100)
        ax3.plot([0.5], [ratio], 'ro', markersize=12)
    else:
        ax3.text(0.5, 1, "undefined", ha='center', va='top')

    # Arrows near left margin
    ax3.annotate("Bob's model better",
                 xy=(0.496, 100), xytext=(0.496, 30),
                 arrowprops=dict(facecolor='black', arrowstyle='->'),
                 ha='left', va='top', fontsize=10)

    ax3.annotate("Alice's model better",
                 xy=(0.496, 1/100), xytext=(0.496, 1/30),
                 arrowprops=dict(facecolor='black', arrowstyle='->'),
                 ha='left', va='bottom', fontsize=10)

    plt.tight_layout()
    plt.show()

# Widgets
alice_controls = VBox([
    Label("Alice’s Predictions"),
    RadioButtons(options=["North", "South"], value="North", description="Hemisphere:")
])
bob_controls = VBox([
    Label("Bob’s Predictions"),
    FloatSlider(value=0.67, min=0, max=1, step=0.01, description="X (lon)"),
    FloatSlider(value=0.75, min=0, max=1, step=0.01, description="Y (lat)"),
    FloatSlider(value=0.1, min=0.01, max=0.3, step=0.01, description="Sigma")
])

output = interactive_output(
    plot_predictions,
    {
        "mu_x": bob_controls.children[1],
        "mu_y": bob_controls.children[2],
        "sigma": bob_controls.children[3],
        "hemisphere": alice_controls.children[1],
    }
)

display(VBox([HBox([alice_controls, bob_controls]), output]))


VBox(children=(HBox(children=(VBox(children=(Label(value='Alice’s Predictions'), RadioButtons(description='Hem…

**Bayes factor scale interpretation** The Bayes factor is a measure of the strength of evidence in favor of one model over another. It is defined as the ratio of the marginal likelihoods of two models, and it can be used to compare the relative fit of different models to the same data.

**Exercise.** Play around with the sliders to see how the Bayes factor changes as you adjust the marginal likelihood of the two models.
- what happens when you switch from $BF_12 to BF_21?
- what happens to the Bayes factor when you both models have same marginal likelihood?
- what happens if both models are really good?
- is it possible to have extreme bayes factors and for both models to be bad? what does this mean? 

In [9]:
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from IPython.display import display, HTML

def plot_bayes_factor_vertical(m1_likelihood, m2_likelihood):
    bf12 = m1_likelihood / m2_likelihood
    bf21 = m2_likelihood / m1_likelihood

    fig, ax = plt.subplots(figsize=(7.5, 6))  # Wider to make room for left-aligned labels

    # Define Jeffreys scale positions and labels
    y_vals = np.log10([1/1000, 1/100, 1/30, 1/10, 1/3, 1, 3, 10, 30, 100, 1000])
    yticks = np.log10([1/100, 1/30, 1/10, 1/3, 1, 3, 10, 30, 100])
    yticklabels = ['1/100', '1/30', '1/10', '1/3', '1', '3', '10', '30', '100']
    descriptions = [
        "Extreme evidence for $M_2$", "Very strong evidence for $M_2$",
        "Strong evidence for $M_2$", "Moderate evidence for $M_2$",
        "Anecdotal evidence for $M_2$", "No preference",
        "Anecdotal evidence for $M_1$", "Moderate evidence for $M_1$",
        "Strong evidence for $M_1$", "Very strong evidence for $M_1$",
        "Extreme evidence for $M_1$"
    ]

    # Shading bands
    for i in range(len(y_vals) - 1):
        ax.axhspan(y_vals[i], y_vals[i+1], color='lightgray', alpha=0.3)

    # Plot Bayes Factors
    ax.plot(0, np.log10(bf12), 'bo', label=f'$BF_{{12}}$ = {bf12:.2f}')
    ax.plot(0, np.log10(bf21), 'ro', label=f'$BF_{{21}}$ = {bf21:.2f}')

    # Plot formatting
    ax.set_ylim(np.log10(1/1000), np.log10(1000))
    ax.set_xlim(-0.5, 1)
    ax.set_xticks([])
    ax.set_yticks(yticks)
    ax.set_yticklabels(yticklabels)
    ax.set_title("Bayes Factors on Jeffreys Scale")
    ax.legend(loc='upper right')

    # Add full-left labels using fig.text (outside axes)
    for i in range(len(y_vals) - 1):
        ypos = (y_vals[i] + y_vals[i+1]) / 2
        fig.text(0.02, (ypos - ax.get_ylim()[0]) / (ax.get_ylim()[1] - ax.get_ylim()[0]),
                 descriptions[i], ha='right', va='center', fontsize=8)

    plt.tight_layout()
    plt.show()

    print(f"\033[34mp(D|M₁): {m1_likelihood:.3f}    \033[31mp(D|M₂): {m2_likelihood:.3f}")

# Color-coded slider labels
m1_label = widgets.HTML('<span style="color:blue">p(D|M₁):</span>')
m2_label = widgets.HTML('<span style="color:red">p(D|M₂):</span>')
m1_slider = widgets.FloatSlider(value=0.5, min=0.001, max=0.999, step=0.001)
m2_slider = widgets.FloatSlider(value=0.5, min=0.001, max=0.999, step=0.001)

ui = widgets.VBox([
    widgets.HBox([m1_label, m1_slider]),
    widgets.HBox([m2_label, m2_slider])
])

out = widgets.interactive_output(plot_bayes_factor_vertical, {
    "m1_likelihood": m1_slider,
    "m2_likelihood": m2_slider
})

display(ui, out)


VBox(children=(HBox(children=(HTML(value='<span style="color:blue">p(D|M₁):</span>'), FloatSlider(value=0.5, m…

Output()

**Bayes factor calculation step by step** This code allows you to calculate the Bayes factor step by step. You can adjust the prior and likelihood for each model, and see how this affects the Bayes factor. The code also shows the marginal likelihood for each model, and how this is used to calculate the Bayes factor.

**Exercise.** Play around with the sliders to see how the Bayes factor changes as you adjust the data. For a theta of 0.5 for the guessing model can you make the data so that the guessing model wins?

- adjust the data to make it BF in favour of M1. 

- how many different ways can you do this? 

In [18]:
from scipy.stats import beta, binom
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from IPython.display import display, Markdown

# --- Core computation and display logic ---
def compute_and_display(k=9, n=10, theta_fixed=0.5):
    if k > n:
        display(Markdown("**⚠️ Error:** $k$ (successes) cannot be greater than $n$ (trials)."))
        return

    # Marginal likelihoods
    marginal_M1 = 1 / (n + 1)
    marginal_M2 = binom.pmf(k, n, theta_fixed)
    BF_12 = marginal_M1 / marginal_M2 if marginal_M2 > 0 else np.inf

    # --- Data section ---
    display(Markdown(f"""
## Data  
- Number of trials: $n = {n}$  
- Number of successes: $k = {k}$  
"""))

    # --- Models section ---
    display(Markdown(f"""
## Models  
- **$M_1$**: Unknown ability → $\\theta \\sim \\mathrm{{Uniform}}(0,1)$  
- **$M_2$**: Guessing → $\\theta = {theta_fixed:.2f}$
"""))

    # --- Priors section ---
    theta = np.linspace(0, 1, 500)
    prior_M1 = np.ones_like(theta)
    fig, ax = plt.subplots(figsize=(6, 2.5))
    ax.plot(theta, prior_M1, label=r"$M_1$: $\theta \sim \mathrm{Uniform}(0,1)$", color='blue')
    ax.axvline(theta_fixed, color='red', linestyle='--', linewidth=2, label=rf"$M_2$: $\theta = {theta_fixed:.2f}$")
    ax.set_title("Priors")
    ax.set_ylabel("Density")
    ax.set_xlabel(r"$\theta$")
    ax.set_xlim(0, 1)
    ax.set_yticks([])
    ax.legend()
    plt.tight_layout()
    plt.show()

    # --- Marginal Likelihoods section ---
    display(Markdown(f"""
## Marginal Likelihoods  

- $p(D \\mid M_1) = \\frac{{1}}{{n+1}} = \\frac{{1}}{{{n}+1}} = {marginal_M1:.4f}$  

- $p(D \\mid M_2) = \\binom{{{n}}}{{{k}}} ({theta_fixed:.2f})^{{{k}}} (1 - {theta_fixed:.2f})^{{{n-k}}} = {marginal_M2:.4f}$  
"""))

    # --- Posterior section ---
    display(Markdown("## Posterior"))
    posterior_M1 = beta.pdf(theta, k + 1, n - k + 1)
    fig, ax = plt.subplots(figsize=(6, 2.5))
    ax.plot(theta, posterior_M1, label=fr"$M_1$: Beta($\alpha$={k+1}, $\beta$={n-k+1})", color='blue')
    ax.axvline(theta_fixed, color='red', linestyle='--', linewidth=2, label=rf"$M_2$: $\theta = {theta_fixed:.2f}$")
    ax.set_title("Posterior for $M_1$ and fixed $\theta$ under $M_2$")
    ax.set_ylabel("Density")
    ax.set_xlabel(r"$\theta$")
    ax.set_xlim(0, 1)
    ax.legend()
    plt.tight_layout()
    plt.show()

    # --- Bayes Factor section ---
    display(Markdown(f"""
## Bayes Factor  

- $BF_{{12}} = \\frac{{p(D \mid M_1)}}{{p(D \mid M_2)}} = \\frac{{{marginal_M1:.4f}}}{{{marginal_M2:.4f}}} = {BF_12:.2f}$  

**Interpretation:**  
The data is about {BF_12:.1f}× more likely under $M_1$ than under $M_2$.
"""))

# --- Interactive widget setup ---
def interactive_bf(n, k, theta_fixed):
    compute_and_display(k=min(k, n), n=n, theta_fixed=theta_fixed)

n_slider = widgets.IntSlider(value=10, min=1, max=50, description='Trials (n):')
k_slider = widgets.IntSlider(value=9, min=0, max=50, description='Successes (k):')
theta_slider = widgets.FloatSlider(value=0.5, min=0.01, max=0.99, step=0.01, description='θ (M₂):')

def update_k_slider_range(*args):
    k_slider.max = n_slider.value

n_slider.observe(update_k_slider_range, names='value')

# Layout: Sliders grouped by section
data_section = widgets.VBox([
    widgets.HTML("<h4>Data</h4>"),
    n_slider,
    k_slider
])

model_section = widgets.VBox([
    widgets.HTML("<h4>Models</h4>"),
    theta_slider
])

ui = widgets.VBox([
    data_section,
    model_section
])

out = widgets.interactive_output(interactive_bf, {
    'n': n_slider,
    'k': k_slider,
    'theta_fixed': theta_slider
})

display(ui, out)


VBox(children=(VBox(children=(HTML(value='<h4>Data</h4>'), IntSlider(value=10, description='Trials (n):', max=…

Output()

**Posterior odds calculation.** This code allows you to calculate the posterior odds for two models given the prior odds and the Bayes factor. The posterior odds are the odds of one model being true given the data, and they can be calculated using Bayes' theorem. 

In [13]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import beta, binom
import ipywidgets as widgets
from IPython.display import display, HTML

# --- Widgets ---
k_slider = widgets.IntSlider(value=5, min=0, max=20, step=1, description='Successes (k):')
n_slider = widgets.IntSlider(value=10, min=1, max=20, step=1, description='Trials (n):')

alpha1_slider = widgets.FloatSlider(value=1, min=0.1, max=10, step=0.1, description='α₁ (M1):')
beta1_slider = widgets.FloatSlider(value=1, min=0.1, max=10, step=0.1, description='β₁ (M1):')

alpha2_slider = widgets.FloatSlider(value=2, min=0.1, max=10, step=0.1, description='α₂ (M2):')
beta2_slider = widgets.FloatSlider(value=2, min=0.1, max=10, step=0.1, description='β₂ (M2):')

prior_odds_slider = widgets.FloatSlider(value=1, min=0.1, max=10, step=0.1, description='Prior Odds (M1:M2)')
show_sampling_toggle = widgets.ToggleButton(value=False, description='Show sampling data', icon='flask')
samples_slider = widgets.IntSlider(value=1000, min=100, max=5000, step=100, description='Samples:')

# --- Core logic ---
def update(k, n, alpha1, beta1, alpha2, beta2, prior_odds, show_sampling):
    if k > n:
        k = n
        k_slider.value = n

    theta = np.linspace(0, 1, 500)
    likelihood = binom.pmf(k, n, theta)

    prior1 = beta.pdf(theta, alpha1, beta1)
    prior2 = beta.pdf(theta, alpha2, beta2)

    unnorm_post1 = likelihood * prior1
    unnorm_post2 = likelihood * prior2

    post1 = unnorm_post1 / np.trapz(unnorm_post1, theta)
    post2 = unnorm_post2 / np.trapz(unnorm_post2, theta)

    marginal1 = np.trapz(unnorm_post1, theta)
    marginal2 = np.trapz(unnorm_post2, theta)

    bayes_factor = marginal1 / marginal2 if marginal2 > 0 else np.inf
    posterior_odds = prior_odds * bayes_factor

    p1 = prior_odds / (1 + prior_odds)
    p2 = 1 / (1 + prior_odds)
    post1_prob = posterior_odds / (1 + posterior_odds)
    post2_prob = 1 / (1 + posterior_odds)

    # --- Plotting ---
    fig, axs = plt.subplots(3, 2, figsize=(12, 10), sharex=True)
    fig.suptitle(f"Bayesian Model Comparison (k={k}, n={n})", fontsize=16)

    axs[0, 0].plot(theta, prior1, label='Prior', color='blue')
    axs[0, 1].plot(theta, prior2, label='Prior', color='red')
    axs[1, 0].plot(theta, likelihood, label='Likelihood', color='black')
    axs[1, 1].plot(theta, likelihood, label='Likelihood', color='black')
    axs[2, 0].plot(theta, post1, label='Posterior', color='blue')
    axs[2, 1].plot(theta, post2, label='Posterior', color='red')

    if show_sampling:
        samples = samples_slider.value
        sampled_prior1 = np.random.beta(alpha1, beta1, samples)
        sampled_prior2 = np.random.beta(alpha2, beta2, samples)
        sampled_post1 = np.random.beta(alpha1 + k, beta1 + n - k, samples)
        sampled_post2 = np.random.beta(alpha2 + k, beta2 + n - k, samples)

        axs[0, 0].hist(sampled_prior1, bins=50, density=True, alpha=0.3, color='blue')
        axs[0, 1].hist(sampled_prior2, bins=50, density=True, alpha=0.3, color='red')
        axs[2, 0].hist(sampled_post1, bins=50, density=True, alpha=0.3, color='blue')
        axs[2, 1].hist(sampled_post2, bins=50, density=True, alpha=0.3, color='red')

    for ax_row in axs:
        for ax in ax_row:
            ax.set_xlim(0, 1)
            ax.set_ylim(bottom=0)
            ax.grid(True)

    axs[0, 0].set_title("Model 1: Prior")
    axs[0, 1].set_title("Model 2: Prior")
    axs[1, 0].set_title("Model 1: Likelihood")
    axs[1, 1].set_title("Model 2: Likelihood")
    axs[2, 0].set_title("Model 1: Posterior")
    axs[2, 1].set_title("Model 2: Posterior")

    plt.tight_layout(rect=[0, 0.05, 1, 0.95])
    plt.show()

    # Posterior model probabilities pie chart
    fig, ax = plt.subplots(figsize=(5, 3))
    ax.pie([post1_prob, post2_prob], labels=['Model 1', 'Model 2'], autopct='%.1f%%',
           colors=['blue', 'red'], startangle=90)
    ax.set_title("Posterior Model Probabilities")
    plt.subplots_adjust(top=0.8, bottom=0.2)
    plt.show()

    # Equation table using plain text
    html_table = f"""
    <h4>Posterior Odds Calculation</h4>
    <table style="width:100%; text-align:center; border-collapse: collapse;">
      <tr style="border-bottom:1px solid #ccc;">
        <th></th>
        <th>Prior Odds</th>
        <th>Bayes Factor</th>
        <th>Posterior Odds</th>
      </tr>
      <tr>
        <td><b>Equation</b></td>
        <td>p(M1)/p(M2)</td>
        <td>p(D|M1)/p(D|M2)</td>
        <td>p(M1|D)/p(M2|D)</td>
      </tr>
      <tr>
        <td><b>Substitution</b></td>
        <td>{p1:.2f} / {p2:.2f}</td>
        <td>{marginal1:.2f} / {marginal2:.2f}</td>
        <td>{post1_prob:.2f} / {post2_prob:.2f}</td>
      </tr>
      <tr style="border-top:1px solid #ccc;">
        <td><b>Value</b></td>
        <td>{prior_odds:.2f}</td>
        <td>{bayes_factor:.2f}</td>
        <td>{posterior_odds:.2f}</td>
      </tr>
    </table>
    """
    display(HTML(html_table))

# --- UI setup ---
ui = widgets.VBox([
    widgets.HTML("<h3>Data:</h3>"),
    widgets.HBox([k_slider, n_slider]),
    widgets.HTML("<h3>Model 1 Prior (θ ~ Beta):</h3>"),
    widgets.HBox([alpha1_slider, beta1_slider]),
    widgets.HTML("<h3>Model 2 Prior (θ ~ Beta):</h3>"),
    widgets.HBox([alpha2_slider, beta2_slider]),
    widgets.HTML("<h3>Model Comparison:</h3>"),
    prior_odds_slider,
    show_sampling_toggle,
    samples_slider
])

out = widgets.interactive_output(update, {
    "k": k_slider,
    "n": n_slider,
    "alpha1": alpha1_slider,
    "beta1": beta1_slider,
    "alpha2": alpha2_slider,
    "beta2": beta2_slider,
    "prior_odds": prior_odds_slider,
    "show_sampling": show_sampling_toggle
})

display(ui, out)


VBox(children=(HTML(value='<h3>Data:</h3>'), HBox(children=(IntSlider(value=5, description='Successes (k):', m…

Output()