# Bayesian Statistics Lab

In this hands-on lab, we continue our demonstration of "estimating the mass of a new fundamental particle".
We will generate multiple experiments, each giving a noisy measurement of the particle's mass, and sequentially update our posterior distribution after each experiment.
We will then discuss what we should do when new theoretical prior appears.

## Physical Setup (Brief Recap)

Let's update some notation from the notes.

We have a particle of **true mass** $m_\text{true}$, measured in TeV.
Each experiment yields an observed mass $m_\text{obs}$ with Gaussian noise:
\begin{align}
m_\text{obs} \;\sim\; \mathcal{N}(m_\text{true},\sigma_\text{expr}^2).
\end{align}
Here, $\sigma_\text{expr}$ is the detector resolution or statistical uncertainty.

We know that $m_\text{true}$ lies in some range $[2,5]$ TeV---our *initial theory* suggests it cannot be outside this window.
Hence, our **initial prior**:
\begin{align}
p(m_\text{true}) =
\begin{cases}
\frac{1}{5-2}, & 2 \le \theta \le 5,\\
0, & \text{otherwise}.
\end{cases}
\end{align}

Each measurement modifies our belief (the prior) into a **posterior** via Bayes' Theorem:
\begin{align}
p(m_\text{true} \mid m_\text{obs}) \propto p(m_\text{obs} \mid m_\text{true})\,p(m_\text{true}).
\end{align}

Here, the **likelihood** $p(m_\text{obs} \mid m_\text{true})$ is given by the Gaussian formula:
\begin{align}
p(m_\text{obs} \mid \theta) =
\frac{1}{\sqrt{2\pi}\sigma_\text{expr}}\exp\left[-\frac{(m_\text{obs} - m_\text{true})^2}{2\sigma_\text{expr}^2}\right].
\end{align}

## Single Experiment Code (Grid Approximation)

Below is a quick code snippet that:
1. Defines a **grid** over $m\in[2,5]$.
2. Multiplies the prior by the Gaussian likelihood for an observed mass $m_\text{obs}$.
3. Normalizes the result to get the posterior.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

In [None]:
# Suppose this is the true mass
m_true = 3.6

In [None]:
# Let's pick also a detector resolution
sigma_expr = 1.0

In [None]:
# For reproducibility
np.random.seed(42)

In [None]:
# Suppose we have a single measurement
m_obs = np.random.normal(loc=m_true, scale=sigma_expr)
print(m_obs)

In [None]:
m_min, m_max = 2.0, 5.0
n_grid = 2000
ms     = np.linspace(m_min, m_max, n_grid)

def initial_prior(ms):
    # Uniform in [2,5], zero outside
    return np.where((ms >= m_min) & (ms <= m_max), 1.0/(m_max-m_min), 0.0)

In [None]:
def likelihood(m_obs, ms, sigma_expr):
    # Gaussian formula
    norm = 1.0 / (np.sqrt(2*np.pi)*sigma_expr)
    return norm * np.exp(-0.5*((m_obs - ms)/sigma_expr)**2)

In [None]:
# Compute prior
prior = initial_prior(ms)

# Compute likelihood
like = likelihood(m_obs, ms, sigma_expr)

In [None]:
# Posterior ~ prior * likelihood (unnormalized)
unnorm_post = prior * like
post = unnorm_post / np.trapezoid(unnorm_post, ms)

In [None]:
plt.plot(ms, post, 'k', label="Posterior (1 experiment)")
plt.title("Posterior after a single measurement")
plt.xlabel("Mass (TeV)")
plt.ylabel("Probability Density")
plt.legend()

## Multiple Experiments

Now we simulate $N$ experiments. Each experiment provides $(m_{\text{obs},i}, \sigma_i) = (m_{\text{obs},i}, \sigma_\text{expr})$.
We update our posterior step by step:
1. Start with the prior (initially uniform in $[2,5]$).
2. For each experiment $i$, multiply the current posterior by the new likelihood.
3. Normalize to get the updated posterior, which becomes the prior for the next experiment.

In [None]:
# For reproducibility
np.random.seed(42)

In [None]:
# Let's simulate multiple experiments
N = 10
ms_obs = pass # HANDSON: draw N samples from a normal distribution.  You may use np.random.normal().

print("Simulated experiment results:")
for i, m_obs in enumerate(ms_obs):
    print(f"\tExperiment {i+1}: observed mass = {m_obs:.3f} ± {sigma_expr}")

In [None]:
# Perform sequential Bayesian updates using the same grid approach

prior = pass  # HANDSON: compute the prior from initial_prior()

plt.figure(figsize=(8,5))
plt.axvline(m_true, color='k', ls=':', label=r'$m_\text{true}$')
plt.plot(ms, prior, label="Initial Prior", lw=2)

for i, m_obs in enumerate(ms_obs):
    # HANDSON: compute the posterior
    like        = pass 
    unnorm_post = pass
    norm        = pass
    post        = pass

    # Plot the posterior
    plt.plot(ms, post, label=f"Posterior after Exp {i+1}")

    # HANDSON: posterior becomes prior for next iteration
    prior = pass

plt.title("Sequential Bayesian Updates of Particle Mass")
plt.xlabel("Mass (TeV)")
plt.ylabel("Probability Density")
plt.legend()

You will see each new experiment narrowing or shifting the distribution.

In [None]:
# HANDSON:
# Try to increase `N` to, e.g, 100 and plot the posterior every 10 experiments.
# What do you see?

In [None]:
# HANDSON:
# Try to change `m_true` to a value outside the theory, e.g., 5.5 and plot the posterior.
# What do you see?