# Lecture 2 : A First Bayesian Statistical Model

This notebook accompanies **Lecture 2** of MT2002: Statistical Modeling.

**Objectives:**
- Introduce the concept of a statistical model
- Define parameters and observed data
- Build and sample a simple Bayesian model using PyMC

In [None]:
import pymc as pm
import arviz as az
import numpy as np

## What Is a Statistical Model?

A statistical model is a simplified representation of a data-generating process.

In this lecture:
- We assume data are generated by a simple probabilistic process
- We express uncertainty about unknown quantities using probability distributions
- We use PyMC to formalize these assumptions

In [None]:
# Observed binary outcomes (e.g., success/failure)
observed_data = np.array([1, 0, 1, 1, 0, 1, 0])

## Model Specification

We assume:
- An unknown probability `p` of success
- Each observation is conditionally independent given `p`
- A prior distribution is assigned to `p`

This model is intentionally simple and serves only as an introduction.

In [None]:
with pm.Model() as model:
    # Prior for probability of success
    p = pm.Beta("p", alpha=1, beta=1)

    # Likelihood
    y = pm.Bernoulli("y", p=p, observed=observed_data)

    # Posterior sampling
    trace = pm.sample(1000, tune=1000, progressbar=True)

In [None]:
az.summary(trace, var_names=["p"])

## Interpretation (Qualitative)

- The posterior distribution reflects uncertainty about `p`
- Different values of `p` are plausible given the observed data
- The goal is not a single estimate, but a distribution of possible values

Formal interpretation will be developed in later lectures.

## Notes

- This is a minimal example designed for intuition
- No diagnostics or model comparison are introduced yet
- More complex models will follow in later lectures