# HW: Hypothesis testing for a crooked die 

We observe 60 rolls of a die, with 19 outcomes equal to 3.

We want a hypothesis test matched to the *specific* alternative:

- the die has a higher probability of rolling a 3

Let $p$ be the probability of rolling a 3, and let $K$ be the number of 3's in $n$ rolls.

## Hypotheses
\[
H_0: p = \frac{1}{6}
\qquad
H_1: p > \frac{1}{6}
\]

## Test statistic
A natural statistic for this one-sided alternative is the count of 3's
\[
T = K = \sum_{i=1}^n \mathbf{1}\{X_i = 3\}
\]
Large values of $K$ support $H_1$.

## p-value definition
Compute the p-value by simulating the null model many times and measuring how often the simulated test statistic is at least as extreme as observed
\[
p\text{-value} = \Pr(K \ge k_{\mathrm{obs}} \mid H_0)
\]

We'll compute:
1. an exact p-value (binomial tail)
2. a simulation p-value (Monte Carlo)


In [None]:
import numpy as np
import math

# Observed data
n = 60
k_obs = 19

# Null model: fair die => p(3) = 1/6
p0 = 1/6

n, k_obs, p0


## Exact p-value (binomial tail)

Under $H_0$, the number of 3's satisfies
\[
K \sim \mathrm{Bin}(n, p_0),
\qquad
p_0 = \frac{1}{6}
\]

Therefore, the one-sided $p$-value is
\[
p\text{-value}
= \Pr(K \ge k_{\mathrm{obs}} \mid H_0)
= \sum_{k = k_{\mathrm{obs}}}^{n}
\binom{n}{k}
\left(\frac{1}{6}\right)^k
\left(\frac{5}{6}\right)^{\,n-k}
\]


In [None]:
p_value_exact = sum(
    math.comb(n, k) * (p0**k) * ((1 - p0)**(n - k))
    for k in range(k_obs, n + 1)
)

p_value_exact


## Simulation p-value (Monte Carlo)

Simulate many experiments under $H_0$

- In each experiment: roll a fair die $n$ times
- Count how many 3's occur
- Estimate the p-value as the fraction of experiments with $K \ge k_{\mathrm{obs}}$



In [None]:
rng = np.random.default_rng(0)

def simulate_k_under_H0(n, iters):
    # simulate die rolls (1..6) and count 3s
    rolls = rng.integers(1, 7, size=(iters, n))
    return np.sum(rolls == 3, axis=1)

iters = 200_000
ks = simulate_k_under_H0(n=n, iters=iters)

p_value_sim = np.mean(ks >= k_obs)
p_value_sim


## Compare exact vs simulation

The simulation estimate should be close to the exact binomial tail (up to Monte Carlo error)


In [None]:
p_value_exact, p_value_sim, abs(p_value_exact - p_value_sim)


## conclusion

The $p$-value is the probability, assuming a fair die, of seeing 19 or more threes in 60 rolls.

A small $p$-value indicates that the observed outcome would be unlikely under $H_0$, providing evidence that $p > 1/6$


In [None]:
print(f"Exact one-sided p-value: {p_value_exact:.6f}")
print(f"Simulated one-sided p-value (iters={iters:,}): {p_value_sim:.6f}")
