# Week 11: Binomial and Poisson Distributions

**Course**: BSMA1002 - Statistics for Data Science I  
**Topic**: Discrete Probability Distributions  
**Week**: 11

## ðŸŽ¯ Objectives
- Understand Bernoulli Trials and Binomial Distribution
- Work with Poisson Distribution for rare events
- Compare Binomial and Poisson (Limiting case)
- Apply distributions to A/B testing and quality control

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats

plt.style.use('seaborn-v0_8-whitegrid')
print("Setup complete!")

## 1. Binomial Distribution

Models the number of successes in $n$ independent Bernoulli trials with probability $p$.
$$X \sim B(n, p)$$
$$P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}$$

In [None]:
# Parameters
n = 10
p = 0.5
binom_rv = stats.binom(n, p)

# Probabilities
print(f"P(X=5): {binom_rv.pmf(5):.4f}")
print(f"P(X<=5): {binom_rv.cdf(5):.4f}")

# Visualization
x = np.arange(0, n+1)
plt.figure(figsize=(8, 5))
plt.bar(x, binom_rv.pmf(x), alpha=0.7, color='blue', label=f'Binomial(n={n}, p={p})')
plt.title(f"Binomial Distribution (n={n}, p={p})")
plt.xlabel("Number of Successes")
plt.ylabel("Probability")
plt.legend()
plt.show()

## 2. Poisson Distribution

Models the number of events occurring in a fixed interval with average rate $\lambda$.
$$X \sim Poisson(\lambda)$$
$$P(X=k) = \frac{e^{-\lambda} \lambda^k}{k!}$$

In [None]:
# Parameters
lam = 3
poisson_rv = stats.poisson(lam)

# Probabilities
print(f"P(X=3): {poisson_rv.pmf(3):.4f}")
print(f"P(X>5): {poisson_rv.sf(5):.4f}")

# Visualization
x = np.arange(0, 15)
plt.figure(figsize=(8, 5))
plt.bar(x, poisson_rv.pmf(x), alpha=0.7, color='green', label=f'Poisson(Î»={lam})')
plt.title(f"Poisson Distribution (Î»={lam})")
plt.xlabel("Number of Events")
plt.ylabel("Probability")
plt.legend()
plt.show()

## 3. Poisson as Limit of Binomial

When $n$ is large and $p$ is small, $B(n, p) \approx Poisson(\lambda)$ where $\lambda = np$.

In [None]:
# Comparison
n_large = 100
p_small = 0.05
lam_approx = n_large * p_small  # 5

binom_limit = stats.binom(n_large, p_small)
poisson_limit = stats.poisson(lam_approx)

x = np.arange(0, 20)

plt.figure(figsize=(10, 6))
plt.plot(x, binom_limit.pmf(x), 'bo-', label='Binomial(100, 0.05)')
plt.plot(x, poisson_limit.pmf(x), 'r--', label='Poisson(5)')
plt.title("Poisson Approximation to Binomial")
plt.xlabel("k")
plt.ylabel("Probability")
plt.legend()
plt.show()

## 4. Application: A/B Testing

Suppose baseline conversion rate is 10%. We test a new feature on 100 users.
If 15 users convert, is it significant?
Model as $X \sim B(100, 0.1)$. Calculate $P(X \ge 15)$.

In [None]:
# A/B Test Significance
n_test = 100
p_baseline = 0.10
observed_conversions = 15

# Null Hypothesis: p = 0.10
# P-value: Probability of observing >= 15 conversions given H0 is true
p_value = 1 - stats.binom(n_test, p_baseline).cdf(observed_conversions - 1)

print(f"Observed: {observed_conversions}/{n_test} (15%)")
print(f"Baseline: {p_baseline*100}%")
print(f"P-value: {p_value:.4f}")

if p_value < 0.05:
    print("Result is Statistically Significant! (Reject H0)")
else:
    print("Result is NOT Significant (Fail to reject H0)")