# Lecture 1 Hands-On: Poisson Warm-Up

This notebook pairs with Lecture 1. Start with a plain NumPy simulation of Poisson counts, then step into a SimPy arrival process.

## Setup

In [None]:
import pathlib

import numpy as np

try:
    import simpy
except ImportError as exc:
    raise SystemExit("SimPy is required for this notebook. Install via 'pip install simpy'.") from exc

RNG = np.random.default_rng(42)
NOTEBOOK_DIR = pathlib.Path.cwd()


---
## Part 1 — Plain Python Warm-Up
Follow the checklist from the slides:
1. Set the hourly rate `lam` and scale it to a half-hour window.
2. Simulate `n_windows = 1_000` Poisson counts for the 30 minute window.
3. Report the sample mean and variance versus the theoretical value `lambda_window`.
4. Estimate `P(X >= 1)` empirically by counting non-zero draws.
5. Summarise the takeaways in the Markdown cell below.

In [None]:
lam_per_hour = 4
lambda_window = lam_per_hour * 0.5  # TODO: confirm the scaling
n_windows = 1_000

# TODO: simulate Poisson counts and compute sample mean, variance, and probability of at least one event
counts = None
sample_mean = None
sample_var = None
p_ge_one = None
p_zero = None  

sample_mean, sample_var, p_ge_one, p_zero


### Theoretical Benchmarks

For a Poisson random variable $X \sim \mathrm{Poi}(\lambda)$ with $\lambda = 2$ (half-hour window):
- $\mathbb{E}[X] = \lambda$.
- $\operatorname{Var}(X) = \lambda$.
- $\mathbb{P}[X \ge 1] = 1 - e^{-\lambda}$.

Fill in the following cell to compute the exact values numerically.

In [None]:
import math

exact_mean = None
exact_variance = None
exact_p_ge_one = None
eaxct_p_zero = None

exact_mean, exact_variance, exact_p_ge_one, eaxct_p_zero

### Compare Simulation vs. Theory

Compute absolute errors between your simulated statistics and the theoretical values above. Check whether the discrepancies are within the tolerance you expect for $n_\text{windows} = 1000$.

In [None]:
# TODO: replace `sample_mean`, `sample_var`, `p_ge_one` after computing them above
errors = {
    'mean_error': abs(sample_mean - exact_mean),
    'variance_error': abs(sample_var - exact_variance),
    'prob_error': abs(p_ge_one - exact_p_ge_one),
}
errors

### Reasonable Tolerance Bounds

Determine quantitative tolerances for each statistic. For example, one approach is to use a normal approximation or Chebyshev inequality to set bounds for estimated mean/variance/probabilities. Formulate a justification and verify the simulation output lies within your bounds.

In [None]:
# TODO: Define tolerances justified by your reasoning
mean_tolerance = None
variance_tolerance = None
prob_tolerance = None

checks = {
    'mean_within_tolerance': errors['mean_error'] <= mean_tolerance if mean_tolerance is not None else None,
    'variance_within_tolerance': errors['variance_error'] <= variance_tolerance if variance_tolerance is not None else None,
    'prob_within_tolerance': errors['prob_error'] <= prob_tolerance if prob_tolerance is not None else None,
}
checks

### Commentary

Briefly discuss whether the simulation agrees with theory given your tolerances. If it does not, refine your reasoning or increase the number of simulations until the comparison is satisfactory.

---
## Part 2 — M/M/1 Queue in Plain Python

We begin with an end-to-end view of the M/M/1 queue before touching SimPy.
- **Arrivals** follow a Poisson process with rate $\lambda$ (exponential inter-arrival times).
- **Service times** are i.i.d. exponential with rate $\mu$.
- A single server works first-come/first-served, with unlimited waiting room.

Key quantities to track: utilisation $\rho = \lambda/\mu$, waiting times $W_q$, sojourn times $W$, and queue length process $L(t)$. We'll implement a minimal discrete-event simulator using only basic Python + NumPy.

In [None]:
# Queue and simulation parameters (feel free to tweak)
lambda_rate = 4.0   # arrivals per hour
mu_rate = 6.0       # services per hour
sim_hours = 2.0
max_events = 5_000  # safety cap to avoid infinite loops

rho = lambda_rate / mu_rate
rho

### Task: Implement a Plain-Python Simulator
Steps to follow inside the skeleton below:
1. Generate exponential inter-arrival and service times using `rng.exponential` (remember rate vs. scale).
2. Keep track of the next arrival time and when the server becomes free.
3. For each arrival, decide when service starts (max of arrival time and server-available time), then update departure time.
4. Record per-customer metrics: waiting time in queue, total time in system, and queue length just before arrival.
5. Stop when simulated time exceeds `sim_hours` or you hit the safety cap.
6. Return a dictionary with raw logs to analyse later.

In [None]:
# TODO: complete the simulator

def simulate_mm1_basic(lambda_rate, mu_rate, sim_hours, rng, max_events=5_000):
    """Return a dict containing arrival/departure logs for an M/M/1 queue."""
    raise NotImplementedError

mm1_logs = simulate_mm1_basic(lambda_rate, mu_rate, sim_hours, RNG, max_events=max_events)
list(mm1_logs.keys())

### Theoretical Benchmarks for M/M/1
For $\rho = \lambda/\mu < 1$ the steady-state metrics are:
- $L = \dfrac{\rho}{1-\rho}$ (expected number in system)
- $L_q = \dfrac{\rho^2}{1-\rho}$ (expected number waiting)
- $W = \dfrac{1}{\mu - \lambda}$ (expected time in system)
- $W_q = \dfrac{\rho}{\mu - \lambda}$ (expected waiting time)

Compute them numerically below for the chosen parameters.

In [None]:
theoretical = {
    'L': rho / (1 - rho),
    'L_q': (rho**2) / (1 - rho),
    'W': 1 / (mu_rate - lambda_rate),
    'W_q': rho / (mu_rate - lambda_rate),
}
theoretical

### Analyse the Simulation Output
Using the raw logs, derive empirical estimates for the same metrics:
- Average number in system / queue (e.g., via time averaging or Little's Law).
- Sample means for waiting time and total time in system.
Then compare to the theoretical values above.

In [None]:
# TODO: derive empirical metrics from mm1_logs
empirical = {
    'L_est': None,
    'L_q_est': None,
    'W_est': None,
    'W_q_est': None,
}
empirical

### Sanity Check
Compute absolute errors between empirical estimates and theory. Comment on whether the run length (`sim_hours`) and sample size are enough to match steady-state predictions, and what adjustments you would make if not.

In [None]:
# TODO: compare empirical metrics to theory (once filled)
if all(value is not None for value in empirical.values()):
    mm1_errors = {
        'L_error': abs(empirical['L_est'] - theoretical['L']),
        'L_q_error': abs(empirical['L_q_est'] - theoretical['L_q']),
        'W_error': abs(empirical['W_est'] - theoretical['W']),
        'W_q_error': abs(empirical['W_q_est'] - theoretical['W_q']),
    }
else:
    mm1_errors = None
mm1_errors

### Reflection
Summarise your findings: does the simple simulator agree with the closed-form M/M/1 results? Note any sources of discrepancy (finite horizon, warm-up bias, random fluctuation) and proposed fixes.

### From Analytical Models to Event Simulation
Parts 1 and 2 gave us the arrival distribution and queue behaviour using plain NumPy. We now carry those ingredients into SimPy so that the event scheduling matches the assumptions we've already validated.

---
## Part 3 — SimPy Arrival Stream
Recreate the same Poisson arrival process inside SimPy. Treat this as the event-driven counterpart to the NumPy simulations:
1. Reuse `lam_per_hour` from Part 1 for the arrival rate.
2. Implement `arrival_process(env, lam, rng, log)` that draws exponential inter-arrival times (matching Part 1) and records `env.now` just like the timestamps you computed for M/M/1.
3. Write `simulate_arrivals` that seeds a SimPy environment, launches the process, and runs it for `duration_hours`.
4. After the run, analyse the timestamps to recover **both** the inter-arrival distribution and the half-hour counts. Compare these to the theoretical benchmarks from Part 1 and use the tolerances you developed there.

In [None]:
lam = lam_per_hour  # reuse the rate from Part 1
window_hours = 0.5
num_windows = int(duration_hours / window_hours) if 'duration_hours' in globals() else 4
simpy_duration_hours = 2.0
simpy_rng = RNG  # reuse the global RNG unless you prefer a fresh seed

In [None]:
# TODO: implement the arrival process and simulation driver
def arrival_process(env, lam, rng, log):
    raise NotImplementedError

def simulate_arrivals(lam, duration_hours, rng):
    raise NotImplementedError

arrival_log = simulate_arrivals(lam, simpy_duration_hours, simpy_rng)
arrival_log[:5]

### Analyse the Arrival Log
- Compute inter-arrival samples and verify their mean/variance against the exponential theory ($1/\lambda$).
- Bucket arrivals into half-hour windows and compare empirical counts to the Part 1 metrics (mean, variance, $P(X\ge 1)$, etc.).
- Comment on whether the SimPy results respect the tolerances you set earlier.

In [None]:
# TODO: derive inter-arrival times, compute summary statistics, and produce at least one plot
# Suggested structure:
# 1. inter_arrivals = np.diff(arrival_log)
# 2. counts = ... (e.g., np.histogram or manual binning per half-hour)
# 3. Compare to `exact_mean`, `exact_variance`, `exact_p_ge_one`, etc.

### Stretch Goal
Replace the manual M/M/1 simulator with a SimPy version: inject the same arrivals into a SimPy `Resource`, add exponential services with rate `mu_rate`, and replicate the queueing metrics from Part 2. Compare the two implementations and explain any differences.