### 02b_matrix_coverage_fast_ops — end-to-end, sparse-matrix engine 

**Notebook purpose (plain language)**  
This notebook swaps the *engine* that finds minimum times in `02a_coverage`.  
Instead of pandas group-bys on the long travel table, we build two *sparse* matrices and use fast, vectorised reductions. All inputs, thresholds, blue-light factors, KPIs, and maps stay the same—only the internals get faster and more scalable, enabling instant “what-if” scenarios.

---

#### What this notebook does

- **Builds matrices (reshape only, no routing):**  
  - **R** (response): rows = demand LSOAs, cols = station LSOAs, values = minutes station→LSOA.  
  - **C** (conveyance): rows = demand LSOAs, cols = acute LSOAs, values = minutes LSOA→acute.
- **Computes nearest times (vectorised, no loops):**  
  - `t_resp = rowwise_min(R[:, active_stations])`  
  - `t_conv = rowwise_min(C[:, active_acutes])`
- **Applies business rules:** blue-light factors applied *after* minima (per-leg), thresholds as in 02a.
- **Outputs unchanged:** coverage KPIs (% pop within 7/15 and 18/40), binary coverage columns, maps.
- **Adds optional diagnostic:** `t_total = t_resp + on_scene_buffer + t_conv` (three-leg view).
- **Enables scenarios:** select different station sets by column subset—no re-grouping or re-reading.

---

#### Inputs (same as 02a)

- LSOA universe (`lsoa_index`), centroids, populations (+ optional IMD / rural-urban).  
- Station & acute site files resolved to LSOA codes.  
- Long-form travel-time table already in the repo (response & conveyance legs).  
- Thresholds & blue-light factors (ARP, handover, conveyance) defined up front.

---

#### Outputs

- KPIs for response & conveyance at configured thresholds (overall and, optionally, by IMD/rural-urban).  
- Binary coverage columns per threshold for mapping.  
- (Optional) End-to-end time columns for transparency in pathway discussions.

---

#### Performance & storage

- **Sparse CSR** matrices for R and C; optional “has-edge” masks to distinguish true zeros from missing pairs.  
- **Radius thinning** (e.g., drop times > 60 min) to shrink matrices without losing feasible options.  
- **Caching:** save matrices + ordered labels to `data/.../matrices/*.npz` for instant reloads.

---

#### Scenario selector

Define scenarios as sets of active station/acute columns (e.g., `baseline`, `baseline + Site X`).  
Switching scenario = re-taking per-row minima → near-instant “what-if” diffs and coverage deltas.

---

#### Validation (first run)

- **Parity check vs 02a:** times and KPIs should match within tight tolerance on the Cornwall slice.  
- After validation, retire the legacy `min_time_from_any_origin` calls in this notebook.

---

#### Notes & cautions

- Any change to travel-time inputs **invalidates caches** → rebuild matrices.  
- LSOAs with no reachable station/acute remain at **∞** and are reported explicitly.  
- Keep LSOA codes categorical and ordering stable to avoid misalignment bugs.

---

#### Quick explainer (02a → 02b)

| Area / Step | 02a does now | 02b change | Benefit |
|---|---|---|---|
| Min times | pandas group-by on long table | rowwise min on sparse matrices | Much faster; scalable; no loops |
| Scenarios | re-filter + re-group | column subset on R/C | Instant “what-if” |
| Blue-light | applied in KPIs | apply after minima per leg | Correct nearest selection |
| Outputs | KPIs, maps | same | No UX change |



In [10]:
# Step 0 — Imports, params (updated: ARP + handover thresholds)
from pathlib import Path
import pandas as pd
import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Patch
from matplotlib.lines import Line2D

# Paths
DATA_ROOT = Path(
    "/Users/rosstaylor/Downloads/Code Repositories/REACH Map (NHS SW)/"
    "GitHub Repo/REACH-Map-NHS-SW/data/raw/test_data_ICB_level"
)

# Files
LOOKUP_CSV   = DATA_ROOT / "cornwall_icb_lsoa_lookup.csv"
AGE_GPKG     = DATA_ROOT / "demographics_age_continuous_icb.gpkg"
AGE_LAYER    = "LSOA_continuous_age_icb"
TRAVEL_CSV   = DATA_ROOT / "travel_matrix_lsoa_icb.csv"
STATIONS_CSV = DATA_ROOT / "ambulance_stations_icb.csv"
ACUTE_CSV    = DATA_ROOT / "acute_hospitals_icb.csv"   # overlay optional

# --- Targets / thresholds ----------------------------------------------------
# ARP response standards (minutes)
RESP = {
    "cat1": {"mean": 7,  "p90": 15},
    "cat2": {"mean": 18, "p90": 40},
    # optional for completeness:
    "cat3": {"p90": 120},
    "cat4": {"p90": 180},
}
# Convenience tuple for KPI lookups used in this notebook
RESPONSE_THRESHOLDS = (RESP["cat1"]["mean"], RESP["cat1"]["p90"],
                       RESP["cat2"]["mean"], RESP["cat2"]["p90"])  # -> (7, 15, 18, 40)

# Hospital handover/turnaround (NOT scene→A&E drive time)
HANDOVER = {"target": 15, "breach": 30, "severe": 60}
HANDOVER_THRESHOLDS = (HANDOVER["target"], HANDOVER["breach"], HANDOVER["severe"])  # (15, 30, 60)

# Scene→A&E conveyance (geographic potential only; no national target)
# Keep if you want to map the drive leg; adjust locally if desired.
SCENE_TO_AE_THRESHOLDS = (30, 45, 60)

# Optional blue-light factors applied to car travel times (1.0 = off)
BLUE_LIGHT_FACTOR_RESPONSE = 1.0
BLUE_LIGHT_FACTOR_CONVEY   = 1.0

# Display prefs
pd.set_option("display.width", 120)
pd.set_option("display.max_columns", 120)


In [12]:
# Step 1 — Load, inspect, and clean core data (universe, population, travel, sites)
# (Refined based on your Step 1 output: 336 LSOAs, 112,560 OD rows, 14 station LSOAs, 3 acute LSOAs)

from __future__ import annotations

from typing import Iterable, Dict
import re

# ---------- configurable cleaning rules (tweak if needed) ----------
ALLOW_DIAGONAL_ZERO = True     # keep 0 min where origin_lsoa == dest_lsoa
FLOOR_OFFDIAG_ZERO_MIN = 0.5   # minutes; applied only if origin != dest and time <= 0
DROP_NEGATIVES = True          # drop rows with time < 0 (data error)
MAX_SANITY_MIN = 300           # flag values > this as outliers (warn only)

# ---------- small helpers ----------
def _ok(msg: str) -> None:
    print(f"[OK] {msg}")

def _warn(msg: str) -> None:
    print(f"[WARN] {msg}")

def _fail(msg: str) -> None:
    raise AssertionError(msg)

def _expect_columns(df: pd.DataFrame, cols: Iterable[str], label: str) -> None:
    missing = [c for c in cols if c not in df.columns]
    if missing:
        _fail(f"{label}: missing columns {missing}")

# ---------- 1) LSOA universe ----------
lookup = pd.read_csv(LOOKUP_CSV, dtype={"lsoa_code": "string"})
_expect_columns(lookup, ["lsoa_code"], "LSOA lookup")
lookup = lookup.drop_duplicates(subset=["lsoa_code"]).copy()

lsoa_index = pd.Index(lookup["lsoa_code"].astype("string"), name="lsoa_code")
if lsoa_index.empty or not lsoa_index.is_unique:
    _fail("LSOA lookup must provide a non-empty, unique list of LSOA codes.")
_ok(f"Universe ready: {len(lsoa_index):,} LSOAs")

# ---------- 2) Population & geometry ----------
lsoa_g = gpd.read_file(AGE_GPKG, layer=AGE_LAYER)
_expect_columns(lsoa_g, ["lsoa_code", "geometry"], "Age GPKG")
lsoa_g["lsoa_code"] = lsoa_g["lsoa_code"].astype("string")

# Determine population: prefer population_total/population; else sum continuous ages
pop_col = next((c for c in ("population_total", "population") if c in lsoa_g.columns), None)
if pop_col:
    population = lsoa_g.set_index("lsoa_code")[pop_col].astype("float64")
else:
    age_cols = []
    for c in lsoa_g.columns:
        if c in ("lsoa_code", "geometry"):
            continue
        if (re.fullmatch(r"\d{1,3}\+?", str(c)) or str(c).startswith("age_")) and np.issubdtype(lsoa_g[c].dtype, np.number):
            age_cols.append(c)
    if not age_cols:
        _fail("Could not infer population: no 'population_total' nor numeric continuous age columns found.")
    population = lsoa_g.set_index("lsoa_code")[age_cols].sum(axis=1).astype("float64")

# Align to universe
population = population.reindex(lsoa_index).fillna(0.0)
lsoa_g = (
    lsoa_g[["lsoa_code", "geometry"]]
    .drop_duplicates("lsoa_code")
    .set_index("lsoa_code")
    .reindex(lsoa_index)
)
lsoa_g = gpd.GeoDataFrame(lsoa_g, geometry="geometry", crs=lsoa_g.crs)

_ok(f"Population loaded (sum={int(population.sum()):,}; non-zero LSOAs={(population > 0).sum():,}); CRS={lsoa_g.crs}")

# ---------- 3) Travel table (load → inspect → clean) ----------
travel = pd.read_csv(
    TRAVEL_CSV,
    dtype={"origin_lsoa": "string", "dest_lsoa": "string"},
)

# Normalise time column to 'time_car_min'
time_col = next((c for c in ("time_car_min", "time_min", "minutes", "drive_min", "t_min") if c in travel.columns), None)
if time_col is None:
    _fail("Travel CSV must include a minutes column (e.g., 'time_car_min' / 'time_min' / 'minutes').")

travel = travel.rename(columns={time_col: "time_car_min"})
_expect_columns(travel, ["origin_lsoa", "dest_lsoa", "time_car_min"], "Travel CSV")

# Type and NA clean
travel["origin_lsoa"] = travel["origin_lsoa"].astype("string")
travel["dest_lsoa"] = travel["dest_lsoa"].astype("string")
travel["time_car_min"] = pd.to_numeric(travel["time_car_min"], errors="coerce").astype("float32")
travel = travel.dropna(subset=["origin_lsoa", "dest_lsoa", "time_car_min"]).copy()

# Keep only rows within the LSOA universe (both origin and dest)
in_universe = travel["origin_lsoa"].isin(lsoa_index) & travel["dest_lsoa"].isin(lsoa_index)
dropped_outside = int((~in_universe).sum())
if dropped_outside:
    _warn(f"Dropping {dropped_outside:,} travel rows outside LSOA universe.")
travel = travel.loc[in_universe].copy()

# Inspect problematic times
is_diag = travel["origin_lsoa"] == travel["dest_lsoa"]
is_zero = travel["time_car_min"] == 0
is_neg = travel["time_car_min"] < 0
is_large = travel["time_car_min"] > MAX_SANITY_MIN
offdiag_zero = (~is_diag) & is_zero

n_rows0 = len(travel)
n_neg = int(is_neg.sum())
n_zero_diag = int((is_diag & is_zero).sum())
n_zero_offdiag = int(offdiag_zero.sum())
n_large = int(is_large.sum())

# Clean according to rules
if DROP_NEGATIVES and n_neg:
    travel = travel.loc[~is_neg].copy()
    _warn(f"Dropped {n_neg:,} rows with negative minutes.")

if n_zero_offdiag:
    # Set small positive floor for off-diagonal zeros
    travel.loc[offdiag_zero, "time_car_min"] = np.float32(FLOOR_OFFDIAG_ZERO_MIN)
    _warn(f"Floored {n_zero_offdiag:,} off-diagonal zero-minute rows to {FLOOR_OFFDIAG_ZERO_MIN} min.")

if not ALLOW_DIAGONAL_ZERO and n_zero_diag:
    travel.loc[is_diag & is_zero, "time_car_min"] = np.float32(FLOOR_OFFDIAG_ZERO_MIN)
    _warn(f"Replaced {n_zero_diag:,} diagonal zeros with {FLOOR_OFFDIAG_ZERO_MIN} min (ALLOW_DIAGONAL_ZERO=False).")

# Recompute quick stats post-clean
n_rows1 = len(travel)
_ok(
    f"Travel rows: {n_rows1:,} (was {n_rows0:,}); "
    f"origins={travel['origin_lsoa'].nunique():,}; dests={travel['dest_lsoa'].nunique():,}"
)
if n_large:
    _warn(f"{n_large:,} rows have very large minutes (>{MAX_SANITY_MIN}); keep but flagged.")

# Coverage against universe (warn-only)
orig_set = pd.Index(travel["origin_lsoa"].unique(), dtype="string")
dest_set = pd.Index(travel["dest_lsoa"].unique(), dtype="string")
miss_orig = lsoa_index.difference(orig_set)
miss_dest = lsoa_index.difference(dest_set)
if len(miss_orig):
    _warn(f"{len(miss_orig)} LSOAs absent as origins (e.g., {list(miss_orig[:5])})")
if len(miss_dest):
    _warn(f"{len(miss_dest)} LSOAs absent as destinations (e.g., {list(miss_dest[:5])})")

# ---------- 4) Sites (stations & acutes resolved to LSOA) ----------
def _load_site_lsoas(csv_path: Path, label: str) -> pd.Index:
    if not csv_path.exists():
        if label.lower().startswith("acute"):
            _warn("Acute CSV not found; conveyance leg will be optional.")
            return pd.Index([], dtype="string", name="lsoa_code")
        _fail(f"{label} CSV not found: {csv_path}")

    df = pd.read_csv(csv_path)
    df.columns = [c.strip().lower() for c in df.columns]
    code_col = next((c for c in ("lsoa_code", "lsoa21cd") if c in df.columns), None)
    if code_col is None:
        _fail(f"{label}: expected an LSOA code column ('lsoa_code' or 'lsoa21cd').")
    codes = pd.Index(df[code_col].astype("string"), name="lsoa_code").dropna().drop_duplicates()
    codes = codes[codes.isin(lsoa_index)]
    if codes.empty:
        _warn(f"{label}: no valid LSOAs after filtering to universe.")
    return codes

station_lsoas = _load_site_lsoas(STATIONS_CSV, "Ambulance stations")
acute_lsoas   = _load_site_lsoas(ACUTE_CSV,    "Acute hospitals")

_ok(f"Stations mapped to {len(station_lsoas):,} LSOAs; Acutes mapped to {len(acute_lsoas):,} LSOAs")

# Ensure coverage in travel table for sites we’ll need later
missing_station_as_origin = station_lsoas.difference(orig_set)
missing_station_as_dest   = station_lsoas.difference(dest_set)  # may be irrelevant for response
if len(missing_station_as_origin):
    _warn(f"{len(missing_station_as_origin)} station LSOAs not present as travel origins "
          f"(e.g., {list(missing_station_as_origin[:5])}) — check travel construction.")
missing_acute_as_dest = acute_lsoas.difference(dest_set)
if len(missing_acute_as_dest):
    _warn(f"{len(missing_acute_as_dest)} acute LSOAs not present as travel destinations "
          f"(e.g., {list(missing_acute_as_dest[:5])}) — check travel construction.")

# ---------- 5) Integer mappings for matrices ----------
lsoa_to_idx: Dict[str, int] = {code: i for i, code in enumerate(lsoa_index)}
idx_to_lsoa = lsoa_index.to_numpy()

station_idx = np.array([lsoa_to_idx[c] for c in station_lsoas], dtype=np.int32) if len(station_lsoas) else np.array([], dtype=np.int32)
acute_idx   = np.array([lsoa_to_idx[c] for c in acute_lsoas],   dtype=np.int32) if len(acute_lsoas)   else np.array([], dtype=np.int32)

# ---------- 6) Report diagonals & zero counts explicitly ----------
diag_zero_ct = int(((travel["origin_lsoa"] == travel["dest_lsoa"]) & (travel["time_car_min"] == 0)).sum())
offdiag_zero_ct = int(((travel["origin_lsoa"] != travel["dest_lsoa"]) & (travel["time_car_min"] == 0)).sum())
_ok(f"Diagonal 0-min rows kept: {diag_zero_ct:,} (ALLOW_DIAGONAL_ZERO={ALLOW_DIAGONAL_ZERO})")
if offdiag_zero_ct:
    _warn(f"Off-diagonal 0-min rows remaining after cleaning: {offdiag_zero_ct:,}")

# ---------- 7) Step 1 summary ----------
summary = pd.Series(
    {
        "n_lsoas": len(lsoa_index),
        "population_sum": int(population.sum()),
        "travel_rows": len(travel),
        "unique_origins": travel["origin_lsoa"].nunique(),
        "unique_dests": travel["dest_lsoa"].nunique(),
        "zeros_diag": diag_zero_ct,
        "zeros_offdiag": offdiag_zero_ct,
        "n_station_lsoas": len(station_lsoas),
        "n_acute_lsoas": len(acute_lsoas),
        "station_idx_dtype": str(station_idx.dtype),
        "acute_idx_dtype": str(acute_idx.dtype),
    }
)
print("\n== STEP 1 SUMMARY (cleaned) ==")
print(summary.to_string())

_ok("Step 1 complete — data aligned, cleaned, and inspected. Ready for Step 2 (sparse matrices).")


[OK] Universe ready: 336 LSOAs
[OK] Population loaded (sum=575,628; non-zero LSOAs=336); CRS=EPSG:27700
[WARN] Floored 41 off-diagonal zero-minute rows to 0.5 min.
[OK] Travel rows: 112,560 (was 112,560); origins=336; dests=336
[OK] Stations mapped to 14 LSOAs; Acutes mapped to 3 LSOAs
[OK] Diagonal 0-min rows kept: 0 (ALLOW_DIAGONAL_ZERO=True)

== STEP 1 SUMMARY (cleaned) ==
n_lsoas                 336
population_sum       575628
travel_rows          112560
unique_origins          336
unique_dests            336
zeros_diag                0
zeros_offdiag             0
n_station_lsoas          14
n_acute_lsoas             3
station_idx_dtype     int32
acute_idx_dtype       int32
[OK] Step 1 complete — data aligned, cleaned, and inspected. Ready for Step 2 (sparse matrices).


In [13]:
# Step 2 — Build & cache sparse matrices (R: station→LSOA, C: LSOA→acute)
# Produces CSR matrices + boolean masks and saves them to MATRICES_DIR.
# Also prints a quick baseline sanity check (nearest times & coverage bands).

from __future__ import annotations

from typing import Tuple, Sequence, Dict
from scipy import sparse

# Optional radius thinning (minutes). Set to None to keep all pairs.
MAX_RADIUS_MIN: float | None = None  # e.g., 60.0 to thin, or None to keep all


def _build_sparse_matrix(
    travel_df: pd.DataFrame,
    row_codes: pd.Index,            # demand LSOAs (rows)
    col_codes: pd.Index,            # site LSOAs for this leg (columns)
    row_key: str,                   # travel column holding row codes
    col_key: str,                   # travel column holding col codes
    value_key: str = "time_car_min",
    max_radius: float | None = MAX_RADIUS_MIN,
) -> Tuple[sparse.csr_matrix, sparse.csr_matrix]:
    """
    Reshape long travel table into a CSR matrix of minutes plus a boolean mask CSR
    with identical sparsity (1=has edge). Rows are row_codes, columns are col_codes.
    """
    if len(col_codes) == 0:
        # Empty column set → return empty (n_rows x 0) matrices
        shape = (len(row_codes), 0)
        return sparse.csr_matrix(shape, dtype=np.float32), sparse.csr_matrix(shape, dtype=np.uint8)

    # Fast membership filters
    idx_row = row_codes.astype("string")
    idx_col = col_codes.astype("string")
    m = travel_df[row_key].isin(idx_row) & travel_df[col_key].isin(idx_col)

    df = travel_df.loc[m, [row_key, col_key, value_key]].dropna(subset=[value_key]).copy()

    # Optional radius thinning
    if max_radius is not None:
        df = df.loc[df[value_key] <= float(max_radius)].copy()

    # Group duplicates to their minimum (defensive)
    df = (
        df.groupby([row_key, col_key], observed=True, sort=False)[value_key]
        .min()
        .reset_index()
    )

    # Build integer lookups for rows/cols local to this matrix
    row_lookup: Dict[str, int] = {code: i for i, code in enumerate(idx_row)}
    col_lookup: Dict[str, int] = {code: j for j, code in enumerate(idx_col)}

    rows = df[row_key].map(row_lookup).to_numpy(dtype=np.int32, na_value=-1)
    cols = df[col_key].map(col_lookup).to_numpy(dtype=np.int32, na_value=-1)
    vals = df[value_key].astype(np.float32).to_numpy()

    # Drop any pairs that mapped to -1 (should not happen given filters)
    good = (rows >= 0) & (cols >= 0) & np.isfinite(vals)
    rows, cols, vals = rows[good], cols[good], vals[good]

    shape = (len(row_codes), len(col_codes))
    coo = sparse.coo_matrix((vals, (rows, cols)), shape=shape, dtype=np.float32)
    mat = coo.tocsr()

    # Boolean mask with identical sparsity (1 where an edge exists)
    mask = sparse.csr_matrix((np.ones_like(mat.data, dtype=np.uint8), mat.indices, mat.indptr), shape=mat.shape)

    return mat, mask


def _rowwise_min_with_mask(
    mat: sparse.csr_matrix,
    mask: sparse.csr_matrix,
) -> np.ndarray:
    """
    Compute row-wise minima while treating 'no edge' as +inf (instead of 0).
    Returns a float32 dense vector of shape (n_rows,).
    """
    if mat.shape[1] == 0:
        return np.full(mat.shape[0], np.inf, dtype=np.float32)
    arr = mat.toarray()
    arr_mask = mask.toarray().astype(bool)
    arr[~arr_mask] = np.inf
    mins = arr.min(axis=1).astype(np.float32)
    return mins


def _pct_covered(times_min: np.ndarray, threshold_min: float, weights: np.ndarray) -> float:
    """
    Weighted percent of population with times <= threshold.
    """
    w = weights.astype(np.float64)
    covered = (times_min <= threshold_min)
    covered_pop = (w * covered).sum()
    total_pop = w.sum()
    return float(covered_pop / total_pop * 100.0) if total_pop > 0 else 0.0


# ---- Build R (station → demand LSOA) ----
R, R_mask = _build_sparse_matrix(
    travel_df=travel,
    row_codes=lsoa_index,          # rows: demand LSOAs
    col_codes=station_lsoas,       # cols: station LSOAs
    row_key="dest_lsoa",           # time is origin→dest; we want station(origin)→demand(dest)
    col_key="origin_lsoa",
    value_key="time_car_min",
    max_radius=MAX_RADIUS_MIN,
)

# ---- Build C (demand LSOA → acute) ----
C, C_mask = _build_sparse_matrix(
    travel_df=travel,
    row_codes=lsoa_index,          # rows: demand LSOAs
    col_codes=acute_lsoas,         # cols: acute LSOAs
    row_key="origin_lsoa",         # time is origin→dest; we want demand(origin)→acute(dest)
    col_key="dest_lsoa",
    value_key="time_car_min",
    max_radius=MAX_RADIUS_MIN,
)

# ---- Basic integrity & density report ----
def _report_matrix(name: str, mat: sparse.csr_matrix):
    nnz = mat.nnz
    m, n = mat.shape
    density = nnz / (m * n) if (m > 0 and n > 0) else 0.0
    print(f"[OK] {name}: shape={mat.shape}, nnz={nnz:,}, density={density:.4f}")

_report_matrix("R (station→LSOA)", R)
_report_matrix("C (LSOA→acute)", C)

# ---- Cache matrices & metadata ----
sparse.save_npz(MATRICES_DIR / "R_response_csr.npz", R)
sparse.save_npz(MATRICES_DIR / "R_response_mask_csr.npz", R_mask)
sparse.save_npz(MATRICES_DIR / "C_convey_csr.npz", C)
sparse.save_npz(MATRICES_DIR / "C_convey_mask_csr.npz", C_mask)

np.savez(
    MATRICES_DIR / "matrix_metadata.npz",
    lsoa_index=lsoa_index.to_numpy(),
    station_lsoas=station_lsoas.to_numpy(),
    acute_lsoas=acute_lsoas.to_numpy(),
    response_thresholds=np.array(RESPONSE_THRESHOLDS, dtype=np.int32),
    scene_to_ae_thresholds=np.array(SCENE_TO_AE_THRESHOLDS, dtype=np.int32),
    blue_light_factors=np.array([BLUE_LIGHT_FACTOR_RESPONSE, BLUE_LIGHT_FACTOR_CONVEY], dtype=np.float32),
    max_radius=np.array([np.nan if MAX_RADIUS_MIN is None else float(MAX_RADIUS_MIN)], dtype=np.float32),
)
print(f"[OK] Cached matrices to: {MATRICES_DIR}")

# ---- Quick baseline sanity: nearest times & coverage (all stations / all acutes) ----
t_resp_base = _rowwise_min_with_mask(R, R_mask) * np.float32(BLUE_LIGHT_FACTOR_RESPONSE)
t_conv_base = _rowwise_min_with_mask(C, C_mask) * np.float32(BLUE_LIGHT_FACTOR_CONVEY)

# Stats
def _summ(name: str, arr: np.ndarray) -> str:
    finite = np.isfinite(arr)
    if not finite.any():
        return f"{name}: all inf"
    vals = arr[finite]
    q = np.percentile(vals, [0, 25, 50, 90, 95, 100]).round(2)
    return f"{name}: min={q[0]}, p25={q[1]}, median={q[2]}, p90={q[3]}, p95={q[4]}, max={q[5]} (n={finite.sum()})"

print(_summ("t_resp (min to any station)", t_resp_base))
print(_summ("t_conv (min to any acute)",   t_conv_base))

# Weighted coverage (overall)
pop_w = population.reindex(lsoa_index).to_numpy(dtype=np.float64)

def _coverage_report(times: np.ndarray, thresholds: Sequence[int], label: str):
    parts = []
    for thr in thresholds:
        pct = _pct_covered(times, thr, pop_w)
        parts.append(f"≤{thr} min: {pct:5.1f}%")
    print(f"{label}: " + " | ".join(parts))

# Response coverage at ARP-like thresholds (7, 15, 18, 40) for visibility
_coverage_report(t_resp_base, RESPONSE_THRESHOLDS, "Response coverage (all stations)")

# Conveyance coverage at scene→A&E bands (30, 45, 60)
if C.shape[1] > 0:
    _coverage_report(t_conv_base, SCENE_TO_AE_THRESHOLDS, "Conveyance coverage (all acutes)")
else:
    print("[WARN] No acute columns; conveyance coverage skipped.")

print("[OK] Step 2 complete — matrices built, cached, and baseline sanity computed.")


[OK] R (station→LSOA): shape=(336, 14), nnz=4,690, density=0.9970
[OK] C (LSOA→acute): shape=(336, 3), nnz=1,005, density=0.9970
[OK] Cached matrices to: /Users/rosstaylor/Downloads/Code Repositories/REACH Map (NHS SW)/GitHub Repo/REACH-Map-NHS-SW/data/raw/test_data_ICB_level/matrices
t_resp (min to any station): min=0.5, p25=5.7, median=11.22, p90=24.64, p95=28.76, max=92.22 (n=336)
t_conv (min to any acute): min=0.5, p25=16.95, median=29.28, p90=74.47, p95=80.82, max=129.12 (n=336)
Response coverage (all stations): ≤7 min:  30.8% | ≤15 min:  62.3% | ≤18 min:  75.2% | ≤40 min:  99.3%
Conveyance coverage (all acutes): ≤30 min:  51.8% | ≤45 min:  69.4% | ≤60 min:  79.0%
[OK] Step 2 complete — matrices built, cached, and baseline sanity computed.


In [14]:
# Step 3 — Vectorised helpers, baseline KPIs, and exports (no loops)

from __future__ import annotations
from typing import Sequence, Dict, Tuple
from scipy import sparse

# --- Optional: set a visible on-scene buffer for end-to-end diagnostic (minutes)
ON_SCENE_BUFFER_MIN: float = 0.0  # set >0 later if you want t_total

# Column lookups (matrix columns are ordered by these indices)
station_col_lookup: Dict[str, int] = {code: j for j, code in enumerate(station_lsoas)}
acute_col_lookup: Dict[str, int]   = {code: j for j, code in enumerate(acute_lsoas)}

def rowwise_min_with_mask(mat: sparse.csr_matrix, mask: sparse.csr_matrix) -> np.ndarray:
    """Row-wise minima treating 'no edge' as +inf (not 0)."""
    if mat.shape[1] == 0:
        return np.full(mat.shape[0], np.inf, dtype=np.float32)
    arr = mat.toarray()
    msk = mask.toarray().astype(bool)
    arr[~msk] = np.inf
    return arr.min(axis=1).astype(np.float32)

def min_times_response(active_station_cols: np.ndarray | list[int]) -> pd.Series:
    """Nearest response time per LSOA (minutes), blue-light factor applied after min."""
    if R.shape[1] == 0 or len(active_station_cols) == 0:
        mins = np.full(R.shape[0], np.inf, dtype=np.float32)
    else:
        sub = R[:, active_station_cols]
        sub_mask = R_mask[:, active_station_cols]
        mins = rowwise_min_with_mask(sub, sub_mask)
    mins = mins * np.float32(BLUE_LIGHT_FACTOR_RESPONSE)
    return pd.Series(mins, index=lsoa_index, name="t_resp_min")

def min_times_convey(active_acute_cols: np.ndarray | list[int]) -> pd.Series:
    """Nearest conveyance time per LSOA (minutes), blue-light factor applied after min."""
    if C.shape[1] == 0 or len(active_acute_cols) == 0:
        mins = np.full(C.shape[0], np.inf, dtype=np.float32)
    else:
        sub = C[:, active_acute_cols]
        sub_mask = C_mask[:, active_acute_cols]
        mins = rowwise_min_with_mask(sub, sub_mask)
    mins = mins * np.float32(BLUE_LIGHT_FACTOR_CONVEY)
    return pd.Series(mins, index=lsoa_index, name="t_conv_min")

def coverage_table(times: pd.Series, thresholds: Sequence[int], weights: pd.Series) -> pd.DataFrame:
    """Population-weighted coverage for multiple thresholds."""
    w = weights.reindex(times.index).astype("float64")
    total = float(w.sum())
    out = []
    a = times.to_numpy()
    for thr in thresholds:
        covered = (a <= thr)
        pct = float((w.to_numpy() * covered).sum() / total * 100.0) if total > 0 else 0.0
        out.append({"threshold_min": int(thr), "pct_population": round(pct, 2)})
    return pd.DataFrame(out)

def label_columns_for_export(df: pd.DataFrame) -> pd.DataFrame:
    """Make boolean coverage columns tidy (0/1 uint8) and keep friendly order."""
    bool_cols = [c for c in df.columns if c.startswith(("resp_le_", "conv_le_"))]
    for c in bool_cols:
        df[c] = df[c].astype("uint8")
    order = ["t_resp_min", "t_conv_min", "t_total_min"] + bool_cols
    return df[[c for c in order if c in df.columns]]

# --- Baseline: use all stations and all acutes ---
baseline_station_cols = np.arange(R.shape[1], dtype=np.int32)
baseline_acute_cols   = np.arange(C.shape[1], dtype=np.int32)

t_resp = min_times_response(baseline_station_cols)
t_conv = min_times_convey(baseline_acute_cols)

# Optional end-to-end diagnostic
t_total = (t_resp.to_numpy() + ON_SCENE_BUFFER_MIN + t_conv.to_numpy()).astype(np.float32)
t_total = pd.Series(t_total, index=lsoa_index, name="t_total_min")

# --- Binary coverage flags for mapping/summary ---
out_df = pd.DataFrame(index=lsoa_index)
out_df["t_resp_min"]  = t_resp
out_df["t_conv_min"]  = t_conv
out_df["t_total_min"] = t_total

for thr in RESPONSE_THRESHOLDS:
    out_df[f"resp_le_{thr}"] = (out_df["t_resp_min"] <= thr)
for thr in SCENE_TO_AE_THRESHOLDS:
    out_df[f"conv_le_{thr}"] = (out_df["t_conv_min"] <= thr)

out_df = label_columns_for_export(out_df)

# --- KPI tables (overall coverage) ---
resp_kpis = coverage_table(t_resp, RESPONSE_THRESHOLDS, population)
conv_kpis = coverage_table(t_conv, SCENE_TO_AE_THRESHOLDS, population)

print("\n== Baseline KPIs ==")
print("Response (ARP-like):")
print(resp_kpis.to_string(index=False))
print("\nConveyance (scene→A&E bands):")
print(conv_kpis.to_string(index=False))

# --- Exports (tables/) ---
times_path = TABLES_DIR / "times_baseline.csv"
kpi_resp_path = TABLES_DIR / "coverage_response_baseline.csv"
kpi_conv_path = TABLES_DIR / "coverage_conveyance_baseline.csv"
by_lsoa_path = TABLES_DIR / "coverage_by_lsoa_baseline.csv"

t_export = out_df.copy()
t_export.insert(0, "lsoa_code", t_export.index)
t_export.to_csv(times_path, index=False)
resp_kpis.to_csv(kpi_resp_path, index=False)
conv_kpis.to_csv(kpi_conv_path, index=False)

# Coverage by LSOA (include population)
by_lsoa = out_df.copy()
by_lsoa.insert(0, "lsoa_code", by_lsoa.index)
by_lsoa["population"] = population.reindex(by_lsoa.index).astype(int).to_numpy()
by_lsoa.to_csv(by_lsoa_path, index=False)

print("\n[OK] Exported:")
print(" -", times_path)
print(" -", kpi_resp_path)
print(" -", kpi_conv_path)
print(" -", by_lsoa_path)
print("[OK] Step 3 complete — vectorised helpers run, KPIs exported, and per-LSOA flags ready for mapping.")



== Baseline KPIs ==
Response (ARP-like):
 threshold_min  pct_population
             7           30.77
            15           62.33
            18           75.15
            40           99.26

Conveyance (scene→A&E bands):
 threshold_min  pct_population
            30           51.77
            45           69.35
            60           79.03

[OK] Exported:
 - /Users/rosstaylor/Downloads/Code Repositories/REACH Map (NHS SW)/GitHub Repo/REACH-Map-NHS-SW/data/raw/test_data_ICB_level/tables/times_baseline.csv
 - /Users/rosstaylor/Downloads/Code Repositories/REACH Map (NHS SW)/GitHub Repo/REACH-Map-NHS-SW/data/raw/test_data_ICB_level/tables/coverage_response_baseline.csv
 - /Users/rosstaylor/Downloads/Code Repositories/REACH Map (NHS SW)/GitHub Repo/REACH-Map-NHS-SW/data/raw/test_data_ICB_level/tables/coverage_conveyance_baseline.csv
 - /Users/rosstaylor/Downloads/Code Repositories/REACH Map (NHS SW)/GitHub Repo/REACH-Map-NHS-SW/data/raw/test_data_ICB_level/tables/coverage_by_lsoa_

In [15]:
# Step 4 — Scenario scaffold (instant what-ifs by column subset; add stations later)

from __future__ import annotations

from dataclasses import dataclass

@dataclass(frozen=True)
class Scenario:
    name: str
    station_cols: np.ndarray  # indices into R's columns (0..R.shape[1]-1)
    acute_cols:   np.ndarray  # indices into C's columns (0..C.shape[1]-1)

def run_scenario(scn: Scenario) -> Dict[str, pd.DataFrame]:
    """Compute times and KPI tables for a scenario."""
    t_r = min_times_response(scn.station_cols)
    t_c = min_times_convey(scn.acute_cols)
    out = {
        "times": pd.DataFrame(
            {"lsoa_code": t_r.index, "t_resp_min": t_r.values, "t_conv_min": t_c.values}
        )
    }
    out["resp_kpis"] = coverage_table(t_r, RESPONSE_THRESHOLDS, population)
    out["conv_kpis"] = coverage_table(t_c, SCENE_TO_AE_THRESHOLDS, population)
    return out

# Baseline scenario = use all available columns as in Step 3
SCENARIOS = [
    Scenario("baseline", station_cols=baseline_station_cols, acute_cols=baseline_acute_cols),
    # Example “add one station” by code (uncomment and change code to a valid station LSOA):
    # Scenario(
    #     "add_station_E01XXXXXX",
    #     station_cols=np.sort(np.unique(np.r_[baseline_station_cols, station_col_lookup["E01XXXXXX"]])).astype(np.int32),
    #     acute_cols=baseline_acute_cols,
    # ),
]

# Run and export all scenarios
rows = []
for scn in SCENARIOS:
    res = run_scenario(scn)
    # Save per-scenario times
    times_path = TABLES_DIR / f"times_{scn.name}.csv"
    res["times"].to_csv(times_path, index=False)
    # Record KPIs in a flat table
    for _, r in res["resp_kpis"].iterrows():
        rows.append({"scenario": scn.name, "leg": "response", "threshold_min": int(r["threshold_min"]), "pct_population": float(r["pct_population"])})
    for _, r in res["conv_kpis"].iterrows():
        rows.append({"scenario": scn.name, "leg": "conveyance", "threshold_min": int(r["threshold_min"]), "pct_population": float(r["pct_population"])})

scen_kpis = pd.DataFrame(rows).sort_values(["scenario", "leg", "threshold_min"])
scen_path = TABLES_DIR / "scenario_kpis.csv"
scen_kpis.to_csv(scen_path, index=False)

print("\n[OK] Scenario exports written:")
for scn in SCENARIOS:
    print(" -", TABLES_DIR / f"times_{scn.name}.csv")
print(" -", scen_path)
print("[OK] Step 4 complete — scenario scaffold in place. Add station codes to test what-ifs instantly.")



[OK] Scenario exports written:
 - /Users/rosstaylor/Downloads/Code Repositories/REACH Map (NHS SW)/GitHub Repo/REACH-Map-NHS-SW/data/raw/test_data_ICB_level/tables/times_baseline.csv
 - /Users/rosstaylor/Downloads/Code Repositories/REACH Map (NHS SW)/GitHub Repo/REACH-Map-NHS-SW/data/raw/test_data_ICB_level/tables/scenario_kpis.csv
[OK] Step 4 complete — scenario scaffold in place. Add station codes to test what-ifs instantly.


In [17]:
# STEP 5 — Scenario diffs, exports, and QA (robust to no scenarios)

from __future__ import annotations
from typing import Dict, Sequence

# --- Helper: build a scenario by adding station LSOA codes (validates codes) ---
def make_add_station_scenario(name: str, add_station_codes: Sequence[str]) -> Scenario:
    missing = [c for c in add_station_codes if c not in station_col_lookup]
    if missing:
        raise ValueError(f"Unknown station LSOA codes: {missing}")
    add_cols = np.array([station_col_lookup[c] for c in add_station_codes], dtype=np.int32)
    new_cols = np.sort(np.unique(np.r_[baseline_station_cols, add_cols])).astype(np.int32)
    return Scenario(name=name, station_cols=new_cols, acute_cols=baseline_acute_cols)

# Example usage (uncomment and replace with real code):
# SCENARIOS.append(make_add_station_scenario("add_station_E01XXXXXX", ["E01XXXXXX"]))

def _scenario_times(station_cols: np.ndarray, acute_cols: np.ndarray) -> pd.DataFrame:
    """Return per-LSOA times for a scenario (response, convey, total)."""
    t_r = min_times_response(station_cols)
    t_c = min_times_convey(acute_cols)
    t_total = (t_r.to_numpy() + ON_SCENE_BUFFER_MIN + t_c.to_numpy()).astype(np.float32)
    return pd.DataFrame(
        {"lsoa_code": lsoa_index, "t_resp_min": t_r.values, "t_conv_min": t_c.values, "t_total_min": t_total}
    )

def _coverage_kpis(times_df: pd.DataFrame) -> tuple[pd.DataFrame, pd.DataFrame]:
    """KPI tables for response and conveyance."""
    resp_kpis = coverage_table(times_df.set_index("lsoa_code")["t_resp_min"], RESPONSE_THRESHOLDS, population)
    conv_kpis = coverage_table(times_df.set_index("lsoa_code")["t_conv_min"], SCENE_TO_AE_THRESHOLDS, population)
    return resp_kpis, conv_kpis

# Baseline artefacts (compute once)
baseline_times = _scenario_times(baseline_station_cols, baseline_acute_cols)
base_resp_kpi, base_conv_kpi = _coverage_kpis(baseline_times)

# Filter to non-baseline scenarios
scenarios_to_run = [s for s in SCENARIOS if s.name != "baseline"]

rows_summary: list[Dict] = []
if len(scenarios_to_run) == 0:
    # Write an empty summary with headers so downstream steps don't break
    summary_df = pd.DataFrame(
        columns=["scenario", "w_mean_resp_minutes_saved", "w_mean_total_minutes_saved", "n_worsened_resp"]
    )
    summary_path = TABLES_DIR / "scenario_delta_summary.csv"
    summary_df.to_csv(summary_path, index=False)
    print("[WARN] No non-baseline scenarios defined. Wrote an empty scenario_delta_summary.csv.")
else:
    for scn in scenarios_to_run:
        scn_times = _scenario_times(scn.station_cols, scn.acute_cols)

        # --- LSOA-level deltas (positive = minutes saved) ---
        merged = baseline_times.merge(scn_times, on="lsoa_code", suffixes=("_base", "_scn"))
        merged["d_resp_min"]  = merged["t_resp_min_base"]  - merged["t_resp_min_scn"]
        merged["d_conv_min"]  = merged["t_conv_min_base"]  - merged["t_conv_min_scn"]
        merged["d_total_min"] = merged["t_total_min_base"] - merged["t_total_min_scn"]

        # Monotonicity QA — response shouldn't worsen when adding stations
        worsened = int((merged["d_resp_min"] < -1e-6).sum())
        if worsened:
            print(f"[WARN] {scn.name}: {worsened} LSOAs have worse response times than baseline.")

        # Top improvements (absolute minutes saved) — export a ranked view
        topN = (
            merged[["lsoa_code", "d_resp_min", "d_conv_min", "d_total_min"]]
            .sort_values("d_resp_min", ascending=False)
            .head(25)
            .copy()
        )

        # KPI deltas
        scn_resp_kpi, scn_conv_kpi = _coverage_kpis(scn_times)

        resp_delta = scn_resp_kpi.merge(base_resp_kpi, on="threshold_min", suffixes=("_scn", "_base"))
        resp_delta["delta_pp"] = resp_delta["pct_population_scn"] - resp_delta["pct_population_base"]

        conv_delta = scn_conv_kpi.merge(base_conv_kpi, on="threshold_min", suffixes=("_scn", "_base"))
        conv_delta["delta_pp"] = conv_delta["pct_population_scn"] - conv_delta["pct_population_base"]

        # Population-weighted average minutes saved
        pop = population.reindex(merged["lsoa_code"]).to_numpy(dtype=float)
        tot_pop = pop.sum() if pop.sum() > 0 else 1.0
        w_mean_resp_save  = float((pop * merged["d_resp_min"].to_numpy()).sum() / tot_pop)
        w_mean_total_save = float((pop * merged["d_total_min"].to_numpy()).sum() / tot_pop)

        rows_summary.append(
            {
                "scenario": scn.name,
                "w_mean_resp_minutes_saved": round(w_mean_resp_save, 3),
                "w_mean_total_minutes_saved": round(w_mean_total_save, 3),
                "n_worsened_resp": worsened,
            }
        )

        # --- Exports per scenario ---
        per_lsoa_path = TABLES_DIR / f"delta_by_lsoa_{scn.name}.csv"
        topN_path     = TABLES_DIR / f"top25_improvements_{scn.name}.csv"
        resp_kpi_path = TABLES_DIR / f"delta_kpi_response_{scn.name}.csv"
        conv_kpi_path = TABLES_DIR / f"delta_kpi_convey_{scn.name}.csv"

        merged.to_csv(per_lsoa_path, index=False)
        topN.to_csv(topN_path, index=False)
        resp_delta.to_csv(resp_kpi_path, index=False)
        conv_delta.to_csv(conv_kpi_path, index=False)

        print(f"[OK] {scn.name}: wrote")
        print("   -", per_lsoa_path.name)
        print("   -", topN_path.name)
        print("   -", resp_kpi_path.name)
        print("   -", conv_kpi_path.name)

    # Consolidated scenario summary
    summary_df = pd.DataFrame(rows_summary).sort_values("scenario")
    summary_path = TABLES_DIR / "scenario_delta_summary.csv"
    summary_df.to_csv(summary_path, index=False)
    print("[OK] Scenario delta summary →", summary_path)

print("[OK] Step 5 complete — per-scenario deltas, KPI shifts, and QA checks exported.")


[WARN] No non-baseline scenarios defined. Wrote an empty scenario_delta_summary.csv.
[OK] Step 5 complete — per-scenario deltas, KPI shifts, and QA checks exported.


In [18]:
# STEP 6 — Maps (binary coverage choropleths + continuous time layers)

from __future__ import annotations
import contextlib

# Requirements from previous steps:
# - lsoa_g  : GeoDataFrame aligned to lsoa_index (EPSG:27700)
# - out_df  : DataFrame with columns t_resp_min, t_conv_min, (optional) t_total_min,
#             and boolean flags resp_le_{thr}, conv_le_{thr}
# - station_lsoas, acute_lsoas : Index of LSOA codes
# - MAPS_DIR : output directory (Path)
# - RESPONSE_THRESHOLDS, SCENE_TO_AE_THRESHOLDS

# ---------- prepare mapping frame ----------
# Join mapping geometry with times/flags
gmap = lsoa_g.join(out_df, how="left")

# Derive station/acute points by taking LSOA centroids (no site coordinates assumed)
# (EPSG:27700 is projected, so centroid is safe)
centroids = gmap.geometry.centroid  # noqa: SHP.W001 (projected CRS)
station_pts = gpd.GeoDataFrame(
    {"lsoa_code": station_lsoas}, geometry=centroids.reindex(station_lsoas), crs=gmap.crs
)
acute_pts = gpd.GeoDataFrame(
    {"lsoa_code": acute_lsoas}, geometry=centroids.reindex(acute_lsoas), crs=gmap.crs
)

# ---------- styling helpers ----------
COVERED_COLOUR = "#2ca25f"
UNCOVERED_COLOUR = "#de2d26"
BORDER_COLOUR = "#ffffff"
BG_COLOUR = "#f7f7f7"
PTS_STATION_COLOUR = "#1f78b4"
PTS_ACUTE_COLOUR = "#6a3d9a"

def _legend_binary(ax, covered_label: str = "Covered", uncovered_label: str = "Not covered"):
    patches = [
        Patch(facecolor=COVERED_COLOUR, edgecolor=BORDER_COLOUR, label=covered_label),
        Patch(facecolor=UNCOVERED_COLOUR, edgecolor=BORDER_COLOUR, label=uncovered_label),
        Line2D([0], [0], marker="o", color="w", markerfacecolor=PTS_STATION_COLOUR, markeredgecolor="none",
               markersize=8, label="Station (centroid)"),
        Line2D([0], [0], marker="o", color="w", markerfacecolor=PTS_ACUTE_COLOUR, markeredgecolor="none",
               markersize=8, label="Acute (centroid)"),
    ]
    ax.legend(handles=patches, loc="lower left", frameon=True, framealpha=0.9)

def _plot_binary(layer_col: str, title: str, outfile: Path):
    fig, ax = plt.subplots(figsize=(8.5, 9), dpi=150, facecolor="white")
    ax.set_facecolor(BG_COLOUR)

    # Fallback if column missing
    if layer_col not in gmap.columns:
        ax.text(0.5, 0.5, f"Column '{layer_col}' not found.", ha="center", va="center", transform=ax.transAxes)
        plt.savefig(outfile, bbox_inches="tight")
        plt.close(fig)
        return

    covered = gmap[gmap[layer_col] == True]   # noqa: E712
    not_covered = gmap[gmap[layer_col] != True]

    # Draw polygons
    with contextlib.suppress(Exception):
        not_covered.plot(ax=ax, color=UNCOVERED_COLOUR, edgecolor=BORDER_COLOUR, linewidth=0.2)
        covered.plot(ax=ax, color=COVERED_COLOUR, edgecolor=BORDER_COLOUR, linewidth=0.2)

    # Overlays
    if not station_pts.empty:
        station_pts.plot(ax=ax, markersize=10, color=PTS_STATION_COLOUR, alpha=0.9)
    if not acute_pts.empty:
        acute_pts.plot(ax=ax, markersize=10, color=PTS_ACUTE_COLOUR, alpha=0.9)

    ax.set_title(title, fontsize=13, pad=10)
    ax.set_axis_off()
    plt.tight_layout()
    plt.savefig(outfile, bbox_inches="tight")
    plt.close(fig)

def _plot_continuous(value_col: str, title: str, outfile: Path, vmin: float | None = None, vmax: float | None = None):
    fig, ax = plt.subplots(figsize=(8.5, 9), dpi=150, facecolor="white")
    ax.set_facecolor(BG_COLOUR)

    if value_col not in gmap.columns:
        ax.text(0.5, 0.5, f"Column '{value_col}' not found.", ha="center", va="center", transform=ax.transAxes)
        plt.savefig(outfile, bbox_inches="tight")
        plt.close(fig)
        return

    # Clip to finite range; auto vmin/vmax if not provided
    data = gmap[value_col].replace([np.inf, -np.inf], np.nan)
    if vmin is None:
        vmin = float(np.nanpercentile(data, 2)) if np.isfinite(data).any() else 0.0
    if vmax is None:
        vmax = float(np.nanpercentile(data, 98)) if np.isfinite(data).any() else 1.0
    vmin, vmax = (min(vmin, vmax), max(vmin, vmax))

    with contextlib.suppress(Exception):
        gmap.plot(
            column=value_col, ax=ax, cmap="viridis", vmin=vmin, vmax=vmax,
            edgecolor=BORDER_COLOUR, linewidth=0.2, legend=True,
            legend_kwds={"label": "Minutes", "shrink": 0.6},
        )

    if not station_pts.empty:
        station_pts.plot(ax=ax, markersize=10, color=PTS_STATION_COLOUR, alpha=0.9)
    if not acute_pts.empty:
        acute_pts.plot(ax=ax, markersize=10, color=PTS_ACUTE_COLOUR, alpha=0.9)

    ax.set_title(title, fontsize=13, pad=10)
    ax.set_axis_off()
    plt.tight_layout()
    plt.savefig(outfile, bbox_inches="tight")
    plt.close(fig)

# ---------- generate & save maps ----------
written = []

# Binary coverage — response thresholds
for thr in RESPONSE_THRESHOLDS:
    col = f"resp_le_{thr}"
    title = f"Response coverage ≤{thr} min (ARP-aligned)"
    out = MAPS_DIR / f"map_response_le_{thr}min.png"
    _plot_binary(col, title, out)
    written.append(out)

# Binary coverage — conveyance thresholds
for thr in SCENE_TO_AE_THRESHOLDS:
    col = f"conv_le_{thr}"
    title = f"Conveyance coverage ≤{thr} min (scene→A&E)"
    out = MAPS_DIR / f"map_conveyance_le_{thr}min.png"
    _plot_binary(col, title, out)
    written.append(out)

# Continuous time surfaces
_plot_continuous("t_resp_min",  "Nearest response time (min)", MAPS_DIR / "map_t_resp_min.png")
written.append(MAPS_DIR / "map_t_resp_min.png")

_plot_continuous("t_conv_min",  "Nearest conveyance time (min)", MAPS_DIR / "map_t_conv_min.png")
written.append(MAPS_DIR / "map_t_conv_min.png")

if "t_total_min" in gmap.columns:
    _plot_continuous("t_total_min", "End-to-end time (resp + scene + convey)", MAPS_DIR / "map_t_total_min.png")
    written.append(MAPS_DIR / "map_t_total_min.png")

print("[OK] Step 6 complete — maps written:")
for p in written:
    print(" -", p)


[OK] Step 6 complete — maps written:
 - /Users/rosstaylor/Downloads/Code Repositories/REACH Map (NHS SW)/GitHub Repo/REACH-Map-NHS-SW/data/raw/test_data_ICB_level/maps/map_response_le_7min.png
 - /Users/rosstaylor/Downloads/Code Repositories/REACH Map (NHS SW)/GitHub Repo/REACH-Map-NHS-SW/data/raw/test_data_ICB_level/maps/map_response_le_15min.png
 - /Users/rosstaylor/Downloads/Code Repositories/REACH Map (NHS SW)/GitHub Repo/REACH-Map-NHS-SW/data/raw/test_data_ICB_level/maps/map_response_le_18min.png
 - /Users/rosstaylor/Downloads/Code Repositories/REACH Map (NHS SW)/GitHub Repo/REACH-Map-NHS-SW/data/raw/test_data_ICB_level/maps/map_response_le_40min.png
 - /Users/rosstaylor/Downloads/Code Repositories/REACH Map (NHS SW)/GitHub Repo/REACH-Map-NHS-SW/data/raw/test_data_ICB_level/maps/map_conveyance_le_30min.png
 - /Users/rosstaylor/Downloads/Code Repositories/REACH Map (NHS SW)/GitHub Repo/REACH-Map-NHS-SW/data/raw/test_data_ICB_level/maps/map_conveyance_le_45min.png
 - /Users/rosstayl