# 02 — Estimator Comparison

This notebook compares the three mechanistic eGFR/CrCl equations implemented in `eGFR/models.py`:

| Equation | Output | Units | Primary Use |
|----------|--------|-------|-------------|
| **CKD-EPI 2021** | eGFR | mL/min/1.73 m² | Current KDIGO standard, CKD staging |
| **MDRD** | eGFR | mL/min/1.73 m² | Legacy lab reporting, backward compat |
| **Cockcroft-Gault** | CrCl | mL/min | Drug dosing (FDA labels) |

We apply all three equations to NHANES-like data and examine:

1. **Scatter plots** — pairwise agreement between estimators
2. **CKD stage reclassification** — how patients shift between stages depending on the equation
3. **Interpretation** — clinical implications of disagreements

---

In [None]:
import sys, os

# Ensure the project root is on the path so we can import eGFR
project_root = os.path.abspath(os.path.join(os.getcwd(), ".."))
if project_root not in sys.path:
    sys.path.insert(0, project_root)

import warnings
import numpy as np
import pandas as pd
import matplotlib
matplotlib.use("Agg")  # non-interactive backend for headless execution
import matplotlib.pyplot as plt

# eGFR package imports
from eGFR.data import clean_kidney_data
from eGFR.models import (
    calc_egfr_ckd_epi_2021,
    calc_egfr_mdrd,
    calc_crcl_cockcroft_gault,
)
from eGFR.utils import egfr_to_ckd_stage

print("Imports OK")

---

## Step 1 — Prepare Data

We generate synthetic NHANES-like data (same approach as Notebook 01) so the notebook
executes without requiring a CDC download.  If real XPT files are available in `data/raw/`,
they will be used instead.

In [None]:
def _make_synthetic_data(n=2000, seed=42):
    """Generate synthetic NHANES-like DataFrames for demonstration."""
    rng = np.random.default_rng(seed)
    seqn = np.arange(1, n + 1, dtype=float)
    sex = rng.choice([1.0, 2.0], size=n)  # 1=male, 2=female
    age = rng.uniform(18, 85, size=n)

    # Creatinine: lognormal, males higher than females
    cr = np.where(
        sex == 1,
        rng.lognormal(np.log(1.0), 0.2, size=n),
        rng.lognormal(np.log(0.8), 0.2, size=n),
    )

    # Weight and height
    weight = np.where(
        sex == 1,
        rng.normal(85, 15, size=n),
        rng.normal(72, 15, size=n),
    )
    height = np.where(
        sex == 1,
        rng.normal(175, 8, size=n),
        rng.normal(162, 7, size=n),
    )

    biopro = pd.DataFrame({"SEQN": seqn, "LBXSCR": cr})
    demo = pd.DataFrame({"SEQN": seqn, "RIDAGEYR": age, "RIAGENDR": sex})
    bmx = pd.DataFrame({"SEQN": seqn, "BMXWT": weight, "BMXHT": height})
    return biopro, demo, bmx


# Try loading real XPT files; fall back to synthetic data
USE_REAL_DATA = False
RAW_DIR = os.path.join(project_root, "data", "raw")

try:
    from eGFR.data import read_xpt
    biopro_df = read_xpt(os.path.join(RAW_DIR, "BIOPRO_J.XPT"))
    demo_df = read_xpt(os.path.join(RAW_DIR, "DEMO_J.XPT"))
    bmx_df = read_xpt(os.path.join(RAW_DIR, "BMX_J.XPT"))
    USE_REAL_DATA = True
    print("Loaded real NHANES XPT files")
except Exception as e:
    print(f"Could not load XPT files ({type(e).__name__}: {e})")
    print("  -> Using synthetic demonstration data instead.\n")
    biopro_df, demo_df, bmx_df = _make_synthetic_data(n=2000)

# Clean & merge
with warnings.catch_warnings(record=True):
    warnings.simplefilter("always")
    df = clean_kidney_data(
        biopro_df=biopro_df,
        demo_df=demo_df,
        bmx_df=bmx_df,
        apply_idms_correction=False,
    )

data_label = "NHANES 2017-2018" if USE_REAL_DATA else "Synthetic Demo Data"
print(f"Dataset: {data_label}")
print(f"Records: {len(df):,}")
print(f"Columns: {list(df.columns)}")

---

## Step 2 — Compute All Three Estimators

We apply each equation row-by-row.  Notes:

- **CKD-EPI 2021** and **MDRD** return **eGFR** (mL/min/1.73 m²)
- **Cockcroft-Gault** returns **CrCl** (mL/min) — weight-dependent, not BSA-normalised
- MDRD warnings (eGFR > 60) are suppressed during batch computation

In [None]:
# Compute CKD-EPI 2021 eGFR
df["egfr_ckd_epi"] = df.apply(
    lambda r: calc_egfr_ckd_epi_2021(r["cr_mgdl"], r["age_years"], int(r["sex"])),
    axis=1,
)

# Compute MDRD eGFR (suppress > 60 warnings)
with warnings.catch_warnings():
    warnings.simplefilter("ignore", UserWarning)
    df["egfr_mdrd"] = df.apply(
        lambda r: calc_egfr_mdrd(r["cr_mgdl"], r["age_years"], int(r["sex"])),
        axis=1,
    )

# Compute Cockcroft-Gault CrCl
df["crcl_cg"] = df.apply(
    lambda r: calc_crcl_cockcroft_gault(
        r["cr_mgdl"], r["age_years"], r["weight_kg"], int(r["sex"])
    ),
    axis=1,
)

# CKD stage per equation
df["stage_ckd_epi"] = df["egfr_ckd_epi"].apply(egfr_to_ckd_stage)
df["stage_mdrd"]    = df["egfr_mdrd"].apply(egfr_to_ckd_stage)
df["stage_cg"]      = df["crcl_cg"].apply(egfr_to_ckd_stage)

# Summary statistics
summary_cols = ["egfr_ckd_epi", "egfr_mdrd", "crcl_cg"]
print(df[summary_cols].describe().round(1).to_string())

---

## Step 3 — Scatter Plots: Pairwise Equation Agreement

### 3.1  CKD-EPI 2021 vs MDRD

Both estimate eGFR in mL/min/1.73 m².  They should agree closely, especially
for eGFR < 60 (CKD stage G3+).  MDRD is known to be less accurate above 60.

In [None]:
fig, ax = plt.subplots(figsize=(8, 8))

ax.scatter(
    df["egfr_ckd_epi"], df["egfr_mdrd"],
    alpha=0.20, s=8, color="#5C6BC0", edgecolors="none",
)

# Identity line
lims = [0, max(df["egfr_ckd_epi"].max(), df["egfr_mdrd"].max()) * 1.05]
ax.plot(lims, lims, "--", color="#E53935", linewidth=1.5, label="Line of identity")

# CKD threshold lines
for thresh in [15, 30, 45, 60, 90]:
    ax.axhline(thresh, color="gray", linewidth=0.5, alpha=0.5)
    ax.axvline(thresh, color="gray", linewidth=0.5, alpha=0.5)

ax.set_xlabel("CKD-EPI 2021 eGFR (mL/min/1.73 m²)", fontsize=12)
ax.set_ylabel("MDRD eGFR (mL/min/1.73 m²)", fontsize=12)
ax.set_title(f"CKD-EPI 2021 vs MDRD — {data_label}", fontsize=14, fontweight="bold")
ax.set_xlim(lims)
ax.set_ylim(lims)
ax.set_aspect("equal")
ax.legend(fontsize=11, loc="upper left")
ax.grid(True, alpha=0.2)

plt.tight_layout()
plt.savefig(os.path.join(project_root, "data", "scatter_ckdepi_vs_mdrd.png"),
            dpi=150, bbox_inches="tight")
plt.show()

# Agreement statistics
diff_epi_mdrd = df["egfr_ckd_epi"] - df["egfr_mdrd"]
print(f"\nCKD-EPI vs MDRD difference:")
print(f"  Mean bias:  {diff_epi_mdrd.mean():+.1f} mL/min/1.73 m²")
print(f"  SD:         {diff_epi_mdrd.std():.1f}")
print(f"  Median:     {diff_epi_mdrd.median():+.1f}")
print(f"  IQR:        [{diff_epi_mdrd.quantile(0.25):+.1f}, {diff_epi_mdrd.quantile(0.75):+.1f}]")

### 3.2  CKD-EPI 2021 vs Cockcroft-Gault

CKD-EPI returns eGFR (BSA-normalised), while Cockcroft-Gault returns CrCl (absolute).
These are **fundamentally different quantities**, so perfect agreement is not expected.

Key differences:
- CrCl overestimates GFR because tubular secretion of creatinine adds to total clearance
- CrCl is weight-dependent → obese patients show higher CrCl than eGFR
- In underweight patients, CrCl may be lower than eGFR

In [None]:
fig, ax = plt.subplots(figsize=(8, 8))

ax.scatter(
    df["egfr_ckd_epi"], df["crcl_cg"],
    alpha=0.20, s=8, color="#26A69A", edgecolors="none",
)

# Identity line
lims = [0, max(df["egfr_ckd_epi"].max(), df["crcl_cg"].max()) * 1.05]
ax.plot(lims, lims, "--", color="#E53935", linewidth=1.5, label="Line of identity")

# CKD threshold lines
for thresh in [15, 30, 45, 60, 90]:
    ax.axhline(thresh, color="gray", linewidth=0.5, alpha=0.5)
    ax.axvline(thresh, color="gray", linewidth=0.5, alpha=0.5)

ax.set_xlabel("CKD-EPI 2021 eGFR (mL/min/1.73 m²)", fontsize=12)
ax.set_ylabel("Cockcroft-Gault CrCl (mL/min)", fontsize=12)
ax.set_title(f"CKD-EPI 2021 vs Cockcroft-Gault — {data_label}",
             fontsize=14, fontweight="bold")
ax.set_xlim(lims)
ax.set_ylim(lims)
ax.set_aspect("equal")
ax.legend(fontsize=11, loc="upper left")
ax.grid(True, alpha=0.2)

plt.tight_layout()
plt.savefig(os.path.join(project_root, "data", "scatter_ckdepi_vs_cg.png"),
            dpi=150, bbox_inches="tight")
plt.show()

# Agreement statistics
diff_epi_cg = df["egfr_ckd_epi"] - df["crcl_cg"]
print(f"\nCKD-EPI vs Cockcroft-Gault difference:")
print(f"  Mean bias:  {diff_epi_cg.mean():+.1f} (eGFR - CrCl)")
print(f"  SD:         {diff_epi_cg.std():.1f}")
print(f"  Median:     {diff_epi_cg.median():+.1f}")
print(f"  IQR:        [{diff_epi_cg.quantile(0.25):+.1f}, {diff_epi_cg.quantile(0.75):+.1f}]")

---

## Step 4 — CKD Stage Reclassification Analysis

Different equations can assign the same patient to different CKD stages.
This is clinically important because CKD stage drives:

- Referral decisions (stage G3b+ → nephrology)
- Drug dose adjustments
- Monitoring frequency

We build a **reclassification matrix** showing how patients move between stages
when switching from CKD-EPI 2021 to each alternative equation.

In [None]:
stage_order = ["G1", "G2", "G3a", "G3b", "G4", "G5"]


def reclassification_matrix(df, stage_ref, stage_alt, ref_label, alt_label):
    """Build and display a reclassification matrix."""
    cross = pd.crosstab(
        df[stage_ref], df[stage_alt],
        rownames=[ref_label], colnames=[alt_label],
    )
    # Ensure all stages present
    for s in stage_order:
        if s not in cross.index:
            cross.loc[s] = 0
        if s not in cross.columns:
            cross[s] = 0
    cross = cross.reindex(index=stage_order, columns=stage_order, fill_value=0)
    return cross


# --- CKD-EPI vs MDRD reclassification ---
reclass_mdrd = reclassification_matrix(
    df, "stage_ckd_epi", "stage_mdrd",
    "CKD-EPI 2021", "MDRD",
)

print("=" * 60)
print("CKD Stage Reclassification: CKD-EPI 2021 (rows) vs MDRD (cols)")
print("=" * 60)
print(reclass_mdrd.to_string())

# Concordance
concordant_mdrd = sum(reclass_mdrd.loc[s, s] for s in stage_order)
total = len(df)
print(f"\nConcordance: {concordant_mdrd}/{total} ({100*concordant_mdrd/total:.1f}%)")
print(f"Reclassified: {total - concordant_mdrd}/{total} "
      f"({100*(total - concordant_mdrd)/total:.1f}%)")

In [None]:
# --- CKD-EPI vs Cockcroft-Gault reclassification ---
reclass_cg = reclassification_matrix(
    df, "stage_ckd_epi", "stage_cg",
    "CKD-EPI 2021", "Cockcroft-Gault",
)

print("=" * 60)
print("CKD Stage Reclassification: CKD-EPI 2021 (rows) vs CG (cols)")
print("=" * 60)
print(reclass_cg.to_string())

# Concordance
concordant_cg = sum(reclass_cg.loc[s, s] for s in stage_order)
print(f"\nConcordance: {concordant_cg}/{total} ({100*concordant_cg/total:.1f}%)")
print(f"Reclassified: {total - concordant_cg}/{total} "
      f"({100*(total - concordant_cg)/total:.1f}%)")

In [None]:
# --- Heatmap visualisation of reclassification matrices ---
fig, axes = plt.subplots(1, 2, figsize=(16, 6))

for ax, matrix, title in zip(
    axes,
    [reclass_mdrd, reclass_cg],
    ["CKD-EPI 2021 vs MDRD", "CKD-EPI 2021 vs Cockcroft-Gault"],
):
    im = ax.imshow(matrix.values, cmap="YlOrRd", aspect="auto")
    ax.set_xticks(range(len(stage_order)))
    ax.set_yticks(range(len(stage_order)))
    ax.set_xticklabels(stage_order, fontsize=11)
    ax.set_yticklabels(stage_order, fontsize=11)
    ax.set_xlabel(title.split(" vs ")[1], fontsize=12)
    ax.set_ylabel(title.split(" vs ")[0], fontsize=12)
    ax.set_title(f"Reclassification: {title}", fontsize=13, fontweight="bold")

    # Annotate cells
    for i in range(len(stage_order)):
        for j in range(len(stage_order)):
            val = matrix.values[i, j]
            if val > 0:
                color = "white" if val > matrix.values.max() * 0.6 else "black"
                ax.text(j, i, str(val), ha="center", va="center",
                        fontsize=10, fontweight="bold", color=color)

plt.tight_layout()
plt.savefig(os.path.join(project_root, "data", "reclassification_heatmaps.png"),
            dpi=150, bbox_inches="tight")
plt.show()
print("Figure saved to data/reclassification_heatmaps.png")

---

## Step 5 — Distribution Comparison

Overlay the distributions of all three estimators to visualise systematic differences.

In [None]:
fig, ax = plt.subplots(figsize=(12, 5))

bins = np.arange(0, 200, 3)
ax.hist(df["egfr_ckd_epi"], bins=bins, alpha=0.5, color="#1E88E5",
        label="CKD-EPI 2021 (eGFR)", density=True)
ax.hist(df["egfr_mdrd"], bins=bins, alpha=0.5, color="#43A047",
        label="MDRD (eGFR)", density=True)
ax.hist(df["crcl_cg"], bins=bins, alpha=0.5, color="#FB8C00",
        label="Cockcroft-Gault (CrCl)", density=True)

# CKD threshold lines
for thresh, stage in [(90, "G1/G2"), (60, "G2/G3a"), (45, "G3a/G3b"),
                      (30, "G3b/G4"), (15, "G4/G5")]:
    ax.axvline(thresh, color="red", linewidth=0.8, linestyle="--", alpha=0.6)
    ax.text(thresh + 1, ax.get_ylim()[1] * 0.92, stage,
            fontsize=8, color="red", alpha=0.7)

ax.set_xlabel("eGFR / CrCl (mL/min/1.73 m² or mL/min)", fontsize=12)
ax.set_ylabel("Density", fontsize=12)
ax.set_title(f"Distribution of All Three Estimators — {data_label}",
             fontsize=14, fontweight="bold")
ax.legend(fontsize=11)
ax.set_xlim(0, 200)
ax.grid(axis="y", alpha=0.2)

plt.tight_layout()
plt.savefig(os.path.join(project_root, "data", "estimator_distributions.png"),
            dpi=150, bbox_inches="tight")
plt.show()
print("Figure saved to data/estimator_distributions.png")

---

## Interpretation

### CKD-EPI 2021 vs MDRD

- **Below eGFR 60:** Both equations generally agree well, which is expected since MDRD
  was developed and validated primarily in populations with reduced kidney function.
- **Above eGFR 60:** MDRD is known to underestimate GFR in healthy individuals.
  The CKD-EPI 2021 equation was specifically developed to improve accuracy at higher
  GFR values. Consequently, CKD-EPI 2021 tends to produce higher values in this range.
- **Clinical impact:** Patients near the G2/G3a boundary (eGFR ≈ 60) may be classified
  as G3a by MDRD but G2 by CKD-EPI 2021, potentially leading to unnecessary referrals
  if MDRD staging is used for clinical decisions. This is a key reason why KDIGO
  recommends CKD-EPI 2021 as the primary equation.

### CKD-EPI 2021 vs Cockcroft-Gault

- **Different quantities:** CKD-EPI 2021 returns eGFR (BSA-normalised to 1.73 m²),
  while Cockcroft-Gault returns absolute CrCl (mL/min). Direct comparison is inherently
  approximate.
- **Weight effect:** Cockcroft-Gault is weight-dependent. Obese patients show higher
  CrCl than eGFR; cachectic patients show lower CrCl.
- **Overestimation of GFR:** CrCl systematically overestimates true GFR because
  creatinine is both filtered and secreted by the tubules.
- **Drug dosing relevance:** Despite its limitations, Cockcroft-Gault CrCl remains the
  reference for many FDA drug labels, so it is essential for pharmacy use cases.
  Substituting eGFR for CrCl in drug dosing decisions can lead to under- or over-dosing.

### Summary

| Comparison | Key Finding |
|------------|-------------|
| CKD-EPI vs MDRD | Good agreement below 60; CKD-EPI higher above 60 |
| CKD-EPI vs CG | Systematic offset due to CrCl ≠ eGFR; weight-driven scatter |
| Stage reclassification | Substantial reclassification possible near CKD thresholds |

> **Clinical recommendation:** Use CKD-EPI 2021 for CKD staging and screening.
> Use Cockcroft-Gault CrCl specifically for drug dosing when the FDA label
> references creatinine clearance.