# MCBOMs Harwood (2003) Validation

This notebook validates the MCBOMs optimizer against benchmark results from:

> Harwood et al. (2003). Systemwide optimization of safety improvements for R3 projects. *Transportation Research Record*, 1840, 148-157.

**Run each cell in order (Shift+Enter)**

## Step 1: Clone Your GitHub Repository

⚠️ **EDIT THE URL BELOW** to match your GitHub repository!

In [None]:
# CHANGE THIS to your GitHub repo URL!
GITHUB_REPO = "https://github.com/YOUR-USERNAME/mcboms-optimization.git"

# Clone the repository
!rm -rf mcboms-optimization  # Remove if exists
!git clone {GITHUB_REPO}

# Navigate into the folder
%cd mcboms-optimization

print("\n✓ Repository cloned successfully!")

## Step 2: Install Dependencies

In [None]:
# Install Gurobi (includes limited license for small problems)
!pip install gurobipy pandas numpy -q

# Install the mcboms package
!pip install -e . -q

print("✓ Dependencies installed!")

## Step 3: Run the Validation Script

In [None]:
!python run_harwood_validation.py

---

## Alternative: Run Step-by-Step (if you want to see details)

In [None]:
# Import modules
from mcboms.io.readers import load_harwood_sites
from mcboms.data.harwood_alternatives import get_harwood_alternatives, get_expected_solution_50m, get_expected_solution_10m
from mcboms.core.optimizer import Optimizer

print("✓ Modules imported successfully!")

In [None]:
# Load data
sites = load_harwood_sites()
alternatives = get_harwood_alternatives()

print(f"Sites: {len(sites)}")
print(f"Alternatives: {len(alternatives)}")
print()
print("Site Data:")
display(sites)

In [None]:
# View alternatives
print("Alternatives Data:")
display(alternatives[["site_id", "alt_id", "description", "total_cost", "total_benefit", "net_benefit"]])

In [None]:
# Run $50M Budget Optimization
print("=" * 60)
print("OPTIMIZATION: $50M Budget (Unconstrained)")
print("=" * 60)

optimizer_50m = Optimizer(
    sites=sites,
    alternatives=alternatives,
    budget=50_000_000,
    discount_rate=0.04,
)

result_50m = optimizer_50m.solve()

print(result_50m.summary())
print()
print("Selected Alternatives:")
display(result_50m.selected_alternatives)

In [None]:
# Compare with expected results
expected_50m = get_expected_solution_50m()

print("COMPARISON - $50M Budget")
print("-" * 50)
print(f"{'Metric':<25} {'Expected':>15} {'Actual':>15}")
print("-" * 50)
print(f"{'Total Cost':<25} ${expected_50m['total_cost']:>14,} ${result_50m.total_cost:>14,.0f}")
print(f"{'Total Benefit':<25} ${expected_50m['total_benefit']:>14,} ${result_50m.total_benefit:>14,.0f}")
print(f"{'Net Benefit':<25} ${expected_50m['net_benefit']:>14,} ${result_50m.net_benefit:>14,.0f}")

In [None]:
# Run $10M Budget Optimization
print("=" * 60)
print("OPTIMIZATION: $10M Budget (Constrained)")
print("=" * 60)

optimizer_10m = Optimizer(
    sites=sites,
    alternatives=alternatives,
    budget=10_000_000,
    discount_rate=0.04,
)

result_10m = optimizer_10m.solve()

print(result_10m.summary())
print()
print("Selected Alternatives:")
display(result_10m.selected_alternatives)

In [None]:
# Check which sites got "Do Nothing"
expected_10m = get_expected_solution_10m()

selected_sites = set(result_10m.selected_alternatives["site_id"].tolist())
all_sites = set(range(1, 11))
do_nothing_sites = all_sites - selected_sites

# Also check for alt_id == 0 (explicit do-nothing)
explicit_do_nothing = result_10m.selected_alternatives[
    result_10m.selected_alternatives["alt_id"] == 0
]["site_id"].tolist()

actual_do_nothing = list(do_nothing_sites.union(set(explicit_do_nothing)))

print("DO-NOTHING SITES COMPARISON")
print("-" * 40)
print(f"Expected: {expected_10m['do_nothing_sites']}")
print(f"Actual:   {sorted(actual_do_nothing)}")
print()

if set(actual_do_nothing) == set(expected_10m['do_nothing_sites']):
    print("✓ MATCH! Correct sites deferred.")
else:
    print("✗ MISMATCH - Check results.")

---

## Summary

If all cells ran successfully and the results match:

✅ **VALIDATION PASSED** - MCBOMs optimizer correctly replicates Harwood (2003)

### Next Steps:
1. Implement full CMF-based benefit calculations
2. Add partial-length alternative enumeration
3. Run MCBOMs methodology and compare against Harwood baseline