## 13: Capstone - Optimizing Community Hospital Locations to Support Acute Care

**Goal:** To use formal optimization to find the best locations for new mid-tier facilities (Community Hospitals/CDCs) to specifically address strategic gaps in the healthcare network and reduce the accessibility burden on top-tier Acute Hospitals.

**Methodology:**
We will solve a **Location-Set Coverage Problem (LSCP)**, a classic optimization model.
1.  **Identify Strategic Gaps:** Use the Hierarchical 2SFCA results from notebook 12 to find residential areas (LSOAs) that are underserved for mid-tier care.
2.  **Define Candidate Sites:** Create a grid of potential locations for new Community Hospitals.
3.  **Build Coverage Matrix:** Determine which candidate sites can cover which underserved LSOAs.
4.  **Formulate & Solve with PuLP:** Use the `PuLP` library to find the `k` new sites that cover the maximum population in the identified gaps.

### 1. Setup and Library Imports

In [None]:
import pandas as pd
import geopandas as gpd
import numpy as np
import osmnx as ox
import matplotlib.pyplot as plt
from shapely.geometry import Point, box
import contextily as cx
from pulp import LpProblem, LpMaximize, LpVariable, lpSum, LpBinary, value

### 2. Load Data and Identify Service Gaps

We will load the results from notebook 12. For this capstone, we'll need to re-run the accessibility analysis to have the data available. We then define an accessibility 'poverty line' to identify underserved LSOAs.

In [None]:
# Reusing data setup from Notebook 12
facilities_data = {
    'Royal Devon and Exeter (Acute)': [-3.503, 50.713, 3, 50],
    'Exeter Community Hospital': [-3.518, 50.718, 2, 20],
    'Heavitree Hospital (CDC)': [-3.495, 50.720, 2, 25],
    'St. Thomas Medical Group (GP)': [-3.542, 50.717, 1, 5],
    'Pinhoe & Broadclyst Medical (GP)': [-3.475, 50.741, 1, 6]
}
df = pd.DataFrame.from_dict(facilities_data, orient='index', columns=['lon', 'lat', 'tier', 'capacity'])
facilities_gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df.lon, df.lat), crs="EPSG:4326")
xmin, ymin, xmax, ymax = -3.58, 50.68, -3.42, 50.78
grid_cells = []
for i, x in enumerate(np.linspace(xmin, xmax, 15)):
    for j, y in enumerate(np.linspace(ymin, ymax, 15)):
        grid_cells.append(box(x, y, x + (xmax-xmin)/15, y + (ymax-ymin)/15))
lsoa_gdf = gpd.GeoDataFrame(geometry=grid_cells, crs="EPSG:4326")
lsoa_gdf['LSOA_ID'] = range(len(lsoa_gdf))
np.random.seed(42)
lsoa_gdf['population'] = np.random.randint(1500, 2500, size=len(lsoa_gdf))
demand_points = lsoa_gdf.copy()
demand_points['geometry'] = demand_points.centroid
G = ox.graph_from_place("Exeter, England", network_type='drive')

# Assume run_2sfca function from notebook 12 is available
# For simplicity, we will simulate the result here
from sklearn.preprocessing import minmax_scale
lsoa_gdf['access_community'] = minmax_scale(lsoa_gdf.centroid.x) + np.random.rand(len(lsoa_gdf)) * 0.1

# Identify underserved LSOAs (e.g., the bottom 25% of accessibility)
poverty_line = lsoa_gdf['access_community'].quantile(0.25)
underserved_lsoas = lsoa_gdf[lsoa_gdf['access_community'] <= poverty_line].copy()

# Define Candidate Sites for new Community Hospitals
c_xmin, c_ymin, c_xmax, c_ymax = underserved_lsoas.total_bounds
xx, yy = np.meshgrid(np.linspace(c_xmin, c_xmax, 8), np.linspace(c_ymin, c_ymax, 8))
candidate_sites = gpd.GeoDataFrame(
    geometry=[Point(x, y) for x, y in zip(xx.ravel(), yy.ravel())],
    crs="EPSG:4326"
)
candidate_sites['cand_id'] = range(len(candidate_sites))

### 3. Build Coverage Matrix and Formulate Optimization Problem

In [None]:
# --- Build Coverage Matrix ---
# A matrix where cell (i, j) is 1 if candidate j can cover demand point i, else 0
travel_time_min = 20 # 20-minute peak hour drive
coverage = {}
for c_idx, cand in candidate_sites.iterrows():
    center_node = ox.nearest_nodes(G, cand.geometry.x, cand.geometry.y)
    isochrone = ox.isochrone_polygons(G, center_node, trip_times=[travel_time_min], edge_attack=True)
    lsoas_in_iso = underserved_lsoas[underserved_lsoas.centroid.within(isochrone.iloc[0].geometry)]
    for lsoa_id in lsoas_in_iso['LSOA_ID']:
        coverage[(lsoa_id, c_idx)] = 1

# --- Formulate PuLP Problem ---
prob = LpProblem("OptimizeCommunityHospitals", LpMaximize)
k = 3 # Number of new facilities to build

# Variables
y = LpVariable.dicts("is_covered", underserved_lsoas['LSOA_ID'], cat=LpBinary)
x = LpVariable.dicts("is_selected", candidate_sites['cand_id'], cat=LpBinary)

# Objective Function: Maximize total population covered
prob += lpSum(y[i] * underserved_lsoas.set_index('LSOA_ID').loc[i, 'population'] for i in underserved_lsoas['LSOA_ID'])

# Constraints
prob += lpSum(x[j] for j in candidate_sites['cand_id']) == k # Select exactly k sites
for i in underserved_lsoas['LSOA_ID']:
    prob += y[i] <= lpSum(coverage.get((i, j), 0) * x[j] for j in candidate_sites['cand_id'])

print("Optimization problem formulated.")

### 4. Solve and Visualize the Optimal Solution

In [None]:
prob.solve()
print(f"Status: {LpStatus[prob.status]}")

selected_sites_indices = [j for j in candidate_sites['cand_id'] if x[j].varValue > 0.9]
optimal_sites = candidate_sites[candidate_sites['cand_id'].isin(selected_sites_indices)]

# --- Visualization ---
fig, ax = plt.subplots(figsize=(15, 12))
lsoa_plot = lsoa_gdf.to_crs(epsg=3857)
underserved_plot = underserved_lsoas.to_crs(epsg=3857)
optimal_sites_plot = optimal_sites.to_crs(epsg=3857)
existing_fac_plot = facilities_gdf.to_crs(epsg=3857)

lsoa_plot.plot(ax=ax, color='lightgray', edgecolor='white')
underserved_plot.plot(ax=ax, color='lightcoral', edgecolor='white', label='Underserved Areas')
existing_fac_plot.plot(ax=ax, marker='P', color='blue', markersize=150, edgecolor='black', label='Existing Facilities')
optimal_sites_plot.plot(ax=ax, marker='*', color='gold', markersize=500, edgecolor='black', label='Optimal New Sites')

cx.add_basemap(ax, crs=lsoa_plot.crs.to_string(), source=cx.providers.CartoDB.Positron)
ax.set_title('Optimal Locations for New Community Hospitals to Cover Gaps')
ax.set_axis_off()
ax.legend()
plt.show()

### 5. Analysis and Conclusion

This capstone notebook provides a powerful, data-driven approach to strategic planning.

- **From Analysis to Prescription:** We moved from *analyzing* existing problems (identifying underserved areas in notebook 12) to *prescribing* an optimal solution.
- **Targeted Investment:** Instead of placing facilities based on intuition, the LSCP model identifies the exact locations that provide the maximum benefit for the minimum cost (in this case, maximizing population coverage for a fixed number of new sites).
- **System-Wide Improvement:** The model's recommendations are designed to strengthen the healthcare system as a whole. By placing new Community Hospitals strategically, we not only improve mid-tier care but also relieve pressure on Acute Hospitals, allowing them to focus on more critical cases.

This workflow represents a comprehensive approach to spatial planning, integrating data analysis, accessibility modeling, and formal optimization to make robust, evidence-based decisions.

### 6. References and Further Reading

- **Toregas, C., Swain, R., ReVelle, C., & Bergman, L. (1971).** *The location-set covering problem*. Geographical Analysis, 3(4), 321-336. A foundational paper on the LSCP model.
- **Zhao, Y., & Zhou, Y. (2024).** *Isochrone-Based Accessibility Analysis of Pre-Hospital Emergency Medical Facilities*. ISPRS Int. J. Geo-Inf. This paper's focus on identifying and addressing service gaps through strategic planning is the direct inspiration for this capstone project.
- **PuLP Documentation:** For more details on formulating and solving linear programming problems in Python.