### MCLP Demonstration

Authors: Kaiwen Dong

The purpose of this notebook is to demonstrate the functionality of the Maximal Covering Location Problem (MCLP) to identify optimal bin placement locations. Hilo, Hawaii is used as a case study.

#### Imports

In [None]:
# Third-party imports
import geopandas as gpd
import spaghetti
import os
import warnings
from dotenv import load_dotenv
from pathlib import Path

# Application imports
from utils.constants import DATA_DIR
from utils.mclp import (
    load_and_clean_data,
    summarize_clusters,
    cluster_foot_traffic,
    clean_coordinates,
    calculate_weights_and_cost_matrix,
    setup_and_solve_mclp,
    print_coverage_results,
    create_network_with_lattice,
    snap_observations_to_network,
    generate_mapbox_cost_matrix,
    visualize_results,
    perform_parameter_sweep_on_service_radius,
    visualize_folium_results,
)

# Suppress warnings
with warnings.catch_warnings():
    warnings.simplefilter("ignore")

#### Define Constants

In [None]:
# Define parameters
CLIENT_COUNT = 200
FACILITY_COUNT = 125
SERVICE_RADIUS = 30
P_FACILITIES = 10

# Load the environment variables from the .env file
ENV_PATH = Path(__file__).resolve().parents[3] / ".env"
load_dotenv(dotenv_path=ENV_PATH)

# Access the Mapbox API token
MAPBOX_ACCESS_TOKEN = os.getenv("MAPBOX_ACCESS_TOKEN")

#### Load and Clean Data

In [None]:
# Load and Clean Data
# We load and clean the datasets for the analysis. This includes the API data, foot traffic data, and apartment data.
# The data is then cleaned to remove duplicates and ensure consistency.
hilo_all_gdf, foot, large_apartments_NJ = load_and_clean_data(
    api_data_path=f"{DATA_DIR}/hilo_api_data.json",
    foot_traffic_path=f"{DATA_DIR}/foot_traffic.parquet",
    large_apartments_path=f"{DATA_DIR}/large_apartments.geojson",
    small_apartments_path=f"{DATA_DIR}/small_apartments.geojson",
)

#### Perform Clustering

In [None]:
# We perform K-means clustering on the foot traffic data to identify clusters of high-demand areas.
# The clusters are then summarized to calculate the total visit counts for each cluster, which are used as weights in the MCLP analysis.
foot_traffic_gdf = gpd.GeoDataFrame(
    foot, geometry=gpd.points_from_xy(foot.longitude, foot.latitude)
)
foot_traffic_gdf.set_crs(large_apartments_NJ.crs, inplace=True)
foot_traffic_gdf = cluster_foot_traffic(foot_traffic_gdf)

# Summarize clusters
cluster_summary = summarize_clusters(foot_traffic_gdf)
cluster_summary["weights"] = cluster_summary["total_visit_counts"] / 100000

#### Clean Coordinates

In [None]:
# We clean the coordinates of the client points and facility points to ensure all geometries are valid and finite.
client_points = clean_coordinates(cluster_summary[["geometry", "weights"]])
facility_points = clean_coordinates(large_apartments_NJ)[["geometry"]]

#### Solve MCLP

In [None]:
# We calculate the cost matrix and weights for the client and facility points.
# Using these, we set up and solve the Maximal Covering Location Problem (MCLP)
# to identify the optimal placement of facilities to maximize coverage.
# The results are then printed to show the coverage achieved by the selected facilities.
weights, cost_matrix = calculate_weights_and_cost_matrix(
    client_points, facility_points, SERVICE_RADIUS
)

# Solve MCLP
mclp_result = setup_and_solve_mclp(
    cost_matrix, weights, SERVICE_RADIUS, P_FACILITIES
)
print_coverage_results(mclp_result)

#### Calculate lattice extent and create network

In [None]:
# Create Network with Lattice and Snap Observations
# We create a network with a regular lattice based on the extent of the client points.
# The client and facility points are then snapped to this network for visualization and further analysis.
minx, miny, maxx, maxy = calculate_lattice_extent(client_points)
ntw = create_network_with_lattice(minx, miny, maxx, maxy)

# Snap observations to network
clients_snapped, facilities_snapped = snap_observations_to_network(
    ntw, client_points, facility_points
)

# Visualize results
streets = spaghetti.element_as_gdf(ntw, arcs=True)
visualize_results(clients_snapped, facilities_snapped, streets)

#### Generate MapBox Cost Matrix

In [None]:
# We generate the cost matrix using the Mapbox API. This allows for a more accurate representation of travel distances between points.
cost_matrix = generate_mapbox_cost_matrix(
    clients_snapped,
    facilities_snapped,
    mapbox_access_token=MAPBOX_ACCESS_TOKEN,
)

#### Perform Grid Search on Service Radius

In [None]:
# We perform a grid search on the service radius to find the best coverage.
# This helps to identify the optimal service radius for the MCLP analysis.
coverage_results = perform_parameter_sweep_on_service_radius(
    cost_matrix, weights, P_FACILITIES
)

# Print the coverage rates for each service radius
for service_radius, coverage in coverage_results:
    print(f"Service Radius: {service_radius} units, Coverage: {coverage}%")

#### Find and Visualize the Best Result on a Folium Map

In [None]:
# We find the best result from the grid search and visualize it on an interactive Folium map.
# This provides a clear and interactive way to understand the optimal bin placement locations.
best_service_radius, best_coverage = max(coverage_results, key=lambda x: x[1])

mclp_result = setup_and_solve_mclp(
    cost_matrix, weights, best_service_radius, P_FACILITIES
)
demand_coords = [(point.x, point.y) for point in clients_snapped.geometry]
facility_coords = [(point.x, point.y) for point in facilities_snapped.geometry]
visualize_folium_results(
    demand_coords,
    weights,
    facility_coords,
    mclp_result,
    "Hilo_All_apartments.html",
)

print(
    f"Best Service Radius: {best_service_radius} units, Coverage:"
    f" {best_coverage}%"
)

## Conclusions
- The Maximal Covering Location Problem (MCLP) was effectively used to identify optimal bin placement locations in Hilo, Hawaii.
- Clustering of foot traffic data helped in identifying high-demand areas and assigning appropriate weights.
- The combination of local and Mapbox API cost matrices allowed for flexible and scalable analysis.
- The results provide actionable insights for improving recycling bin placement strategies to maximize coverage.

