# Tutorial 6: Multi-Day Simulation Testing

**WSmart+ Route Tutorial Series**

This tutorial covers multi-day waste collection simulation for comparing routing policies. You'll learn:

1. **Bins**: waste accumulation and overflow dynamics
2. **Simulation concepts**: daily fill-collect-log cycle
3. **Routing policies**: regular, neural, and classical approaches
4. **Running simulations** and collecting metrics
5. **Analyzing results** with visualizations

**Previous**: [05_evaluation_and_decoding.ipynb](05_evaluation_and_decoding.ipynb) | **Next**: [07_extending_the_codebase.ipynb](07_extending_the_codebase.ipynb)

> **Note**: This tutorial uses synthetic data to demonstrate the simulation framework. Real-world simulations use empirical waste data from sensor-equipped bins.

In [None]:
import os
import sys
import warnings

warnings.filterwarnings("ignore")

PROJECT_ROOT = os.path.abspath(os.path.join(os.getcwd(), "..", ".."))
if PROJECT_ROOT not in sys.path:
    sys.path.insert(0, PROJECT_ROOT)

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import torch

torch.manual_seed(42)
np.random.seed(42)
print(f"Project root: {PROJECT_ROOT}")

---
## 1. Understanding Waste Bins

In real-world waste collection, bins fill up over time. Each day, citizens deposit waste, and the municipality must decide which bins to collect. The challenge: collect bins efficiently before they overflow, while minimizing travel costs.

### Key Concepts

| Concept | Description |
|---------|-------------|
| **Fill level** | Current waste level (0-100%) |
| **Overflow** | When fill level exceeds 100% - waste is lost |
| **Collection** | Emptying a bin (reset to 0%) |
| **Revenue** | Income from collected waste (per kg) |
| **Expenses** | Cost of travel (per km) |
| **Profit** | Revenue - Expenses |

In [None]:
# Simulate bin filling over 10 days (simplified)
n_bins = 10
n_days = 15

np.random.seed(42)

# Generate daily fill rates from gamma distribution (realistic waste patterns)
alpha, beta = 5.0, 10.0  # Shape and scale
daily_fills = np.random.gamma(alpha, 1.0 / beta, size=(n_days, n_bins)) * 100  # Scale to percentage

# Simulate without collection (to see overflow dynamics)
fill_levels = np.zeros((n_days + 1, n_bins))
overflows = np.zeros((n_days, n_bins))

for day in range(n_days):
    new_level = fill_levels[day] + daily_fills[day]
    overflows[day] = np.maximum(new_level - 100, 0)
    fill_levels[day + 1] = np.minimum(new_level, 100)

print("Bin Dynamics (no collection):")
print(f"  Bins: {n_bins}")
print(f"  Days simulated: {n_days}")
print(f"  Mean daily fill rate: {daily_fills.mean():.1f}%")
print(f"  Total overflow events: {(overflows > 0).sum()}")
print(f"  Total waste lost to overflow: {overflows.sum():.1f}% equivalent")

In [None]:
fig, axes = plt.subplots(2, 1, figsize=(12, 8), gridspec_kw={"height_ratios": [3, 1]})

# Top: Fill levels over time
ax = axes[0]
for i in range(min(5, n_bins)):
    ax.plot(range(n_days + 1), fill_levels[:, i], linewidth=1.5, label=f"Bin {i+1}")
ax.axhline(y=100, color="red", linestyle="--", linewidth=1, alpha=0.7, label="Overflow threshold")
ax.set_xlabel("Day")
ax.set_ylabel("Fill Level (%)")
ax.set_title("Bin Fill Levels Over Time (No Collection)")
ax.legend(bbox_to_anchor=(1.05, 1), loc="upper left", fontsize=9)
ax.grid(True, alpha=0.3)

# Bottom: Daily overflow events
ax = axes[1]
daily_overflow_count = (overflows > 0).sum(axis=1)
ax.bar(range(n_days), daily_overflow_count, color="red", alpha=0.6, edgecolor="darkred")
ax.set_xlabel("Day")
ax.set_ylabel("Bins Overflowing")
ax.set_title("Daily Overflow Events")
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

---
## 2. Collection Policies

A **routing policy** decides which bins to collect each day and in what order. Different policies trade off between:
- **Collection coverage**: How many bins to empty
- **Travel efficiency**: How to minimize distance
- **Overflow prevention**: Prioritizing nearly-full bins

In [None]:
def simulate_policy(daily_fills, distance_matrix, policy_fn, n_days=None):
    """Run a multi-day simulation with a given collection policy.
    
    Args:
        daily_fills: (n_days, n_bins) array of daily fill increments
        distance_matrix: (n_bins+1, n_bins+1) distance matrix (index 0 = depot)
        policy_fn: function(fill_levels, distance_matrix) -> list of bin indices to collect
        n_days: number of days to simulate (defaults to all days in daily_fills)
    
    Returns:
        Dictionary with simulation metrics
    """
    if n_days is None:
        n_days = len(daily_fills)
    n_bins = daily_fills.shape[1]
    
    fill_levels = np.zeros(n_bins)
    metrics = {
        "fill_history": [],
        "collections": [],
        "overflows": [],
        "distance": [],
        "waste_collected": [],
        "waste_lost": [],
    }
    
    for day in range(n_days):
        # Step 1: Fill bins
        fill_levels += daily_fills[day]
        
        # Step 2: Record overflow BEFORE collection
        overflow = np.maximum(fill_levels - 100, 0)
        waste_lost = overflow.sum()
        fill_levels = np.minimum(fill_levels, 100)
        
        # Step 3: Decide which bins to collect
        tour = policy_fn(fill_levels.copy(), distance_matrix)
        
        # Step 4: Calculate tour distance
        if len(tour) > 0:
            route = [0] + list(tour) + [0]  # Start and end at depot
            total_dist = sum(distance_matrix[route[i], route[i+1]] for i in range(len(route)-1))
        else:
            total_dist = 0.0
        
        # Step 5: Collect waste
        waste_collected = fill_levels[tour].sum() if len(tour) > 0 else 0.0
        if len(tour) > 0:
            fill_levels[tour] = 0.0
        
        # Record metrics
        metrics["fill_history"].append(fill_levels.copy())
        metrics["collections"].append(len(tour))
        metrics["overflows"].append(int((overflow > 0).sum()))
        metrics["distance"].append(total_dist)
        metrics["waste_collected"].append(waste_collected)
        metrics["waste_lost"].append(waste_lost)
    
    return metrics


# Generate a distance matrix (random symmetric)
n_bins = 10
np.random.seed(42)
coords = np.random.rand(n_bins + 1, 2)  # +1 for depot at index 0
dist_matrix = np.sqrt(((coords[:, None] - coords[None, :]) ** 2).sum(axis=-1))

In [None]:
def regular_policy(fill_levels, distance_matrix, interval=3):
    """Collect all bins every N days."""
    regular_policy._day_count = getattr(regular_policy, "_day_count", 0) + 1
    if regular_policy._day_count % interval == 0:
        return list(range(len(fill_levels)))
    return []


def threshold_policy(fill_levels, distance_matrix, threshold=70.0):
    """Collect bins that exceed a fill threshold."""
    return list(np.where(fill_levels >= threshold)[0])


def greedy_nearest_policy(fill_levels, distance_matrix, threshold=50.0):
    """Collect high-fill bins using nearest-neighbor ordering."""
    candidates = np.where(fill_levels >= threshold)[0]
    if len(candidates) == 0:
        return []
    
    # Nearest-neighbor ordering from depot
    tour = []
    remaining = set(candidates.tolist())
    current = 0  # Start at depot
    
    while remaining:
        # Add 1 to index since distance_matrix includes depot at 0
        distances = {b: distance_matrix[current, b + 1] for b in remaining}
        nearest = min(distances, key=distances.get)
        tour.append(nearest)
        current = nearest + 1
        remaining.remove(nearest)
    
    return tour


def priority_policy(fill_levels, distance_matrix, max_collections=5):
    """Collect the most-full bins first, limited to max per day."""
    sorted_bins = np.argsort(fill_levels)[::-1]
    # Only collect bins above 30% fill
    candidates = [b for b in sorted_bins if fill_levels[b] > 30.0]
    return candidates[:max_collections]

In [None]:
# Reset regular policy counter
regular_policy._day_count = 0

# Generate more days of data
np.random.seed(42)
n_days = 20
daily_fills = np.random.gamma(5.0, 1.0 / 10.0, size=(n_days, n_bins)) * 100

policies = {
    "Regular (every 3 days)": lambda fl, dm: regular_policy(fl, dm, interval=3),
    "Threshold (70%)": lambda fl, dm: threshold_policy(fl, dm, threshold=70.0),
    "Greedy Nearest (50%)": lambda fl, dm: greedy_nearest_policy(fl, dm, threshold=50.0),
    "Priority (top 5)": lambda fl, dm: priority_policy(fl, dm, max_collections=5),
}

results = {}
for name, policy_fn in policies.items():
    # Reset state for regular policy
    if "Regular" in name:
        regular_policy._day_count = 0
    results[name] = simulate_policy(daily_fills, dist_matrix, policy_fn, n_days)

print("Simulation Complete!")
print(f"{'Policy':<25} {'Total Collected':>15} {'Total Lost':>12} {'Total Dist':>12} {'Overflows':>10}")
print("-" * 78)
for name, m in results.items():
    print(f"{name:<25} {sum(m['waste_collected']):>15.1f} {sum(m['waste_lost']):>12.1f} "
          f"{sum(m['distance']):>12.2f} {sum(m['overflows']):>10d}")

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# 1. Cumulative waste collected
ax = axes[0, 0]
for name, m in results.items():
    ax.plot(np.cumsum(m["waste_collected"]), linewidth=2, label=name)
ax.set_xlabel("Day")
ax.set_ylabel("Cumulative Waste Collected (%)")
ax.set_title("Waste Collection Over Time")
ax.legend(fontsize=8)
ax.grid(True, alpha=0.3)

# 2. Cumulative overflow
ax = axes[0, 1]
for name, m in results.items():
    ax.plot(np.cumsum(m["overflows"]), linewidth=2, label=name)
ax.set_xlabel("Day")
ax.set_ylabel("Cumulative Overflow Events")
ax.set_title("Overflow Events Over Time")
ax.legend(fontsize=8)
ax.grid(True, alpha=0.3)

# 3. Daily distance traveled
ax = axes[1, 0]
for name, m in results.items():
    ax.plot(m["distance"], linewidth=1.5, alpha=0.8, label=name)
ax.set_xlabel("Day")
ax.set_ylabel("Distance Traveled")
ax.set_title("Daily Travel Distance")
ax.legend(fontsize=8)
ax.grid(True, alpha=0.3)

# 4. Summary bar chart
ax = axes[1, 1]
policy_names = list(results.keys())
total_collected = [sum(results[n]["waste_collected"]) for n in policy_names]
total_lost = [sum(results[n]["waste_lost"]) for n in policy_names]

x = np.arange(len(policy_names))
width = 0.35
ax.bar(x - width/2, total_collected, width, label="Collected", color="steelblue", alpha=0.8)
ax.bar(x + width/2, total_lost, width, label="Lost (overflow)", color="red", alpha=0.6)
ax.set_xticks(x)
ax.set_xticklabels([n.split("(")[0].strip() for n in policy_names], rotation=15, ha="right")
ax.set_ylabel("Total Waste (%)")
ax.set_title("Collection vs Overflow")
ax.legend()
ax.grid(True, alpha=0.3, axis="y")

plt.tight_layout()
plt.show()

---
## 3. WSmart+ Route Simulation Framework

The WSmart+ Route simulator provides a comprehensive framework for multi-day simulation testing with:

- **Bins class**: Manages bin state with stochastic/empirical filling
- **Distance matrices**: Real-world road network distances
- **Policy adapters**: Unified interface for all routing strategies
- **Parallel execution**: Multi-process simulation across seeds

### CLI Usage

```bash
# Run simulation with multiple policies
python main.py test_sim --policies regular alns hgs --days 31 --size 50

# With neural agent
python main.py test_sim --policies neural regular --days 31 --model weights/best.pt
```

---
## 4. Results Analysis

Let's create a comprehensive analysis of the simulation results.

In [None]:
# Create summary DataFrame
summary_data = []
for name, m in results.items():
    summary_data.append({
        "Policy": name,
        "Total Collected (%)": sum(m["waste_collected"]),
        "Total Lost (%)": sum(m["waste_lost"]),
        "Total Distance": sum(m["distance"]),
        "Total Overflows": sum(m["overflows"]),
        "Avg Collections/Day": np.mean(m["collections"]),
        "Collection Efficiency": (
            sum(m["waste_collected"]) / max(sum(m["distance"]), 0.001)
        ),
    })

df = pd.DataFrame(summary_data)
df = df.set_index("Policy")
print("Simulation Summary:")
print(df.to_string())

In [None]:
# Radar chart for multi-metric comparison
metrics_for_radar = ["Total Collected (%)", "Collection Efficiency", "Avg Collections/Day"]

# Invert metrics where lower is better
inverted_metrics = {"Total Lost (%)", "Total Distance", "Total Overflows"}

fig, ax = plt.subplots(figsize=(8, 8), subplot_kw=dict(polar=True))

categories = ["Collection\nVolume", "Efficiency\n(waste/km)", "Collection\nFrequency",
              "Low\nOverflow", "Low\nDistance"]
n_cats = len(categories)

# Normalize each metric to [0, 1]
radar_data = {}
for name in df.index:
    vals = [
        df.loc[name, "Total Collected (%)"],
        df.loc[name, "Collection Efficiency"],
        df.loc[name, "Avg Collections/Day"],
        1.0 / max(df.loc[name, "Total Overflows"] + 1, 1),  # Inverse: fewer overflows = better
        1.0 / max(df.loc[name, "Total Distance"] + 0.1, 0.1),  # Inverse: less distance = better
    ]
    radar_data[name] = vals

# Normalize across policies
all_vals = np.array(list(radar_data.values()))
mins = all_vals.min(axis=0)
maxs = all_vals.max(axis=0)
ranges = maxs - mins
ranges[ranges == 0] = 1  # Avoid division by zero

angles = np.linspace(0, 2 * np.pi, n_cats, endpoint=False).tolist()
angles += angles[:1]

colors = ["steelblue", "coral", "seagreen", "purple"]
for idx, (name, vals) in enumerate(radar_data.items()):
    normalized = (np.array(vals) - mins) / ranges
    values = normalized.tolist()
    values += values[:1]
    
    ax.plot(angles, values, "o-", linewidth=2, label=name.split("(")[0].strip(),
            color=colors[idx], markersize=6)
    ax.fill(angles, values, alpha=0.1, color=colors[idx])

ax.set_xticks(angles[:-1])
ax.set_xticklabels(categories, fontsize=10)
ax.set_ylim(0, 1.1)
ax.legend(loc="upper right", bbox_to_anchor=(1.3, 1.1), fontsize=9)
ax.set_title("Multi-Metric Policy Comparison", fontsize=13, pad=20)

plt.tight_layout()
plt.show()

In [None]:
# Heatmap of fill levels over time for best policy
best_policy = min(results.keys(), key=lambda k: sum(results[k]["waste_lost"]))
fill_history = np.array(results[best_policy]["fill_history"])

fig, ax = plt.subplots(figsize=(12, 5))
im = ax.imshow(fill_history.T, aspect="auto", cmap="YlOrRd", vmin=0, vmax=100)
ax.set_xlabel("Day", fontsize=12)
ax.set_ylabel("Bin Index", fontsize=12)
ax.set_title(f"Fill Levels Over Time - {best_policy}", fontsize=13)
plt.colorbar(im, ax=ax, label="Fill Level (%)", shrink=0.8)
ax.set_yticks(range(n_bins))
ax.set_yticklabels([f"Bin {i+1}" for i in range(n_bins)])

plt.tight_layout()
plt.show()

---
## 5. Key Insights

### Policy Trade-offs

- **Regular collection** is simple but wasteful (collects even empty bins) and misses high-fill bins between collection days
- **Threshold-based** policies are reactive - they wait until bins are nearly full, risking overflow
- **Nearest-neighbor greedy** policies balance fill level awareness with travel efficiency
- **Priority-based** policies focus on the most urgent bins but may miss efficient routing

### Why Neural Policies?

Neural routing policies (trained in Tutorial 4) can learn to:
- **Predict** which bins will overflow soon
- **Optimize** routes for both collection urgency and travel efficiency
- **Generalize** across different demand patterns and problem sizes
- Make decisions in **milliseconds** (vs. seconds/minutes for classical solvers)

---
## Summary

In this tutorial, you learned:

- **Bin dynamics**: Waste accumulates daily following stochastic patterns; overflows cause waste loss
- **Collection policies** make daily decisions about which bins to collect and in what order
- **Simulation framework** enables multi-day testing with various policies and metrics
- **Key metrics**: waste collected, overflow events, travel distance, collection efficiency
- **Trade-offs**: Every policy balances collection coverage, travel cost, and overflow prevention
- **Neural policies** can learn to outperform hand-crafted heuristics through RL training

### Full Tutorial Series

1. **[Data Generation](01_data_generation.ipynb)** - Creating problem instances and datasets
2. **[Environments](02_environments.ipynb)** - RL environment abstraction and state management
3. **[Models & Policies](03_models_and_policies.ipynb)** - Neural and classical routing policies
4. **[Training](04_training_with_lightning.ipynb)** - RL training with PyTorch Lightning
5. **[Evaluation](05_evaluation_and_decoding.ipynb)** - Decoding strategies and metrics
6. **[Simulation](06_simulation_testing.ipynb)** - Multi-day simulation testing (this tutorial)

For production-scale experiments, use the CLI:
```bash
python main.py test_sim --policies regular neural alns hgs --days 31 --size 50
```