## Preprocess the Sim2Real dataset

The Sim2Real dataset uses jpeg images to represent wildfire masks. This representation is very lightweight but slow to work with. 
To benchmark faster, we convert them to binary NumPy files. These files are ~100 times heavier and it requires an overhead time to convert from jpeg to them, but once they are created and stored, they allow 50--100x faster operations.

1. Download the dataset from [Sim2Real-Fire GitHub repository](https://github.com/TJU-IDVLab/Sim2Real-Fire).
2. Extract all files from the dataset into your `Dataset` folder
3. Run the preprocessing function to convert from the jpeg representation to the NumPy one, and compute burn maps

In [1]:
pip install -r requirements.txt


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [3]:
# import requred modules
import sys
import os
import time
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Add code to path
module_path = os.path.abspath(".") + "/code"
if module_path not in sys.path:
    sys.path.append(module_path)


from dataset import preprocess_sim2real_dataset, load_scenario_npy, compute_and_save_burn_maps_sim2real_dataset
from wrappers import wrap_log_sensor_strategy, wrap_log_drone_strategy
from Strategy import RandomDroneRoutingStrategy, return_no_custom_parameters, SensorPlacementOptimization, RandomSensorPlacementStrategy, LoggedOptimizationSensorPlacementStrategy,DroneRoutingOptimizationSlow, DroneRoutingOptimizationModelReuse, DroneRoutingOptimizationModelReuseIndex, LoggedDroneRoutingStrategy, LogWrapperDrone, LogWrapperSensor
from benchmark import run_benchmark_scenario,run_benchmark_scenarii_sequential, get_burnmap_parameters,run_benchmark_scenarii_sequential_precompute, benchmark_on_sim2real_dataset_precompute
from displays import create_scenario_video
from new_clustering import get_wrapped_strategy

Uncomment the cell below to preprocess the dataset.
- `n_max_scenarii_per_layout` controls the number of scenarios we convert from jpeg to NumPy files for each layout.
- If executed on all ~50 layouts of the dataset, the code below takes ~15mins to run and generates **400 GB** of data. Make sure to have the space.

In [None]:
preprocess_sim2real_dataset("MinimalDataset/", n_max_scenarii_per_layout=100) 

Converting JPG scenarios to NPY for MinimalDataset/
Computing burn maps...
Computing burn map for MinimalDataset/0002/scenarii/ and files with extension .npy


100%|██████████| 5/5 [00:00<00:00, 31.16it/s]


Computing burn map for MinimalDataset/0001/scenarii/ and files with extension .npy


100%|██████████| 5/5 [00:00<00:00, 50.21it/s]


We can now run the benchmark function on any scenario.
1. We load the scenario using `load_scenario_npy` since it is in preprocessed `npy` format
2. Werun the benchmark using `run_benchmark_scenario` that takes as input:
    - The `scenario`, 
    - The sensor placement strategy and the drone routing strategy
    - A dictionary `custom_initialization_parameters` that contains any custom initialization inputs for your strategy functions (such as the burn map if your strategy needs a burn map as input)
    - A python function `custom_step_parameters_function` that returns a dictionary of custom inputs for your routing function. This function is executed by the benchmarking code at each time step. Your routing function will be called with: `routing(custom_step_parameters_function())` internally

In [None]:
# change values here to change benchmarking parameters
def my_automatic_layout_parameters(scenario:np.ndarray):
    return {
        "N": scenario.shape[1],
        "M": scenario.shape[2],
        "max_battery_distance": -10,
        "max_battery_time": 20,
        "n_drones": 10,
        "n_ground_stations": 1,
        "n_charging_stations": 5,
    }

In [None]:
# That's very fast to run!
print("starting benchmark")
time_start = time.time()
scenario = load_scenario_npy("MinimalDataset/0001/scenarii/0001_00002.npy")
results, (position_history, ground, charging)  = run_benchmark_scenario(scenario, SensorPlacementOptimization, DroneRoutingOptimizationModelReuseIndex, custom_initialization_parameters = {"burnmap_filename": "./MinimalDataset/0001/burn_map.npy", "load_from_logfile": False, "reevaluation_step": 5, "optimization_horizon":5, "strategy_drone": DroneRoutingOptimizationModelReuseIndex, "strategy_sensor": RandomSensorPlacementStrategy}, custom_step_parameters_function = return_no_custom_parameters, automatic_initialization_parameters_function=my_automatic_layout_parameters, return_history=True)
print(results)
print(f"Time taken to run benchmark on the scenario: {time.time() - time_start} seconds")




In [None]:
def make_clustered_drone_strategy(auto_params, custom_params):
    
    # Wrap strategy
    LoggedDroneStrategy = wrap_log_drone_strategy(DroneRoutingOptimizationModelReuseIndex)

    # Use clustering helper to split drones
    dummy_helper = get_wrapped_strategy(LoggedDroneStrategy, [], 6.0, [])
    clusters = dummy_helper.find_clusters(dummy_helper, auto_params["charging_stations_locations"], 6.0)

    total_drones = auto_params["n_drones"]
    num_clusters = len(clusters)
    base = total_drones // num_clusters
    drones_per_cluster = [base] * num_clusters
    for i in range(total_drones % num_clusters):
        drones_per_cluster[i] += 1

    # Create wrapped class with layout-specific cluster info
    ClusteredStrategyClass = get_wrapped_strategy(
        BaseStrategy=LoggedDroneStrategy,
        charging_stations=auto_params["charging_stations_locations"],
        drone_battery=6.0,
        drones_per_cluster=drones_per_cluster
    )

    # Instantiate strategy
    return ClusteredStrategyClass(auto_params, custom_params)


In [None]:
def custom_initialization_parameters_function(input_dir: str, layout_name: str = None):
    print(f"input_dir: {input_dir}")
    burnmap_path = f"{'/'.join(input_dir.strip('/').split('/')[:-1])}/burn_map.npy"
    os.makedirs("logs", exist_ok=True)

    return {
        "burnmap_filename": burnmap_path,
        "reevaluation_step": 5,
        "optimization_horizon": 10,
        "strategy_drone": DroneRoutingOptimizationModelReuseIndex,
        "strategy_sensor": RandomSensorPlacementStrategy,
        "recompute_logfile": False
    }

# Create wrapped class with layout-specific cluster info
# Wrap strategy

ClusteredStrategyClass = get_wrapped_strategy(
    BaseStrategy=DroneRoutingOptimizationModelReuseIndex
)
LoggedDroneStrategy = wrap_log_drone_strategy(ClusteredStrategyClass)
# Run the benchmark and collect metrics
print("Running benchmarks...")
metrics_by_layout = benchmark_on_sim2real_dataset_precompute(
    dataset_folder_name="MinimalDataset/",
    ground_placement_strategy=RandomSensorPlacementStrategy,
    drone_routing_strategy=LoggedDroneStrategy,
    custom_initialization_parameters_function=custom_initialization_parameters_function,
    custom_step_parameters_function=return_no_custom_parameters,
    max_n_scenarii=100,
    starting_time=0
)


Running benchmarks...

 --- 
 Processing layout MinimalDataset/0002
input_dir: MinimalDataset/0002/scenarii/
RandomSensorPlacementStrategy
[init] Number of clusters: 5
  Cluster 0: [(57, 96)]
  Cluster 1: [(58, 189)]
  Cluster 2: [(33, 92)]
  Cluster 3: [(85, 29)]
  Cluster 4: [(170, 217)]

🚀 Running cluster 0 with 1 charging stations and 1 drones
  🧱 Bounding grid: 7 x 7, origin: (54, 93)
[wrap_log_drone_strategy] 🚫 No log file found at MinimalDataset/0002/logs/DroneRoutingOptimizationModelReuseIndex_5_drones_5_charging_stations_10_ground_stations_33-92_57-96_58-189_85-29_170-217_10_logged_drone_routing.json. Logging will be enabled.
  ✅ Strategy initialized for cluster 0

🚀 Running cluster 1 with 1 charging stations and 1 drones
  🧱 Bounding grid: 7 x 7, origin: (55, 186)
[wrap_log_drone_strategy] 🚫 No log file found at MinimalDataset/0002/logs/DroneRoutingOptimizationModelReuseIndex_5_drones_5_charging_stations_10_ground_stations_33-92_57-96_58-189_85-29_170-217_10_logged_drone_rout

ValueError: too many values to unpack (expected 2)

movement_plan: [[("charge", (34, 93)), ("charge", (34, 93)), ("charge", (58, 97)), ("charge", (58, 97)), ("fly", (171, 218))], [("fly", (34, 93)), ("fly", (34, 93)), ("fly", (58, 97)), ("fly", (58, 97)), ("fly", (171, 217))], [("fly", (34, 93)), ("fly", (34, 93)), ("fly", (58, 98)), ("fly", (58, 98)), ("fly", (170, 216))], [("fly", (34, 93)), ("fly", (34, 93)), ("fly", (58, 98)), ("fly", (58, 98)), ("fly", (170, 216))], [("fly", (34, 93)), ("fly", (34, 93)), ("fly", (58, 98)), ("fly", (58, 98)), ("fly", (170, 216))]]


In [None]:
from benchmark import benchmark_on_sim2real_dataset_precompute, build_custom_init_params, return_no_custom_parameters
from Strategy import RandomSensorPlacementStrategy, DroneRoutingOptimizationExample
from plot_metrics import plot_all_metrics_across_layouts
from plot_violin import gather_data_from_layouts, plot_violin_for_each_metric

# Create output directories
os.makedirs("plots", exist_ok=True)
os.makedirs("plots_violin", exist_ok=True)

# --- Step 2: Generate line plots per layout ---
plot_all_metrics_across_layouts(
    root_folder="MinimalDataset",
    sensor_strategy_cls=RandomSensorPlacementStrategy,
    drone_strategy_cls=DroneRoutingOptimizationExample,
    max_n_scenarii=100,
    starting_time=0
)

# --- Step 3: Generate violin plots across strategies (optional but useful for comparison) ---
from Strategy import RandomDroneRoutingStrategy, LoggedSensorPlacementStrategy, LoggedDroneRoutingStrategy

STRATEGIES = [
    ("Random", RandomSensorPlacementStrategy, RandomDroneRoutingStrategy),
    ("Logged", LoggedSensorPlacementStrategy, LoggedDroneRoutingStrategy),
    ("Optimized", RandomSensorPlacementStrategy, DroneRoutingOptimizationExample)
]

df_all = gather_data_from_layouts(
    root_folder="MinimalDataset",
    strategies=STRATEGIES,
    custom_init_params_fn=build_custom_init_params,
    custom_step_params_fn=return_no_custom_parameters,
    starting_time=0,
    max_n_scenarii=100
)

if not df_all.empty:
    plot_violin_for_each_metric(df_all, output_dir="plots_violin")

ImportError: cannot import name 'plot_metrics' from 'plot_metrics' (/Users/josephye/Desktop/wildfire_drone_routing/code/plot_metrics.py)

In [None]:
create_scenario_video(scenario[:len(position_history)],drone_locations_history=position_history,starting_time=0,out_filename='test_simulation', ground_sensor_locations = ground, charging_stations_locations = charging)

We can visualize the strategy in action by creating a video of the scenario. 
1. Use the `return_history` parameter or `run_benchmark_scenario` to output the log of ground sensor and drone positions during the benchmark
2. Use the `create_scenario_video` function to compile the video. This can take a couple seconds

In [None]:
delta , device , (position_history, ground, charging)  = run_benchmark_scenario(scenario, SensorPlacementOptimization, RandomDroneRoutingStrategy, custom_initialization_parameters = {"burnmap_filename": "./WideDataset/0001/burn_map.npy"}, custom_step_parameters_function = return_no_custom_parameters, automatic_initialization_parameters_function=my_automatic_layout_parameters, return_history=True)
create_scenario_video(scenario[:len(position_history)],drone_locations_history=position_history,starting_time=0,out_filename='test_simulation', ground_sensor_locations = ground, charging_stations_locations = charging)

Instead of running a benchmark on a single scenario, we are interested in running the benchmark on all scenarii of a given layout (potentially in parallel!)

1. We use `run_benchmark_scenarii_sequential`

In [None]:
run_benchmark_scenarii_sequential("MinimalDataset/0001/scenarii/", RandomSensorPlacementStrategy, RandomDroneRoutingStrategy, custom_initialization_parameters_function = get_burnmap_parameters, custom_step_parameters_function = return_no_custom_parameters, starting_time=0, max_n_scenarii=3)
run_benchmark_scenarii_sequential("MinimalDataset/0001/Satellite_Images_Mask/", RandomSensorPlacementStrategy, RandomDroneRoutingStrategy, custom_initialization_parameters_function = get_burnmap_parameters, custom_step_parameters_function = return_no_custom_parameters, starting_time=0, max_n_scenarii=3, file_format="jpg")

On all of the dataset!

In [None]:
benchmark_on_sim2real_dataset("WideDataset/", SensorPlacementStrategy, DroneRoutingStrategy, custom_initialization_parameters_function = lambda x: None, custom_step_parameters_function = return_no_custom_parameters, max_n_scenarii=100, starting_time=0)

In [None]:
# That's very fast to run!
time_start = time.time()
scenario = load_scenario_npy("WideDataset/0001/scenarii/0001_00002.npy")
device, delta_t, _ = run_benchmark_scenario(scenario, RandomSensorPlacementStrategy, DroneRoutingOptimizationSlow, custom_initialization_parameters = {"burnmap_filename": "./WideDataset/0001/burn_map.npy", "call_every_n_steps": 5, "optimization_horizon": 10}, custom_step_parameters_function = return_no_custom_parameters, automatic_initialization_parameters_function=my_automatic_layout_parameters, return_history=True)
print("Fire detected in ", delta_t, "time steps by device: ", device)
print(f"Time taken to run benchmark on the scenario: {time.time() - time_start} seconds")


In [None]:
# define function to get parameters for optimal drone routing strategy
# def get_routing_optimization_parameters(input_dir:str):
#     return {
#         "burnmap_filename": f"{input_dir}/burn_map.npy",
#         "call_every_n_steps": 5,
#         "optimization_horizon": 10
#     }

# add the log
