## Preprocess the Sim2Real dataset

The Sim2Real dataset uses jpeg images to represent wildfire masks. This representation is very lightweight but slow to work with. 
To benchmark faster, we convert them to binary NumPy files. These files are ~100 times heavier and it requires an overhead time to convert from jpeg to them, but once they are created and stored, they allow 50--100x faster operations.

1. Download the dataset from [Sim2Real-Fire GitHub repository](https://github.com/TJU-IDVLab/Sim2Real-Fire).
2. Extract all files from the dataset into your `Dataset` folder
3. Run the preprocessing function to convert from the jpeg representation to the NumPy one, and compute burn maps

In [1]:
pip install -r requirements.txt


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [2]:
# import requred modules
import sys
import os
import time
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Add code to path
module_path = os.path.abspath(".") + "/code"
if module_path not in sys.path:
    sys.path.append(module_path)


from dataset import preprocess_sim2real_dataset, load_scenario_npy, compute_and_save_burn_maps_sim2real_dataset
from wrappers import wrap_log_sensor_strategy, wrap_log_drone_strategy
from Strategy import RandomDroneRoutingStrategy, return_no_custom_parameters, SensorPlacementOptimization, RandomSensorPlacementStrategy, LoggedOptimizationSensorPlacementStrategy,DroneRoutingOptimizationSlow, DroneRoutingOptimizationModelReuse, DroneRoutingOptimizationModelReuseIndex, LoggedDroneRoutingStrategy, LogWrapperDrone, LogWrapperSensor
from benchmark import run_benchmark_scenario,run_benchmark_scenarii_sequential, get_burnmap_parameters,run_benchmark_scenarii_sequential_precompute, benchmark_on_sim2real_dataset_precompute
from displays import create_scenario_video
from new_clustering import get_wrapped_strategy

Detected IPython. Loading juliacall extension. See https://juliapy.github.io/PythonCall.jl/stable/compat/#IPython
Initializing the Julia session. This can take up to 1 minute.
initializing the ground sensor julia module
installing packages
initializing the drone julia module
Julia session initialized.
=== TEST PRINT: Entered RandomPlacementStrategy class definition ===
=== TEST PRINT: Entered DroneRoutingOptimizationModelReuseIndex class definition ===
=== TEST PRINT: Entered DroneRoutingOptimizationModelReuseIndex class definition ===


Uncomment the cell below to preprocess the dataset.
- `n_max_scenarii_per_layout` controls the number of scenarios we convert from jpeg to NumPy files for each layout.
- If executed on all ~50 layouts of the dataset, the code below takes ~15mins to run and generates **400 GB** of data. Make sure to have the space.

In [None]:
# preprocess_sim2real_dataset("MinimalDataset/", n_max_scenarii_per_layout=100) 

We can now run the benchmark function on any scenario.
1. We load the scenario using `load_scenario_npy` since it is in preprocessed `npy` format
2. Werun the benchmark using `run_benchmark_scenario` that takes as input:
    - The `scenario`, 
    - The sensor placement strategy and the drone routing strategy
    - A dictionary `custom_initialization_parameters` that contains any custom initialization inputs for your strategy functions (such as the burn map if your strategy needs a burn map as input)
    - A python function `custom_step_parameters_function` that returns a dictionary of custom inputs for your routing function. This function is executed by the benchmarking code at each time step. Your routing function will be called with: `routing(custom_step_parameters_function())` internally

In [18]:
# change values here to change benchmarking parameters
def my_automatic_layout_parameters(scenario:np.ndarray, input_dir:str='', simulation_parameters:dict={}):
    return {
    "N": scenario.shape[1],
    "M": scenario.shape[2],
    "max_battery_distance": -1,
    "max_battery_time": 20,
    "n_drones": 10,
    "n_ground_stations": 1,
    "n_charging_stations": 6,
    "ground_sensor_locations": [(97, 103)],
    "charging_stations_locations": [(86, 77), (88, 77), (88, 81), (90, 84), (91, 93), (93, 93)],
    input_dir: input_dir,
}

In [19]:
# That's very fast to run!
print("starting benchmark")
time_start = time.time()
scenario = load_scenario_npy("MinimalDataset/0002/scenarii/0002_00010.npy")
results, (position_history, ground, charging)  = run_benchmark_scenario(scenario, SensorPlacementOptimization, DroneRoutingOptimizationModelReuseIndex, 
                                                custom_initialization_parameters = {"burnmap_filename": "code/ML_burn_map/ML/MLDatasets/0002/predicted_burnmaps/burnmap_0002_00010.npy", 
                                                                                    "load_from_logfile": False, "reevaluation_step": 5, "optimization_horizon":5, 
                                                                                    "strategy_drone": DroneRoutingOptimizationModelReuseIndex, 
                                                                                    "strategy_sensor": RandomSensorPlacementStrategy, 
                                                                                    # "model_path": "code/ML_burn_map/models/ok_residualbatch128_normfirst_auc2_best_ap_improvement.pt",
                                                                                    # "weather_file": "code/ML_burn_map/ML/MLDatasets/0002/Weather_Data_Processed/0002_00010.txt",
                                                                                    }
                                                                                    ,custom_step_parameters_function = return_no_custom_parameters, 
                                                                                    automatic_initialization_parameters_function=my_automatic_layout_parameters, return_history=True)
print(results)
print(f"Time taken to run benchmark on the scenario: {time.time() - time_start} seconds")




starting benchmark
calling julia optimization model
NEW STRATEGY 3 - Optimized for speed
Number of cells discarded: 42250
Set parameter Username
Set parameter LicenseID to value 2638288
Academic license - for non-commercial use only - expires 2026-03-18
Took 4.687589700799435 seconds to create model
optimization finished
Charging Station Locations from Julia Optimization Model:  []
ground sensor locations
[(92, 86)]
charging station locations
[(90, 84), (90, 86), (92, 88), (91, 90), (93, 90), (92, 92)]
Creating initial routing model (reusable)
--- parameters for julia (Julia indexing) ---
burnmap_filename: code/ML_burn_map/ML/MLDatasets/0002/predicted_burnmaps/burnmap_0002_00010.npy
n_drones: 10
charging_stations_locations: [(91, 85), (91, 87), (93, 89), (92, 91), (94, 91), (93, 93)]
ground_sensor_locations: [(93, 87)]
optimization_horizon: 5
Took 5.516063038958237 seconds total
Set parameter Username
Set parameter LicenseID to value 2638288
Academic license - for non-commercial use on

In [8]:
print("working on it...")
def custom_initialization_parameters_function(input_dir: str, layout_name: str = None):
    print(f"input_dir: {input_dir}")
    burnmap_path = f"{'/'.join(input_dir.strip('/').split('/')[:-1])}/burn_map.npy"
    os.makedirs("logs", exist_ok=True)

    return {
        "burnmap_filename": burnmap_path,
        "reevaluation_step": 5,
        "optimization_horizon": 10,
        "strategy_drone": DroneRoutingOptimizationModelReuseIndex,
        "strategy_sensor": RandomSensorPlacementStrategy,
        "recompute_logfile": False  # toggle this to force refresh
    }

clustered_strategy = get_wrapped_strategy(DroneRoutingOptimizationModelReuseIndex)

# change values here to change benchmarking parameters

simulation_parameters =  {
    "max_battery_distance": -1,
    "max_battery_time": 20,
    "n_drones": 10,
    "n_ground_stations": 1,
    "n_charging_stations": 5,
}

# Run the benchmark and collect metrics
print("Running benchmarks...")
metrics_by_layout = benchmark_on_sim2real_dataset_precompute(
    dataset_folder_name="MinimalDataset/",
    ground_placement_strategy=RandomSensorPlacementStrategy,
    drone_routing_strategy=clustered_strategy,
    custom_initialization_parameters_function=custom_initialization_parameters_function,
    custom_step_parameters_function=return_no_custom_parameters,
    max_n_scenarii=1,
    starting_time=0,
    simulation_parameters=simulation_parameters
)

working on it...
Running benchmarks...
simulation_parameters:  {'max_battery_distance': -1, 'max_battery_time': 20, 'n_drones': 10, 'n_ground_stations': 1, 'n_charging_stations': 5}

 --- 
 Processing layout MinimalDataset/0002
input_dir: MinimalDataset/0002/scenarii/
Creating initial routing model (reusable)
--- parameters for julia (Julia indexing) ---
burnmap_filename: MinimalDataset/0002/burn_map.npy
n_drones: 2
charging_stations_locations: [(5, 10)]
ground_sensor_locations: []
optimization_horizon: 10
Set parameter Username
Set parameter LicenseID to value 2638288
Academic license - for non-commercial use only - expires 2026-03-18
Model created in 2.440480759134516 seconds
[(5, 10)]
Initial optimization finished

DEBUG: Available Charging Stations (after model creation): [(4, 9)]
Creating initial routing model (reusable)
--- parameters for julia (Julia indexing) ---
burnmap_filename: MinimalDataset/0002/burn_map.npy
n_drones: 2
charging_stations_locations: [(11, 130)]
ground_senso

  0%|          | 0/1 [00:00<?, ?it/s]

Creating initial routing model (reusable)
--- parameters for julia (Julia indexing) ---
burnmap_filename: MinimalDataset/0002/burn_map.npy
n_drones: 2
charging_stations_locations: [(64, 203)]
ground_sensor_locations: []
optimization_horizon: 10
Optimizing model took 0.17150619602762163 seconds
Set parameter Username
Set parameter LicenseID to value 2638288
Academic license - for non-commercial use only - expires 2026-03-18
Model created in 4.17862508399412 seconds
[(64, 203)]
Initial optimization finished

DEBUG: Available Charging Stations (after model creation): [(63, 202)]
Creating initial routing model (reusable)
--- parameters for julia (Julia indexing) ---
burnmap_filename: MinimalDataset/0002/burn_map.npy
n_drones: 2
charging_stations_locations: [(4, 125)]
ground_sensor_locations: []
optimization_horizon: 10
Optimizing model took 0.06010172492824495 seconds
Set parameter Username
Set parameter LicenseID to value 2638288
Academic license - for non-commercial use only - expires 20

100%|██████████| 1/1 [00:33<00:00, 33.62s/it]

Optimizing model took 0.3320451008621603 seconds
Next move optimization finished
Optimizing model took 0.9821190689690411 seconds
Next move optimization finished
Optimizing model took 0.10221059410832822 seconds
Next move optimization finished
Optimizing model took 0.019798944937065244 seconds
Next move optimization finished
Optimizing model took 0.5065168628934771 seconds
Next move optimization finished

 --- 
 Processing layout MinimalDataset/0001
input_dir: MinimalDataset/0001/scenarii/
Creating initial routing model (reusable)
--- parameters for julia (Julia indexing) ---
burnmap_filename: MinimalDataset/0001/burn_map.npy
n_drones: 2
charging_stations_locations: [(5, 158)]
ground_sensor_locations: []
optimization_horizon: 10





Optimizing model took 0.33446280797943473 seconds
Set parameter Username
Set parameter LicenseID to value 2638288
Academic license - for non-commercial use only - expires 2026-03-18
Model created in 1.7798721571452916 seconds
[(5, 158)]
Initial optimization finished

DEBUG: Available Charging Stations (after model creation): [(4, 157)]
Creating initial routing model (reusable)
--- parameters for julia (Julia indexing) ---
burnmap_filename: MinimalDataset/0001/burn_map.npy
n_drones: 2
charging_stations_locations: [(94, 17)]
ground_sensor_locations: []
optimization_horizon: 10
Optimizing model took 0.1622808009851724 seconds
Set parameter Username
Set parameter LicenseID to value 2638288
Academic license - for non-commercial use only - expires 2026-03-18
Model created in 2.5573616370093077 seconds
[(94, 17)]
Initial optimization finished

DEBUG: Available Charging Stations (after model creation): [(93, 16)]
Creating initial routing model (reusable)
--- parameters for julia (Julia indexin

  0%|          | 0/1 [00:00<?, ?it/s]

Creating initial routing model (reusable)
--- parameters for julia (Julia indexing) ---
burnmap_filename: MinimalDataset/0001/burn_map.npy
n_drones: 2
charging_stations_locations: [(42, 63)]
ground_sensor_locations: []
optimization_horizon: 10
Optimizing model took 0.33709546690806746 seconds
Set parameter Username
Set parameter LicenseID to value 2638288
Academic license - for non-commercial use only - expires 2026-03-18
Model created in 3.2056421898305416 seconds
[(42, 63)]
Initial optimization finished

DEBUG: Available Charging Stations (after model creation): [(41, 62)]
Creating initial routing model (reusable)
--- parameters for julia (Julia indexing) ---
burnmap_filename: MinimalDataset/0001/burn_map.npy
n_drones: 2
charging_stations_locations: [(58, 252)]
ground_sensor_locations: []
optimization_horizon: 10
Optimizing model took 0.07137801591306925 seconds
Set parameter Username
Set parameter LicenseID to value 2638288
Academic license - for non-commercial use only - expires 20

100%|██████████| 1/1 [00:58<00:00, 58.71s/it]


Optimizing model took 0.19675317918881774 seconds
Next move optimization finished
Optimizing model took 0.20172362099401653 seconds
Next move optimization finished
Optimizing model took 0.032544674118980765 seconds
Next move optimization finished
Optimizing model took 0.183430909877643 seconds
Next move optimization finished
Optimizing model took 0.19834058405831456 seconds
Next move optimization finished
Optimizing model took 0.3995769638568163 seconds


In [None]:
from benchmark import benchmark_on_sim2real_dataset_precompute, build_custom_init_params, return_no_custom_parameters
from Strategy import RandomSensorPlacementStrategy, DroneRoutingOptimizationExample
from plot_metrics import plot_all_metrics_across_layouts
from plot_violin import gather_data_from_layouts, plot_violin_for_each_metric

# Create output directories
os.makedirs("plots", exist_ok=True)
os.makedirs("plots_violin", exist_ok=True)

# --- Step 2: Generate line plots per layout ---
plot_all_metrics_across_layouts(
    root_folder="MinimalDataset",
    sensor_strategy_cls=RandomSensorPlacementStrategy,
    drone_strategy_cls=DroneRoutingOptimizationExample,
    max_n_scenarii=100,
    starting_time=0
)

# --- Step 3: Generate violin plots across strategies (optional but useful for comparison) ---
from Strategy import RandomDroneRoutingStrategy, LoggedSensorPlacementStrategy, LoggedDroneRoutingStrategy

STRATEGIES = [
    ("Random", RandomSensorPlacementStrategy, RandomDroneRoutingStrategy),
    ("Logged", LoggedSensorPlacementStrategy, LoggedDroneRoutingStrategy),
    ("Optimized", RandomSensorPlacementStrategy, DroneRoutingOptimizationExample)
]

df_all = gather_data_from_layouts(
    root_folder="MinimalDataset",
    strategies=STRATEGIES,
    custom_init_params_fn=build_custom_init_params,
    custom_step_params_fn=return_no_custom_parameters,
    starting_time=0,
    max_n_scenarii=100
)

if not df_all.empty:
    plot_violin_for_each_metric(df_all, output_dir="plots_violin")

ImportError: cannot import name 'DroneRoutingOptimizationExample' from 'Strategy' (/Users/josephye/Desktop/wildfire_drone_routing/code/Strategy.py)

In [None]:
create_scenario_video(scenario[:len(position_history)],drone_locations_history=position_history,starting_time=0,out_filename='test_simulation', ground_sensor_locations = ground, charging_stations_locations = charging)

We can visualize the strategy in action by creating a video of the scenario. 
1. Use the `return_history` parameter or `run_benchmark_scenario` to output the log of ground sensor and drone positions during the benchmark
2. Use the `create_scenario_video` function to compile the video. This can take a couple seconds

In [None]:
delta , device , (position_history, ground, charging)  = run_benchmark_scenario(scenario, SensorPlacementOptimization, RandomDroneRoutingStrategy, custom_initialization_parameters = {"burnmap_filename": "./WideDataset/0001/burn_map.npy"}, custom_step_parameters_function = return_no_custom_parameters, automatic_initialization_parameters_function=my_automatic_layout_parameters, return_history=True)
create_scenario_video(scenario[:len(position_history)],drone_locations_history=position_history,starting_time=0,out_filename='test_simulation', ground_sensor_locations = ground, charging_stations_locations = charging)

Instead of running a benchmark on a single scenario, we are interested in running the benchmark on all scenarii of a given layout (potentially in parallel!)

1. We use `run_benchmark_scenarii_sequential`

In [None]:
run_benchmark_scenarii_sequential("MinimalDataset/0001/scenarii/", RandomSensorPlacementStrategy, RandomDroneRoutingStrategy, custom_initialization_parameters_function = get_burnmap_parameters, custom_step_parameters_function = return_no_custom_parameters, starting_time=0, max_n_scenarii=3)
run_benchmark_scenarii_sequential("MinimalDataset/0001/Satellite_Images_Mask/", RandomSensorPlacementStrategy, RandomDroneRoutingStrategy, custom_initialization_parameters_function = get_burnmap_parameters, custom_step_parameters_function = return_no_custom_parameters, starting_time=0, max_n_scenarii=3, file_format="jpg")

On all of the dataset!

In [None]:
benchmark_on_sim2real_dataset("WideDataset/", SensorPlacementStrategy, DroneRoutingStrategy, custom_initialization_parameters_function = lambda x: None, custom_step_parameters_function = return_no_custom_parameters, max_n_scenarii=100, starting_time=0)

In [None]:
# That's very fast to run!
time_start = time.time()
scenario = load_scenario_npy("WideDataset/0001/scenarii/0001_00002.npy")
device, delta_t, _ = run_benchmark_scenario(scenario, RandomSensorPlacementStrategy, DroneRoutingOptimizationSlow, custom_initialization_parameters = {"burnmap_filename": "./WideDataset/0001/burn_map.npy", "call_every_n_steps": 5, "optimization_horizon": 10}, custom_step_parameters_function = return_no_custom_parameters, automatic_initialization_parameters_function=my_automatic_layout_parameters, return_history=True)
print("Fire detected in ", delta_t, "time steps by device: ", device)
print(f"Time taken to run benchmark on the scenario: {time.time() - time_start} seconds")


In [None]:
# define function to get parameters for optimal drone routing strategy
# def get_routing_optimization_parameters(input_dir:str):
#     return {
#         "burnmap_filename": f"{input_dir}/burn_map.npy",
#         "call_every_n_steps": 5,
#         "optimization_horizon": 10
#     }

# add the log
