## Preprocess the Sim2Real dataset

The Sim2Real dataset uses jpeg images to represent wildfire masks. This representation is very lightweight but slow to work with. 
To benchmark faster, we convert them to binary NumPy files. These files are ~100 times heavier and it requires an overhead time to convert from jpeg to them, but once they are created and stored, they allow 50--100x faster operations.

1. Download the dataset from [Sim2Real-Fire GitHub repository](https://github.com/TJU-IDVLab/Sim2Real-Fire).
2. Extract all files from the dataset into your `Dataset` folder
3. Run the preprocessing function to convert from the jpeg representation to the NumPy one, and compute burn maps

In [9]:
pip install -r requirements.txt

Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


In [1]:
# import requred modules
import sys
import os
import time
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Add code to path
module_path = os.path.abspath(".") + "/code"
if module_path not in sys.path:
    sys.path.append(module_path)


from dataset import preprocess_sim2real_dataset, load_scenario_npy, compute_and_save_burn_maps_sim2real_dataset
from wrappers import wrap_log_sensor_strategy, wrap_log_drone_strategy
from Strategy import RandomDroneRoutingStrategy, return_no_custom_parameters, SensorPlacementOptimization, RandomSensorPlacementStrategy, LoggedOptimizationSensorPlacementStrategy,DroneRoutingOptimizationSlow, DroneRoutingOptimizationModelReuse, DroneRoutingOptimizationModelReuseIndex, LoggedDroneRoutingStrategy, LogWrapperDrone, LogWrapperSensor, DroneRoutingRegularizedMaxCoverageResetStatic, FixedPlacementStrategy, DroneRoutingRegularizedMaxCoverageResetStatic
from benchmark import run_benchmark_scenario,run_benchmark_scenarii_sequential, get_burnmap_parameters,run_benchmark_scenarii_sequential_precompute, benchmark_on_sim2real_dataset_precompute
from displays import create_scenario_video
from new_clustering import get_wrapped_strategy

Detected IPython. Loading juliacall extension. See https://juliapy.github.io/PythonCall.jl/stable/compat/#IPython
Initializing the Julia session. This can take up to 1 minute.
initializing the ground sensor julia module
installing packages
initializing the drone julia module
Julia session initialized.
=== TEST PRINT: Entered RandomPlacementStrategy class definition ===
=== TEST PRINT: Entered DroneRoutingOptimizationModelReuseIndex class definition ===
=== TEST PRINT: Entered DroneRoutingOptimizationModelReuseIndex class definition ===
=== TEST PRINT: Entered DroneRoutingLinearMinTime class definition ===


Uncomment the cell below to preprocess the dataset.
- `n_max_scenarii_per_layout` controls the number of scenarios we convert from jpeg to NumPy files for each layout.
- If executed on all ~50 layouts of the dataset, the code below takes ~15mins to run and generates **400 GB** of data. Make sure to have the space.

In [5]:
# preprocess_sim2real_dataset("MinimalDataset/", n_max_scenarii_per_layout=100) 

Converting JPG scenarios to NPY for MinimalDataset/
Computing burn maps...
Computing burn map for MinimalDataset/0002/scenarii/ and files with extension .npy


100%|██████████| 5/5 [00:00<00:00, 31.16it/s]


Computing burn map for MinimalDataset/0001/scenarii/ and files with extension .npy


100%|██████████| 5/5 [00:00<00:00, 50.21it/s]


We can now run the benchmark function on any scenario.
1. We load the scenario using `load_scenario_npy` since it is in preprocessed `npy` format
2. Werun the benchmark using `run_benchmark_scenario` that takes as input:
    - The `scenario`, 
    - The sensor placement strategy and the drone routing strategy
    - A dictionary `custom_initialization_parameters` that contains any custom initialization inputs for your strategy functions (such as the burn map if your strategy needs a burn map as input)
    - A python function `custom_step_parameters_function` that returns a dictionary of custom inputs for your routing function. This function is executed by the benchmarking code at each time step. Your routing function will be called with: `routing(custom_step_parameters_function())` internally

In [2]:
# change values here to change benchmarking parameters
def my_automatic_layout_parameters(scenario:np.ndarray,b,c):
    print(scenario.shape[1])
    return {
        "N": scenario.shape[1],
        "M": scenario.shape[2],
        "max_battery_distance": -1,
        "max_battery_time": 28,
        "n_drones": 2,
        "n_ground_stations": 0,
        "n_charging_stations": 2,
    }

simulation_parameters =  {
        "N": 50,
        "M": 50,
        "max_battery_distance": -1,
        "max_battery_time": 28,
        "n_drones": 2,
        "n_ground_stations": 0,
        "n_charging_stations": 2,
    }

def return_no_custom_parameters():
    return {}

In [12]:
def custom_initialization_parameters_function(input_dir: str, layout_name: str = None):
    return {"burnmap_filename": "IP_Dataset/0101_02057/cropped_burn_map_new.npy", "load_from_logfile": False, "reevaluation_step": 25, "optimization_horizon":25,"regularization_param": 1} #"regularization_param": 0.0001}


run_benchmark_scenarii_sequential_precompute("IP_Dataset/0101_02057/cropped_scenarii", 
FixedPlacementStrategy, 
wrap_log_drone_strategy(get_wrapped_strategy(DroneRoutingRegularizedMaxCoverageResetStatic)),#wrap_log_drone_strategy(get_wrapped_strategy(DroneRoutingLinearMinTime)), 
custom_initialization_parameters_function, 
return_no_custom_parameters, 
file_format="npy", simulation_parameters=simulation_parameters)

Creating initial routing model (reusable)
--- parameters for julia (Julia indexing) ---
burnmap_filename: IP_Dataset/0101_02057/cropped_burn_map_new.npy
n_drones: 1
charging_stations_locations: [(36, 15)]
ground_sensor_locations: []
optimization_horizon: 25
Set parameter Username
Set parameter LicenseID to value 2612529
Academic license - for non-commercial use only - expires 2026-01-21
Model created in 9.027104874956422 seconds
[(36, 15)]
Set parameter OutputFlag to value 1
Initial optimization finished

DEBUG: Available Charging Stations (after model creation): [(35, 14)]
Creating initial routing model (reusable)
--- parameters for julia (Julia indexing) ---
burnmap_filename: IP_Dataset/0101_02057/cropped_burn_map_new.npy
n_drones: 1
charging_stations_locations: [(31, 43)]
ground_sensor_locations: []
optimization_horizon: 25
Optimizing model took 125.25347329198848 seconds
Set parameter Username
Set parameter LicenseID to value 2612529
Academic license - for non-commercial use only -

 99%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 137/138 [00:01<00:00, 115.00it/s]


{'min_delta': 16,
 'min_delta_file': 'IP_Dataset/0101_02057/cropped_scenarii/0101_00069.npy',
 'avg_time_to_detection': 19.95575221238938,
 'avg_delta': 20.133928571428573,
 'device_percentages': {'ground sensor': 0.0,
  'charging station': 8.7,
  'drone': 72.46,
  'undetected': 18.12},
 'avg_execution_time': 1.5879778160168334e-06,
 'avg_fire_size': 298.15328467153284,
 'avg_fire_percentage': 11.926131386861314,
 'avg_map_explored': 1.7462773722627738,
 'avg_distance': 75.94890510948905,
 'avg_drone_entropy': 0.6931471805599452,
 'avg_sensor_entropy': 0.0,
 'std_deltas': 9.703180200063171,
 'std_execution_time': 1.0521367101503507e-06,
 'std_fire_size': 217.64596211503064,
 'std_fire_percentage': 8.705838484601225,
 'std_map_explored': 0.6997716074739468,
 'std_distance': 36.75631355825085,
 'std_drone_entropy': 1.2147079399144166e-16,
 'std_sensor_entropy': 0.0,
 'raw_execution_times': [1.5735626220703124e-06,
  1.4816011701311384e-06,
  1.0791577790912829e-06,
  1.152356465657552e-0

Optimizing model took 26.049688208964653 seconds


In [3]:
# That's very fast to run!
print("starting benchmark")
time_start = time.time()
scenario = load_scenario_npy("IP_Dataset/0101_02057/cropped_scenarii/0101_00002.npy")
results, (position_history, ground, charging)  = run_benchmark_scenario(scenario, FixedPlacementStrategy, wrap_log_drone_strategy(get_wrapped_strategy(DroneRoutingRegularizedMaxCoverageResetStatic)), 
custom_initialization_parameters = {"burnmap_filename": "IP_Dataset/0101_02057/cropped_burn_map.npy", "load_from_logfile": False, "reevaluation_step": 2, "optimization_horizon":11, "regularization_param": 0}, custom_step_parameters_function = return_no_custom_parameters, automatic_initialization_parameters_function=my_automatic_layout_parameters, return_history=True)
print(results)
print(f"Time taken to run benchmark on the scenario: {time.time() - time_start} seconds")
create_scenario_video(scenario[:len(position_history)],drone_locations_history=position_history,starting_time=0,out_filename='test_simulation', ground_sensor_locations = ground, charging_stations_locations = charging)

starting benchmark
50
Creating initial routing model (reusable)
--- parameters for julia (Julia indexing) ---
burnmap_filename: IP_Dataset/0101_02057/cropped_burn_map.npy
n_drones: 1
charging_stations_locations: [(36, 15)]
ground_sensor_locations: []
optimization_horizon: 11
Set parameter Username
Set parameter LicenseID to value 2612529
Academic license - for non-commercial use only - expires 2026-01-21
Model created in 12.717115000006743 seconds
[(36, 15)]
Set parameter OutputFlag to value 1
Initial optimization finished

DEBUG: Available Charging Stations (after model creation): [(35, 14)]
Creating initial routing model (reusable)
--- parameters for julia (Julia indexing) ---
burnmap_filename: IP_Dataset/0101_02057/cropped_burn_map.npy
n_drones: 1
charging_stations_locations: [(31, 43)]
ground_sensor_locations: []
optimization_horizon: 11
Optimizing model took 2.4032869169604965 seconds
Set parameter Username
Set parameter LicenseID to value 2612529
Academic license - for non-commer

  prob_grid = prob_grid / np.sum(prob_grid)


Optimizing model took 0.43928941601188853 seconds
Next move optimization finished
Optimizing model took 0.4700314159854315 seconds
Next move optimization finished
[wrap_log_drone_strategy] Calling parent's next_actions
len log_data: 2
step_counter: 2
log name: IP_Dataset/0101_02057/logs/DroneRoutingRegularizedMaxCoverageResetStatic_2_drones_2_charging_stations_0_ground_stations_30-42_35-14_11__2_0_logged_drone_routing.json
Optimizing model took 0.3344633340020664 seconds
Next move optimization finished
Optimizing model took 0.5194597920053639 seconds
Next move optimization finished
[wrap_log_drone_strategy] Calling parent's next_actions
len log_data: 3
step_counter: 3
log name: IP_Dataset/0101_02057/logs/DroneRoutingRegularizedMaxCoverageResetStatic_2_drones_2_charging_stations_0_ground_stations_30-42_35-14_11__2_0_logged_drone_routing.json
Optimizing model took 0.3798152499948628 seconds
Next move optimization finished
Optimizing model took 0.6025895419879816 seconds
Next move optimiz

In [4]:
bm = load_scenario_npy("tmp_burnmaps/tmp_burnmap_251727.npy")
create_scenario_video(bm, burn_map = True, out_filename = "test_burn_map", )

Video saved at: display_test_burn_map/test_burn_map.mp4


In [None]:
print("working on it...")
def custom_initialization_parameters_function(input_dir: str, layout_name: str = None):
    print(f"input_dir: {input_dir}")
    burnmap_path = f"{'/'.join(input_dir.strip('/').split('/')[:-1])}/burn_map.npy"
    os.makedirs("logs", exist_ok=True)

    return {
        "burnmap_filename": burnmap_path,
        "reevaluation_step": 5,
        "optimization_horizon": 10,
        "strategy_drone": DroneRoutingOptimizationModelReuseIndex,
        "strategy_sensor": RandomSensorPlacementStrategy,
        "recompute_logfile": False  # toggle this to force refresh
    }

clustered_strategy = get_wrapped_strategy(DroneRoutingOptimizationModelReuseIndex)

# change values here to change benchmarking parameters

simulation_parameters =  {
    "max_battery_distance": -1,
    "max_battery_time": 20,
    "n_drones": 10,
    "n_ground_stations": 1,
    "n_charging_stations": 5,
}

# Run the benchmark and collect metrics
print("Running benchmarks...")
metrics_by_layout = benchmark_on_sim2real_dataset_precompute(
    dataset_folder_name="MinimalDataset/",
    ground_placement_strategy=RandomSensorPlacementStrategy,
    drone_routing_strategy=clustered_strategy,
    custom_initialization_parameters_function=custom_initialization_parameters_function,
    custom_step_parameters_function=return_no_custom_parameters,
    max_n_scenarii=1,
    starting_time=0,
    simulation_parameters=simulation_parameters
)

In [8]:
from benchmark import benchmark_on_sim2real_dataset_precompute, build_custom_init_params, return_no_custom_parameters
from Strategy import RandomSensorPlacementStrategy, DroneRoutingOptimizationExample
from plot_metrics import plot_all_metrics_across_layouts
from plot_violin import gather_data_from_layouts, plot_violin_for_each_metric

# Create output directories
os.makedirs("plots", exist_ok=True)
os.makedirs("plots_violin", exist_ok=True)

# --- Step 2: Generate line plots per layout ---
plot_all_metrics_across_layouts(
    root_folder="MinimalDataset",
    sensor_strategy_cls=RandomSensorPlacementStrategy,
    drone_strategy_cls=DroneRoutingOptimizationExample,
    max_n_scenarii=100,
    starting_time=0
)

# --- Step 3: Generate violin plots across strategies (optional but useful for comparison) ---
from Strategy import RandomDroneRoutingStrategy, LoggedSensorPlacementStrategy, LoggedDroneRoutingStrategy

STRATEGIES = [
    ("Random", RandomSensorPlacementStrategy, RandomDroneRoutingStrategy),
    ("Logged", LoggedSensorPlacementStrategy, LoggedDroneRoutingStrategy),
    ("Optimized", RandomSensorPlacementStrategy, DroneRoutingOptimizationExample)
]

df_all = gather_data_from_layouts(
    root_folder="MinimalDataset",
    strategies=STRATEGIES,
    custom_init_params_fn=build_custom_init_params,
    custom_step_params_fn=return_no_custom_parameters,
    starting_time=0,
    max_n_scenarii=100
)

if not df_all.empty:
    plot_violin_for_each_metric(df_all, output_dir="plots_violin")

ImportError: cannot import name 'DroneRoutingOptimizationExample' from 'Strategy' (/Users/puech/Desktop/Climate/wildfire_drone_routing/code/Strategy.py)

In [None]:
create_scenario_video(scenario[:len(position_history)],drone_locations_history=position_history,starting_time=0,out_filename='test_simulation', ground_sensor_locations = ground, charging_stations_locations = charging)

We can visualize the strategy in action by creating a video of the scenario. 
1. Use the `return_history` parameter or `run_benchmark_scenario` to output the log of ground sensor and drone positions during the benchmark
2. Use the `create_scenario_video` function to compile the video. This can take a couple seconds

In [None]:
delta , device , (position_history, ground, charging)  = run_benchmark_scenario(scenario, SensorPlacementOptimization, RandomDroneRoutingStrategy, custom_initialization_parameters = {"burnmap_filename": "./WideDataset/0001/burn_map.npy"}, custom_step_parameters_function = return_no_custom_parameters, automatic_initialization_parameters_function=my_automatic_layout_parameters, return_history=True)
create_scenario_video(scenario[:len(position_history)],drone_locations_history=position_history,starting_time=0,out_filename='test_simulation', ground_sensor_locations = ground, charging_stations_locations = charging)

Instead of running a benchmark on a single scenario, we are interested in running the benchmark on all scenarii of a given layout (potentially in parallel!)

1. We use `run_benchmark_scenarii_sequential`

In [None]:
run_benchmark_scenarii_sequential("MinimalDataset/0001/scenarii/", RandomSensorPlacementStrategy, RandomDroneRoutingStrategy, custom_initialization_parameters_function = get_burnmap_parameters, custom_step_parameters_function = return_no_custom_parameters, starting_time=0, max_n_scenarii=3)
run_benchmark_scenarii_sequential("MinimalDataset/0001/Satellite_Images_Mask/", RandomSensorPlacementStrategy, RandomDroneRoutingStrategy, custom_initialization_parameters_function = get_burnmap_parameters, custom_step_parameters_function = return_no_custom_parameters, starting_time=0, max_n_scenarii=3, file_format="jpg")

On all of the dataset!

In [None]:
benchmark_on_sim2real_dataset("WideDataset/", SensorPlacementStrategy, DroneRoutingStrategy, custom_initialization_parameters_function = lambda x: None, custom_step_parameters_function = return_no_custom_parameters, max_n_scenarii=100, starting_time=0)

In [None]:
# That's very fast to run!
time_start = time.time()
scenario = load_scenario_npy("WideDataset/0001/scenarii/0001_00002.npy")
device, delta_t, _ = run_benchmark_scenario(scenario, RandomSensorPlacementStrategy, DroneRoutingOptimizationSlow, custom_initialization_parameters = {"burnmap_filename": "./WideDataset/0001/burn_map.npy", "call_every_n_steps": 5, "optimization_horizon": 10}, custom_step_parameters_function = return_no_custom_parameters, automatic_initialization_parameters_function=my_automatic_layout_parameters, return_history=True)
print("Fire detected in ", delta_t, "time steps by device: ", device)
print(f"Time taken to run benchmark on the scenario: {time.time() - time_start} seconds")


In [None]:
# define function to get parameters for optimal drone routing strategy
# def get_routing_optimization_parameters(input_dir:str):
#     return {
#         "burnmap_filename": f"{input_dir}/burn_map.npy",
#         "call_every_n_steps": 5,
#         "optimization_horizon": 10
#     }

# add the log
