## Preprocess the Sim2Real dataset

The Sim2Real dataset uses jpeg images to represent wildfire masks. This representation is very lightweight but slow to work with. 
To benchmark faster, we convert them to binary NumPy files. These files are ~100 times heavier and it requires an overhead time to convert from jpeg to them, but once they are created and stored, they allow 50--100x faster operations.

1. Download the dataset from [Sim2Real-Fire GitHub repository](https://github.com/TJU-IDVLab/Sim2Real-Fire).
2. Extract all files from the dataset into your `Dataset` folder
3. Run the preprocessing function to convert from the jpeg representation to the NumPy one, and compute burn maps

In [None]:
pip install -r requirements.txt

In [1]:
# import requred modules
import sys
import os
import time

# Add code to path
module_path = os.path.abspath(".") + "/code"
if module_path not in sys.path:
    sys.path.append(module_path)

from dataset import preprocess_sim2real_dataset, load_scenario_npy, compute_and_save_burn_maps_sim2real_dataset
from Strategy import SensorPlacementStrategy, DroneRoutingStrategy, return_no_custom_parameters, SensorPlacementOptimization
from benchmark import run_benchmark_scenario,run_benchmark_scenarii_sequential, benchmark_on_sim2real_dataset
from displays import create_scenario_video

Detected IPython. Loading juliacall extension. See https://juliapy.github.io/PythonCall.jl/stable/compat/#IPython
Initializing the Julia session. This can take up to 1 minute.
installing packages


   Resolving package versions...
  No Changes to `/opt/anaconda3/julia_env/Project.toml`
  No Changes to `/opt/anaconda3/julia_env/Manifest.toml`
   Resolving package versions...
  No Changes to `/opt/anaconda3/julia_env/Project.toml`
  No Changes to `/opt/anaconda3/julia_env/Manifest.toml`
   Resolving package versions...
  No Changes to `/opt/anaconda3/julia_env/Project.toml`
  No Changes to `/opt/anaconda3/julia_env/Manifest.toml`
   Resolving package versions...
  No Changes to `/opt/anaconda3/julia_env/Project.toml`
  No Changes to `/opt/anaconda3/julia_env/Manifest.toml`
   Resolving package versions...
  No Changes to `/opt/anaconda3/julia_env/Project.toml`
  No Changes to `/opt/anaconda3/julia_env/Manifest.toml`
   Resolving package versions...
  No Changes to `/opt/anaconda3/julia_env/Project.toml`
  No Changes to `/opt/anaconda3/julia_env/Manifest.toml`
   Resolving package versions...
  No Changes to `/opt/anaconda3/julia_env/Project.toml`
  No Changes to `/opt/anaconda3/jul

Starting test
In julia fuinction
Loading burn map from ./WideDataset/0001/burn_map.npy
Burn map loaded
72116287
1
2
Set parameter Username
Set parameter LicenseID to value 2612529
Academic license - for non-commercial use only - expires 2026-01-21
3
Gurobi Optimizer version 12.0.0 build v12.0.0rc1 (mac64[arm] - Darwin 24.3.0 24D60)

CPU model: Apple M4
Thread count: 10 physical cores, 10 logical processors, using up to 10 threads

Optimize a model with 33295 rows, 66584 columns and 166460 nonzeros
Model fingerprint: 0x0cdf4dbd
Variable types: 0 continuous, 66584 integer (66584 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [4e-05, 3e-02]
  Bounds range     [0e+00, 0e+00]
  RHS range        [1e+00, 1e+00]
Found heuristic solution: objective 0.0000784
Presolve removed 29475 rows and 58948 columns
Presolve time: 0.06s
Presolved: 3820 rows, 7636 columns, 15272 nonzeros
Variable types: 0 continuous, 7636 integer (7636 binary)

Root relaxation: objective

Uncomment the cell below to preprocess the dataset.
- `n_max_scenarii_per_layout` controls the number of scenarios we convert from jpeg to NumPy files for each layout.
- If executed on all ~50 layouts of the dataset, the code below takes ~15mins to run and generates **400 GB** of data. Make sure to have the space.

In [2]:
# preprocess_sim2real_dataset("WideDataset/", n_max_scenarii_per_layout=100) 

We can now run the benchmark function on any scenario.
1. We load the scenario using `load_scenario_npy` since it is in preprocessed `npy` format
2. Werun the benchmark using `run_benchmark_scenario` that takes as input:
    - The `scenario`, 
    - The sensor placement strategy and the drone routing strategy
    - A dictionary `custom_initialization_parameters` that contains any custom initialization inputs for your strategy functions (such as the burn map if your strategy needs a burn map as input)
    - A python function `custom_step_parameters_function` that returns a dictionary of custom inputs for your routing function. This function is executed by the benchmarking code at each time step. Your routing function will be called with: `routing(custom_step_parameters_function())` internally

In [2]:
# That's very fast to run!
print("starting benchmark")
time_start = time.time()
scenario = load_scenario_npy("WideDataset/0001/scenarii/0001_00002.npy")
print("scenario loaded")
device, delta_t, _ = run_benchmark_scenario(scenario, SensorPlacementOptimization, DroneRoutingStrategy, custom_initialization_parameters = {"burnmap_filename": "./WideDataset/0001/burn_map.npy"}, custom_step_parameters_function = return_no_custom_parameters)
print("Fire detected in ", delta_t, "time steps by device: ", device)
print(f"Time taken to run benchmark on the scenario: {time.time() - time_start} seconds")


starting benchmark
scenario loaded
running benchmark scenario
getting sensor locations
initializing sensor placement optimization
calling julia optimization model
In julia fuinction
Loading burn map from ./WideDataset/0001/burn_map.npy
Burn map loaded
72116287
1
2
Set parameter Username
Set parameter LicenseID to value 2612529
Academic license - for non-commercial use only - expires 2026-01-21
3
Gurobi Optimizer version 12.0.0 build v12.0.0rc1 (mac64[arm] - Darwin 24.3.0 24D60)

CPU model: Apple M4
Thread count: 10 physical cores, 10 logical processors, using up to 10 threads

Optimize a model with 33295 rows, 66584 columns and 166460 nonzeros
Model fingerprint: 0x0cdf4dbd
Variable types: 0 continuous, 66584 integer (66584 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [4e-05, 3e-02]
  Bounds range     [0e+00, 0e+00]
  RHS range        [1e+00, 1e+00]
Found heuristic solution: objective 0.0000784
Presolve removed 29475 rows and 58948 columns
Presolv

NotImplementedError: get_locations is an abstract method and should be implemented by subclasses.

In [9]:
import numpy as np
np.load("./WideDataset/0001/burn_map.npy", allow_pickle=True)

array([[[0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
         0.0000000e+00, 0.0000000e+00, 0.0000000e+00],
        [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
         0.0000000e+00, 0.0000000e+00, 0.0000000e+00],
        [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
         0.0000000e+00, 0.0000000e+00, 0.0000000e+00],
        ...,
        [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
         0.0000000e+00, 0.0000000e+00, 0.0000000e+00],
        [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
         0.0000000e+00, 0.0000000e+00, 0.0000000e+00],
        [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
         0.0000000e+00, 0.0000000e+00, 0.0000000e+00]],

       [[0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
         0.0000000e+00, 0.0000000e+00, 0.0000000e+00],
        [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
         0.0000000e+00, 0.0000000e+00, 0.0000000e+00],
        [0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
         0.000

We can visualize the strategy in action by creating a video of the scenario. 
1. Use the `return_history` parameter or `run_benchmark_scenario` to output the log of ground sensor and drone positions during the benchmark
2. Use the `create_scenario_video` function to compile the video. This can take a couple seconds

In [4]:
delta , device , (position_history, ground, charging)  = run_benchmark_scenario(scenario, SensorPlacementStrategy, DroneRoutingStrategy, custom_initialization_parameters = None, custom_step_parameters_function = return_no_custom_parameters, return_history=True)
create_scenario_video(scenario[:len(position_history)],drone_locations_history=position_history,starting_time=0,out_filename='test_simulation', ground_sensor_locations = ground, charging_stations_locations = charging)

Video saved at: display_test_simulation/test_simulation.mp4


Instead of running a benchmark on a single scenario, we are interested in running the benchmark on all scenarii of a given layout (potentially in parallel!)

1. We use `run_benchmark_scenarii_sequential`

In [5]:
run_benchmark_scenarii_sequential("WideDataset/0001/scenarii/", SensorPlacementStrategy, DroneRoutingStrategy, custom_initialization_parameters_function = lambda x: None, custom_step_parameters_function = return_no_custom_parameters, starting_time=0, max_n_scenarii=100)

100%|██████████| 100/100 [00:00<00:00, 141.79it/s]

This strategy took on average 11.909090909090908 time steps to find the fire.
Fire found 29.0% of the time by ground sensor
Fire found 27.0% of the time by charging station
Fire found 43.0% of the time by drone
Fire found 1.0% of the time by undetected





On all of the dataset!

In [None]:
benchmark_on_sim2real_dataset("WideDataset/", SensorPlacementStrategy, DroneRoutingStrategy, custom_initialization_parameters_function = lambda x: None, custom_step_parameters_function = return_no_custom_parameters, max_n_scenarii=100, starting_time=0)

In [None]:
# That's very fast to run!
time_start = time.time()
scenario = load_scenario_npy("WideDataset/0001/scenarii/0001_00002.npy")
device, delta_t, _ = run_benchmark_scenario(scenario, SensorPlacementStrategy, SensorPlacementOptimization, custom_initialization_parameters = None, custom_step_parameters_function = return_no_custom_parameters)
print("Fire detected in ", delta_t, "time steps by device: ", device)
print(f"Time taken to run benchmark on the scenario: {time.time() - time_start} seconds")
