Luke Rowe*, Roger Girgis*, Anthony Gosselin, Bruno Carrez, Florian Golemo, Felix Heide, Liam Paull, Christopher Pal
We propose CtRL-Sim, a framework that leverages offline reinforcement learning for controllable closed-loop multi-agent driving simulation.
Start by cloning the repository
git clone https://github.com/montrealrobotics/ctrl-sim
cd ctrl-sim
This repository assumes you have a scratch directory /path/to/scratch
for storing larger files/data. If storage is not a concern, you can simply let your scratch directory be the ctrl-sim working directory.
To setup the environment, we currently provide support for Singularity.
First, ensure the project root (/path/to/ctrl-sim
) is in your PATH. Also, make sure the CONFIG_PATH
variable in cfgs/config.py
and the dataset_root
and project_root
variables in config.yaml
are set appropriately.
We require singularity version >= 3.7.1. Download the singularity image format file nocturne.sif
for creating the nocturne-compatible singularity container and place it in your scratch directory. Create the singularity container and conda environment:
# shell into singularity container
singularity shell -H $HOME:/home -B /path/to/scratch:/scratch --nv /path/to/scratch/nocturne.sif
cd ctrl-sim
# load conda module
source /opt/miniconda3/etc/profile.d/conda.sh
# create conda environment
conda env create -f environment.yml
conda activate ctrl_sim
# install additional packages
pip install torch_scatter -f https://data.pyg.org/whl/torch-2.2.0+cu121.html
To build nocturne, run the following:
python setup.py develop
First, download the pre-processed Waymo dataset for the Nocturne simulator nocturne_waymo
(150 GB), which can be found here. Additionally, download the pre-processed validation_interactive split of the Waymo dataset for the Nocturne simulator, formatted_json_v2_no_tl_valid_interactive.tar
, which can be found here. These are used as evaluation scenarios for evaluating planners against CtRL-Sim adversarial agents. Place the unzipped formatted_json_v2_no_tl_valid_interactive
directory inside the nocturne_waymo
directory.
We assume this dataset is in your local scratch repository with the following structure:
/path/to/scratch/
├── nocturne_waymo/
| └── formatted_json_v2_no_tl_train/
| | ├── tfrecord-00000-of-01000_108.json
| | ├── tfrecord-00000-of-01000_109.json
| | ├── ...
| | ├── valid_files.json
| | ├── valid_files_backup.json
| | ├── valid_files.txt
| | ├── invalid_files.json
| | ├── invalid_files.txt
| └── formatted_json_v2_no_tl_valid/
| | ├── tfrecord-00000-of-00150_101.json
| | ├── tfrecord-00000-of-00150_102.json
| | ├── ...
| | ├── valid_files.json
| | ├── valid_files.txt
| | ├── invalid_files.json
| | ├── invalid_files.txt
| └── formatted_json_v2_no_tl_valid_interactive/
| | ├── tfrecord-00000-of-00150_0.json
| | ├── tfrecord-00000-of-00150_106.json
| | ├── tfrecord-00000-of-00150_109.json
| | ├── ...
| | ├── valid_files.json
| | ├── invalid_files.json
The preprocessed offline RL dataset for CtRL-Sim model training preprocess.tar
can be found here (193 GB). Also, download the pickle file test_filenames.pkl
containing the names of the test files here.
We now assume this dataset is in your local scratch repository with the following structure:
/path/to/scratch/
├── nocturne_waymo/
| ├── ...
├── test_filenames.pkl/
├── preprocess/
| └── train/
| | ├── tfrecord-00000-of-01000_108_physics.pkl
| | ├── tfrecord-00000-of-01000_109_physics.pkl
| | ├── ...
| └── val/
| | ├── tfrecord-00000-of-00150_101_physics.pkl
| | ├── tfrecord-00000-of-00150_102_physics.pkl
| | ├── ...
| └── val_interactive/
| | ├── tfrecord-00000-of-00150_0_physics.pkl
| | ├── tfrecord-00000-of-00150_106_physics.pkl
| | ├── ...
| └── test/
| | ├── tfrecord-00000-of-00150_103_physics.pkl
| | ├── tfrecord-00000-of-00150_108_physics.pkl
| | ├── ...
Warning This step is not necessary, given that the dataset was already downloaded in the previous step. This is only needed if you wish to rebuild the dataset from scratch.
To generate the offline RL datasets for training a CtRL-Sim model, we generate the offline RL dataset in chunks, which can be parallelized across different machines:
# generate validation interactive dataset
python data/generate_offline_rl_dataset.py offline_rl.chunk_idx=0 offline_rl.chunk_size=25000 offline_rl.mode=val_interactive
# generate validation dataset
python data/generate_offline_rl_dataset.py offline_rl.chunk_idx=0 offline_rl.chunk_size=25000 offline_rl.mode=val
# generate train dataset
python data/generate_offline_rl_dataset.py offline_rl.chunk_idx=0 offline_rl.chunk_size=25000 offline_rl.mode=train
python data/generate_offline_rl_dataset.py offline_rl.chunk_idx=1 offline_rl.chunk_size=25000 offline_rl.mode=train
python data/generate_offline_rl_dataset.py offline_rl.chunk_idx=2 offline_rl.chunk_size=25000 offline_rl.mode=train
python data/generate_offline_rl_dataset.py offline_rl.chunk_idx=3 offline_rl.chunk_size=25000 offline_rl.mode=train
python data/generate_offline_rl_dataset.py offline_rl.chunk_idx=4 offline_rl.chunk_size=25000 offline_rl.mode=train
python data/generate_offline_rl_dataset.py offline_rl.chunk_idx=5 offline_rl.chunk_size=25000 offline_rl.mode=train
Next, we preprocess the offline RL dataset for efficient CtRL-Sim model training. We preprocess the dataset in chunks, which can be parallelized across different machines:
# preprocess val_interactive dataset
python data/preprocess_rl_waymo.py preprocess_rl_waymo.chunk_idx=0 preprocess_rl_waymo.chunk_size=25000 preprocess_rl_waymo.mode=val_interactive dataset.waymo.preprocess_real_data=True
# preprocess val dataset
python data/preprocess_rl_waymo.py preprocess_rl_waymo.chunk_idx=0 preprocess_rl_waymo.chunk_size=25000 preprocess_rl_waymo.mode=val dataset.waymo.preprocess_real_data=True
# preprocess train dataset
python data/preprocess_rl_waymo.py preprocess_rl_waymo.chunk_idx=0 preprocess_rl_waymo.chunk_size=25000 preprocess_rl_waymo.mode=train dataset.waymo.preprocess_real_data=True
python data/preprocess_rl_waymo.py preprocess_rl_waymo.chunk_idx=1 preprocess_rl_waymo.chunk_size=25000 preprocess_rl_waymo.mode=train dataset.waymo.preprocess_real_data=True
python data/preprocess_rl_waymo.py preprocess_rl_waymo.chunk_idx=2 preprocess_rl_waymo.chunk_size=25000 preprocess_rl_waymo.mode=train dataset.waymo.preprocess_real_data=True
python data/preprocess_rl_waymo.py preprocess_rl_waymo.chunk_idx=3 preprocess_rl_waymo.chunk_size=25000 preprocess_rl_waymo.mode=train dataset.waymo.preprocess_real_data=True
python data/preprocess_rl_waymo.py preprocess_rl_waymo.chunk_idx=4 preprocess_rl_waymo.chunk_size=25000 preprocess_rl_waymo.mode=train dataset.waymo.preprocess_real_data=True
python data/preprocess_rl_waymo.py preprocess_rl_waymo.chunk_idx=5 preprocess_rl_waymo.chunk_size=25000 preprocess_rl_waymo.mode=train dataset.waymo.preprocess_real_data=True
Next, we split the validation set into a val and test set and generate test_filenames.pkl file:
python data/split_val_test.py
We finetune CtRL-Sim on a dataset of adversarial scenarios from the val_interactive split of Waymo using CAT. The preprocessed CAT finetuning dataset for CtRL-Sim model finetuning preprocessed_cat.tar
can be found here (3.8 GB). Also, download the pickle file eval_planner_dict.pkl
here.
We now assume this dataset is in your local scratch repository with the following structure:
/path/to/scratch/
├── preprocess/
| ├── ...
├── nocturne_waymo/
| ├── ...
├── test_filenames.pkl/
├── preprocessed_cat/
| ├── tfrecord-00000-of-00150_111_physics_cat.pkl
| ├── tfrecord-00000-of-00150_134_physics_cat.pkl
| ├── ...
├── eval_planner_dict.pkl/
Warning This step is not necessary, given that the dataset was already downloaded in the previous step. This is only needed if you wish to rebuild the CAT finetuning dataset from scratch.
The first step is to follow the steps in the CAT repository to convert the Waymo validation interactive tfrecord files into MetaDrive scenario descriptions. Make sure to replace cat_advgen.py
and scripts/convert_WOMD_to_MD.py
in the cat
repository files with the corresponding files in modified_cat_files
. The Waymo validation interactive tfrecord files can be downloaded from the Waymo Open Dataset website.
After setting up the CAT repository and replacing the necessary files, run (see here for context):
python /path/to/cat/scripts/convert_WOMD_to_MD.py --input /path/to/val_interactive/waymo/tfrecord/files --output /scratch/md_scenarios
Once the MetaDrive scenario files are generated, we must find the valid scenarios that overlap with the valid Nocturne files, from which we create a finetuning dataset and test set. All valid scenarios are randomized and reindexed from 0, where the first 1000 indices comprise the test set (test set for Table 2 in paper), and the remaining indices comprise the finetuning dataset. The reindexed valid scenarios are stored in /scratch/valid_md_scenarios
, with the corresponding metadata stored in /scratch/eval_planner_dict.pkl
. To generate these, run:
python data/filter_valid_cat_scenarios.py
Next, we run CAT to generate the adversarial trajectories, which will be stored in the eval_planner_dict.pkl
file:
python /path/to/cat/cat_advgen.py --dict_path /path/to/eval_planner_dict.pkl
Finally, to generate and preprocess the CAT finetuning dataset, run:
python data/generate_offline_rl_cat_dataset.py
python data/preprocess_rl_waymo.py dataset.waymo.preprocess_simulated_data=True
To train a CtRL-Sim model, run:
python train.py
This will run according to the configurations specified in cfgs/config.yaml
. By default, this will train on 4 GPUs with a total batch size of 64. Set train.track=True
to log the run to wandb, where the wandb project and entity can be configured in cfgs/
. The model checkpoint save directory can also be configured in cfgs/
, which is set to [save_dir]/[run_name]
. By default, it is set to /home/checkpoints/test
.
To train a CtRL-Sim model on 1 GPU, run:
python train.py train.devices=1 datamodule.train_batch_size=64 datamodule.val_batch_size=64
Warning The batch size may need to be adjusted to support 1 GPU training without running out of GPU memory, but this may affect performance.
To train a DT model on 4 GPUs, run:
python train.py model=dt
To train an IL model on 4 GPUs, run:
python train.py model=il
To train an Actions Only model on 4 GPUs, run:
python train.py model=trajeglish
To train a CTG++ model on 4 GPUs, run:
python train.py model=ctg_plus_plus dataset/waymo=ctg_plus_plus datamodule=ctg_plus_plus train=ctg_plus_plus
Once the preprocessed CAT scenarios have been downloaded or generated, you can now finetune a CtRL-Sim model. To finetune a CtRL-Sim model (specified by run_name
), run:
python train.py train.run_name=[replace_with_run_name] train=ctrl_sim_finetuning dataset/waymo=ctrl_sim_finetuning
To evaluate CtRL-Sim with run name run_name
on multi agent simulation (Table 1) over 1000 test scenes, run:
python eval_sim.py eval.policy.run_name=run_name eval.eval_mode=multi_agent
One can also evaluate a (+10) positively-tilted CtRL-Sim model by running:
python eval_sim.py eval.policy.run_name=run_name eval.eval_mode=multi_agent eval.policy.veh_veh_tilt=10 eval.policy.veh_edge_tilt=10 eval.policy.goal_tilt=10
To evaluate a DT model with run name run_name
, run:
python eval_sim.py policy@eval.policy=dt eval.policy.run_name=run_name eval.eval_mode=multi_agent
To evaluate an IL model with run name run_name
, run:
python eval_sim.py policy@eval.policy=il eval.policy.run_name=run_name eval.eval_mode=multi_agent
To evaluate an Actions Only model with run name run_name
, run:
python eval_sim.py policy@eval.policy=trajeglish eval.policy.run_name=run_name eval.eval_mode=multi_agent
To evaluate a CTG++ model with run name run_name
, we split evaluation into 4 partitions (which can be parallelized across multiple devices) by running:
python eval_sim.py policy@eval.policy=ctg_plus_plus eval.policy.run_name=run_name eval.eval_mode=multi_agent eval=partitioned eval.partition=0
python eval_sim.py policy@eval.policy=ctg_plus_plus eval.policy.run_name=run_name eval.eval_mode=multi_agent eval=partitioned eval.partition=1
python eval_sim.py policy@eval.policy=ctg_plus_plus eval.policy.run_name=run_name eval.eval_mode=multi_agent eval=partitioned eval.partition=2
python eval_sim.py policy@eval.policy=ctg_plus_plus eval.policy.run_name=run_name eval.eval_mode=multi_agent eval=partitioned eval.partition=3
# TODO: missing MetricsComputer line
To conduct one-agent tilting evaluation (Figure 4) of a CtRL-Sim model with run name run_name
over 1000 test scenes with with specified tilting values GOAL_TILT, VEH_VEH_TILT, and VEH_EDGE_TILT, run:
python eval_sim.py eval.policy.run_name=run_name eval.eval_mode=one_agent eval.policy.goal_tilt=GOAL_TILT eval.policy.veh_veh_tilt=VEH_VEH_TILT eval.policy.veh_edge_tilt=VEH_EDGE_TILT
To evaluate a CtRL-Sim planner (with run name run_name_planner
) against a finetuned CtRL-Sim adversary (with run anem run_name_adversary
) over 1000 test scenes (held out from Waymo validation_interactive split), run:
python eval_planner.py eval_planner_adversary.planner.run_name=run_name_planner eval_planner_adversary.adversary.run_name=run_name_adversary
To evaluate a CtRL-Sim planner (with run name run_name_planner
) against a CAT adversary, run:
python eval_planner.py eval_planner_adversary.planner.run_name=run_name_planner policy@eval_planner_adversary.adversary=cat
Check LICENSE
@inproceedings{
rowe2024ctrlsim,
title={Ct{RL}-Sim: Reactive and Controllable Driving Agents with Offline Reinforcement Learning},
author={Luke Rowe and Roger Girgis and Anthony Gosselin and Bruno Carrez and Florian Golemo and Felix Heide and Liam Paull and Christopher Pal},
booktitle={8th Annual Conference on Robot Learning},
year={2024},
url={https://openreview.net/forum?id=MfIUKzihC8}
}
We would like to acknowledge the following public repositories: