Hongbo Lu1,2,*, Liang Yao1,*, Chenghao He1,*, Haoyu Wang1,*,
Xiang Gu3, Xianfei Li1,2, Wenlong Liao1,†, Tao He1, Pai Peng1,†,‡
1COWARobot Co. Ltd 2Shanghai Jiao Tong University 3Hohai University
* Equal Contribution † Corresponding Author ‡ Project Lead
Offline inference/evaluation repo for world-model planners on nuScenes data converted to the NavSim PKL layout.
The entrypoint loads a training YAML, builds the encoder, predictor, and planner, restores a checkpoint, runs the val_command.py validation forward path, and writes metrics plus optional trajectory visualizations.
pip install -e .The code expects the converted nuScenes/NavSim PKL paths and model checkpoints referenced by the YAML to exist on the machine.
Download the required model weights before running inference.
- Pretrained weights (used for
config.meta.pretrain_checkpoint_full):pretrained.pt: https://pan.baidu.com/s/1SgmdMop50yd-Vv2nl0k2Mg Code: ukx8
- Inference checkpoint weights (used for
--checkpointorbest_open_loop.pt):best_open_loop.pt: https://pan.baidu.com/s/1uVHHEGp8qEJR9Gz3XGS05Q Code: xhku
For example:
mkdir -p checkpoints
wget -O checkpoints/pretrained.pt https://example.com/path/to/pretrained.pt
wget -O checkpoints/best_open_loop.pt https://example.com/path/to/best_open_loop.ptThen update your YAML or command line --checkpoint path accordingly.
Convert raw nuScenes data (labels JSON + GT box NPZ + CAN bus) to the NavSim PKL layout used by NavSimWorldModelDataset:
python tools/convert_nuscenes_to_navsim_pkl.py \
--nuscenes-root /path/nuScenes \
--output-root /path/nuScenes/navsim_format \
--split trainval \
--workers 8Input directory layout (--nuscenes-root):
/path/nuScenes/
├── labels/
│ ├── scene-0001.json # per-scene frame metadata
│ ├── scene-0001/ # per-scene GT box NPZ files
│ │ ├── 0001.npz
│ │ └── ...
│ └── scene-0002.json
├── samples/ # original camera images (symlinked, not copied)
│ └── CAM_FRONT/...
└── can_bus/
└── scene-0001_pose.json # CAN bus ego pose (velocity, acceleration)
Output directory layout (--output-root):
/path/nuScenes/navsim_format/
├── train/
│ ├── scene-0001.pkl # List[Dict], each entry = one keyframe (2 Hz)
│ └── ...
├── val/
│ ├── scene-0003.pkl
│ └── ...
└── sensor_blobs/
└── scene-0001/
└── CAM_F0/
└── n015-2018-...jpg # symlinks to original images
What gets converted per keyframe:
ego2global_translation/ego2global_rotation— from nuScenes ego pose matrixego_dynamic_state([vx, vy, ax, ay]in ego frame) — interpolated from CAN bus pose datadriving_command(one-hot:[GO_STRAIGHT / TURN_LEFT / TURN_RIGHT / U_TURN]) — inferred from cumulative yaw change over the scenecams— onlyCAM_FRONTis retained; images are symlinked, not copiedanns(GT boxes) — NPZ annotations converted to NavSim format; categories mapped ([vehicle/pedestrian/bicycle]); barriers and traffic cones are filtered out
Scene split follows the standard nuScenes v1.0-trainval partition: 700 train + 150 val scenes.
PYTHONPATH="$PWD" /usr/bin/python3 scripts/infer_nuscenes_val.py \
--config configs/nuscenes_inference_example.yaml \
--checkpoint /path/to/best_open_loop.pt \
--output-dir outputs/nuscenes_eval \
--disable-visIf --checkpoint is omitted, the script tries <config.folder>/best_open_loop.pt, then <config.folder>/latest.pt, then meta.resume_checkpoint.
This repo targets nuScenes converted to the local NavSim-style PKL format used by NavSimWorldModelDataset:
data.navsim.val_data_path: directory containing validation.pklscene filesdata.navsim.val_sensor_blobs_path: camera image rootdata.navsim.camera_name: camera folder name, usuallyCAM_F0
The validation path computes ADE/FDE/minADE@K/minFDE@K, World4Drive L2 horizons, and collision metrics when BEV segmentation is present in the dataloader batch.