Skip to content

WilliamXuanYu/CLOVER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CLOVER

Pipeline

End-to-end autonomous driving planners are commonly trained by imitating a single logged trajectory, yet they are evaluated by rule-based planning metrics that measure safety, feasibility, progress, and comfort. This creates a training-evaluation mismatch: trajectories close to the logged path may still violate planning rules, while alternative trajectories farther from the demonstration can remain valid and high-scoring. The mismatch is especially limiting for proposal-selection planners, whose performance depends on both candidate-set coverage and scorer ranking quality. We propose CLOVER, a Closed-LOop Value Estimation and Ranking framework for end-to-end driving planning. CLOVER first expands single-trajectory imitation into set-level proposal coverage by constructing evaluator-filtered pseudo-expert trajectories. It then performs conservative closed-loop self-distillation: a trajectory-level scorer is fitted to true evaluator sub-scores on generated proposals, while the generator is refined toward teacher-selected top-k and vector-Pareto proposal targets with stability regularization. We also analyze when an imperfect scorer can improve the generator, showing that scorer-mediated refinement is reliable under local scorer accuracy, conservative updates, and selected-set enrichment.

Paper: https://arxiv.org/abs/2605.15120

TODO

  • Release paper
  • Release inference code, scripts, and ckpt
  • Release preview training scripts[1]
  • Release official training code
  • Release pseudo-expert trajectory generation code and NAVSIM-v2 evaluation scripts

Note [1] To facilitate early community discussion and reproduction, we release this preview version of the training scripts first. This preview may still contain unfinished details, deprecated interfaces, or fixed-path assumptions. These issues will be cleaned up in the formal release. The epoch schedule may also differ slightly from the final paper setup. In the current stage-2 preview we default to 30 alternating cycles (30 x 2 epochs in total). Empirically, the best checkpoint is often reached around 20 to 30 epochs, but iterative alternating training can occasionally be unstable, and an early score drop during the first several epochs is normal. We therefore keep a longer default schedule in the preview release.

Diversity Visualization

Diversity visualization appendix

Releases

  • Checkpoints and release assets: https://github.com/WilliamXuanYu/CLOVER/releases
  • DINOv2 ViT-S backbone weights: https://huggingface.co/timm/vit_small_patch14_reg4_dinov2.lvd142m/tree/main
  • Stage-1 pseudo-expert trajectory package: https://drive.google.com/drive/folders/1oNTv5Pe-naw_i81rqaKk8KIs0VcUqGZ-?usp=drive_link

Installation

conda create -n clover python=3.8
conda activate clover
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt
pip install -e /path/to/nuplan-devkit
pip install -e .

If you prefer to use the vendored nuplan-devkit copy in this repository instead of an external checkout:

pip install -e ./nuplan-devkit

Documentation

Public Entrypoints

  • Train metric cache: python navsim/planning/script/run_train_metric_caching.py
  • Stage-1 training: bash scripts/run_training_multi_expert.sh
  • Stage-2 training: bash scripts/run_training_stage2_vector_pareto_alternating.sh
  • NAVSIM-v1 evaluation: bash scripts/eval_multi_expert_navtest.sh

About

CLOVER, a Closed-LOop Value Estimation and Ranking framework for end-to-end driving planning.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors