New work: Check out When the City Teaches the Car β label-free 3D perception from infrastructure!
A label-efficient framework for 3D detection using expert predictions
ICLR 2025 Β Β·Β DriveX @ ICCV 2025 (Oral)
Jinsu Yoo1, Zhenyang Feng1, Tai-Yu Pan1, Yihong Sun2, Cheng Perng Phoo2, Xiangyu Chen2, Mark Campbell2, Kilian Q. Weinberger2, Bharath Hariharan2, Wei-Lun Chao1
1The Ohio State University Β Β 2Cornell University
Can an autonomous vehicle learn 3D perception by observing predictions from a nearby expert agent β without accessing its raw sensor data or model weights? R&B-POP answers yes, but shows that naively using expert predictions as pseudo-labels yields poor performance due to two fundamental challenges:
- Mislocalization: GPS inaccuracies and timing delays introduce positional error, causing pseudo-labels to be offset from the true object locations.
- Viewpoint mismatch: Objects visible to the expert may be occluded or outside the ego vehicle's field of view, resulting in false positives and missed detections.
R&B-POP addresses these challenges with a two-stage self-training pipeline on V2V4Real, a real-world collaborative driving dataset. A lightweight PointNet-based box ranker β trained with fewer than 1% labeled frames (~40 frames total) β refines and filters the noisy pseudo-labels before training the ego detector. A distance-based curriculum further improves training by first focusing on nearby objects (where pseudo-labels are more reliable) before gradually expanding to longer ranges. The pipeline iterates: refined labels train a better detector, which in turn generates cleaner pseudo-labels for the next stage.
conda create -n rnb-pop python=3.8 -y
conda activate rnb-popInstall PyTorch matching your CUDA version. The codebase was developed with CUDA 11.8:
pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 --index-url https://download.pytorch.org/whl/cu118spconv is required for the voxelization backbone. Install the version matching your CUDA:
pip install spconv-cu118git clone https://github.com/jinsuyoo/rnb-pop.git
cd rnb-pop
pip install -r requirements.txtpip install -e .This makes opencood, ranker, and tools importable from anywhere within the project.
Note: On systems that default to Intel compilers (e.g., OSC), prepend
CC=gccto avoid linker errors with Intel-specific symbols.
# Cython extension for 2D box overlap computation
CC=gcc python opencood/utils/setup.py build_ext --inplace
# CUDA extension for 3D IoU / NMS
cd opencood/utils/iou3d_nms
CC=gcc python setup.py build_ext --inplace
cd ../../..R&B-POP is evaluated on V2V4Real, a real-world collaborative driving dataset with a Tesla (ego car, car_id=0) and a Honda (reference car, car_id=1) driving within 100m of each other.
- Download the dataset from the V2V4Real website.
- Set
data_pathinconfigs/rnb_pop_v2v4real.yamlto your dataset root.
| Model | File | Description |
|---|---|---|
| Ego Car Detector | pretrained_models/ego_detector.pth |
R&B-POP trained detector |
| Box Ranker | pretrained_models/ranker.pth |
Trained with 2 annotated frames per scenario (~40 frames total) |
| Reference Car Detector | pretrained_models/refcar_detector.pth |
PointPillars (32-beam) trained on reference car LiDAR |
Set your dataset root once and reuse it throughout:
DATA_DIR=/path/to/v2v4real # <-- set this to your V2V4Real dataset rootThe pipeline consists of the following steps:
[Step 1] Generate ranker training data (skip if using pretrained)
β
[Step 2] Train box ranker (skip if using pretrained)
β
[Step 3] Run R&B-POP pipeline (2-stage self-training)
β
[Step 4] Evaluate
We provide preprocessed initial pseudo-labels (reference car predictions projected into the ego frame, z-adjusted, FP-filtered) as exp/refcar_predictions_preprocessed.tar.gz (~1 MB), included in this repository. After cloning, simply extract:
tar -xzf exp/refcar_predictions_preprocessed.tar.gz -C exp/The initial_label_path in the pipeline scripts is already set to exp/refcar_predictions_preprocessed.
The pretrained reference car detector checkpoint (pretrained_models/refcar_detector.pth) is also provided in case you want to regenerate the predictions yourself.
Skip this step if using the pretrained ranker (pretrained_models/ranker.pth).
The box ranker uses per-frame above-ground point masks stored under above_ground_ransac/ inside the dataset root. Since the ranker is trained on ego car data only, generate them for the ego car:
python data_preprocessing/generate_ground_plane.py \
--root_dir $DATA_DIR \
--train_split subset2 \
--car_id 0python ranker/generate_data/generate_ranker_data.py \
--root_dir $DATA_DIR \
--num_annotate_frames 2 \
--num_samples_per_box 1000 \
--save_dir exp/ranker_training_dataThis uses only the first 2 annotated frames per scenario (~40 labeled frames total across ~20 scenarios).
Skip this step if using the pretrained ranker (pretrained_models/ranker.pth).
bash scripts/train_ranker.shOr manually:
python ranker/train_ranker.py \
--root_dir $DATA_DIR \
--train_data_dir exp/ranker_training_data \
--num_annotate_frames 2 \
--batch_size 256 \
--epoch 100 \
--save_dir exp/ranker \
--use_offset \
--random_drop_points \
--no_distEdit the path variables at the top of the script:
root_dir=$DATA_DIR
ranker_path="pretrained_models/ranker.pth" # pretrained; or exp/ranker/... if trained from scratchThen run:
SLURM (4 nodes Γ 2 GPU):
sbatch scripts/run_rnb_pop.shLocal (multi-GPU, single machine):
bash scripts/run_rnb_pop_local.shSet ngpus at the top of the script to match the number of GPUs on your machine.
The pipeline runs two stages automatically:
- Stage 1 (Rank & Build): Refine reference car labels β filter by ranker score β train ego detector on 0β40m frames
- Stage 2 (Self-training): Generate new pseudo-labels with trained detector β refine β filter β train on all frames (0β90m)
Output structure under exp/:
exp/
βββ stage_1_1_refined/ # ranker-refined labels
βββ stage_1_2_filtered/ # score-filtered labels
βββ stage_1_3_trained/ # pseudo-labels from stage 1 detector
βββ stage_2_1_refined/
βββ stage_2_2_filtered/
βββ checkpoints/
βββ stage_1/ # detector checkpoints from stage 1
βββ stage_2/ # detector checkpoints from stage 2
To evaluate the pretrained ego car detector:
python test.py \
--model_dir pretrained_models \
--strict_model_path pretrained_models/ego_detector.pth \
--data_split testNote: Exact numbers may vary slightly depending on the environment (CUDA version, hardware, etc.), though the differences are not meaningful.
To evaluate a detector trained from scratch via the pipeline:
python test.py \
--model_dir exp/checkpoints \
--strict_model_path exp/checkpoints/stage_2/net_epoch060.pth \
--data_split testTo evaluate pseudo-label quality against ego car GT:
exp/ego_gt_labels.tar.gz (~1.2 MB) is included in the repo. After cloning, simply extract:
tar -xzf exp/ego_gt_labels.tar.gz -C exp/Then run:
python eval_label_quality.py \
--root_dir $DATA_DIR \
--gt_label_path exp/ego_gt_labels \
--gt_label_idx ego \
--pseudo_label_path exp/stage_1_2_filtered \
--pseudo_label_idx pred@inproceedings{yoo2025rnbpop,
title={Learning 3D Perception from Others' Predictions},
author={Yoo, Jinsu and Feng, Zhenyang and Pan, Tai-Yu and Sun, Yihong and Phoo, Cheng Perng and Chen, Xiangyu and Campbell, Mark and Weinberger, Kilian Q and Hariharan, Bharath and Chao, Wei-Lun},
booktitle={International Conference on Learning Representations (ICLR)},
year={2025}
}This codebase builds on OpenCOOD, V2V4Real, and pointnet.pytorch. We thank the authors for their open-source contributions.