This repository hosts the official implementation of MonoPRIO, a model for monocular 3D detection with query-adaptive prior conditioning for robust size estimation under size-depth ambiguity. MonoPRIO builds upon MonoDETR and MonoDGP.
Our core idea is to stabilise ambiguous monocular size estimation with class-aware offline prior banks, query-level routed mixture priors, uncertainty-aware log-space size conditioning, and CAP regularisation in the size pathway.
The table below reports per seed validation results (R40) for our unified model and median of 5 values used in our paper. For baseline results of MonoDGP and MonoCLUE please refer here.
| Run | Car APBEV|R40 | Car AP3D|R40 | Ped. AP3D|R40 | Cyc. AP3D|R40 | Log | Checkpoint | ||||||||
| E | M | H | E | M | H | E | M | H | E | M | H | |||
| Seed 444 | 37.7385 | 27.4488 | 24.7231 | 30.4130 | 21.9274 | 18.7024 | 12.4075 | 9.6523 | 7.4288 | 10.2545 | 5.3419 | 5.0021 | log | ckpt |
| Seed 445 | 40.3231 | 28.7343 | 25.1377 | 31.0523 | 21.9635 | 19.5214 | 12.2332 | 9.3608 | 7.1981 | 14.9668 | 7.9881 | 7.0977 | log | ckpt |
| Seed 446 | 38.3903 | 28.5380 | 24.7171 | 30.1636 | 21.7001 | 18.9655 | 10.9028 | 7.9890 | 6.3868 | 12.4020 | 5.9881 | 5.8987 | log | ckpt |
| Seed 447 | 35.8443 | 26.6558 | 23.4343 | 27.4737 | 20.8672 | 18.0712 | 13.1752 | 9.9896 | 7.7053 | 10.9032 | 5.8934 | 5.1463 | log | ckpt |
| Seed 448 | 40.6979 | 28.8151 | 25.0044 | 31.3979 | 21.8560 | 19.2577 | 12.7353 | 9.1760 | 7.3320 | 13.5628 | 6.7914 | 6.2966 | log | ckpt |
| Median of 5 (paper) | 38.390 | 28.538 | 24.723 | 30.413 | 21.856 | 18.965 | 12.408 | 9.361 | 7.332 | 12.402 | 5.988 | 5.899 | - | - |
We also provide test set checkpoints and prediction files for direct verification on the KITTI 3D object detection benchmark.
| Model | Car APBEV|R40 | Car AP3D|R40 | Ped. AP3D|R40 | Cyc. AP3D|R40 | Checkpoint | Predictions | ||||||||
| E | M | H | E | M | H | E | M | H | E | M | H | |||
| MonoPRIO | 35.50 | 24.66 | 21.54 | 26.83 | 18.93 | 16.25 | 16.31 | 10.74 | 9.08 | 7.72 | 4.32 | 3.61 | ckpt | pred |
| Model | Car APBEV|R40 | Car AP3D|R40 | Checkpoint | Predictions | ||||
| E | M | H | E | M | H | |||
| MonoPRIO | 35.74 | 26.05 | 22.90 | 28.12 | 19.85 | 16.93 | ckpt | pred |
Test results submitted to the official KITTI Benchmark:
Car category:
All categories:
For fairness and reproducibility, we also release the exact local validation logs used for the 5-seed median comparisons against MonoDGP and MonoCLUE.
MonoCLUE references:
| Method | Car APBEV|R40 | Car AP3D|R40 | Ped. AP3D|R40 | Cyc. AP3D|R40 | ||||||||
| Easy | Mod. | Hard | Easy | Mod. | Hard | Easy | Mod. | Hard | Easy | Mod. | Hard | |
| MonoDGP Pu et al. (2025) |
37.451 | 26.938 | 24.416 | 29.536 | 21.136 | 18.802 | 12.311 | 9.270 | 7.268 | 9.813 | 4.921 | 4.422 |
| MonoCLUE Yang et al. (2025) |
38.249 | 28.119 | 24.656 | 29.344 | 21.416 | 18.495 | 11.658 | 8.579 | 7.043 | 11.180 | 5.741 | 5.259 |
| MonoPRIO | 38.390 | 28.538 | 24.723 | 30.413 | 21.856 | 18.965 | 12.408 | 9.361 | 7.332 | 12.402 | 5.988 | 5.899 |
MonoDGP seed logs:
MonoCLUE seed logs:
-
Clone this project and create a conda environment:
git clone https://github.com/bigggs/MonoPRIO.git cd MonoPRIO conda create -n monoprio python=3.10 -y conda activate monoprio -
Install pytorch and torchvision matching your CUDA version:
pip install torch==2.10.0 torchvision==0.25.0 --index-url https://download.pytorch.org/whl/cu130
-
Install requirements and compile the deformable attention:
pip install -r requirements.txt cd lib/models/monoprio/ops/ bash make.sh cd ../../../..
-
Download KITTI datasets and prepare the directory structure as:
│MonoPRIO/ ├──... │data/kitti/ ├──ImageSets/ ├──training/ │ ├──image_2 │ ├──label_2 │ ├──calib ├──testing/ │ ├──image_2 │ ├──calib
-
Prepare prior banks
Option A: download and use our priors
Place downloaded files under
MonoPRIO/priors/then setmodel/prior_pathin your config.Setup Bank Link Validation Unified Download Validation Car-only Download Test Unified Download Test Car-only Download Option B: generate priors locally from KITTI labels/images
Install CLIP:
pip install git+https://github.com/openai/CLIP.git
Generate priors from the config split (defaults to
dataset.train_splitinconfigs/monoprio.yaml):python tools/build_priors.py --config configs/monoprio.yaml --out-dir priors
To split a unified bank into per-class banks (
ped/car/cyclist) (for if you want to just train on the car class):python tools/split_banks.py --input priors/priors_unified.npz --out-dir priors/individual_banks
By default, training uses:
model: prior_path: 'priors/priors_unified.npz'
Update this path in your config if you want to use a different bank file.
You can modify the settings of models and training in configs/monoprio.yaml and indicate the GPU in train.sh:
bash train.sh configs/monoprio.yaml > logs/monoprio.logThe best checkpoint will be evaluated as default. You can change it at "tester/checkpoint" in configs/monoprio.yaml:
bash test.sh configs/monoprio.yamlYou can test the inference time on your own device:
python tools/test_runtime.pyIf you find our work useful in your research, please consider giving us a star and citing:
@misc{davies2026monoprioadaptivepriorconditioning,
title={MonoPRIO: Adaptive Prior Conditioning for Unified Monocular 3D Object Detection},
author={Leon Davies and Qinggang Meng and Mohamad Saada and Baihua Li and Simon Sølvsten},
year={2026},
eprint={2605.14781},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2605.14781},
}This repo benefits from the excellent work MonoDETR, and MonoDGP. If you find this work useful, please also consider checking out and citing their work.


