This repository contains the implementation of the paper:
Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases
Yanze Li*, Wenhua Zhang*, Kai Chen*, Yanxin Liu, Pengxiang Li, Ruiyuan Gao, Lanqing Hong, Meng Tian, Xinhai Zhao, Zhenguo Li, Dit-Yan Yeung, Huchuan Lu, Xu Jia†
*Equal Contribution †Corresponding Author
The instructions for downloading CODA-LM are listed as follows:
- Download the image files following the CODA official instructions here
- Download the CODA-LM annotation files and then decompress them in the same root directory.
Split | Size | Image Source | Download |
---|---|---|---|
Train | 4884 | CODA2022 val | HF Hub |
Val | 4384 | CODA2022 test | HF Hub |
Test | 500 | CODA2022 test | HF Hub |
Mini | 50 | CODA2022 test | HF Hub |
Note that:
- Images of CODA-LM train set come from CODA2022 val set, while images of CODA-LM val and test sets come from CODA2022 test set.
- CODA-LM mini set is a 50-image subset of CODA-LM val set for demonstration.
After decompression, the data organization is listed as follows:
├── val -- CODA2022 val (we only use images)
│ │── images
│ │ │── *_*.jpg
├── test -- CODA2022 test (we only use images)
│ │── images
│ │ │── *_*.jpg
├── CODA-LM
│ │── Train -- CODA-LM train (we use 4884 images from CODA2022 val)
│ │ │── val_*.json
│ │── Val -- CODA-LM val (we use 4384 images from CODA2022 test)
│ │ │── test_*.json
│ │── Test -- CODA-LM test (we use 500 images from CODA2022 test)
│ │ │── test_*.json
│ │── Mini -- CODA-LM mini (a 50-image subset of CODA-LM val)
│ │ │── test_*.json
The annotation files contains question-answering pairs for all three tasks as following,
{
"general_perception":{
"vehicles": [
{
"description": <str>,
"explanation": <str>"
},
"vulnerable_road_users": [...],
"traffic signs": [...],
"traffic lights": [...],
"traffic cones": [...],
"barriers": [...],
"other objects": [...],
},
"region_perception":{
"1": {
"description and explanation": <str>,
"box": <list of float>,
"category_name": <str>
},
"2": {...},
"3": {...}
},
"driving_suggestion": <str>,
}
The instructions for checking CODA-LM (mini set) examples are listed as follows:
- Install dependencies including torch and numpy.
- Run the following commands.
python codalm_dataloader.py --data_root $DATA_ROOT/CODA/CODA-LM --version Mini --batch_size 1 --num_workers 8
@article{li2024automated,
title={Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases},
author={Li, Yanze and Zhang, Wenhua and Chen, Kai and Liu, Yanxin and Li, Pengxiang and Gao, Ruiyuan and Hong, Lanqing and Tian, Meng and Zhao, Xinhai and Li, Zhenguo and others},
journal={arXiv preprint arXiv:2404.10595},
year={2024}
}