CODA-LM

This repository contains the implementation of the paper:

Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases
Yanze Li*, Wenhua Zhang*, Kai Chen*, Yanxin Liu, Pengxiang Li, Ruiyuan Gao, Lanqing Hong, Meng Tian, Xinhai Zhao, Zhenguo Li, Dit-Yan Yeung, Huchuan Lu, Xu Jia†
*Equal Contribution †Corresponding Author

Data Preparation

The instructions for downloading CODA-LM are listed as follows:

Download the image files following the CODA official instructions here
Download the CODA-LM annotation files and then decompress them in the same root directory.

Split	Size	Image Source	Download
Train	4884	CODA2022 val	HF Hub
Val	4384	CODA2022 test	HF Hub
Test	500	CODA2022 test	HF Hub
Mini	50	CODA2022 test	HF Hub

Note that:

Images of CODA-LM train set come from CODA2022 val set, while images of CODA-LM val and test sets come from CODA2022 test set.
CODA-LM mini set is a 50-image subset of CODA-LM val set for demonstration.

After decompression, the data organization is listed as follows:

├── val                    -- CODA2022 val (we only use images)
│   │── images
│   │   │── *_*.jpg
├── test                   -- CODA2022 test (we only use images)
│   │── images
│   │   │── *_*.jpg
├── CODA-LM
│   │── Train              -- CODA-LM train (we use 4884 images from CODA2022 val)
│   │   │── val_*.json
│   │── Val                -- CODA-LM val (we use 4384 images from CODA2022 test)
│   │   │── test_*.json
│   │── Test               -- CODA-LM test (we use 500 images from CODA2022 test)
│   │   │── test_*.json
│   │── Mini               -- CODA-LM mini (a 50-image subset of CODA-LM val)
│   │   │── test_*.json

Data Format

The annotation files contains question-answering pairs for all three tasks as following,

{
    "general_perception":{
        "vehicles": [
            {
                "description": <str>,
                "explanation": <str>"
            },
        "vulnerable_road_users": [...],
        "traffic signs": [...],
        "traffic lights": [...],
        "traffic cones": [...],
        "barriers": [...],
        "other objects": [...],
    },
    "region_perception":{
        "1": {
            "description and explanation": <str>,
            "box": <list of float>,
            "category_name": <str>
        },
        "2": {...},
        "3": {...}
    },
    "driving_suggestion": <str>,
}

Data Example

The instructions for checking CODA-LM (mini set) examples are listed as follows:

Install dependencies including torch and numpy.
Run the following commands.

python codalm_dataloader.py --data_root $DATA_ROOT/CODA/CODA-LM --version Mini --batch_size 1 --num_workers 8

Citation

@article{li2024automated,
  title={Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases},
  author={Li, Yanze and Zhang, Wenhua and Chen, Kai and Liu, Yanxin and Li, Pengxiang and Gao, Ruiyuan and Hong, Lanqing and Tian, Meng and Zhao, Xinhai and Li, Zhenguo and others},
  journal={arXiv preprint arXiv:2404.10595},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
images		images
README.md		README.md
codalm_dataloader.py		codalm_dataloader.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

README.md

README.md

codalm_dataloader.py

codalm_dataloader.py

Repository files navigation

CODA-LM

Data Preparation

Data Format

Data Example

Citation

About

Releases

Packages

Contributors 3

Languages

DLUT-LYZ/CODA-LM

Folders and files

Latest commit

History

Repository files navigation

CODA-LM

Data Preparation

Data Format

Data Example

Citation

About

Resources

Stars

Watchers

Forks

Languages