Q&A Prompts-ECCV'24

Q&A Prompts: Discovering Rich Visual Clues through Mining Question-Answer Prompts for VQA requiring Diverse World Knowledge. This is the official implementation of the [Paper] accepted by ECCV'24.

Install

Clone this repository and navigate to QA-Prompts folder

git clone https://github.com/WHB139426/QA-Prompts.git
cd QA-Prompts
mkdir experiments

Install Package

conda create -n qaprompts python=3.9.16
conda activate qaprompts
pip install -r requirements.txt
pip install numpy==1.26.4

Datasets

We prepare the annotations of [A-OKVQA] in ./annotations.

The images can be downloaded from [COCO2017], and you should organize the data as follows,

├── coco2017
│   └── train2017
│   └── val2017
│   └── test2017
├── QA-Prompts
│   └── annotations
│     └── aokvqa_v1p0_train.json
│     └── sub_qa.json
│     └── ...
│   └── datasets
│   └── models
│   └──...

You should also modify the parameter coco_path of argparse in finetune_ans.py/evaluation.py according to the directory of your COCO images.

Pretrained Weights of InstructBLIP

You can prepare the pretrained weights of InstructBLIP-Vicuna-7B according to [InstructBLIP].

Since we have changed the structure of the code of the model, we RECOMMEND you download the pretrained weights of EVA-CLIP, Vicuna-7b-v1.1 and QFormer directly in [🤗HF]. The pretrained weights should be downloaded into the sub folder ./experiments and organized as follows,

├── QA-Prompts
│   └── experiments
│     └── eva_vit_g.pth
│     └── qformer_vicuna.pth
│     └── query_tokens_vicuna.pth
│     └── vicuna-7b
│     └── llm_proj_vicuna.pth

Evaluation

Download the trained checkpoints vicuna_1_0.6969.pth from [🤗HF] (should be downloaded into the sub folder ./experiments), and then run the following script to reproduce the results.

python evaluation.py

Training

We recommend using GPUs with memory > 24G. Otherwise, you may need to extract the vision features in advance to save the memory usage of EVA-CLIP and avoid OOM. Modify the parameter world_size of argparse in finetune_ans.py according to the number of GPUs.

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=1111 finetune_ans.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Q&A Prompts-ECCV'24

Install

Datasets

Pretrained Weights of InstructBLIP

Evaluation

Training

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
annotations		annotations
datasets		datasets
models		models
utils		utils
.gitignore		.gitignore
README.md		README.md
evaluation.py		evaluation.py
finetune_ans.py		finetune_ans.py
requirements.txt		requirements.txt

WHB139426/QA-Prompts

Folders and files

Latest commit

History

Repository files navigation

Q&A Prompts-ECCV'24

Install

Datasets

Pretrained Weights of InstructBLIP

Evaluation

Training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages