Skip to content

LL-a-VO/VLLaVO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VLLaVO: Mitigating Visual Gap through LLMs

This is the offical code of VLLaVO.

Overview of VLLaVO

Firstly, we extract the descriptions of an images by VLMs (CLIP and BLIP), then finetuning LLM (LLaMA) with the descriptions. The finetuned LLM can be used to do classification.

Prepare model

All model we used in the following list:

Prepare dataset

Directly download existing dataset

We offer the dataset used in our paper on the following link: dataset link

Description Extract

If you want to extract the descriptions to construct the dataset by yourself, the following codes can be used.

CUDA_VISIBLE_DEVICES=1 python descriptions_extractor.py -s dataset/office_home/image_list/Product.txt --save_path ../datasets/Office_home --base_path dataset/office_home/

Finetune

See DG_llama.sh in ./script/bash_command for LLM model llama2.

Evaluate

See classification_llama.sh for LLM model llama2.

Releases

No releases published

Packages

No packages published