Skip to content

beta-nlp/VGA

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Under Construction ...


Tell Model Where to Look: Mitigating Hallucinations in MLLMs by Vision-Guided Attention

We propose Vision-Guided Attention, a method that guides visual attention by visual grounding.


Setup

Environment

conda create -n vga -y python=3.10
conda activate vga

pip install -r requirements.txt

Note: The dependencies are referred to LLaVA-v1.5. For LLaVA-Next and Qwen2.5-VL-Instruct, you can also easily set up the environment by following the instructions from their official repositories.

Datasets

All benchmarks need to be processed into structurally consistent JSON files.

Some samples could be found in data/samples.json.

Evaluate VGA

Quick start

We developed a shell script scripts/all.sh that can execute benchmarks end-to-end.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.9%
  • Shell 0.1%