Hallucination-aware Intermediate Representation Edit in Large Vision-Language Models

This repository contains the official pytorch implementation of the ICLR2026 paper: "Hallucination-aware Intermediate Representation Edit in Large Vision-Language Models".

Method: HIRE

Environment Setup

conda create HIRE python=3.10
conda activate HIRE
git clone https://github.com/ASGO-MM/HIRE
cd HIRE
pip install -r requirements.txt

Dataset

To train the editor, please download and extract the images and annotations from this link.
To train the router, please download and extract the MSCOCO 2014 dataset from this link.

Models

About model Pre-trained checkpoints

LLaVA-1.5: Download LLaVA-1.5 merged 7B

Training

First, extract the positive and negative hidden_states.

bash train_hire/scripts/extract_hidden_states.sh

Need to specify "model_path", "data_path","hidden_states_path"

Then, train the editor.

bash train_hire/scripts/train_hire_editor.sh

Need to specify "model_path", "data_path","hidden_states_path"

Finally, train the router.

bash train_hire/scripts/train_hire_router.sh

Need to specify "hidden_states_path", "checkpoint_path","direction_save_path"

Inference

bash train_hire/scripts/generate_caption.sh

Need to specify "editor-model-path", "model-path", "image-folder", "anno-folder", chair-path"

Acknowlegdements

This codebase is based on LLaVA, TruthX, CHAIR. Many thanks to the authors for generously sharing their codes!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
figs		figs
llava		llava
train_hire		train_hire
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hallucination-aware Intermediate Representation Edit in Large Vision-Language Models

Method: HIRE

Environment Setup

Dataset

Models

Training

Inference

Acknowlegdements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Hallucination-aware Intermediate Representation Edit in Large Vision-Language Models

Method: HIRE

Environment Setup

Dataset

Models

Training

Inference

Acknowlegdements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages