Skip to content
/ AllPath Public

[NeurIPS 2025] Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats

License

Notifications You must be signed in to change notification settings

SooLab/AllPath

Repository files navigation

AllPath

This is the official PyTorch implementation for our NeurIPS 2025 paper:

Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats
> Jiaye Qian1,2*  Ge Zheng2*  Yuchen Zhu2  Sibei Yang1†
> 1School of Computer Science and Engineering, Sun Yat-sen University  2ShanghaiTech University

AllPath Pipeline

arXiv:2511.17254

Abstract

Despite their impressive performance across a wide range of tasks, Large Vision-Language Models (LVLMs) remain prone to hallucination. In this study, we propose a comprehensive intervention framework aligned with the transformer’s causal architecture in LVLMs, integrating the effects of different intervention paths on hallucination. We find that hallucinations in LVLMs do not arise from a single causal path, but rather from the interplay among image-to-input-text, image-to-output-text, and text-to-text pathways. For the first time, we also find that LVLMs rely on different pathways depending on the question–answer alignment format. Building on these insights, we propose simple yet effective methods to identify and intervene on critical hallucination heads within each pathway, tailored to discriminative and generative formats. Experiments across multiple benchmarks demonstrate that our approach consistently reduces hallucinations across diverse alignment types.

Setup

Environment Setup

Our codebase requires Python ≥ 3.9. When running evaluations, each model family depends on a specific version of the transformers library. Since the transformers API changes over time, using a different version may lead to unexpected issues. To avoid this, several modules include version checks to ensure the correct environment is used. The required versions are:

Model transformers Version
LLaVA 1.5 4.37.2
Qwen VL 4.32.0

We recommend setting up the environment according to the official instructions provided by each model’s GitHub repository.

In addition, please install the following dependencies:

pip install nltk pycocotools

Path Setup

You will need to configure the paths in playground/path_table.py by replacing each path/to/xxx placeholder with the actual locations on your system.

COCO

To evaluate CHAIR, you should download the COCO dataset from their official website.

After downloading, please set COCO Path to the val2014/ directory that directly contains the image files, and set COCO annotation to the corresponding instances_val2014.json file.

MME

To evaluate MME, you should download the MME dataset from this link.

After downloading, you should unzip this file, and set MME root to the MME_Benchmark_release_version/MME_Benchmark/ folder in this file.

GQA

To evaluate GQA, you should download the images from GQA from this website.

After downloading, you should unzip this file and set GQA path to the folder that directly contains the GQA images.

MCQ POPE

Our method includes a dataset called MCQ POPE, which can be found in the benchs/mcq_pope/ directory. Files beginning with resampled are the datasets we use for extracting attention heads. These files contain no images that overlap with the final evaluation datasets.

Heads Identification

Our approach consists of two components: (1) heads identification and (2) hallucination mitigation. If you want to proceed directly to evaluation, we have already provided the identified heads in the head_ours/ directory.

File Structure

For each model, the extracted heads on different datasets are stored in head_ours/[model name]/heads-[benchmark name]-{format|image}.jsonl. Here, format indicates T2T heads, and image indicates I2T heads.

Each line in the file follows the format:

[[layer id, head id], I2T/T2T score]

The entries are sorted in ascending order by I2T/T2T score, meaning that heads appearing earlier in the file are more likely to promote hallucinations.

How to Reproduce

To reproduce the extracted attention heads, you will need to run one of the extraction scripts and specify the model and benchmark. Here’s a structured overview:

  • Scripts

    • inference_t2t_scores.py – extracts T2T heads
    • inference_i2t_scores.py – extracts I2T heads
  • Model (--model)

    • llava – LLaVA v1.5 7B
    • qwenvl – Qwen VL
  • Benchmark (--eval)

    • ResampledPOPE
    • ResampledMCQPOPE
    • ResampledCHAIR

Benchmarks prefixed with Resampled ensure that the dataset used for head extraction does not overlap with the dataset used for final evaluation.

  • Dataset and Split (only for POPE/MCQ POPE)

    • --dataset coco
    • --split can be random, popular, or adversarial

For example, to extract T2T heads from LLaVA on the MCQ POPE dataset using the adversarial split:

python inference_t2t_scores.py \
    --model llava \
    --eval ResampledPOPE \
    --dataset coco \
    --split adversarial

If you want to extract heads for the CHAIR benchmark, the --dataset and --split arguments are not needed. For example:

python inference_t2t_scores.py \
    --model llava \
    --eval ResampledCHAIR

Evaluation

POPE / MCQ POPE

To evaluate on the POPE or MCQ POPE benchmarks, you need to specify the following:

  • Model (--model): choose which model to evaluate

    • llava – LLaVA v1.5 7B
    • qwenvl – Qwen VL
  • Method (--method): choose the evaluation method

    • baseline – vanilla model evaluation
    • vcd – Visual Contrastive Decoding
    • icd – Instruction Contrastive Decoding
    • pai – Paying More Attention to Image
    • adhh – Countering Description Contrastive Decoding
    • allpath – our proposed method
  • Benchmark (--eval):

    • pope – POPE benchmark
    • mcqpope – MCQ POPE benchmark
  • Dataset and split (--dataset and --split)

    • --dataset can be coco, aokvqa, or gqa
    • --split can be random, popular, or adversarial
  • Sampling (--sample): enables temperature=1.0, which is the setting reported in our paper. You can also specifying the temperature by passing --temperature <float>

For example, to evaluate LLaVA on the MCQ POPE benchmark using our method and the adversarial split:

python main.py \
    --model llava \
    --method allpath \
    --sample \
    --eval mcqpope \
    --dataset coco \
    --split adversarial

CHAIR

To evaluate on the CHAIR benchmark, use the --fixed True flag to run evaluation on a fixed set of 500 questions instead of randomly sampling them.

For example, to evaluate LLaVA using our method on CHAIR:

python main.py \
    --model llava \
    --method allpath \
    --eval chair \
    --fixed True

MME

Evaluating on the MME benchmark does not require any additional parameters.

For example, to run LLaVA with our method on MME:

python main.py \
    --model llava \
    --method allpath \
    --eval mme \
    --sample

Acknowledgements

Our implementation incorporates or modifies code from the following open-source repositories. We extend our sincere gratitude to the authors of these projects (listed in no particular order):

Citation

If you find our work useful, please cite us as:

@inproceedings{qian2025interveneallpaths,
    title     = {Intervene-All-Paths: Unified Mitigation of {LVLM} Hallucinations across Alignment Formats},
    author    = {Qian, Jiaye and Zheng, Ge and Zhu, Yuchen and Yang, Sibei},
    booktitle = {The Thirty-ninth Annual Conference on Neural Information Processing Systems},
    year      = {2025},
    url       = {https://openreview.net/forum?id=HRBhNqkG03}
}

About

[NeurIPS 2025] Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages