Skip to content

pseudoc18/EAZY

Repository files navigation

Hallucinatory Image Tokens: A Training-free EAZY Approach to Detecting and Mitigating Object Hallucinations in LVLMs (ICCV 2025)

License: MIT Paper Project Page

This repository provides the official PyTorch implementation of the paper:

Hallucinatory Image Tokens: A Training-free EAZY Approach to Detecting and Mitigating Object Hallucinations in LVLMs
Liwei Che, Tony Qingze Liu, Jing Jia, Weiyi Qin, Ruixiang Tang, Vladimir Pavlovic
ICCV 2025

Overview

EAZY is a novel training-free approach designed to detect and mitigate object hallucinations in Large Vision-Language Models (LVLMs). By identifying hallucinatory image tokens and suppressing them during the generation process, EAZY significantly improves the reliability of LVLMs without the need for additional training data or external knowledge.

Setup

Environment

conda env create -f environment.yml
conda activate eazy

Model Modification (Important)

To enable EAZY on LLaVA models, you need to modify the LlavaMetaForCausalLM class in minigpt4/models/llava_arch.py.

  1. Find class LlavaMetaForCausalLM in minigpt4/models/llava_arch.py.
  2. Locate the def encode_images() function.
  3. Insert the following code after line 240:
       if zero_out_list is not None:
            zero_out_list = [idx - 34 for idx in zero_out_list]
            for idx in zero_out_list:
                # zero-out
                image_features[:, idx, :] = 0
  1. Note: You must also ensure that the zero_out_list parameter is passed from the model input arguments down to the encode_images function.

Evaluation

The evaluation requires the MSCOCO 2014 dataset. Please download it from here and extract it to your data path.

HallCOCO Dataset

The HallCOCO dataset is included in this repository and can be found at: dataset/hall_coco

CHAIR Evaluation

To evaluate using the CHAIR metric with EAZY (One-Pass):

python eval_script/chair_eval_one_pass.py --model llava-1.5 --k 1 --gpu-id 0 --beam 1

POPE Evaluation

To evaluate using the POPE benchmark:

python eval_script/pope_eval_eazy_onepass.py --model llava-1.5 --pope-type adversarial --gpu-id 0 --k 3 --beam 1

Hallucination Detection

To run the hallucination detection task:

python eval_script/hall_detection_eazy.py --model llava-1.5 --gpu-id 0

Citation

If you find this work useful for your research, please cite our paper:

@InProceedings{Che_2025_ICCV,
    author    = {Che, Liwei and Liu, Tony Qingze and Jia, Jing and Qin, Weiyi and Tang, Ruixiang and Pavlovic, Vladimir},
    title     = {Hallucinatory Image Tokens: A Training-free EAZY Approach to Detecting and Mitigating Object Hallucinations in LVLMs},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2025},
    pages     = {21635-21644}
}

Acknowledgement

This repository is built upon the OPERA codebase. We thank the authors for their excellent work.

About

Offcial Implementation of EAZY(ICCV2025)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors