Hallucinatory Image Tokens: A Training-free EAZY Approach to Detecting and Mitigating Object Hallucinations in LVLMs (ICCV 2025)
This repository provides the official PyTorch implementation of the paper:
Hallucinatory Image Tokens: A Training-free EAZY Approach to Detecting and Mitigating Object Hallucinations in LVLMs
Liwei Che, Tony Qingze Liu, Jing Jia, Weiyi Qin, Ruixiang Tang, Vladimir Pavlovic
ICCV 2025
EAZY is a novel training-free approach designed to detect and mitigate object hallucinations in Large Vision-Language Models (LVLMs). By identifying hallucinatory image tokens and suppressing them during the generation process, EAZY significantly improves the reliability of LVLMs without the need for additional training data or external knowledge.
conda env create -f environment.yml
conda activate eazyTo enable EAZY on LLaVA models, you need to modify the LlavaMetaForCausalLM class in minigpt4/models/llava_arch.py.
- Find
class LlavaMetaForCausalLMinminigpt4/models/llava_arch.py. - Locate the
def encode_images()function. - Insert the following code after line 240:
if zero_out_list is not None:
zero_out_list = [idx - 34 for idx in zero_out_list]
for idx in zero_out_list:
# zero-out
image_features[:, idx, :] = 0- Note: You must also ensure that the
zero_out_listparameter is passed from the model input arguments down to theencode_imagesfunction.
The evaluation requires the MSCOCO 2014 dataset. Please download it from here and extract it to your data path.
The HallCOCO dataset is included in this repository and can be found at:
dataset/hall_coco
To evaluate using the CHAIR metric with EAZY (One-Pass):
python eval_script/chair_eval_one_pass.py --model llava-1.5 --k 1 --gpu-id 0 --beam 1To evaluate using the POPE benchmark:
python eval_script/pope_eval_eazy_onepass.py --model llava-1.5 --pope-type adversarial --gpu-id 0 --k 3 --beam 1To run the hallucination detection task:
python eval_script/hall_detection_eazy.py --model llava-1.5 --gpu-id 0If you find this work useful for your research, please cite our paper:
@InProceedings{Che_2025_ICCV,
author = {Che, Liwei and Liu, Tony Qingze and Jia, Jing and Qin, Weiyi and Tang, Ruixiang and Pavlovic, Vladimir},
title = {Hallucinatory Image Tokens: A Training-free EAZY Approach to Detecting and Mitigating Object Hallucinations in LVLMs},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2025},
pages = {21635-21644}
}This repository is built upon the OPERA codebase. We thank the authors for their excellent work.