Skip to content

The official code repository for "Overcoming Shortcut Problem in VLM for Robust Out-of-Distribution Detection" in CVPR, 2025

Notifications You must be signed in to change notification settings

HAIV-Lab/OSPCoOp_Imagenet-bg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overcoming Shortcut Learning in Vision-Language Models for Robust Out-of-Distribution Detection

CVPR 2025 Project page Kaggle Dataset

Zhuo Xu1, Xiang Xiang1,2, Yifan Liang1

1 Huazhong University of Science and Technology, China

2 Peng Cheng National Laboratory, China

The official code repository for "Overcoming Shortcut Problem in VLM for Robust Out-of-Distribution Detection".

🚀 News

[05/2025] 🎉 We have released the code and checkpoints for training and evalution.

[04/2025] 🎉 Our paper has been selected as a Highlight paper.

[03/2025]🎉 We have released the proposed ImageNet-Bg along with the code for OOD generation.

[02/2025]🎉 Our paper has been accepted by CVPR2025.

👀 Table of Contents

  1. The proposed ImageNet-Bg
  2. Dataset Preparation
  3. Train and Evaluate OSPCoOp
  4. Visualization
  5. Acknowledgements
  6. Citation

✨The proposed ImageNet-Bg

To further test the robustness of the model against background interference, we propose an ImageNet background interference test set, ImageNet-Bg, based on the ImageNet validation set with 48,285 images. All images in this dataset are generated by removing ID-relevant regions from samples in the ImageNet validation set. We filter the images to obtain the ImageNet-Bg(S) test set, which contains purer background information with 24,863 images.

🔧 Dataset Preparation

  • ID Datasets: ImageNet-1K, The ImageNet-1k dataset (ILSVRC-2012) can be downloaded here.

  • OOD Datasets: iNaturalist, SUN, Places, and Texture. Please follow the instruction from MOS.

  • Background Interference OOD Datasets: ImageNet-Bg, ImageNet-Bg(S).

After downloading the datasets, update the data root path in ./OSPCoOp/root_config.py to point to your local dataset location.

Please run the following command to sample few-shot training data for further Pseudo-OOD Generation.

CUDA_VISIBLE_DEVICES=0 python train.py  --trainer OSPCoOp --shots 16 --few_shot_sampler True --seed 1

Pseudo-OOD Generation

For quick start, we have provided our generated Pseudo-OOD data, which can be downloaded here (Google Drive or Baidu Cloud).

Image Masking

Please follow these steps:

Step1: Please install Grounded-Segment-Anything.

Step2: Copy the ./GroundingSAM/masking.py and ./GroundingSAM/classnames.py file into your Grounded-Segment-Anything project directory.

Step3: Run the following command:

python masking.py --config GroundingDINO/groundingdino/config/GroundingDINO_SwinT_OGC.py   --grounded_checkpoint groundingdino_swint_ogc.pth   --sam_checkpoint sam_vit_h_4b8939.pth   --output_dir "output_dir"   --box_threshold 0.3   --text_threshold 0.25    --device "cuda"

Note: Set "output_dir" to your image directory, and place the images to be processed in the "./output_dir/raw" directory.

Image Inpainting

Please follow these steps:

Step1: Please install Inpaint-Anything.

Step2: Copy the ./Inpainting/inpainting.py, ./Inpainting/Texture_inpainting.py and ./GroundingSAM/classnames.py file into your Inpaint-Anything project directory.

Step3: Run the following command to generate the inpainting images:

python inpainting.py --output_dir output    --lama_ckpt ./pretrained_models/big-lama  --min=0.0  --max=0.25

Step4: Run the following command to compute the score of inpainting images:

python get_clip_score.py  --output_dir output

Step5: To generate the texture OOD aug data, please run the following command:

python Texture_inpainting.py  --output_dir output    --lama_ckpt ./pretrained_models/big-lama  --min=0.0  --max=0.25

💻 Train and Evaluate OSPCoOp

Our experiments are conducted with Python 3.9.18 and Pytorch 2.1.0. Followed LoCoOp and CoOp, the training code is built on top of the awesome toolbox Dassl. So, you need to install dassl first. Please note that a new environment should be created, which is different from the aforementioned Pseudo-OOD Generation.

Training

After preparing Pseudo-OOD data, please place the Pseudo-OOD data in the ./OSPCoOp/data/ directory and run the following command to train OSPCoOp:

CUDA_VISIBLE_DEVICES=0 python train.py   --loss1 1.5 --loss2 0.5 --trainer OSPCoOp --shots 16 --output_dir ./runs/16shots 

To train OSPCoOp with ID aug data, please run ./OSPCoOp/idaug.py first with the following command:

CUDA_VISIBLE_DEVICES=0 python idaug.py --train_root "ImageNet_1shot_seed1" --mask_thre 0.5 --shots 1 --seed 1 --id_aug_options "1,2,3" --id_aug_times "1,1,1" --id_aug_rate "0.5,0.5,0.5"

Then, run the following command to train OSPCoOp:

CUDA_VISIBLE_DEVICES=0 python train.py   --loss1 1.5 --loss2 0.5 --trainer OSPCoOp --shots 1 --eval_freq 20 --config-file "configs/trainers/OSPCoOp/vit_b16_ep20.yaml" --id_aug_dir "xxx" --seed 1  --output_dir "./runs/xxxx"

OOD Detection Evaluation

For quick start, we share our 16-shot OSPCoOp checkpoint, please download them via (Google Drive or Baidu Cloud).

To evaluate iNaturalist, SUN, Places, and Texture OOD data, please run the following command.

CUDA_VISIBLE_DEVICES=0 python train.py  --trainer OSPCoOp --shots 16 --eval_only True --model-dir ./runs/16shots 

To evaluate ImageNet-Bg, please run the following command.

CUDA_VISIBLE_DEVICES=0 python train.py  --trainer OSPCoOp --shots 16 --eval_only True --model-dir ./runs/16shots  --eval-bg True

🔬 Visualization

🍻Acknowledgement

This work is based on the following repositories: Grounded-SAM, LoCoOp, Inpaint Anything, LaMa. Thanks to their excellent work!!!

📚 Citaiton

If you find our work interesting or use our methods, please consider citing:

@InProceedings{Xu_2025_CVPR,
    author    = {Xu, Zhuo and Xiang, Xiang and Liang, Yifan},
    title     = {Overcoming Shortcut Problem in VLM for Robust Out-of-Distribution Detection},
    booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {15402-15412}
}

About

The official code repository for "Overcoming Shortcut Problem in VLM for Robust Out-of-Distribution Detection" in CVPR, 2025

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •