Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Generalized OOD Detection v2

🚀 Our framework encapsulates the evolution of OOD detection and related tasks in the VLM era, fostering collaborative efforts among each community 🤝

Atsuyuki Miyai¹ Jingkang Yang^2,† Jingyang Zhang³ Yifei Ming⁴ Yueqian Lin³
Qing Yu^1,5 Go Irie⁶ Shafiq Joty^4,2 Yixuan Li⁷ Hai Li³ Ziwei Liu^2,†
Toshihiko Yamasaki¹ Kiyoharu Aizawa¹

¹The University of Tokyo ^†S-Lab, ²Nanyang Technological University ³Duke University ⁴Salesforce AI Research ⁵LY Corporation ⁶Tokyo University of Science ⁷University of Wisconsin-Madison

About This Repository

This is a repository of our survey paper. We hope that our survey can help readers and participants better understand the demanding challenges on OOD detection and related topics in the VLM era.
This repository plays the following two roles:

This repository provides an easily accessible list of the references mentioned in the paper Table 2. This list will continue to include more promising works as new ones emerge. Please feel free to recommend relevant and good works via Pull Request.
We hope that this repository will become a discussion panel for readers to ask questions, raise concerns, and make constructive comments. Feel free to post your ideas to Issues.

Abstract

We present a generalized OOD detection v2, encapsulating the evolution of Anomaly Detection (AD), Novelty Detection (ND), Open-set Recognition (OSR), Out-of-distribution (OOD) detection, and Outlier Detection (OD) in the VLM era. Our framework reveals that, with some field inactivity and integration, the demanding challenges in the VLM era have become OOD detection and AD. In addition to the inter-field evolution, we also highlight the significant shift in the definition, problem settings, and benchmarks; our work thus features a comprehensive review of the methodology for OOD detection, including in-depth discussion over other related tasks to clarify their relationship to and influence on OOD detection. Finally, we explore the advancements in the emerging Large Vision Language Model (LVLM) era, represented by GPT-4V. We conclude this survey with open challenges and potential research directions of OOD detection in the VLM and LVLM era.

Common Benchmarks

CLIP-based OOD Detection

ImageNet OOD Benchmark

ImageNet-20, ImageNet-10

ImageNet protocol

CLIP-based AD

MVTec-AD

VisA

Methodology

We introduce methods for CLIP-based OOD detection and CLIP-based AD.
To provide diverse perspectives on OOD detection approaches, we have encompassed a wide range of methods, including preprints.

Timeline

Paper List

CLIP-based OOD Detection

Zero-shot

, ,

,

Few-shot

,

Others

CLIP-based AD

Zero-shot

Few-shot

Others

Early Advance in LVLM Era

In the LVLM Era, OOD detection and related topics have evolved as follows:

(i) Sensory Anomaly Detection ⇒ Sensory Anomaly Detection

AD

(ii) OOD Detection ⇒ Unsolvable Problem Detection

UPD

Acknowledgment

This repository is built upon the foundation of the following resources: generalized OOD detection v1, OpenOOD codebase.

Contact

If you have questions or find any mistake, please open an issue mentioning @AtsuMiyai.

Citation

If you find our survey paper helpful for your research, please consider citing the following paper:

@article{miyai2024generalized2,
  title={Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey},
  author={Miyai, Atsuyuki and Yang, Jingkang and Zhang, Jingyang and Ming, Yifei and Lin, Yueqian and Yu, Qing and Irie, Go and Joty, Shafiq and Li, Yixuan and Li, Hai and Liu, Ziwei and Yamasaki, Toshihiko and Aizawa, Kiyoharu},
  journal={arXiv preprint arXiv:2407.21794},
  year={2024}
}

Besides, please also consider citing our other projects that are closely related to this survey.

Our Directly Related Projects

# generalized OOD detection framework v1, survey
@article{yang2024generalized,
  title={Generalized out-of-distribution detection: A survey},
  author={Yang, Jingkang and Zhou, Kaiyang and Li, Yixuan and Liu, Ziwei},
  journal={IJCV},
  pages={1--28},
  year={2024},
}

# MCM (Zero-shot OOD detection)
@inproceedings{ming2022delving,
  title={Delving into out-of-distribution detection with vision-language representations},
  author={Ming, Yifei and Cai, Ziyang and Gu, Jiuxiang and Sun, Yiyou and Li, Wei and Li, Yixuan},
  booktitle={NeurIPS},
  year={2022}
}

# GL-MCM (Zero-shot OOD detection)
@article{miyai2023zero,
  title={Zero-Shot In-Distribution Detection in Multi-Object Settings Using Vision-Language Foundation Models},
  author={Miyai, Atsuyuki and Yu, Qing and Irie, Go and Aizawa, Kiyoharu},
  journal={arXiv preprint arXiv:2304.04521},
  year={2023}
}

# PEFT-MCM (Few-shot OOD detection, Concurrent work with LoCoOp)
@article{ming2024does,
  title={How Does Fine-Tuning Impact Out-of-Distribution Detection for Vision-Language Models?},
  author={Ming, Yifei and Li, Yixuan},
  journal={IJCV},
  volume={132},
  number={2},
  pages={596--609},
  year={2024},
}

# LoCoOp (Few-shot OOD detection, Concurrent work with PEFT-MCM)
@inproceedings{miyai2023locoop,
  title={LoCoOp: Few-Shot Out-of-Distribution Detection via Prompt Learning},
  author={Miyai, Atsuyuki and Yu, Qing and Irie, Go and Aizawa, Kiyoharu},
  booktitle={NeurIPS},
  year={2023}
}

# UPD
@article{miyai2024upd,
  title={Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models},
  author={Miyai, Atsuyuki and Yang, Jingkang and Zhang, Jingyang and Ming, Yifei and Yu, Qing and Irie, Go and Li, Yixuan and Li, Hai and Liu, Ziwei and Aizawa, Kiyoharu},
  journal={arXiv preprint arXiv:2403.20331},
  year={2024}
}

Our Other Projects

# OpenOOD 
@inproceedings{yang2022openood,
  title={Openood: Benchmarking generalized out-of-distribution detection},
  author={Yang, Jingkang and Wang, Pengyun and Zou, Dejian and Zhou, Zitang and Ding, Kunyuan and Peng, Wenxuan and Wang, Haoqi and Chen, Guangyao and Li, Bo and Sun, Yiyou and others},
  booktitle={NeurIPS Datasets and Benchmarks Track},
  year={2022}
}

# OpenOOD v1.5 report
@article{zhang2023openood,
  title={OpenOOD v1.5: Enhanced Benchmark for Out-of-Distribution Detection},
  author={Zhang, Jingyang and Yang, Jingkang and Wang, Pengyun and Wang, Haoqi and Lin, Yueqian and Zhang, Haoran and Sun, Yiyou and Du, Xuefeng and Zhou, Kaiyang and Zhang, Wayne and Li, Yixuan and Liu, Ziwei and Chen, Yiran and Li, Hai},
  journal={arXiv preprint arXiv:2306.09301},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
figs		figs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Generalized OOD Detection v2

About This Repository

Abstract

Common Benchmarks

Methodology

Timeline

Paper List

CLIP-based OOD Detection

CLIP-based AD

Early Advance in LVLM Era

(i) Sensory Anomaly Detection ⇒ Sensory Anomaly Detection

(ii) OOD Detection ⇒ Unsolvable Problem Detection

Acknowledgment

Contact

Citation

About

Releases

Packages

AtsuMiyai/Awesome-OOD-VLM

Folders and files

Latest commit

History

Repository files navigation

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Generalized OOD Detection v2

About This Repository

Abstract

Common Benchmarks

Methodology

Timeline

Paper List

CLIP-based OOD Detection

CLIP-based AD

Early Advance in LVLM Era

(i) Sensory Anomaly Detection ⇒ Sensory Anomaly Detection

(ii) OOD Detection ⇒ Unsolvable Problem Detection

Acknowledgment

Contact

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages