Highlights
👀CV&VLM
[CVPR'22] Official PyTorch Implementation of "Collaborative Transformers for Grounded Situation Recognition"
[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
Collection of AWESOME vision-language models for vision tasks
Awesome-Remote-Sensing-Vision-Language-Models
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
KoLLaVA: Korean Large Language-and-Vision Assistant (feat.LLaVA)
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
Taming Transformers for High-Resolution Image Synthesis
[AAAI 2024 Oral] AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models
Paper list and datasets for industrial image anomaly/defect detection (updating). 工业异常/瑕疵检测论文及数据集检索库(持续更新)。
This repo contains annotated research papers that I found really good and useful
📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).






