[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
-
Updated
Sep 30, 2024 - Python
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models
Official Repository of Multi-Object Hallucination in Vision-Language Models
[CVPR 2024 CVinW] Multi-Agent VQA: Exploring Multi-Agent Foundation Models on Zero-Shot Visual Question Answering
[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation
This repository is the codebase of TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
[ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.
[NeurIPS 2024 D&B Track] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strategy.
Talk2BEV: Language-Enhanced Bird's Eye View Maps (Accepted to ICRA'24)
Gemini Pro, your do-it-all AI tool, translates languages, sparks creativity, and answers questions, all while efficiently running on devices from phones to data centers, making it accessible for developers and businesses to unlock AI's potential.
An benchmark for evaluating the capabilities of large vision-language models (LVLMs)
Add a description, image, and links to the large-vision-language-models topic page so that developers can more easily learn about it.
To associate your repository with the large-vision-language-models topic, visit your repo's landing page and select "manage topics."