LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
-
Updated
Feb 1, 2024 - Python
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"
An open-source implementation for training LLaVA-NeXT.
Contains code and documentation for our VANE-Bench paper.
The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.
"Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA"
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
[ECCV 2024] BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models
A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo
An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
🔥 Official Benchmark Toolkits for "Visual Haystacks: Answering Harder Questions About Sets of Images"
Open Platform for Embodied Agents
A Framework of Small-scale Large Multimodal Models
The official evaluation suite and dynamic data release for MixEval.
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
Add a description, image, and links to the large-multimodal-models topic page so that developers can more easily learn about it.
To associate your repository with the large-multimodal-models topic, visit your repo's landing page and select "manage topics."