🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)
-
Updated
Dec 9, 2023 - Python
🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models. The first work to correct hallucinations in MLLMs.
mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
[Paper][Preprint 2023] Making Large Language Models Perform Better in Knowledge Graph Completion
A Gradio demo of MGIE
A Video Chat Agent with Temporal Prior
A PyTorch-based system for highly accurate drug-target interaction predictions utilizing multi-modal large language models to discern structural affinities in drug-target pairs.
An Easy-to-use Hallucination Detection Framework for LLMs.
This repository contains code to evaluate various multimodal large language models using different instructions across multiple multimodal content comprehension tasks.
Voice assistant using Multimodal LLMs - LLaVA-NeXT (Mistral 7B) finetuned & PhoWhisper
This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
Multimodal RAG and comparisons between language models. (Project for Deep Learning Module at the FHSWF)
Simulating Large-Scale Multi-Agent Interactions with Limited Multimodal Senses and Physical Needs
[CVPR 2024] 🎬💭 chat with over 10K frames of video!
Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"
Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".
[ACL 2024] An Easy-to-use Hallucination Detection Framework for LLMs.
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Add a description, image, and links to the multimodal-large-language-models topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-large-language-models topic, visit your repo's landing page and select "manage topics."