multimodal-large-language-models

Here are 56 public repositories matching this topic...

modelscope / modelscope-agent

ModelScope-Agent: An agent framework connecting models in ModelScope with the world

agent chatbot android-application multi-agents rag mobile-agents gpts llm multimodal-large-language-models qwen assistantapi chatglm-4 open-gpts mobile-agent

Updated Jul 20, 2024
Python

X-PLUG / MobileAgent

Star

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

android agent harmony ios app gui automation mobile copilot multimodal mobile-agents mllm multimodal-large-language-models gpt4v multimodal-agent

Updated Jul 15, 2024
Python

cambrian-mllm / cambrian

Star

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

computer-vision chatbot representation-learning clip dino large-language-models llms instruction-tuning mllm multimodal-large-language-models

Updated Jul 6, 2024
Python

X-PLUG / mPLUG-DocOwl

Star

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

multimodal table-understanding document-understanding mllm multimodal-large-language-models chart-understanding

Updated Jul 16, 2024
Python

BAAI-DCAI / Bunny

Star

A family of lightweight multimodal models.

english chinese vlm gpt-4 chatgpt mllm multimodal-large-language-models

Updated Jul 14, 2024
Python

LLaVA-VL / LLaVA-Plus-Codebase

Star

LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills

agent tool-use large-language-models multimodal-large-language-models large-multimodal-models

Updated Feb 1, 2024
Python

rese1f / MovieChat

Star

[CVPR 2024] 🎬💭 chat with over 10K frames of video!

computer-vision dataset llama large-language-models long-video-understanding multimodal-large-language-models

Updated Jun 16, 2024
Python

X-LANCE / SLAM-LLM

Star

Speech, Language, Audio, Music Processing with Large Language Model

speech-processing audio-processing peft music-processing large-language-model multimodal-large-language-models

Updated Jul 19, 2024
Python

tsujuifu / pytorch_mgie

Star

A Gradio demo of MGIE

pytorch image-editing vision-and-language multimodal-large-language-models iclr2024

Updated Feb 23, 2024
Python

BradyFU / Woodpecker

Star

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models. The first work to correct hallucinations in MLLMs.

multimodality hallucination hallucinations large-language-models llm mllm multimodal-large-language-models

Updated Jun 17, 2024
Python

X-PLUG / Youku-mPLUG

Star

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks

benchmark video dataset chinese youku multimodal video-retrieval video-question-answering multimodal-pretraining mllm multimodal-large-language-models

Updated Jan 8, 2024
Python

zjukg / KoPA

Star

[Paper][Preprint 2023] Making Large Language Models Perform Better in Knowledge Graph Completion

knowledge-graph knowledge-graph-completion multi-modal knowledge-graph-embeddings large-language-models instruction-tuning multimodal-large-language-models

Updated Feb 10, 2024
Python

LINs-lab / DynMoE

Star

[Preprint] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models

moe language-model mixture-of-experts adaptive-computation vision-transformer multimodal-large-language-models

Updated Jun 24, 2024
Python

Hangover3832 / ComfyUI-Hangover-Moondream

Star

Moondream is a lightweight multimodal large language model

stable-diffusion comfyui moondream multimodal-large-language-models

Updated Jun 14, 2024
Python

cocacola-lab / MineLand

Star

Simulating Large-Scale Multi-Agent Interactions with Limited Multimodal Senses and Physical Needs

minecraft ai-agents large-language-models multimodal-large-language-models

Updated May 23, 2024
Python

OpenGVLab / MM-NIAH

Star

This is the official implementation of the paper "Needle In A Multimodal Haystack"

benchmark long-context vision-language-model multimodal-large-language-models

Updated Jul 4, 2024
Python

ChenDelong1999 / polite-flamingo

Star

🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)

large-language-models visual-instruction-tuning multimodal-large-language-models

Updated Dec 9, 2023
Python

AlignGPT-VL / AlignGPT

Star

Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"

large-language-models multimodal-large-language-models visual-language-models

Updated Jul 12, 2024
Python

baaivision / EVE

Star

EVE: Encoder-Free Vision-Language Models from BAAI

clip vlm instruction-following large-language-models llm mllm multimodal-large-language-models vision-language-models encoder-free-vlm

Updated Jul 20, 2024
Python

mu-cai / matryoshka-mm

Star

Matryoshka Multimodal Models

llama multimodal chatb matryoshka llava multimodal-large-language-models llava-next

Updated Jun 3, 2024
Python

Improve this page

Add a description, image, and links to the multimodal-large-language-models topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-large-language-models topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multimodal-large-language-models

Here are 56 public repositories matching this topic...

modelscope / modelscope-agent

X-PLUG / MobileAgent

cambrian-mllm / cambrian

X-PLUG / mPLUG-DocOwl

BAAI-DCAI / Bunny

LLaVA-VL / LLaVA-Plus-Codebase

rese1f / MovieChat

X-LANCE / SLAM-LLM

tsujuifu / pytorch_mgie

BradyFU / Woodpecker

X-PLUG / Youku-mPLUG

zjukg / KoPA

LINs-lab / DynMoE

Hangover3832 / ComfyUI-Hangover-Moondream

cocacola-lab / MineLand

OpenGVLab / MM-NIAH

ChenDelong1999 / polite-flamingo

AlignGPT-VL / AlignGPT

baaivision / EVE

mu-cai / matryoshka-mm

Improve this page

Add this topic to your repo