mllm

Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation

image-captioning nodes vlm custom-nodes img2text llm mllm llava comfyui siglip phi15 joytag img2sfx

Updated Jun 2, 2024
Python

VisualWebBench / VisualWebBench

Star

Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"

machine-learning natural-language-processing computer-vision deep-learning evaluation question-answering visual-question-answering multimodal multimodal-deep-learning foundation-models large-language-models llm llms mllm multimodal-large-language-models large-multimodal-models

Updated May 31, 2024
Python

BUAADreamer / Chinese-LLaVA-Med

Star

中文医学多模态大模型 Large Chinese Language-and-Vision Assistant for BioMedicine

ai transformers medical chinese multimodal huggingface-datasets mllm llava minigpt4 gpt4v qwen1-5 llama-factory

Updated May 22, 2024
Python

360CVGroup / SEEChat

Star

Multimodal chatbot with computer vision capabilities integrated

chatbot gpt4 mllm

Updated May 17, 2024
Python

Ahnsun / merlin

Star

Merlin: Empowering Multimodal LLMs with Foresight Minds

mllm

Updated May 8, 2024
Python

CircleRadon / Osprey

Star

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

sam mllm visual-instruction-tuning pixel-understanding

Updated Apr 15, 2024
Python

FoundationVision / GenerateU

Star

[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection

open-world object-detection multimodality open-vocabulary mllm open-vocabulary-detection

Updated Mar 25, 2024
Python

BAAI-DCAI / DataOptim

Star

A collection of visual instruction tuning datasets.

llm mllm visual-instruction-tuning

Updated Mar 14, 2024
Python

zzq2000 / MIKO

Star

MIKO: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discover

social-media intention llm mllm

Updated Mar 5, 2024
Python

Improve this page

Add a description, image, and links to the mllm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mllm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mllm

Here are 26 public repositories matching this topic...

sterzhang / image-textualization

InternLM / InternLM-XComposer

TideDra / VL-RLHF

BAAI-DCAI / Bunny

TIGER-AI-Lab / Mantis

BUAADreamer / MLLM-Finetuning-Demo

X-PLUG / MobileAgent

FoundationVision / Groma

microsoft / unilm

showlab / VisInContext

X-PLUG / mPLUG-DocOwl

gokayfem / ComfyUI_VLM_nodes

VisualWebBench / VisualWebBench

BUAADreamer / Chinese-LLaVA-Med

360CVGroup / SEEChat

Ahnsun / merlin

CircleRadon / Osprey

FoundationVision / GenerateU

BAAI-DCAI / DataOptim

zzq2000 / MIKO

Improve this page

Add this topic to your repo