llava

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

chatbot llama clip mulit-modal vision-language vicuna gpt-4 vision-language-pretraining llava video-chatboat video-conversation

Updated May 20, 2024
Python

roboflow / multimodal-maestro

Star

Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥

object-detection cross-modal multimodality instance-segmentation lmm gpt-4 visual-prompting prompt-engineering vision-language-model llava segment-anything gpt-4-vision

Updated Feb 13, 2024
Python

unum-cloud / uform

Star

Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

Updated May 1, 2024
Python

mbzuai-oryx / LLaVA-pp

Star

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

conversation lmms vision-language llm llava llama3 phi3 llava-llama3 llava-phi3 llama3-llava phi3-llava llama-3-vision phi3-vision llama-3-llava phi-3-llava llama3-vision phi-3-vision

Updated May 3, 2024
Python

SkalskiP / awesome-foundation-and-multimodal-models

Sponsor

Star

👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]

nlp computer-vision image-captioning clip blip multimodal zero-shot-detection foundational-models llava segment-anything open-vocabulary-detection open-vocabulary-segmentation grounding-dino

Updated Feb 29, 2024
Python

open-compass / VLMEvalKit

Star

Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 40+ HF models, 20+ benchmarks

computer-vision evaluation pytorch gemini openai vqa vit gpt multi-modal clip claude openai-api gpt4 large-language-models llm chatgpt llava qwen gpt-4v

Updated May 23, 2024
Python

jhc13 / taggui

Star

Tag manager and captioner for image datasets

image-captioning image-tagging tag-manager pyside6 stable-diffusion llava moondream cogvlm

Updated May 20, 2024
Python

apocas / restai

Sponsor

Star

RestAI is an AIaaS (AI as a Service) open-source platform. Built on top of LlamaIndex, Ollama and HF Pipelines. Supports any public LLM supported by LlamaIndex and any local LLM suported by Ollama. Precise embeddings usage and tuning.

python transformers embeddings openai llama rag fastapi llm stable-diffusion langchain openaiapi llava llamaindex ollama

Updated May 17, 2024
Python

TinyLLaVA / TinyLLaVA_Factory

Star

A Framework of Small-scale Large Multimodal Models

nlp transformers llama vision-language llava large-multimodal-models tinyllama

Updated May 23, 2024
Python

SALT-NLP / LLaVAR

Star

Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"

ocr chatbot multimodal vision-and-language gpt-4 chatgpt instruction-tuning llava

Updated Sep 5, 2023
Python

gokayfem / ComfyUI_VLM_nodes

Star

Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation

image-captioning nodes vlm custom-nodes img2text llm mllm llava comfyui siglip phi15 joytag img2sfx

Updated May 23, 2024
Python

PaddlePaddle / PaddleMIX

Star

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.

image-to-text clip text-to-image dit multimodal sora text-to-video aigc stable-diffusion controlnet llava blip2 minigpt4 sd-xl ppdiffusers eva-clip stablevideodiffusion qwen-vl

Updated May 23, 2024
Python

Improve this page

Add a description, image, and links to the llava topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llava topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llava

Here are 98 public repositories matching this topic...

ollama / ollama

haotian-liu / LLaVA

Fanghua-Yu / SUPIR

InternLM / xtuner

SciSharp / LLamaSharp

chenking2020 / FindTheChatGPTer

modelscope / data-juicer

modelscope / swift

mbzuai-oryx / Video-ChatGPT

roboflow / multimodal-maestro

unum-cloud / uform

mbzuai-oryx / LLaVA-pp

SkalskiP / awesome-foundation-and-multimodal-models

open-compass / VLMEvalKit

jhc13 / taggui

apocas / restai

TinyLLaVA / TinyLLaVA_Factory

SALT-NLP / LLaVAR

gokayfem / ComfyUI_VLM_nodes

PaddlePaddle / PaddleMIX

Improve this page

Add this topic to your repo