llava-next

Here are 9 public repositories matching this topic...

xiaoachen98 / Open-LLaVA-NeXT

An open-source implementation for training LLaVA-NeXT.

chatbot llama multimodal multi-modality gpt-4 visual-language-learning chatgpt vision-language-model llava large-multimodal-models llama3 gpt4o llava-next

Updated Oct 23, 2024
Python

RLHF-V / RLAIF-V

Star

[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

chatbot multimodal llava vision-language-learning gpt-4v llava-next rlaif-v minicpm-v cvpr2025

Updated May 14, 2025
Python

zjysteven / lmms-finetune

Star

A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.

finetuning multimodal vision-language foundation-models instruction-tuning large-language-model llava visual-instruction-tuning multimodal-large-language-models large-multimodal-models qwen-vl llava-next

Updated Feb 25, 2025
Python

mu-cai / matryoshka-mm

Star

Matryoshka Multimodal Models

llama multimodal chatb matryoshka llava multimodal-large-language-models llava-next

Updated Jan 22, 2025
Python

chuangchuangtan / LLaVA-NeXT-Image-Llama3-Lora

Star

LLaVA-NeXT-Image-Llama3-Lora, Modified from https://github.com/arielnlee/LLaVA-1.6-ft

lora finetuning llama3 llava-next

Updated Jul 17, 2024
Python

[AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Vision-Language Models (e.g., LLaVA-Next) under a fixed token budget.

ml vlm llava lvlm llava-next

Updated Apr 18, 2025
Python

Jorffy / NoteMR

Star

[CVPR 2025] Code for "Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering".

knowledge retrieval vqa cvpr multimodal-learning visual-question-answering gradcam rag llm large-language-model mllm llava retrieval-augmented-generation llava-next cvpr2025 mllm-reasoning multimodal-large-language-model knowledge-based-visual-question-answering kb-vqa

Updated Jun 16, 2025
Python

rng190001 / CS6375-ResearchProject

Star

Visual Language Model focusing on testing different parsing techniques from generated responses

machine-learning bart research-project cosine-similarity nlp-machine-learning tf-idf-vectorizer bert-embeddings ui-ux-design parsing-algorithms visual-language-models llava-next

Updated May 10, 2025
Jupyter Notebook

Ethel75 / NoteMR

Star

NoteMR enhances multimodal large language models for visual question answering by integrating structured notes. This implementation aims to reduce reasoning errors and improve visual feature perception. 🐙📚

knowledge retrieval vqa cvpr multimodal-learning visual-question-answering gradcam rag llm mllm llava retrieval-augmented-generation llava-next cvpr2025 mllm-reasoning multimodal-large-language-model knowledge-based-visual-question-answering kb-vqa

Updated Jul 1, 2025
Python

Improve this page

Add a description, image, and links to the llava-next topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llava-next topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llava-next

Here are 9 public repositories matching this topic...

xiaoachen98 / Open-LLaVA-NeXT

RLHF-V / RLAIF-V

zjysteven / lmms-finetune

mu-cai / matryoshka-mm

chuangchuangtan / LLaVA-NeXT-Image-Llama3-Lora

hasanar1f / HiRED

Jorffy / NoteMR

rng190001 / CS6375-ResearchProject

Ethel75 / NoteMR

Improve this page

Add this topic to your repo