multimodal-large-language-models

Here are 40 public repositories matching this topic...

Wang-ML-Lab / multimodal-needle-in-a-haystack

Code and data for the benchmark "Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models"

benchmark llm multimodal-large-language-models needle-in-a-haystack multimodal-needle-in-a-haystack

Updated Jun 18, 2024
Python

sitamgithub-MSIT / TechSage

Star

chatbot artificial-intelligence gradio techbot gemini-api multimodal-data huggingface-spaces generative-ai multimodal-large-language-models gemini-pro-vision gemini-pro

Updated Jun 5, 2024
Python

pipixin321 / Arcana

Star

Implementation of "Arcana: Improving Multi-modal Large Language Model through Boosting Vision Capabilitie"

visual perception lora clip multimodal-large-language-models

Updated Jun 7, 2024
Python

DistilledCode / mmrl

Star

Multi-Modal Representational Learning for Social Media Popularity Prediction

neural-network embeddings data-pipeline multimodal-deep-learning praw-reddit airflow-dags chromadb multimodal-large-language-models

Updated Jun 14, 2024
Python

rohit901 / VANE-Bench

Star

Contains code and documentation for our VANE-Bench paper.

benchmark-datasets multimodal-deep-learning video-anomaly-detection large-language-models multimodal-large-language-models large-multimodal-models

Updated Jun 18, 2024
Python

patrick-tssn / VideoHallucer

Star

VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)

multimodal-large-language-models hallucination-detection video-language-model video-hallucination

Updated Jun 19, 2024
Python

surakku / cadence-gemma

Star

Giving RecurrentGemma sight.

multimodal-large-language-models

Updated Jun 19, 2024
Python

whwu95 / FreeVA

Star

FreeVA: Offline MLLM as Training-Free Video Assistant

chatbot video-understanding zero-shot-video-captioning video-question-answering chatgpt vision-language-model llava training-free multimodal-large-language-models

Updated Jun 9, 2024
Python

CKeibel / FHSWF-deep-learning

Star

Multimodal RAG and comparisons between language models. (Project for Deep Learning Module at the FHSWF)

machine-learning deep-learning multimodal rag multimodal-large-language-models multimodal-rag

Updated Jun 2, 2024
Python

Lzcstan / DrugLAMP

Star

A PyTorch-based system for highly accurate drug-target interaction predictions utilizing multi-modal large language models to discern structural affinities in drug-target pairs.

attention-mechanism drug-target-interactions contrastive-learning multimodal-large-language-models

Updated Mar 26, 2024
Python

AIDC-AI / Ovis

Star

A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.

chatbot multimodality multimodal vision-language-model multimodal-large-language-models vision-language-learning qwen llama3

Updated Jun 14, 2024
Python

bigai-nlco / LSTP-Chat

Star

A Video Chat Agent with Temporal Prior

spatial-temporal video-language llm mllm visual-instruction-tuning multimodal-large-language-models

Updated Feb 28, 2024
Python

sitamgithub-MSIT / well-being

Star

chatbot artificial-intelligence gradio gemini-api multimodal-data huggingface-spaces generative-ai multimodal-large-language-models gemini-pro-vision gemini-pro

Updated Jun 10, 2024
Python

MileBench / MileBench

Star

This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"

benchmark machine-learning natural-language-processing deep-neural-networks computer-vision deep-learning evaluation multimodality visual-question-answering multimodal foundation-models large-language-models llm llms long-context-transformers multimodal-large-language-models large-multimodal-models long-context-modeling

Updated May 19, 2024
Python

declare-lab / MM-InstructEval

Star

This repository contains code to evaluate various multimodal large language models using different instructions across multiple multimodal content comprehension tasks.

multimodal-large-language-models multimodal-content-comprehension-tasks