#

mllm

Here are 50 public repositories matching this topic...

alexander-moore / vlm

Composition of Multimodal Language Models From Scratch

machine-learning ai vlm llm mllm vision-language-model multimodal-large-language-models mmllm

Updated Jul 2, 2024
Jupyter Notebook

turningpoint-ai / MOSSBench

This is the official implementation (code, data) of the paper "MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?""

vlm mllm oversensitivity safety-alignment turningpoint-ai

Updated Jul 5, 2024
JavaScript

Ahnsun / merlin

[ECCV2024] Official code implementation of Merlin: Empowering Multimodal LLMs with Foresight Minds

Updated Jul 4, 2024
Python

CharlieDDDD / AISurveyPapers

Large Visual Language Model(LVLM), Large Language Model(LLM), Multimodal Large Language Model(MLLM), Alignment, Agent, AI System, Survey

agent agi survey alignment ai-system llm mllm lvlm

Updated Apr 28, 2024

isLinXu / MLLM-Research-Learn

Conducting learning and research on MLLM based on the MME rankings.

Updated Nov 25, 2023

eric-ai-lab / MultipanelVQA

Code for the MultipanelVQA benchmark "Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA"

vqa vlm mllm screen-ai multipanel-understanding

Updated Apr 11, 2024
Jupyter Notebook

bz-lab / AUITestAgent

AUITestAgent is the first automatic, natural language-driven GUI testing tool for mobile apps, capable of fully automating the entire process of GUI interaction and function verification.

testing agent gui automation mobile-app multi-agent multimodal llm mllm multimodal-agent gpt-4o

Updated Jul 16, 2024

zzq2000 / MIKO

MIKO: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discover

social-media intention llm mllm

Updated Mar 5, 2024
Python

xirui-li / attacks-on-LLMs

Awesome list for attacks on large language models.

attack jailbreak defense vlm attack-defense awsome-list llm llms mllm llms-adversarial-attack

Updated Mar 1, 2024

BUAADreamer / MLLM-Finetuning-Demo

使用LLaMA-Factory微调多模态大语言模型的示例代码 Demo of Finetuning Multimodal LLM with LLaMA-Factory

transformers lora pretraining huggingface-datasets supervised-finetuning mllm llava finetune-llm llama-factory paligemma yi-vl

Updated Jun 28, 2024
Python

VisualWebBench / VisualWebBench

Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"

machine-learning natural-language-processing computer-vision deep-learning evaluation question-answering visual-question-answering multimodal multimodal-deep-learning foundation-models large-language-models llm llms mllm multimodal-large-language-models large-multimodal-models

Updated May 31, 2024
Python

thu-ml / MMTrustEval

A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust)

benchmark privacy toolbox safety multi-modal fairness robustness claude gpt-4 trustworthy-ai truthfulness mllm

Updated Jul 15, 2024
Python

parsee-ai / parsee-datasets

Datasets, case studies and benchmarks for extracting structured information from PDFs, HTML files or images, created by the Parsee.ai team. Datasets also on Hugging Face: https://huggingface.co/parsee-ai

datasets rag llm mllm

Updated May 15, 2024
Jupyter Notebook

bigai-nlco / LSTP-Chat

A Video Chat Agent with Temporal Prior

spatial-temporal video-language llm mllm visual-instruction-tuning multimodal-large-language-models

Updated Feb 28, 2024
Python

X-PLUG / mPLUG-HalOwl

mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating

benchmark contrastive-learning hallucinations mllm multimodal-large-language-models multimodal-hallucination

Updated Jan 29, 2024
Python

baaivision / DenseFusion

DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

vlm image-descriptions visual-perception mllm multimodal-large-language-models vision-language-models

Updated Jul 12, 2024
Python

xirui-li / MOSSBench

MOSSBench: A webpage for an oversensitivity benchmark

attack alignment vlm mllm oversensitivity

Updated Jul 5, 2024
JavaScript

BUAADreamer / Chinese-LLaVA-Med

中文医学多模态大模型 Large Chinese Language-and-Vision Assistant for BioMedicine

ai transformers medical chinese multimodal huggingface-datasets mllm llava minigpt4 gpt4v qwen1-5 llama-factory

Updated May 22, 2024
Python

bonjour-npy / UndergraduateDissertation

Undergraduate Dissertation of Guilin University of Electronic Technology

prompt-learning prompt-tuning llm mllm

Updated May 24, 2024
Python

Hon-Wong / Elysium

[ECCV2024] Elysium: Exploring Object-level Perception in Videos via MLLM

tracking benchmark dataset gpt vlm sot mllm eccv2024

Updated Jul 12, 2024

Improve this page

Add a description, image, and links to the mllm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mllm topic, visit your repo's landing page and select "manage topics."