Stars
The first opensource platform for multimodal intent analysis
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Unsupervised Multimodal Clustering for Semantics Discovery in Multimodal Utterances (ACL 2024)
Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".
[ACL 2024 SDT] OpenVNA is an open-source framework designed for analyzing the behavior of multimodal language understanding systems under noisy conditions.
MMSA is a unified framework for Multimodal Sentiment Analysis.