A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
-
Updated
May 25, 2024 - Python
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
Oscar and VinVL
An efficient PyTorch implementation of the winning entry of the 2017 VQA Challenge.
Visual Question Answering in Pytorch
[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks
Implementation for the paper "Compositional Attention Networks for Machine Reasoning" (Hudson and Manning, ICLR 2018)
PyTorch implementation for the Neuro-Symbolic Concept Learner (NS-CL).
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!
A lightweight, scalable, and general framework for visual question answering research
Strong baseline for visual question answering
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
OmniFusion — a multimodal model to communicate using text and images
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)
Add a description, image, and links to the vqa topic page so that developers can more easily learn about it.
To associate your repository with the vqa topic, visit your repo's landing page and select "manage topics."