Skip to content

This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers". As a part of this release we share the information about recent multimodal datasets which are available for research purposes. We found that although 100+ multimodal language resources are available…

drmuskangarg/Multimodal-datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 

Repository files navigation

Multimodal datasets

This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers".

As a part of this release we share the information about recent multimodal datasets which are available for research purposes.

We found that although 100+ multimodal language resources are available in literature for various NLP tasks, still publicly available multimodal datasets are under-explored for its re-usage in subsequent problem domains.

Multimodal datasets for NLP Applications

  1. Sentiment Analysis
Dataset Title of the Paper Link of the Paper Link of the Dataset
EmoDB A Database of German Emotional Speech Paper Dataset
VAM The Vera am Mittag German Audio-Visual Emotional Speech Database Paper Dataset
IEMOCAP IEMOCAP: interactive emotional dyadic motion capture database Paper Dataset
Mimicry A Multimodal Database for Mimicry Analysis Paper Dataset
YouTube Towards Multimodal Sentiment Analysis:Harvesting Opinions from the Web Paper Dataset
HUMAINE The HUMAINE database Paper Dataset
Large Movies Sentiment classification on Large Movie Review Paper Dataset
SEMAINE The SEMAINE Database: Annotated Multimodal Records of Emotionally Colored Conversations between a Person and a Limited Agent Paper Dataset
AFEW Collecting Large, Richly Annotated Facial-Expression Databases from Movies Paper Dataset
SST Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank Paper Dataset
ICT-MMMO YouTube Movie Reviews: Sentiment Analysis in an AudioVisual Context Paper Dataset
RECOLA Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions Paper Dataset
MOUD Utterance-Level Multimodal Sentiment Analysis Paper
CMU-MOSI MOSI: Multimodal Corpus of Sentiment Intensity and Subjectivity Analysis in Online Opinion Videos Paper Dataset
POM Multimodal Analysis and Prediction of Persuasiveness in Online Social Multimedia Paper Dataset
MELD MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations Paper Dataset
CMU-MOSEI Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph Paper Dataset
AMMER Towards Multimodal Emotion Recognition in German Speech Events in Cars using Transfer Learning Paper On Request
SEWA SEWA DB: A Rich Database for Audio-Visual Emotion and Sentiment Research in the Wild Paper Dataset
Fakeddit r/fakeddit: A new multimodal benchmark dataset for fine-grained fake news detection Paper Dataset
CMU-MOSEAS CMU-MOSEAS: A Multimodal Language Dataset for Spanish, Portuguese, German and French Paper Dataset
MultiOFF Multimodal meme dataset (MultiOFF) for identifying offensive content in image and text Paper Dataset
MEISD MEISD: A Multimodal Multi-Label Emotion, Intensity and Sentiment Dialogue Dataset for Emotion Recognition and Sentiment Analysis in Conversations Paper Dataset
TASS Overview of TASS 2020: Introducing Emotion Paper Dataset
CH SIMS CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotation of Modality Paper Dataset
Creep-Image A Multimodal Dataset of Images and Text Paper Dataset
Entheos Entheos: A Multimodal Dataset for Studying Enthusiasm Paper Dataset
  1. Machine Translation
Dataset Title of the Paper Link of the Paper Link of the Dataset
Multi30K Multi30K: Multilingual English-German Image Description Paper Dataset
How2 How2: A Large-scale Dataset for Multimodal Language Understanding Paper Dataset
MLT Multimodal Lexical Translation Paper Dataset
IKEA A Visual Attention Grounding Neural Model for Multimodal Machine Translation Paper Dataset
Flickr30K (EN- (hi-IN)) Multimodal Neural Machine Translation for Low-resource Language Pairs using Synthetic Data Paper On Request
Hindi Visual Genome Hindi Visual Genome: A Dataset for Multimodal English-to-Hindi Machine Translation Paper Dataset
HowTo100M Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models Paper Dataset
  1. Information Retrieval
Dataset Title of the Paper Link of the Paper Link of the Dataset
MUSICLEF MusiCLEF: a Benchmark Activity in Multimodal Music Information Retrieval Paper Dataset
Moodo The Moodo dataset: Integrating user context with emotional and color perception of music for affective music information retrieval Paper Dataset
ALF-200k ALF-200k: Towards Extensive Multimodal Analyses of Music Tracks and Playlists Paper Dataset
MQA Can Image Captioning Help Passage Retrieval in Multimodal Question Answering? Paper Dataset
WAT2019 WAT2019: English-Hindi Translation on Hindi Visual Genome Dataset Paper Dataset
ViTT Multimodal Pretraining for Dense Video Captioning Paper Dataset
MTD MTD: A Multimodal Dataset of Musical Themes for MIR Research Paper Dataset
MusiClef A professionally annotated and enriched multimodal data set on popular music Paper Dataset
Schubert Winterreise Schubert Winterreise dataset: A multimodal scenario for music analysis Paper Dataset
WIT WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning Paper Dataset
  1. Question Answering
Dataset Title of the Paper Link of the Paper Link of the Dataset
MQA A Dataset for Multimodal Question Answering in the Cultural Heritage Domain Paper -
MovieQA Movieqa: Understanding stories in movies through question-answering MovieQA Paper Dataset
PororoQA Deep story video story qa by deep embedded memory networks Paper Dataset
MemexQA MemexQA: Visual Memex Question Answering Paper Dataset
VQA Making the V in VQA matter: Elevating the role of image understanding in Visual Question Answering Paper Dataset
TDIUC An analysis of visual question answering algorithms Paper Dataset
TGIF-QA TGIF-QA: Toward spatio-temporal reasoning in visual question answering Paper Dataset
MSVD QA, MSRVTT QA Video question answering via attribute augmented attention network learning Paper Dataset
YouTube2Text Video Question Answering via Gradually Refined Attention over Appearance and Motion Paper Dataset
MovieFIB A dataset and exploration of models for understanding video data through fill-in-the-blank question-answering Paper Dataset
Video Context QA Uncovering the temporal context for video question answering Paper Dataset
MarioQA Marioqa: Answering questions by watching gameplay videos Paper Dataset
TVQA Tvqa: Localized, compositional video question answering Paper Dataset
VQA-CP v2 Don’t just assume; look and answer: Overcoming priors for visual question answering Paper Dataset
RecipeQA RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes Paper Dataset
GQA GQA: A new dataset for real-world visual reasoning and compositional question answering Paper Dataset
Social IQ Social-iq: A question answering benchmark for artificial social intelligence Paper Dataset
MIMOQA MIMOQA: Multimodal Input Multimodal Output Question Answering Paper -
  1. Summarization
Dataset Title of the Paper Link of the Paper Link of the Dataset
SumMe Tvsum: Summarizing web videos using titles Paper Dataset
TVSum Creating summaries from user videos Paper Dataset
QFVS Query-focused video summarization: Dataset, evaluation, and a memory network based approach Paper Dataset
MMSS Multi-modal Sentence Summarization with Modality Attention and Image Filtering Paper -
MSMO MSMO: Multimodal Summarization with Multimodal Output Paper -
Screen2Words Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning Paper Dataset
AVIATE IEMOCAP: interactive emotional dyadic motion capture database Paper Dataset
Multimodal Microblog Summarizaion On Multimodal Microblog Summarization Paper -
  1. Human Computer Interaction
Dataset Title of the Paper Link of the Paper Link of the Dataset
CAUVE CUAVE: A new audio-visual database for multimodal human-computer interface research Paper Dataset
MHAD Berkeley mhad: A comprehensive multimodal human action database Paper Dataset
Multi-party interactions A Multi-party Multi-modal Dataset for Focus of Visual Attention in Human-human and Human-robot Interaction Paper -
MHHRI Multimodal human-human-robot interactions (mhhri) dataset for studying personality and engagement Paper Dataset
Red Hen Lab Red Hen Lab: Dataset and Tools for Multimodal Human Communication Research Paper -
EMRE Generating a Novel Dataset of Multimodal Referring Expressions Paper Dataset
Chinese Whispers Chinese whispers: A multimodal dataset for embodied language grounding Paper Dataset
uulmMAC The uulmMAC database—A multimodal affective corpus for affective computing in human-computer interaction Paper Dataset
  1. Semantic Analysis
Dataset Title of the Paper Link of the Paper Link of the Dataset
WN9-IMG Image-embodied Knowledge Representation Learning Paper Dataset
Wikimedia Commons A Dataset and Reranking Method for Multimodal MT of User-Generated Image Captions Paper Dataset
Starsem18-multimodalKB A Multimodal Translation-Based Approach for Knowledge Graph Representation Learning Paper Dataset
MUStARD Towards Multimodal Sarcasm Detection Paper Dataset
YouMakeup YouMakeup: A Large-Scale Domain-Specific Multimodal Dataset for Fine-Grained Semantic Comprehension Paper Dataset
MDID Integrating Text and Image: Determining Multimodal Document Intent in Instagram Posts Paper Dataset
Social media posts from Flickr (Mental Health) Inferring Social Media Users’ Mental Health Status from Multimodal Information Paper Dataset
Twitter MEL Building a Multimodal Entity Linking Dataset From Tweets Building a Multimodal Entity Linking Dataset From Tweets Paper Dataset
MultiMET MultiMET: A Multimodal Dataset for Metaphor Understanding Paper -
MSDS Multimodal Sarcasm Detection in Spanish: a Dataset and a Baseline Paper Dataset
  1. Miscellaneous
Dataset Title of the Paper Link of the Paper Link of the Dataset
MS COCO Microsoft COCO: Common objects in context Paper Dataset
ILSVRC ImageNet Large Scale Visual Recognition Challenge Paper Dataset
YFCC100M YFCC100M: The new data in multimedia research Paper Dataset
COGNIMUSE COGNIMUSE: a multimodal video database annotated with saliency, events, semantics and emotion with application to summarization Paper Dataset
SNAG SNAG: Spoken Narratives and Gaze Dataset Paper Dataset
UR-Funny UR-FUNNY: A Multimodal Language Dataset for Understanding Humor Paper Dataset
Bag-of-Lies Bag-of-Lies: A Multimodal Dataset for Deception Detection Paper Dataset
MARC A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks Paper Dataset
MuSE MuSE: a Multimodal Dataset of Stressed Emotion Paper Dataset
BabelPic Fatality Killed the Cat or: BabelPic, a Multimodal Dataset for Non-Concrete Concept Paper Dataset
Eye4Ref Eye4Ref: A Multimodal Eye Movement Dataset of Referentially Complex Situations Paper -
Troll Memes A Dataset for Troll Classification of TamilMemes Paper Dataset
SEMD EmoSen: Generating sentiment and emotion controlled responses in a multimodal dialogue system Paper -
Chat talk Corpus Construction and Analysis of a Multimodal Chat-talk Corpus for Dialog Systems Considering Interpersonal Closeness Paper -
EMOTyDA Towards Emotion-aided Multi-modal Dialogue Act Classification Paper Dataset
MELINDA MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification Paper Dataset
NewsCLIPpings NewsCLIPpings: Automatic Generation of Out-of-Context Multimodal Media Paper Dataset
R2VQ Designing Multimodal Datasets for NLP Challenges Paper Dataset
M2H2 M2H2: A Multimodal Multiparty Hindi Dataset For Humor Recognition in Conversations Paper Dataset

About

This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers". As a part of this release we share the information about recent multimodal datasets which are available for research purposes. We found that although 100+ multimodal language resources are available…

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published