Description

This repo contains the source code for the INTERSPEECH 2021 paper "M3: MultiModal Masking applied to sentiment analysis".

Introduction

This paper presents $M^3$, a generic light-weight layer which can be emdedded in multimodal architectrues without any modifications and without any additional learnable parameters. $M^3$ takes as input representations from various modalities, e.g. text, audio, visual. It then randomly either masks one of them or leaves the total representation unaffected. $M^3$ is applied at every time step in the multimodal sequence, acting as a form of regularization.

Prerequisites

Dependencies

Python >=3.7.3
PyTorch == 1.7.1
slp == 1.1.6

Setup

Clone repo with CMU Multimodal SDK submodule

# git version < 2.1.2
git clone --recursive https://github.com/efthymisgeo/multimodal-masking.git

# git version > 2.1.2
git clone --recurse-submodules https://github.com/efthymisgeo/multimodal-masking.git

Create virtualenv and install dependencies

# Ensure your python version is >= 3.7.3

pip install poetry
poetry install

Download data using CMU Multimodal SDK

mkdir -p data
python cmusdk.py data/

M³ Experiments

Optional

poetry shell
export PYTHONPATH=$PYTHONPATH:./CMU-MultimodalSDK

Reproduce the result in Table 1 of the paper

python experiments/main.py --config configs/m3-rnn-hard-0.2-before.yaml --m3_sequential --m3_masking --use-mmdrop-before --gpus 1 --offline

Reproduce the best results, illustrated in Table 2

python experiments/main.py --config configs/m3-rnn-drop-text-0.6-hard-0.2-before.yaml --m3_sequential --m3_masking --use-mmdrop-before --gpus 1 --offline

For further experimentation we suggest creating custom config .yaml files under configs folder and

python experiments/main.py --config configs/<myconf.yaml> --offline --gpus 1

Reference

If you find our work useful for your research, please include the following citation

@inproceedings{georgiou21_interspeech,
  author={Efthymios Georgiou and Georgios Paraskevopoulos and Alexandros Potamianos},
  title={{M3: MultiModal Masking Applied to Sentiment Analysis}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={2876--2880},
  doi={10.21437/Interspeech.2021-1739}
}

TODOs

Upload pickle with features

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
CMU-MultimodalSDK @ fc0d7f2		CMU-MultimodalSDK @ fc0d7f2
cache		cache
checkpoints		checkpoints
configs		configs
data		data
experiments		experiments
logs		logs
modules		modules
utils		utils
.gitmodules		.gitmodules
README.md		README.md
cmusdk.py		cmusdk.py
m3.jpg		m3.jpg
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Introduction

Prerequisites

Dependencies

Setup

M³ Experiments

Reference

TODOs

About

Releases

Packages

Languages

efthymisgeo/multimodal-masking

Folders and files

Latest commit

History

Repository files navigation

Description

Introduction

Prerequisites

Dependencies

Setup

M3 Experiments

Reference

TODOs

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

M³ Experiments

Packages