# Welcome!

This example shows a very basic usage case of MultiBench. In particular, it demonstrates how to use MultiBench with the affective computing dataset MOSI, and how to use it with a very simple fusion model.

While this will be simple, it will show off most of the capabilities of MultiBench, and most of the conventions at the heart of the system.

To begin, let's clone the repo and setup our interpreter to run commands inside the folder.

In [1]:
!git clone https://github.com/pliang279/MultiBench.git
%cd MultiBench

Cloning into 'MultiBench'...
remote: Enumerating objects: 6943, done.[K
remote: Counting objects: 100% (154/154), done.[K
remote: Compressing objects: 100% (94/94), done.[K
remote: Total 6943 (delta 72), reused 121 (delta 60), pack-reused 6789[K
Receiving objects: 100% (6943/6943), 51.07 MiB | 20.67 MiB/s, done.
Resolving deltas: 100% (4258/4258), done.
/content/MultiBench


In [12]:
!pip install -U memory_profiler

Collecting memory_profiler
  Downloading memory_profiler-0.61.0-py3-none-any.whl (31 kB)
Installing collected packages: memory_profiler
Successfully installed memory_profiler-0.61.0


Try to download the data folder with MOSI, MOSEI, SARCASM and MUSTARD files using the below command. If this does not work for you, please download the data file locally, and upload it to the folder "/content/MultiBench/data/"

In [4]:
!pip install gdown && gdown https://drive.google.com/drive/folders/1drao0aXgS1tPCr55riqMLfo0pMPzDAR-?usp=sharing -O /content/MultiBench/ --folder

Retrieving folder contents
Processing file 1i8ePqY6HQ3m4joa5bjNRj5q98F9Bo4lF humor.pkl
Processing file 1uhUloaigNUMYU7Wq_EZj58pCGIoto34T mosei_raw.pkl
Processing file 13McsNwX1_kTnA3i2KyDWlLWnh52Kvz4V mosi_raw.pkl
Processing file 1dN7AT3ytEnCgL18FnKN-jXN4GkEQFKZS sarcasm.pkl
Retrieving folder contents completed
Building directory structure
Building directory structure completed
Downloading...
From (original): https://drive.google.com/uc?id=1i8ePqY6HQ3m4joa5bjNRj5q98F9Bo4lF
From (redirected): https://drive.google.com/uc?id=1i8ePqY6HQ3m4joa5bjNRj5q98F9Bo4lF&confirm=t&uuid=0bb7924c-18f5-476d-b9ed-f5f24c37b506
To: /content/MultiBench/data/humor.pkl
100% 1.22G/1.22G [00:13<00:00, 89.2MB/s]
Downloading...
From (original): https://drive.google.com/uc?id=1uhUloaigNUMYU7Wq_EZj58pCGIoto34T
From (redirected): https://drive.google.com/uc?id=1uhUloaigNUMYU7Wq_EZj58pCGIoto34T&confirm=t&uuid=2dd28ee5-af8a-492a-91bf-f05f37d14d59
To: /content/MultiBench/data/mosei_raw.pkl
100% 9.94G/9.94G [01:48<00:00,

As Colab famously has bad handling of Conda env files, we'll install the dependencies manually so that it works. Please note that other systems might require installation of a long list of other dependencies.

From here, let's import some of MultiBench and get working:

In [5]:
import torch
import sys
import os

First, we'll import and create the dataloader for the MOSI dataset, which we're working with:

In [6]:
# Import the associated dataloader for affect datasets, which MOSI is a part of.
from datasets.affect.get_data import get_dataloader

# Create the training, validation, and test-set dataloaders.
traindata, validdata, testdata = get_dataloader(
    '/content/MultiBench/data/mosi_raw.pkl', robust_test=False, max_pad=True, data_type='mosi', max_seq_len=50)

Then, let's define our MultiModal model to test. MultiBench divides models into three separate portions.

Firstly, let's define the encoders of the raw modality information, which come from the "unimodals" section of MultiBench:

In [7]:
# Here, we'll import several common modules should you want to mess with this more.
from unimodals.common_models import GRU, MLP, Sequential, Identity

# As this example is meant to be simple and easy to train, we'll pass in identity
# functions for each of the modalities in MOSI:
encoders = [Identity().cuda(), Identity().cuda(), Identity().cuda()]

Then, let's define the fusion paradigm, which will govern how we take the current modalities, and combine them.

For this example, we'll use the ConcatEarly fusion, which just concatenates the inputs along the second dimension.

In [8]:
# Import a fusion paradigm, in this case early concatenation.
from fusions.common_fusions import ConcatEarly  # noqa

# Initialize the fusion module
fusion = ConcatEarly().cuda()

Lastly, we'll define a 'head' module, which takes the output of the fusion module, and applies transformations to get an output that correponds to our problem - sarcasm detection.

In [10]:
head = Sequential(GRU(409, 512, dropout=True, has_padding=False,
                  batch_first=True, last_only=True), MLP(512, 512, 1)).cuda()

And with that, we're almost done! Now we just need to put them into one of MultiBench's training loops, and set it running:

In [13]:
# Standard supervised learning training loop
from training_structures.Supervised_Learning import train, test

# For more information regarding parameters for any system, feel free to check out the documentation
# at multibench.readthedocs.io!
train(encoders, fusion, head, traindata, validdata, 100, task="regression", optimtype=torch.optim.AdamW,
      is_packed=False, lr=1e-3, save='mosi_ef_r0.pt', weight_decay=0.01, objective=torch.nn.L1Loss())

print("Testing:")
model = torch.load('mosi_ef_r0.pt').cuda()
test(model, testdata, 'affect', is_packed=False,
     criterion=torch.nn.L1Loss(), task="posneg-classification", no_robust=True)

Epoch 0 train loss: tensor(1.3242, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 0 valid loss: 1.3936829566955566
Saving Best
Epoch 1 train loss: tensor(1.3236, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 1 valid loss: 1.3801870346069336
Saving Best
Epoch 2 train loss: tensor(1.3197, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 2 valid loss: 1.3857842683792114
Epoch 3 train loss: tensor(1.3203, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 3 valid loss: 1.3853166103363037
Epoch 4 train loss: tensor(1.3208, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 4 valid loss: 1.3821818828582764
Epoch 5 train loss: tensor(1.3221, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 5 valid loss: 1.3856353759765625
Epoch 6 train loss: tensor(1.3179, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 6 valid loss: 1.3768967390060425
Saving Best
Epoch 7 train loss: tensor(1.3176, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 7 valid loss: 1.3840895891189575
Epoch 8 train loss: tensor(1.3198, device='c

And with that, you've taken your first step into using MultiBench! We hope you find the library useful, and feel free to make an issue on GitHub should there be any confusions regarding how to use an aspect of the package.