# Welcome!

This example shows a very basic usage case of MultiBench. In particular, it demonstrates how to use MultiBench with the affective computing dataset MOSI, and how to use it with a very simple fusion model.

While this will be simple, it will show off most of the capabilities of MultiBench, and most of the conventions at the heart of the system.

To begin, let's clone the repo and setup our interpreter to run commands inside the folder.

In [1]:
!git clone https://github.com/pliang279/MultiBench.git
%cd MultiBench

Cloning into 'MultiBench'...
remote: Enumerating objects: 6937, done.[K
remote: Counting objects: 100% (148/148), done.[K
remote: Compressing objects: 100% (88/88), done.[K
remote: Total 6937 (delta 68), reused 121 (delta 60), pack-reused 6789[K
Receiving objects: 100% (6937/6937), 51.07 MiB | 23.01 MiB/s, done.
Resolving deltas: 100% (4254/4254), done.
/content/MultiBench


Try to download the data file for MOSI using the below command. If this does not work for you, please download the data file locally, and upload it to the folder "/content/MultiBench/data/"

In [2]:
!mkdir data
!pip install gdown
# && gdown https://drive.google.com/u/0/uc?id=1szKIqO0t3Be_W91xvf6aYmsVVUa7wDHU



In [3]:
!gdown https://drive.google.com/uc?id=180l4pN6XAv8-OAYQ6OrMheFUMwtqUWbz

Downloading...
From: https://drive.google.com/uc?id=180l4pN6XAv8-OAYQ6OrMheFUMwtqUWbz
To: /content/MultiBench/mosei_senti_data.pkl
100% 3.73G/3.73G [00:42<00:00, 86.8MB/s]


As Colab famously has bad handling of Conda env files, we'll install the dependencies manually so that it works. Please note that other systems might require installation of a long list of other dependencies.

# Model

From here, let's import some of MultiBench and get working:

In [1]:
import torch
import sys
import os
%cd MultiBench

/content/MultiBench


First, we'll import and create the dataloader for the MOSI dataset, which we're working with:

In [2]:
# Import the associated dataloader for affect datasets, which MOSI is a part of.
from datasets.affect.get_data import get_dataloader

# Create the training, validation, and test-set dataloaders.
traindata, validdata, testdata = get_dataloader(
    '/content/MultiBench/mosei_senti_data.pkl', robust_test=False, max_pad=True, data_type='mosei', max_seq_len=50)

Then, let's define our MultiModal model to test. MultiBench divides models into three separate portions.

Firstly, let's define the encoders of the raw modality information, which come from the "unimodals" section of MultiBench:

In [3]:
# Here, we'll import several common modules should you want to mess with this more.
from unimodals.common_models import GRU, MLP, Sequential, Identity

# As this example is meant to be simple and easy to train, we'll pass in identity
# functions for each of the modalities in MOSI:
encoders = [Identity().cuda(), Identity().cuda(), Identity().cuda()]

Then, let's define the fusion paradigm, which will govern how we take the current modalities, and combine them.

For this example, we'll use the ConcatEarly fusion, which just concatenates the inputs along the second dimension.

In [4]:
# Import a fusion paradigm, in this case early concatenation.
from fusions.common_fusions import ConcatEarly  # noqa

# Initialize the fusion module
fusion = ConcatEarly().cuda()

Lastly, we'll define a 'head' module, which takes the output of the fusion module, and applies transformations to get an output that correponds to our problem - sarcasm detection.

In [5]:
head = Sequential(GRU(409, 512, dropout=True, has_padding=False,
                  batch_first=True, last_only=True), MLP(512, 512, 1)).cuda()

And with that, we're almost done! Now we just need to put them into one of MultiBench's training loops, and set it running:

In [6]:
# Standard supervised learning training loop
from training_structures.Supervised_Learning import train, test

# For more information regarding parameters for any system, feel free to check out the documentation
# at multibench.readthedocs.io!
train(encoders, fusion, head, traindata, validdata, 100, task="regression", optimtype=torch.optim.AdamW,
      is_packed=False, lr=1e-3, save='mosei_ef_r0.pt', weight_decay=0.01, objective=torch.nn.MSELoss(), track_complexity=False)

print("Testing:")
model = torch.load('mosei_ef_r0.pt').cuda()
test(model, testdata, 'affect', is_packed=False,
     criterion=torch.nn.MSELoss(), task="regression", no_robust=True)

hi
epoch  0
Epoch 0 train loss: tensor(1.2816, device='cuda:0', grad_fn=<DivBackward0>) train MAE: 0.8519967867423948
Epoch 0 valid loss: 1.0844184160232544 valid MAE: 0.7779296504850851
Saving Best
epoch  1
Epoch 1 train loss: tensor(1.2374, device='cuda:0', grad_fn=<DivBackward0>) train MAE: 0.8392663315018708
Epoch 1 valid loss: 0.9007430672645569 valid MAE: 0.7271371780669963
Saving Best
epoch  2
Epoch 2 train loss: tensor(1.0101, device='cuda:0', grad_fn=<DivBackward0>) train MAE: 0.7708404655208999
Epoch 2 valid loss: 0.80674147605896 valid MAE: 0.683269563633819
Saving Best
epoch  3
Epoch 3 train loss: tensor(0.8827, device='cuda:0', grad_fn=<DivBackward0>) train MAE: 0.7158025459229231
Epoch 3 valid loss: 0.7546402215957642 valid MAE: 0.6556885658203758
Saving Best
epoch  4
Epoch 4 train loss: tensor(0.8019, device='cuda:0', grad_fn=<DivBackward0>) train MAE: 0.680856250751451
Epoch 4 valid loss: 0.7231867909431458 valid MAE: 0.6444933073861259
Saving Best
epoch  5
Epoch 5 trai

And with that, you've taken your first step into using MultiBench! We hope you find the library useful, and feel free to make an issue on GitHub should there be any confusions regarding how to use an aspect of the package.

In [10]:
!pip install memory_profiler

Collecting memory_profiler
  Downloading memory_profiler-0.61.0-py3-none-any.whl (31 kB)
Installing collected packages: memory_profiler
Successfully installed memory_profiler-0.61.0


In [12]:
!pip install -U scikit-learn

Collecting scikit-learn
  Downloading scikit_learn-1.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (10.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.8/10.8 MB[0m [31m35.0 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: scikit-learn
  Attempting uninstall: scikit-learn
    Found existing installation: scikit-learn 1.2.2
    Uninstalling scikit-learn-1.2.2:
      Successfully uninstalled scikit-learn-1.2.2
Successfully installed scikit-learn-1.3.2


In [7]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [8]:
!mv /content/MultiBench/mosei_ef_r0.pt /content/drive/MyDrive