# Welcome!

This example shows a very basic usage case of MultiBench. In particular, it demonstrates how to use MultiBench with the affective computing dataset MOSI, and how to use it with a very simple fusion model. 

While this will be simple, it will show off most of the capabilities of MultiBench, and most of the conventions at the heart of the system.

To begin, let's clone the repo and setup our interpreter to run commands inside the folder.

In [1]:
!git clone https://github.com/pliang279/MultiBench.git
%cd MultiBench

Cloning into 'MultiBench'...
remote: Enumerating objects: 4890, done.[K
remote: Counting objects: 100% (1906/1906), done.[K
remote: Compressing objects: 100% (1019/1019), done.[K
remote: Total 4890 (delta 1289), reused 1369 (delta 884), pack-reused 2984[K
Receiving objects: 100% (4890/4890), 46.51 MiB | 17.39 MiB/s, done.
Resolving deltas: 100% (3349/3349), done.
/content/MultiBench


Try to download the data file for MOSI using the below command. If this does not work for you, please download the data file locally, and upload it to the folder "/content/MultiBench/data/"

In [8]:
!mkdir data
!pip install gdown && gdown https://drive.google.com/u/0/uc?id=1szKIqO0t3Be_W91xvf6aYmsVVUa7wDHU

mkdir: cannot create directory ‘data’: File exists
Access denied with the following error:

 	Cannot retrieve the public link of the file. You may need to change
	the permission to 'Anyone with the link', or have had many accesses. 

You may still be able to access the file from the browser:

	 https://drive.google.com/u/0/uc?id=1szKIqO0t3Be_W91xvf6aYmsVVUa7wDHU 



As Colab famously has bad handling of Conda env files, we'll install the dependencies manually so that it works. Please note that other systems might require installation of a long list of other dependencies.

From here, let's import some of MultiBench and get working:

In [12]:
import torch
import sys
import os

First, we'll import and create the dataloader for the MOSI dataset, which we're working with:

In [None]:
# Import the associated dataloader for affect datasets, which MOSI is a part of.
from datasets.affect.get_data import get_dataloader

# Create the training, validation, and test-set dataloaders. 
traindata, validdata, testdata = get_dataloader(
    '/content/MultiBench/data/mosi_raw.pkl', robust_test=False, max_pad=True, data_type='mosi', max_seq_len=50)

Then, let's define our MultiModal model to test. MultiBench divides models into three separate portions.

Firstly, let's define the encoders of the raw modality information, which come from the "unimodals" section of MultiBench:

In [14]:
# Here, we'll import several common modules should you want to mess with this more.
from unimodals.common_models import GRU, MLP, Sequential, Identity 

# As this example is meant to be simple and easy to train, we'll pass in identity
# functions for each of the modalities in MOSI:
encoders = [Identity().cuda(), Identity().cuda(), Identity().cuda()]

Then, let's define the fusion paradigm, which will govern how we take the current modalities, and combine them.

For this example, we'll use the ConcatEarly fusion, which just concatenates the inputs along the second dimension.

In [None]:
# Import a fusion paradigm, in this case early concatenation.
from fusions.common_fusions import ConcatEarly  # noqa

# Initialize the fusion module
fusion = ConcatEarly().cuda()

Lastly, we'll define a 'head' module, which takes the output of the fusion module, and applies transformations to get an output that correponds to our problem - sarcasm detection.

In [15]:
head = Sequential(GRU(409, 512, dropout=True, has_padding=False,
                  batch_first=True, last_only=True), MLP(512, 512, 1)).cuda()

And with that, we're almost done! Now we just need to put them into one of MultiBench's training loops, and set it running:

In [None]:
# Standard supervised learning training loop
from training_structures.Supervised_Learning import train, test

# For more information regarding parameters for any system, feel free to check out the documentation
# at multibench.readthedocs.io!
train(encoders, fusion, head, traindata, validdata, 100, task="regression", optimtype=torch.optim.AdamW,
      is_packed=False, lr=1e-3, save='mosi_ef_r0.pt', weight_decay=0.01, objective=torch.nn.L1Loss())

print("Testing:")
model = torch.load('mosi_ef_r0.pt').cuda()
test(model, testdata, 'affect', is_packed=False,
     criterion=torch.nn.L1Loss(), task="posneg-classification", no_robust=True)

Epoch 0 train loss: tensor(1.3309, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 0 valid loss: 1.3881396055221558
Saving Best
Epoch 1 train loss: tensor(1.3193, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 1 valid loss: 1.385719895362854
Saving Best
Epoch 2 train loss: tensor(1.3250, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 2 valid loss: 1.3686903715133667
Saving Best
Epoch 3 train loss: tensor(1.3222, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 3 valid loss: 1.3763189315795898
Epoch 4 train loss: tensor(1.3209, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 4 valid loss: 1.3935108184814453
Epoch 5 train loss: tensor(1.3185, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 5 valid loss: 1.3834254741668701
Epoch 6 train loss: tensor(1.3160, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 6 valid loss: 1.391663670539856
Epoch 7 train loss: tensor(1.3205, device='cuda:0', grad_fn=<DivBackward0>)
Epoch 7 valid loss: 1.3926382064819336
Epoch 8 train loss: tensor(1.3181, device='cud

And with that, you've taken your first step into using MultiBench! We hope you find the library useful, and feel free to make an issue on GitHub should there be any confusions regarding how to use an aspect of the package.