# CoMM on MultiBench dataset

This notebook will show how to use CoMM on the [MultiBench dataset](https://github.com/pliang279/MultiBench) (see Table 1 in [our paper](https://arxiv.org/abs/2409.07402)). 

## Packages install and loading

We start by installing and loading the required packages for this notebook:

In [None]:
%pip install torch
%pip install omegaconf
%pip install hydra-core
%pip install pytorch-lightning
%pip install scikit-learn
%pip install torchvision
%pip install tensorboard
%pip install pandas
%pip install einops
%pip install matplotlib
%pip install gdown

In [7]:
import sys
sys.path.append("../")
import numpy as np
import torch
from sklearn.linear_model import LogisticRegressionCV
from dataset.multibench import MultiBenchDataModule
from pytorch_lightning import Trainer
from pl_modules.comm import CoMM
from models.mmfusion import MMFusion
from models.mlp import MLP
from models.gru import GRU
import warnings

In [2]:
torch.manual_seed(45) # for reproducibility
np.random.seed(45) 
warnings.filterwarnings("ignore", category=UserWarning) # avoids sklearn warnings

## Load the data 

MultiBench consists in 15 preprocessed datasets with predefined train/val/test splits. 

In this notebook, we will focus on MIMIC, a dataset that requires credientials before download (check https://mimic.mit.edu).

Don't forget to set the correct data path in **dataset/catalog.json** ! 

In [5]:
# Load MIMIC 
data_module_mimic = MultiBenchDataModule("mimic", model="CoMM", 
                                        batch_size=64, num_workers=16, 
                                        modalities=["tabular", "timeseries"], 
                                        augmentations="drop+noise")

downstream_mimic = MultiBenchDataModule("mimic", model="Sup", 
                                        batch_size=64, num_workers=16, 
                                        modalities=["tabular", "timeseries"])

## Evaluate CoMM on MultiBench data

In [6]:
def classification_scoring(model, data_module, scoring="balanced_accuracy"):
    Z_train, y_train = model.extract_features(data_module.train_dataloader())
    Z_test, y_test = model.extract_features(data_module.test_dataloader())
    linear_model = LogisticRegressionCV(Cs=5, n_jobs=10, scoring=scoring)
    linear_model.fit(Z_train.cpu().detach().numpy(), y_train.cpu().detach().numpy())
    return linear_model.score(Z_test.cpu().detach().numpy(), y_test.cpu().detach().numpy())

### MIMIC 

In [None]:
comm = CoMM(
    encoder=MMFusion(
        encoders=[ # Handles tabular and timeseries data  
            MLP(indim=5, hiddim=10, outdim=10, dropout=False), 
            GRU(indim=12, hiddim=512, dropout=False, batch_first=True), 
        ], 
        input_adapters=[FeaturesInputAdapter(n_features=10,dim_tokens=512), None], # No adapters needed
        embed_dim=512
    ),
    projection=CoMM._build_mlp(512, 512, 256),
    optim_kwargs=dict(lr=1e-3, weight_decay=1e-2),
    loss_kwargs=dict(temperature=0.1)
)

In [10]:
trainer = Trainer(inference_mode=False, max_epochs=100)

Trainer will use only 1 of 2 GPUs because it is running inside an interactive / notebook environment. You may try to set `Trainer(devices=2)` but please note that multi-GPU inside interactive / notebook environments is considered experimental and unstable. Your mileage may vary.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs


In [None]:
trainer.fit(comm, datamodule=data_module_mimic)

In [11]:
score = classification_scoring(comm, downstream_mimic)

IndexError: too many indices for tensor of dimension 2

In [None]:
print(f"CoMM accuracy on MIMIC={100 * score:.2f}")

CoMM accuracy on MOSI=65.06


### UR-FUNNY

In [5]:
comm = CoMM(
    encoder=MMFusion(
        encoders=[ # Handles vision and textual modalities
            Transformer(n_features=371, dim=40, max_seq_length=50, positional_encoding=False), 
            Transformer(n_features=300, dim=40, max_seq_length=50, positional_encoding=False), 
        ], 
        input_adapters=[None, None], # No adapters needed
        embed_dim=40
    ),
    projection=CoMM._build_mlp(40, 512, 256),
    optim_kwargs=dict(lr=1e-3, weight_decay=1e-2),
    loss_kwargs=dict(temperature=0.1)
)

In [6]:
trainer = Trainer(inference_mode=False, max_epochs=100)

Trainer will use only 1 of 2 GPUs because it is running inside an interactive / notebook environment. You may try to set `Trainer(devices=2)` but please note that multi-GPU inside interactive / notebook environments is considered experimental and unstable. Your mileage may vary.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs


In [None]:
trainer.fit(comm, datamodule=data_module_humor)

In [None]:
score = classification_scoring(comm, downstream_humor)

In [9]:
print(f"CoMM accuracy on UR-FUNNY={100 * score:.2f}")

CoMM accuracy on UR-FUNNY=62.24


### MUsTARD

In [10]:
comm = CoMM(
    encoder=MMFusion(
        encoders=[ # Handles vision and textual modalities
            Transformer(n_features=371, dim=40, max_seq_length=50, positional_encoding=False), 
            Transformer(n_features=300, dim=40, max_seq_length=50, positional_encoding=False), 
        ], 
        input_adapters=[None, None], # No adapters needed
        embed_dim=40
    ),
    projection=CoMM._build_mlp(40, 512, 256),
    optim_kwargs=dict(lr=1e-3, weight_decay=1e-2),
    loss_kwargs=dict(temperature=0.1)
)

In [11]:
trainer = Trainer(inference_mode=False, max_epochs=100)

Trainer will use only 1 of 2 GPUs because it is running inside an interactive / notebook environment. You may try to set `Trainer(devices=2)` but please note that multi-GPU inside interactive / notebook environments is considered experimental and unstable. Your mileage may vary.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs


In [None]:
trainer.fit(comm, datamodule=data_module_sarcasm)

In [None]:
score = classification_scoring(comm, downstream_sarcasm)

In [14]:
print(f"CoMM accuracy on MUsTARD={100 * score:.2f}")

CoMM accuracy on MUsTARD=64.91
