<a href="https://colab.research.google.com/github/akshatamadavi/data_mining/blob/main/autogluon/autogluon_multimodal_colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# ───────────────────────────────────────────────────
# 🚀 Setup & Install
# ───────────────────────────────────────────────────
!pip install --upgrade pip
!pip install autogluon[all]
!nvidia-smi

# ───────────────────────────────────────────────────
# 📚 Imports
# ───────────────────────────────────────────────────
import os
import warnings
import numpy as np
import pandas as pd
warnings.filterwarnings('ignore')
np.random.seed(123)

from autogluon.multimodal import MultiModalPredictor
from autogluon.core.utils.loaders import load_zip

# ───────────────────────────────────────────────────
# 📂 Download & Prepare Dataset
# ───────────────────────────────────────────────────
download_dir = './ag_automm_tutorial'
zip_file = 'https://automl-mm-bench.s3.amazonaws.com/petfinder_for_tutorial.zip'
load_zip.unzip(zip_file, unzip_dir=download_dir)

dataset_path = os.path.join(download_dir, 'petfinder_for_tutorial')
train_data = pd.read_csv(os.path.join(dataset_path, 'train.csv'), index_col=0)
test_data  = pd.read_csv(os.path.join(dataset_path, 'test.csv'),  index_col=0)

label_col = 'AdoptionSpeed'
image_col = 'Images'

train_data[image_col] = train_data[image_col].apply(lambda ele: ele.split(';')[0])
test_data[image_col]  = test_data[image_col].apply(lambda ele: ele.split(';')[0])

def path_expander(path, base_folder):
    paths = path.split(';')
    return ';'.join([ os.path.abspath(os.path.join(base_folder, p)) for p in paths ])

train_data[image_col] = train_data[image_col].apply(lambda ele: path_expander(ele, dataset_path))
test_data[image_col]  = test_data[image_col].apply(lambda ele: path_expander(ele, dataset_path))

print("Train shape:", train_data.shape)
print("Test shape:", test_data.shape)
print("Label column:", label_col)

# ───────────────────────────────────────────────────
# 🧠 Train the multimodal model
# ───────────────────────────────────────────────────
predictor = MultiModalPredictor(label=label_col, path='./multimodal_model/')
predictor = predictor.fit(
    train_data=train_data,
    time_limit=120
)

# ───────────────────────────────────────────────────
# 📈 Evaluate
# ───────────────────────────────────────────────────
performance = predictor.evaluate(test_data)
print("Performance on test data:", performance)

# ───────────────────────────────────────────────────
# 🔮 Predict
# ───────────────────────────────────────────────────
preds = predictor.predict(test_data)
print("Sample predictions:", preds[:10])

# ───────────────────────────────────────────────────
# 💾 Save model
# ───────────────────────────────────────────────────
predictor.save()
print("Model saved at:", predictor.path)


Collecting pip
  Downloading pip-25.3-py3-none-any.whl.metadata (4.7 kB)
Downloading pip-25.3-py3-none-any.whl (1.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m22.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 24.1.2
    Uninstalling pip-24.1.2:
      Successfully uninstalled pip-24.1.2
Successfully installed pip-25.3
Collecting autogluon[all]
  Downloading autogluon-1.4.0-py3-none-any.whl.metadata (11 kB)
[0mCollecting autogluon.core==1.4.0 (from autogluon.core[all]==1.4.0->autogluon[all])
  Downloading autogluon.core-1.4.0-py3-none-any.whl.metadata (12 kB)
Collecting autogluon.features==1.4.0 (from autogluon[all])
  Downloading autogluon.features-1.4.0-py3-none-any.whl.metadata (11 kB)
Collecting autogluon.tabular==1.4.0 (from autogluon.tabular[all]==1.4.0->autogluon[all])
  Downloading autogluon.tabular-1.4.0-py3-none-any.whl.metadata (16 kB)
Collecti

100%|██████████| 18.8M/18.8M [00:00<00:00, 57.0MiB/s]
AutoGluon Version:  1.4.0
Python Version:     3.12.12
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Thu Oct  2 10:42:05 UTC 2025
CPU Count:          8
Pytorch Version:    2.7.1+cu126
CUDA Version:       12.6
GPU Count:          1
Memory Avail:       48.67 GB / 50.99 GB (95.4%)
Disk Space Avail:   190.04 GB / 235.68 GB (80.6%)
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
	2 unique label values:  [np.int64(0), np.int64(1)]
	If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])


Train shape: (600, 25)
Test shape: (100, 25)
Label column: AdoptionSpeed



AutoMM starts to create your model. ✨✨✨

To track the learning progress, you can open a terminal and launch Tensorboard:
    ```shell
    # Assume you have installed tensorboard
    tensorboard --logdir /content/multimodal_model
    ```

INFO: Seed set to 0


config.json:   0%|          | 0.00/666 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/440M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/395M [00:00<?, ?B/s]

GPU Count: 1
GPU Count to be Used: 1

INFO: Using 16bit Automatic Mixed Precision (AMP)
INFO: GPU available: True (cuda), used: True
INFO: TPU available: False, using: 0 TPU cores
INFO: HPU available: False, using: 0 HPUs
INFO: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO: 
  | Name              | Type                | Params | Mode 
------------------------------------------------------------------
0 | model             | MultimodalFusionMLP | 207 M  | train
1 | validation_metric | BinaryAUROC         | 0      | train
2 | loss_func         | CrossEntropyLoss    | 0      | train
------------------------------------------------------------------
207 M     Trainable params
0         Non-trainable params
207 M     Total params
828.307   Total estimated model params size (MB)
1171      Modules in train mode
0         Modules in eval mode


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 0, global step 1: 'val_roc_auc' reached 0.56167 (best 0.56167), saving model to '/content/multimodal_model/epoch=0-step=1.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 0, global step 4: 'val_roc_auc' reached 0.74583 (best 0.74583), saving model to '/content/multimodal_model/epoch=0-step=4.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 1, global step 5: 'val_roc_auc' reached 0.77167 (best 0.77167), saving model to '/content/multimodal_model/epoch=1-step=5.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 1, global step 8: 'val_roc_auc' reached 0.78500 (best 0.78500), saving model to '/content/multimodal_model/epoch=1-step=8.ckpt' as top 3
INFO: Time limit reached. Elapsed time is 0:02:03. Signaling Trainer to stop.


Validation: |          | 0/? [00:00<?, ?it/s]

Start to fuse 3 checkpoints via the greedy soup algorithm.
INFO: 💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.


Predicting: |          | 0/? [00:00<?, ?it/s]

INFO: 💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.


Predicting: |          | 0/? [00:00<?, ?it/s]

INFO: 💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.


Predicting: |          | 0/? [00:00<?, ?it/s]

AutoMM has created your model. 🎉🎉🎉

To load the model, use the code below:
    ```python
    from autogluon.multimodal import MultiModalPredictor
    predictor = MultiModalPredictor.load("/content/multimodal_model")
    ```

If you are not satisfied with the model, try to increase the training time, 
adjust the hyperparameters (https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub (https://github.com/autogluon/autogluon/issues).


INFO: 💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.


Predicting: |          | 0/? [00:00<?, ?it/s]

INFO: 💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.


Performance on test data: {'roc_auc': np.float64(0.914)}


Predicting: |          | 0/? [00:00<?, ?it/s]

Sample predictions: 8     1
70    1
82    1
28    0
63    1
0     0
5     0
50    1
81    1
4     1
Name: AdoptionSpeed, dtype: int64


TypeError: MultiModalPredictor.save() missing 1 required positional argument: 'path'