# AutoGluon Multimodal - Quick Start

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/autogluon/autogluon/blob/master/docs/tutorials/multimodal/multimodal_prediction/multimodal-quick-start.ipynb)
[![Open In SageMaker Studio Lab](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/autogluon/autogluon/blob/master/docs/tutorials/multimodal/multimodal_prediction/multimodal-quick-start.ipynb)

AutoGluon's `MultiModalPredictor` is a deep learning model zoo of model zoos that can automatically build state-of-the-art deep learning models for inputs including images, text, and tabular data. Convert your data into AutoGluon's multimodal dataframe format, and `MultiModalPredictor` can predict the values of one column based on the other features.

Begin by making sure AutoGluon is installed, and then import the required modules.

In [None]:
# !python -m pip install --upgrade pip
# !python -m pip install autogluon

In [1]:
import os
import warnings

import numpy as np

warnings.filterwarnings('ignore')
np.random.seed(123)

## Example Data

For this tutorial we use a simplified and subsampled version of the [PetFinder dataset](https://www.kaggle.com/c/petfinder-adoption-prediction). The goal is to predict pet adoption rates based on their adoption profiles. In this simplified version, the adoption speed is grouped into two categories: 0 (slow) and 1 (fast). We begin by downloading a zip file containing the petfinder datasets and unzipping them in the current working directory.

In [2]:
from autogluon.core.utils.loaders import load_zip

download_dir = './ag_multimodal_tutorial'
zip_file = 'https://automl-mm-bench.s3.amazonaws.com/petfinder_for_tutorial.zip'

load_zip.unzip(zip_file, unzip_dir=download_dir)

Downloading ./ag_multimodal_tutorial/file.zip from https://automl-mm-bench.s3.amazonaws.com/petfinder_for_tutorial.zip...


100%|██████████| 18.8M/18.8M [00:04<00:00, 3.89MiB/s]


Next, we use pandas to read the dataset's CSV files into `DataFrames`, noting that the column we are interested in learning to predict is "AdoptionSpeed".

In [3]:
import pandas as pd

dataset_path = f'{download_dir}/petfinder_for_tutorial'

train_data = pd.read_csv(f'{dataset_path}/train.csv', index_col=0)
test_data = pd.read_csv(f'{dataset_path}/test.csv', index_col=0)

label_col = 'AdoptionSpeed'

In [6]:
train_data.head()

Unnamed: 0,Type,Name,Age,Breed1,Breed2,Gender,Color1,Color2,Color3,MaturitySize,...,Quantity,Fee,State,RescuerID,VideoAmt,Description,PetID,PhotoAmt,AdoptionSpeed,Images
0,2,Yumi Hamasaki,4,292,265,2,1,5,7,2,...,1,0,41326,bcc4e1b9557a8b3aaf545ea8e6e86991,0,I rescued Yumi Hamasaki at a food stall far aw...,7d7a39d71,3.0,0,/Users/elnath/004_deep_learning/AutoGloun-Offi...
1,2,Nene/ Kimie,12,285,0,2,5,6,7,2,...,1,0,41326,f0450bf0efe0fa3ff9321d0b827b1237,0,Has adopted by a friend with new pet name Kimie,0e107c82f,3.0,0,/Users/elnath/004_deep_learning/AutoGloun-Offi...
2,2,Mattie,12,266,0,2,1,7,0,2,...,1,0,41401,9b52af6d48a4521fd01d4028eb5879a3,0,I rescued Mattie with a broken leg. After surg...,1a8fd6707,5.0,0,/Users/elnath/004_deep_learning/AutoGloun-Offi...
3,1,,1,189,307,2,1,2,0,2,...,1,0,41401,88da1210e021a5cf43480b074778f3bc,0,She born on 30 September . I really hope the a...,bca8b44ae,3.0,0,/Users/elnath/004_deep_learning/AutoGloun-Offi...
4,2,Coco,6,276,285,2,2,4,7,2,...,1,100,41326,227d7b1bcfaffb5f9882bf57b5ee8fab,0,Calico Tame and easy going Diet RC Kitten Supp...,2def67952,1.0,0,/Users/elnath/004_deep_learning/AutoGloun-Offi...


The PetFinder dataset comes with a directory of images, and some records in the data have multiple images associated with them. AutoGluon's multimodal dataframe format requires that image columns contain a string whose value is a path to a single image file. For this example, we will limit the image feature column to only the first image and will need to do some path manipulations to get everything setup correctly for the current directory structure.

In [5]:
image_col = 'Images'

train_data[image_col] = train_data[image_col].apply(lambda ele: ele.split(';')[0])
test_data[image_col] = test_data[image_col].apply(lambda ele: ele.split(';')[0])

def path_expander(path, base_folder):
    path_l = path.split(';')
    return ';'.join([os.path.abspath(os.path.join(base_folder, path)) for path in path_l])

train_data[image_col] = train_data[image_col].apply(lambda ele: path_expander(ele, base_folder=dataset_path))
test_data[image_col] = test_data[image_col].apply(lambda ele: path_expander(ele, base_folder=dataset_path))

Each animal's adoption profile includes pictures, a text description, and various tabular features such as age, breed, name, color, and more. Let's look at a picture and description for an example row of data.

In [7]:
example_row = train_data.iloc[0]
example_image = example_row[image_col]

from IPython.display import Image, display
pil_img = Image(filename=example_image)
display(pil_img)

example_row['Description']

<IPython.core.display.Image object>

"I rescued Yumi Hamasaki at a food stall far away in Kelantan. At that time i was on my way back to KL, she was suffer from stomach problem and looking very2 sick.. I send her to vet & get the treatment + vaccinated and right now she's very2 healthy.. About yumi : - love to sleep with ppl - she will keep on meowing if she's hugry - very2 active, always seeking for people to accompany her playing - well trained (poo+pee in her own potty) - easy to bathing - I only feed her with these brands : IAMS, Kittenbites, Pro-formance Reason why i need someone to adopt Yumi: I just married and need to move to a new house where no pets are allowed :( As Yumi is very2 special to me, i will only give her to ppl that i think could take care of her just like i did (especially on her foods things).."

## Training

Now that the data is in a suitable format, we can fit `MultiModalPredictor` on the training data. Here we set a tight training time budget for this quick demo. More training time will lead to better prediction performance, but we can get surprisingly good performance in a short amount of time.

In [16]:
from autogluon.multimodal import MultiModalPredictor

predictor = MultiModalPredictor(label=label_col).fit(
    train_data=train_data,
    # time_limit=120
    time_limit=60 * 60 * 1
    
)

No path specified. Models will be saved in: "AutogluonModels/ag-20231227_104605"
AutoGluon Version:  1.0.0
Python Version:     3.10.13
Operating System:   Darwin
Platform Machine:   x86_64
Platform Version:   Darwin Kernel Version 23.2.0: Wed Nov 15 21:54:10 PST 2023; root:xnu-10002.61.3~2/RELEASE_X86_64
CPU Count:          16
Pytorch Version:    2.0.0.post104
CUDA Version:       CUDA is not available
Memory Avail:       39.67 GB / 64.00 GB (62.0%)
Disk Space Avail:   469.97 GB / 931.55 GB (50.5%)
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
	2 unique label values:  [0, 1]
	If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])

AutoMM starts to create your model. ✨✨✨

To track the learning progress, you can open a terminal and launch Tensorboard:
    ```shell
    # Assume you have insta

Epoch 0:  50%|█████     | 30/60 [06:45<06:45, 13.52s/it]                   
Validation: 0it [00:00, ?it/s][A
Validation:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   7%|▋         | 1/15 [00:04<00:59,  4.24s/it][A
Validation DataLoader 0:  13%|█▎        | 2/15 [00:07<00:49,  3.84s/it][A
Validation DataLoader 0:  20%|██        | 3/15 [00:11<00:44,  3.72s/it][A
Validation DataLoader 0:  27%|██▋       | 4/15 [00:14<00:39,  3.63s/it][A
Validation DataLoader 0:  33%|███▎      | 5/15 [00:20<00:40,  4.04s/it][A
Validation DataLoader 0:  40%|████      | 6/15 [00:25<00:38,  4.26s/it][A
Validation DataLoader 0:  47%|████▋     | 7/15 [00:29<00:33,  4.15s/it][A
Validation DataLoader 0:  53%|█████▎    | 8/15 [00:32<00:28,  4.05s/it][A
Validation DataLoader 0:  60%|██████    | 9/15 [00:35<00:23,  3.99s/it][A
Validation DataLoader 0:  67%|██████▋   | 10/15 [00:41<00:20,  4.11s/it][A
Validation DataLoa

INFO: Epoch 0, global step 1: 'val_roc_auc' reached 0.51250 (best 0.51250), saving model to '/Users/elnath/004_deep_learning/AutoGloun-Official/v1_0_0/docs/tutorials/multimodal/multimodal_prediction/AutogluonModels/ag-20231227_104605/epoch=0-step=1.ckpt' as top 3


Epoch 0: 100%|██████████| 60/60 [14:24<00:00, 14.41s/it]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   7%|▋         | 1/15 [00:04<01:01,  4.39s/it][A
Validation DataLoader 0:  13%|█▎        | 2/15 [00:07<00:51,  3.99s/it][A
Validation DataLoader 0:  20%|██        | 3/15 [00:11<00:46,  3.88s/it][A
Validation DataLoader 0:  27%|██▋       | 4/15 [00:15<00:41,  3.75s/it][A
Validation DataLoader 0:  33%|███▎      | 5/15 [00:20<00:41,  4.16s/it][A
Validation DataLoader 0:  40%|████      | 6/15 [00:26<00:39,  4.39s/it][A
Validation DataLoader 0:  47%|████▋     | 7/15 [00:30<00:34,  4.29s/it][A
Validation DataLoader 0:  53%|█████▎    | 8/15 [00:33<00:29,  4.19s/it][A
Validation DataLoader 0:  60%|██████    | 9/15 [00:37<00:25,  4.17s/it][A
Validation DataLoader 0:  67%|██████▋   | 10/15 [00:42<00:21,  4.27s/it][A
Validation DataLoader 0:  73%|███████

INFO: Epoch 0, global step 4: 'val_roc_auc' reached 0.63611 (best 0.63611), saving model to '/Users/elnath/004_deep_learning/AutoGloun-Official/v1_0_0/docs/tutorials/multimodal/multimodal_prediction/AutogluonModels/ag-20231227_104605/epoch=0-step=4.ckpt' as top 3


Epoch 1:  50%|█████     | 30/60 [06:29<06:29, 12.98s/it]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   7%|▋         | 1/15 [00:04<01:00,  4.34s/it][A
Validation DataLoader 0:  13%|█▎        | 2/15 [00:07<00:50,  3.90s/it][A
Validation DataLoader 0:  20%|██        | 3/15 [00:11<00:45,  3.81s/it][A
Validation DataLoader 0:  27%|██▋       | 4/15 [00:14<00:40,  3.69s/it][A
Validation DataLoader 0:  33%|███▎      | 5/15 [00:20<00:40,  4.10s/it][A
Validation DataLoader 0:  40%|████      | 6/15 [00:26<00:39,  4.36s/it][A
Validation DataLoader 0:  47%|████▋     | 7/15 [00:29<00:34,  4.28s/it][A
Validation DataLoader 0:  53%|█████▎    | 8/15 [00:33<00:29,  4.19s/it][A
Validation DataLoader 0:  60%|██████    | 9/15 [00:37<00:25,  4.17s/it][A
Validation DataLoader 0:  67%|██████▋   | 10/15 [00:42<00:21,  4.29s/it][A
Validation DataLoader 0:  73%|███████

INFO: Epoch 1, global step 5: 'val_roc_auc' reached 0.67306 (best 0.67306), saving model to '/Users/elnath/004_deep_learning/AutoGloun-Official/v1_0_0/docs/tutorials/multimodal/multimodal_prediction/AutogluonModels/ag-20231227_104605/epoch=1-step=5.ckpt' as top 3


Epoch 1: 100%|██████████| 60/60 [14:20<00:00, 14.35s/it]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   7%|▋         | 1/15 [00:05<01:10,  5.04s/it][A
Validation DataLoader 0:  13%|█▎        | 2/15 [00:08<00:55,  4.31s/it][A
Validation DataLoader 0:  20%|██        | 3/15 [00:12<00:49,  4.13s/it][A
Validation DataLoader 0:  27%|██▋       | 4/15 [00:15<00:43,  3.98s/it][A
Validation DataLoader 0:  33%|███▎      | 5/15 [00:22<00:44,  4.45s/it][A
Validation DataLoader 0:  40%|████      | 6/15 [00:27<00:41,  4.65s/it][A
Validation DataLoader 0:  47%|████▋     | 7/15 [00:31<00:36,  4.53s/it][A
Validation DataLoader 0:  53%|█████▎    | 8/15 [00:35<00:30,  4.39s/it][A
Validation DataLoader 0:  60%|██████    | 9/15 [00:38<00:25,  4.30s/it][A
Validation DataLoader 0:  67%|██████▋   | 10/15 [00:43<00:21,  4.38s/it][A
Validation DataLoader 0:  73%|███████

INFO: Epoch 1, global step 8: 'val_roc_auc' reached 0.68833 (best 0.68833), saving model to '/Users/elnath/004_deep_learning/AutoGloun-Official/v1_0_0/docs/tutorials/multimodal/multimodal_prediction/AutogluonModels/ag-20231227_104605/epoch=1-step=8.ckpt' as top 3


Epoch 2:  50%|█████     | 30/60 [05:57<05:57, 11.92s/it]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   7%|▋         | 1/15 [00:04<00:56,  4.05s/it][A
Validation DataLoader 0:  13%|█▎        | 2/15 [00:07<00:47,  3.65s/it][A
Validation DataLoader 0:  20%|██        | 3/15 [00:10<00:42,  3.52s/it][A
Validation DataLoader 0:  27%|██▋       | 4/15 [00:13<00:37,  3.41s/it][A
Validation DataLoader 0:  33%|███▎      | 5/15 [00:19<00:38,  3.81s/it][A
Validation DataLoader 0:  40%|████      | 6/15 [00:24<00:36,  4.05s/it][A
Validation DataLoader 0:  47%|████▋     | 7/15 [00:27<00:31,  3.97s/it][A
Validation DataLoader 0:  53%|█████▎    | 8/15 [00:30<00:27,  3.87s/it][A
Validation DataLoader 0:  60%|██████    | 9/15 [00:34<00:23,  3.84s/it][A
Validation DataLoader 0:  67%|██████▋   | 10/15 [00:39<00:19,  3.95s/it][A
Validation DataLoader 0:  73%|███████

INFO: Epoch 2, global step 9: 'val_roc_auc' reached 0.69778 (best 0.69778), saving model to '/Users/elnath/004_deep_learning/AutoGloun-Official/v1_0_0/docs/tutorials/multimodal/multimodal_prediction/AutogluonModels/ag-20231227_104605/epoch=2-step=9.ckpt' as top 3


Epoch 2: 100%|██████████| 60/60 [13:20<00:00, 13.34s/it]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   7%|▋         | 1/15 [00:03<00:47,  3.41s/it][A
Validation DataLoader 0:  13%|█▎        | 2/15 [00:06<00:40,  3.10s/it][A
Validation DataLoader 0:  20%|██        | 3/15 [00:08<00:35,  2.99s/it][A
Validation DataLoader 0:  27%|██▋       | 4/15 [00:11<00:32,  2.92s/it][A
Validation DataLoader 0:  33%|███▎      | 5/15 [00:16<00:32,  3.26s/it][A
Validation DataLoader 0:  40%|████      | 6/15 [00:20<00:31,  3.48s/it][A
Validation DataLoader 0:  47%|████▋     | 7/15 [00:23<00:27,  3.42s/it][A
Validation DataLoader 0:  53%|█████▎    | 8/15 [00:27<00:23,  3.38s/it][A
Validation DataLoader 0:  60%|██████    | 9/15 [00:30<00:20,  3.36s/it][A
Validation DataLoader 0:  67%|██████▋   | 10/15 [00:34<00:17,  3.48s/it][A
Validation DataLoader 0:  73%|███████

INFO: Epoch 2, global step 12: 'val_roc_auc' reached 0.69472 (best 0.69778), saving model to '/Users/elnath/004_deep_learning/AutoGloun-Official/v1_0_0/docs/tutorials/multimodal/multimodal_prediction/AutogluonModels/ag-20231227_104605/epoch=2-step=12.ckpt' as top 3


Epoch 3:  50%|█████     | 30/60 [05:24<05:24, 10.81s/it]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   7%|▋         | 1/15 [00:03<00:50,  3.62s/it][A
Validation DataLoader 0:  13%|█▎        | 2/15 [00:06<00:41,  3.17s/it][A
Validation DataLoader 0:  20%|██        | 3/15 [00:09<00:36,  3.06s/it][A
Validation DataLoader 0:  27%|██▋       | 4/15 [00:11<00:32,  2.98s/it][A
Validation DataLoader 0:  33%|███▎      | 5/15 [00:16<00:33,  3.40s/it][A
Validation DataLoader 0:  40%|████      | 6/15 [00:21<00:31,  3.55s/it][A
Validation DataLoader 0:  47%|████▋     | 7/15 [00:24<00:27,  3.47s/it][A
Validation DataLoader 0:  53%|█████▎    | 8/15 [00:27<00:23,  3.38s/it][A
Validation DataLoader 0:  60%|██████    | 9/15 [00:30<00:20,  3.34s/it][A
Validation DataLoader 0:  67%|██████▋   | 10/15 [00:34<00:17,  3.41s/it][A
Validation DataLoader 0:  73%|███████

INFO: Epoch 3, global step 13: 'val_roc_auc' reached 0.69972 (best 0.69972), saving model to '/Users/elnath/004_deep_learning/AutoGloun-Official/v1_0_0/docs/tutorials/multimodal/multimodal_prediction/AutogluonModels/ag-20231227_104605/epoch=3-step=13.ckpt' as top 3


Epoch 3: 100%|██████████| 60/60 [11:52<00:00, 11.88s/it]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   7%|▋         | 1/15 [00:03<00:50,  3.64s/it][A
Validation DataLoader 0:  13%|█▎        | 2/15 [00:06<00:42,  3.28s/it][A
Validation DataLoader 0:  20%|██        | 3/15 [00:09<00:37,  3.14s/it][A
Validation DataLoader 0:  27%|██▋       | 4/15 [00:12<00:33,  3.03s/it][A
Validation DataLoader 0:  33%|███▎      | 5/15 [00:16<00:33,  3.39s/it][A
Validation DataLoader 0:  40%|████      | 6/15 [00:22<00:33,  3.69s/it][A
Validation DataLoader 0:  47%|████▋     | 7/15 [00:25<00:29,  3.63s/it][A
Validation DataLoader 0:  53%|█████▎    | 8/15 [00:28<00:24,  3.55s/it][A
Validation DataLoader 0:  60%|██████    | 9/15 [00:31<00:21,  3.51s/it][A
Validation DataLoader 0:  67%|██████▋   | 10/15 [00:35<00:17,  3.59s/it][A
Validation DataLoader 0:  73%|███████

INFO: Epoch 3, global step 16: 'val_roc_auc' reached 0.73889 (best 0.73889), saving model to '/Users/elnath/004_deep_learning/AutoGloun-Official/v1_0_0/docs/tutorials/multimodal/multimodal_prediction/AutogluonModels/ag-20231227_104605/epoch=3-step=16.ckpt' as top 3


Epoch 4:  17%|█▋        | 10/60 [01:38<08:14,  9.88s/it]

INFO: Time limit reached. Elapsed time is 1:00:04. Signaling Trainer to stop.


Epoch 4:  18%|█▊        | 11/60 [01:47<07:57,  9.74s/it]
Validation: 0it [00:00, ?it/s][A
Validation:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   0%|          | 0/15 [00:00<?, ?it/s][A
Validation DataLoader 0:   7%|▋         | 1/15 [00:03<00:47,  3.38s/it][A
Validation DataLoader 0:  13%|█▎        | 2/15 [00:06<00:40,  3.10s/it][A
Validation DataLoader 0:  20%|██        | 3/15 [00:09<00:36,  3.01s/it][A
Validation DataLoader 0:  27%|██▋       | 4/15 [00:11<00:32,  2.93s/it][A
Validation DataLoader 0:  33%|███▎      | 5/15 [00:16<00:33,  3.36s/it][A
Validation DataLoader 0:  40%|████      | 6/15 [00:21<00:32,  3.59s/it][A
Validation DataLoader 0:  47%|████▋     | 7/15 [00:24<00:28,  3.53s/it][A
Validation DataLoader 0:  53%|█████▎    | 8/15 [00:27<00:24,  3.45s/it][A
Validation DataLoader 0:  60%|██████    | 9/15 [00:30<00:20,  3.40s/it][A
Validation DataLoader 0:  67%|██████▋   | 10/15 [00:34<00:17,  3.48s/it][A
Validation DataLoader 0:  73%|███████

Start to fuse 3 checkpoints via the greedy soup algorithm.


Predicting DataLoader 0: 100%|██████████| 4/4 [00:54<00:00, 13.73s/it]
Predicting DataLoader 0: 100%|██████████| 4/4 [00:54<00:00, 13.66s/it]
Predicting DataLoader 0: 100%|██████████| 4/4 [00:55<00:00, 13.78s/it]


AutoMM has created your model. 🎉🎉🎉

To load the model, use the code below:
    ```python
    from autogluon.multimodal import MultiModalPredictor
    predictor = MultiModalPredictor.load("/Users/elnath/004_deep_learning/AutoGloun-Official/v1_0_0/docs/tutorials/multimodal/multimodal_prediction/AutogluonModels/ag-20231227_104605")
    ```

If you are not satisfied with the model, try to increase the training time, 
adjust the hyperparameters (https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub (https://github.com/autogluon/autogluon/issues).




Under the hood `MultiModalPredictor` automatically infers the problem type (classification or regression), detects feature modalities, selects models from the multimodal model pools, and trains the selected models. If multiple backbones are used, MultiModalPredictor appends a late-fusion model (MLP or transformer) on top of them.

## Prediction

After fitting the model, we want to use it to predict the labels in the witheld test dataset.

In [17]:
predictions = predictor.predict(test_data.drop(columns=label_col))
predictions[:5]

Predicting DataLoader 0: 100%|██████████| 4/4 [00:39<00:00,  9.82s/it]


8     0
70    1
82    1
28    0
63    1
Name: AdoptionSpeed, dtype: int64

For classification tasks, we can just as easily get the prediction probabilities for each output class.

In [18]:
probs = predictor.predict_proba(test_data.drop(columns=label_col))
probs[:5]

Predicting DataLoader 0: 100%|██████████| 4/4 [00:38<00:00,  9.66s/it]


Unnamed: 0,0,1
8,0.943792,0.056208
70,0.102629,0.897371
82,0.194576,0.805424
28,0.972388,0.027612
63,0.084784,0.915216


## Evaluation

Finally, we can evaluate the predictor on the witheld test dataset on other performance metrics, in this case [roc_auc](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html).

In [19]:
scores = predictor.evaluate(test_data, metrics=["roc_auc"])
scores

Predicting DataLoader 0: 100%|██████████| 4/4 [00:39<00:00,  9.84s/it]


{'roc_auc': 0.9231999999999999}

## Conclusion

In this quickstart tutorial we saw the basic fit and predict functionality of AutoGluon's `MultiModalPredictor`, but we just scratched the surface on its functionality. Check out the in-depth tutorials to learn about other features of AutoGluon's `MultiModalPredictor` like embedding extraction, distillation, model fine-tuning, text or image prediction, and semantic matching.