# AutoGluon Tabular - Foundational Models

In this tutorial, we introduce support for cutting-edge foundational tabular models that leverage pre-training and in-context learning to achieve state-of-the-art performance on tabular datasets. These models represent a significant advancement in automated machine learning for structured data.

In this tutorial, we'll explore three foundational tabular models:

1. **Mitra** - AutoGluon's new state-of-the-art tabular foundation model
2. **TabICL** - In-context learning for large tabular datasets
3. **TabPFNv2** - Prior-fitted networks for accurate predictions on small data

These models excel particularly on small to medium-sized datasets and can run in both zero-shot and fine-tuning modes.

## Installation

First, let's install AutoGluon with support for foundational models:

In [1]:
# Individual model installations:
!pip install uv
!uv pip install autogluon.tabular[mitra]   # For Mitra
!uv pip install autogluon.tabular[tabicl]   # For TabICL
!uv pip install autogluon.tabular[tabpfn]   # For TabPFNv2


Collecting uv
  Downloading uv-0.8.15-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Downloading uv-0.8.15-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (21.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m21.0/21.0 MB[0m [31m50.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: uv
Successfully installed uv-0.8.15
[2mUsing Python 3.12.11 environment at: /usr[0m
[2K[2mResolved [1m71 packages[0m [2min 655ms[0m[0m
[2K[2mPrepared [1m15 packages[0m [2min 32.26s[0m[0m
[2mUninstalled [1m5 packages[0m [2min 647ms[0m[0m
[2K[2mInstalled [1m15 packages[0m [2min 283ms[0m[0m
 [32m+[39m [1mautogluon-common[0m[2m==1.4.0[0m
 [32m+[39m [1mautogluon-core[0m[2m==1.4.0[0m
 [32m+[39m [1mautogluon-features[0m[2m==1.4.0[0m
 [32m+[39m [1mautogluon-tabular[0m[2m==1.4.0[0m
 [32m+[39m [1mboto3[0m[2m==1.40.26[0m
 [32m+[39m [1mbotocore[0m[2m==1.40.26[0m
 [32m+[39m [1meinx[

In [2]:
import pandas as pd
from autogluon.tabular import TabularDataset, TabularPredictor
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_wine, fetch_california_housing

## Example Data

For this tutorial, we'll demonstrate the foundational models on three different datasets to showcase their versatility:

1. **Wine Dataset** (Multi-class Classification) - Medium-sized dataset for comparing model performance
3. **California Housing** (Regression) - Regression dataset

Let's load and prepare these datasets:

In [3]:
# Load datasets

# 1. Wine (Multi-class Classification)
wine_data = load_wine()
wine_df = pd.DataFrame(wine_data.data, columns=wine_data.feature_names)
wine_df['target'] = wine_data.target

# 2. California Housing (Regression)
housing_data = fetch_california_housing()
housing_df = pd.DataFrame(housing_data.data, columns=housing_data.feature_names)
housing_df['target'] = housing_data.target

print("Dataset shapes:")
print(f"Wine: {wine_df.shape}")
print(f"California Housing: {housing_df.shape}")

Dataset shapes:
Wine: (178, 14)
California Housing: (20640, 9)


## Create Train/Test Splits

Let's create train/test splits for our datasets:

In [4]:
# Create train/test splits (80/20)
wine_train, wine_test = train_test_split(wine_df, test_size=0.2, random_state=42, stratify=wine_df['target'])
housing_train, housing_test = train_test_split(housing_df, test_size=0.2, random_state=42)

print("Training set sizes:")
print(f"Wine: {len(wine_train)} samples")
print(f"Housing: {len(housing_train)} samples")

# Convert to TabularDataset
wine_train_data = TabularDataset(wine_train)
wine_test_data = TabularDataset(wine_test)
housing_train_data = TabularDataset(housing_train)
housing_test_data = TabularDataset(housing_test)

Training set sizes:
Wine: 142 samples
Housing: 16512 samples


## 1. Mitra: AutoGluon's Tabular Foundation Model

[Mitra](https://huggingface.co/autogluon/mitra-classifier) is a new state-of-the-art tabular foundation model developed by the AutoGluon team, natively supported in AutoGluon with just three lines of code via `predictor.fit())`.

### Using Mitra for Classification

In [5]:
# Create predictor with Mitra
print("Training Mitra classifier on classification dataset...")
mitra_predictor = TabularPredictor(label='target')
mitra_predictor.fit(
    wine_train_data,
    hyperparameters={
        'MITRA': {'fine_tune': False}
    },
   )

print("\nMitra training completed!")

No path specified. Models will be saved in: "AutogluonModels/ag-20250909_143626"
Verbosity: 2 (Standard Logging)
AutoGluon Version:  1.4.0
Python Version:     3.12.11
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Sun Mar 30 16:01:29 UTC 2025
CPU Count:          2
Memory Avail:       11.51 GB / 12.67 GB (90.8%)
Disk Space Avail:   65.08 GB / 107.72 GB (60.4%)
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='extreme' : New in v1.4: Massively better than 'best' on datasets <30000 samples by using new models meta-learned on https://tabarena.ai: TabPFNv2, TabICL, Mitra, and TabM. Absolute best accuracy. Requires a GPU. Recommended 64 GB CPU memory and 32+ GB GPU memory.
	presets='best'    : Maximize accuracy. Recomm

Training Mitra classifier on classification dataset...


Beginning AutoGluon training ...
AutoGluon will save models to "/content/AutogluonModels/ag-20250909_143626"
Train Data Rows:    142
Train Data Columns: 13
Label Column:       target
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
	3 unique label values:  [np.int64(0), np.int64(2), np.int64(1)]
	If 'multiclass' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
Problem Type:       multiclass
Preprocessing data ...
Train Data Class Count: 3
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    11757.64 MB
	Train Data (Original)  Memory Usage: 0.01 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specif

config.json:   0%|          | 0.00/86.0 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/303M [00:00<?, ?B/s]

	1.0	 = Validation score   (accuracy)
	66.83s	 = Training   runtime
	39.98s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
	Ensemble Weights: {'Mitra': 1.0}
	1.0	 = Validation score   (accuracy)
	0.0s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 113.11s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 0.7 rows/s (29 batch size)
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/content/AutogluonModels/ag-20250909_143626")



Mitra training completed!


## Evaluate Mitra Performance

In [6]:
# Make predictions
mitra_predictions = mitra_predictor.predict(wine_test_data)
print("Sample Mitra predictions:")
print(mitra_predictions.head(10))

# Show prediction probabilities for first few samples
mitra_predictions = mitra_predictor.predict_proba(wine_test_data)
print(mitra_predictions.head())

# Show model leaderboard
print("\nMitra Model Leaderboard:")
mitra_predictor.leaderboard(wine_test_data)




Sample Mitra predictions:
10     0
134    2
28     0
121    0
62     1
51     0
7      0
66     1
129    1
166    2
Name: target, dtype: int64




            0         1         2
10   0.997044  0.002800  0.000156
134  0.001029  0.106581  0.892390
28   0.962575  0.037323  0.000102
121  0.496672  0.496672  0.006655
62   0.089949  0.908454  0.001597

Mitra Model Leaderboard:




Unnamed: 0,model,score_test,score_val,eval_metric,pred_time_test,pred_time_val,fit_time,pred_time_test_marginal,pred_time_val_marginal,fit_time_marginal,stack_level,can_infer,fit_order
0,Mitra,0.972222,1.0,accuracy,40.591352,39.979676,66.827397,40.591352,39.979676,66.827397,1,True,1
1,WeightedEnsemble_L2,0.972222,1.0,accuracy,40.594836,39.980493,66.831488,0.003485,0.000817,0.004091,2,True,2


## Finetuning with Mitra

In [7]:
mitra_predictor_ft = TabularPredictor(label='target')
mitra_predictor_ft.fit(
    wine_train_data,
    hyperparameters={
        'MITRA': {'fine_tune': True, 'fine_tune_steps': 10}
    },
    time_limit=120,  # 2 minutes
   )

print("\nMitra fine-tuning completed!")

No path specified. Models will be saved in: "AutogluonModels/ag-20250909_144024"
Verbosity: 2 (Standard Logging)
AutoGluon Version:  1.4.0
Python Version:     3.12.11
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Sun Mar 30 16:01:29 UTC 2025
CPU Count:          2
Memory Avail:       10.68 GB / 12.67 GB (84.2%)
Disk Space Avail:   64.19 GB / 107.72 GB (59.6%)
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='extreme' : New in v1.4: Massively better than 'best' on datasets <30000 samples by using new models meta-learned on https://tabarena.ai: TabPFNv2, TabICL, Mitra, and TabM. Absolute best accuracy. Requires a GPU. Recommended 64 GB CPU memory and 32+ GB GPU memory.
	presets='best'    : Maximize accuracy. Recomm


Mitra fine-tuning completed!


## Evaluating Fine-tuned Mitra Performance

In [8]:

# Show model leaderboard
print("\nMitra Model Leaderboard:")
mitra_predictor_ft.leaderboard(wine_test_data)





Mitra Model Leaderboard:


Unnamed: 0,model,score_test,score_val,eval_metric,pred_time_test,pred_time_val,fit_time,pred_time_test_marginal,pred_time_val_marginal,fit_time_marginal,stack_level,can_infer,fit_order
0,Mitra,0.972222,1.0,accuracy,40.801136,40.068762,484.405354,40.801136,40.068762,484.405354,1,True,1
1,WeightedEnsemble_L2,0.972222,1.0,accuracy,40.847684,40.069679,484.410124,0.046548,0.000917,0.00477,2,True,2


## Using Mitra for Regression

In [9]:

# Create predictor with Mitra for regression
print("Training Mitra regressor on California Housing dataset...")
mitra_reg_predictor = TabularPredictor(
    label='target',
    path='./mitra_regressor_model',
    problem_type='regression'
)
mitra_reg_predictor.fit(
    housing_train_data.sample(1000), # sample 1000 rows
    hyperparameters={
        'MITRA': {'fine_tune': False}
    },
)

# Evaluate regression performance
mitra_reg_predictor.leaderboard(housing_test_data)


Verbosity: 2 (Standard Logging)
AutoGluon Version:  1.4.0
Python Version:     3.12.11
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Sun Mar 30 16:01:29 UTC 2025
CPU Count:          2
Memory Avail:       10.19 GB / 12.67 GB (80.4%)
Disk Space Avail:   63.91 GB / 107.72 GB (59.3%)
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='extreme' : New in v1.4: Massively better than 'best' on datasets <30000 samples by using new models meta-learned on https://tabarena.ai: TabPFNv2, TabICL, Mitra, and TabM. Absolute best accuracy. Requires a GPU. Recommended 64 GB CPU memory and 32+ GB GPU memory.
	presets='best'    : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
	presets='high'    : St

Training Mitra regressor on California Housing dataset...


Beginning AutoGluon training ...
AutoGluon will save models to "/content/mitra_regressor_model"
Train Data Rows:    1000
Train Data Columns: 8
Label Column:       target
Problem Type:       regression
Preprocessing data ...
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    10431.74 MB
	Train Data (Original)  Memory Usage: 0.06 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
	Stage 1 Generators:
		Fitting AsTypeFeatureGenerator...
	Stage 2 Generators:
		Fitting FillNaFeatureGenerator...
	Stage 3 Generators:
		Fitting IdentityFeatureGenerator...
	Stage 4 Generators:
		Fitting DropUniqueFeatureGenerator...
	Stage 5 Generators:
		Fitting DropDuplicatesFeatureGenerator...
	Types of features in original data (raw dtype, special dtypes):
		('float', []) : 8 | ['MedInc', 'HouseAge', 'AveRooms', '

config.json:   0%|          | 0.00/81.0 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/303M [00:00<?, ?B/s]

	-0.5738	 = Validation score   (-root_mean_squared_error)
	217.99s	 = Training   runtime
	214.65s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
	Ensemble Weights: {'Mitra': 1.0}
	-0.5738	 = Validation score   (-root_mean_squared_error)
	0.0s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 439.48s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 0.9 rows/s (200 batch size)
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/content/mitra_regressor_model")


Unnamed: 0,model,score_test,score_val,eval_metric,pred_time_test,pred_time_val,fit_time,pred_time_test_marginal,pred_time_val_marginal,fit_time_marginal,stack_level,can_infer,fit_order
0,Mitra,-0.5418,-0.573832,root_mean_squared_error,1772.328377,214.645845,217.98661,1772.328377,214.645845,217.98661,1,True,1
1,WeightedEnsemble_L2,-0.5418,-0.573832,root_mean_squared_error,1772.337462,214.646283,217.990494,0.009085,0.000437,0.003884,2,True,2


## 2. TabICL: In-Context Learning for Tabular Data

**TabICL** ("**Tab**ular **I**n-**C**ontext **L**earning") is a foundational model designed specifically for in-context learning on large tabular datasets.

**Paper**: ["TabICL: A Tabular Foundation Model for In-Context Learning on Large Data"](https://arxiv.org/abs/2502.05564)  
**Authors**: Jingang Qu, David Holzmüller, Gaël Varoquaux, Marine Le Morvan  
**GitHub**: https://github.com/soda-inria/tabicl

TabICL leverages transformer architecture with in-context learning capabilities, making it particularly effective for scenarios where you have limited training data but access to related examples.

In [10]:
# Train TabICL on dataset
print("Training TabICL on wine dataset...")
tabicl_predictor = TabularPredictor(
    label='target',
    path='./tabicl_model'
)
tabicl_predictor.fit(
    wine_train_data,
    hyperparameters={
        'TABICL': {},
    },
)

# Show prediction probabilities for first few samples
tabicl_predictions = tabicl_predictor.predict_proba(wine_test_data)
print(tabicl_predictions.head())

# Show TabICL leaderboard
print("\nTabICL Model Details:")
tabicl_predictor.leaderboard(wine_test_data)

Verbosity: 2 (Standard Logging)
AutoGluon Version:  1.4.0
Python Version:     3.12.11
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Sun Mar 30 16:01:29 UTC 2025
CPU Count:          2
Memory Avail:       10.24 GB / 12.67 GB (80.8%)
Disk Space Avail:   63.07 GB / 107.72 GB (58.5%)
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='extreme' : New in v1.4: Massively better than 'best' on datasets <30000 samples by using new models meta-learned on https://tabarena.ai: TabPFNv2, TabICL, Mitra, and TabM. Absolute best accuracy. Requires a GPU. Recommended 64 GB CPU memory and 32+ GB GPU memory.
	presets='best'    : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
	presets='high'    : St

Training TabICL on wine dataset...


Beginning AutoGluon training ...
AutoGluon will save models to "/content/tabicl_model"
Train Data Rows:    142
Train Data Columns: 13
Label Column:       target
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
	3 unique label values:  [np.int64(0), np.int64(2), np.int64(1)]
	If 'multiclass' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
Problem Type:       multiclass
Preprocessing data ...
Train Data Class Count: 3
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    10482.18 MB
	Train Data (Original)  Memory Usage: 0.01 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of th

INFO: You are downloading 'tabicl-classifier-v1.1-0506.ckpt', the latest best-performing version of TabICL.
To reproduce results from the original paper, please use 'tabicl-classifier-v1-0208.ckpt'.

Checkpoint 'tabicl-classifier-v1.1-0506.ckpt' not cached.
 Downloading from Hugging Face Hub (jingang/TabICL-clf).



tabicl-classifier-v1.1-0506.ckpt:   0%|          | 0.00/108M [00:00<?, ?B/s]

	1.0	 = Validation score   (accuracy)
	1.44s	 = Training   runtime
	7.46s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
	Ensemble Weights: {'TabICL': 1.0}
	1.0	 = Validation score   (accuracy)
	0.01s	 = Training   runtime
	0.01s	 = Validation runtime
AutoGluon training complete, total runtime = 9.37s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 3.9 rows/s (29 batch size)
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/content/tabicl_model")


            0         1         2
10   0.999189  0.000718  0.000093
134  0.002011  0.255421  0.742569
28   0.992100  0.007716  0.000184
121  0.585101  0.405147  0.009752
62   0.009680  0.985753  0.004567

TabICL Model Details:


Unnamed: 0,model,score_test,score_val,eval_metric,pred_time_test,pred_time_val,fit_time,pred_time_test_marginal,pred_time_val_marginal,fit_time_marginal,stack_level,can_infer,fit_order
0,TabICL,0.972222,1.0,accuracy,7.926368,7.456331,1.435242,7.926368,7.456331,1.435242,1,True,1
1,WeightedEnsemble_L2,0.972222,1.0,accuracy,7.929867,7.462854,1.440316,0.003499,0.006524,0.005075,2,True,2


## 3. TabPFNv2: Prior-Fitted Networks

**TabPFNv2** ("**Tab**ular **P**rior-**F**itted **N**etworks **v2**") is designed for accurate predictions on small tabular datasets by using prior-fitted network architectures.

**Paper**: ["Accurate predictions on small data with a tabular foundation model"](https://www.nature.com/articles/s41586-024-08328-6)  
**Authors**: Noah Hollmann, Samuel Müller, Lennart Purucker, Arjun Krishnakumar, Max Körfer, Shi Bin Hoo, Robin Tibor Schirrmeister & Frank Hutter  
**GitHub**: https://github.com/PriorLabs/TabPFN

TabPFNv2 excels on small datasets (< 10,000 samples) by leveraging prior knowledge encoded in the network architecture.

In [11]:
# Train TabPFNv2 on Wine dataset (perfect size for TabPFNv2)
print("Training TabPFNv2 on Wine dataset...")
tabpfnv2_predictor = TabularPredictor(
    label='target',
    path='./tabpfnv2_model'
)
tabpfnv2_predictor.fit(
    wine_train_data,
    hyperparameters={
        'TABPFNV2': {
            # TabPFNv2 works best with default parameters on small datasets
        },
    },
)

# Show prediction probabilities for first few samples
tabpfnv2_predictions = tabpfnv2_predictor.predict_proba(wine_test_data)
print(tabpfnv2_predictions.head())


tabpfnv2_predictor.leaderboard(wine_test_data)

Verbosity: 2 (Standard Logging)
AutoGluon Version:  1.4.0
Python Version:     3.12.11
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Sun Mar 30 16:01:29 UTC 2025
CPU Count:          2
Memory Avail:       10.20 GB / 12.67 GB (80.5%)
Disk Space Avail:   62.86 GB / 107.72 GB (58.4%)
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='extreme' : New in v1.4: Massively better than 'best' on datasets <30000 samples by using new models meta-learned on https://tabarena.ai: TabPFNv2, TabICL, Mitra, and TabM. Absolute best accuracy. Requires a GPU. Recommended 64 GB CPU memory and 32+ GB GPU memory.
	presets='best'    : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
	presets='high'    : St

Training TabPFNv2 on Wine dataset...


Beginning AutoGluon training ...
AutoGluon will save models to "/content/tabpfnv2_model"
Train Data Rows:    142
Train Data Columns: 13
Label Column:       target
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
	3 unique label values:  [np.int64(0), np.int64(2), np.int64(1)]
	If 'multiclass' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
Problem Type:       multiclass
Preprocessing data ...
Train Data Class Count: 3
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    10441.14 MB
	Train Data (Original)  Memory Usage: 0.01 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of 

tabpfn-v2-classifier-finetuned-zk73skhh.(…):   0%|          | 0.00/29.0M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/37.0 [00:00<?, ?B/s]

	1.0	 = Validation score   (accuracy)
	2.91s	 = Training   runtime
	8.4s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
	Ensemble Weights: {'TabPFNv2': 1.0}
	1.0	 = Validation score   (accuracy)
	0.0s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 11.5s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 3.5 rows/s (29 batch size)
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/content/tabpfnv2_model")


            0         1         2
10   0.999961  0.000035  0.000004
134  0.000083  0.060002  0.939916
28   0.999560  0.000438  0.000002
121  0.145505  0.830807  0.023688
62   0.023755  0.975990  0.000255


Unnamed: 0,model,score_test,score_val,eval_metric,pred_time_test,pred_time_val,fit_time,pred_time_test_marginal,pred_time_val_marginal,fit_time_marginal,stack_level,can_infer,fit_order
0,TabPFNv2,0.972222,1.0,accuracy,8.440603,8.396841,2.909948,8.440603,8.396841,2.909948,1,True,1
1,WeightedEnsemble_L2,0.972222,1.0,accuracy,8.443532,8.397723,2.914865,0.002929,0.000883,0.004917,2,True,2


## Advanced Usage: Combining Multiple Foundational Models

AutoGluon allows you to combine multiple foundational models in a single predictor for enhanced performance through model stacking and ensembling:

In [12]:
# Configure multiple foundational models together
multi_foundation_config = {
    'MITRA': {
        'fine_tune': True,
        'fine_tune_steps': 10
    },
    'TABPFNV2': {},
    'TABICL': {},
}

print("Training ensemble of foundational models...")
ensemble_predictor = TabularPredictor(
    label='target',
    path='./ensemble_foundation_model'
).fit(
    wine_train_data,
    hyperparameters=multi_foundation_config,
    time_limit=300,  # More time for multiple models
)

# Evaluate ensemble performance
ensemble_predictor.leaderboard(wine_test_data)


Verbosity: 2 (Standard Logging)
AutoGluon Version:  1.4.0
Python Version:     3.12.11
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Sun Mar 30 16:01:29 UTC 2025
CPU Count:          2
Memory Avail:       10.13 GB / 12.67 GB (80.0%)
Disk Space Avail:   62.78 GB / 107.72 GB (58.3%)
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='extreme' : New in v1.4: Massively better than 'best' on datasets <30000 samples by using new models meta-learned on https://tabarena.ai: TabPFNv2, TabICL, Mitra, and TabM. Absolute best accuracy. Requires a GPU. Recommended 64 GB CPU memory and 32+ GB GPU memory.
	presets='best'    : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
	presets='high'    : St

Training ensemble of foundational models...


Beginning AutoGluon training ... Time limit = 300s
AutoGluon will save models to "/content/ensemble_foundation_model"
Train Data Rows:    142
Train Data Columns: 13
Label Column:       target
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
	3 unique label values:  [np.int64(0), np.int64(2), np.int64(1)]
	If 'multiclass' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
Problem Type:       multiclass
Preprocessing data ...
Train Data Class Count: 3
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    10375.98 MB
	Train Data (Original)  Memory Usage: 0.01 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manual

Unnamed: 0,model,score_test,score_val,eval_metric,pred_time_test,pred_time_val,fit_time,pred_time_test_marginal,pred_time_val_marginal,fit_time_marginal,stack_level,can_infer,fit_order
0,TabICL,0.972222,1.0,accuracy,7.99522,6.73331,0.448481,7.99522,6.73331,0.448481,1,True,2
1,TabPFNv2,0.972222,1.0,accuracy,8.15953,8.309897,0.175188,8.15953,8.309897,0.175188,1,True,1
2,WeightedEnsemble_L2,0.972222,1.0,accuracy,8.163413,8.311131,0.27502,0.003882,0.001234,0.099832,2,True,4
3,Mitra,0.972222,1.0,accuracy,40.900537,39.999872,485.800996,40.900537,39.999872,485.800996,1,True,3
