# TFM Tutorial Notebook 2: Tabular Foundation Models

In this tutorial, we introduce a few cutting-edge foundational tabular models that leverage pretraining and in-context learning to achieve state-of-the-art performance on tabular datasets. These models represent a significant advancement in automated machine learning for structured data.

Please use the following instructions to install AutoGluon if needed

``!python -m pip install --upgrade pip``

``!python -m pip install autogluon``

Specifically, we will explore three foundational tabular models:

1. **Mitra**-AutoGluon's new state-of-the-art tabular foundation model
2. **TabPFN v2** - Prior-fitted networks for accurate predictions on small data
3. **TabICL** - In-context learning for large tabular datasets

*Note: In the lecture, we will introduce TabPFN. But the Prior Lab team has upgraded TabPFN to TabPFN v2, so we use TabPFN v2 here*

In AutoGluon, the main tabular models are **Mitra** and **Chronos** (dedicated to time-series) to be introduced in the next module. Therefore, to use TabPFN and TabICL, you will need to install separately them through AutoGluon as follows:

``!pip install uv``

``!uv pip install autogluon.tabular[tabicl]``

``!uv pip install autogluon.tabular[tabpfn]``

In [1]:
import pandas as pd
from autogluon.tabular import TabularDataset, TabularPredictor
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_wine, fetch_california_housing

  from .autonotebook import tqdm as notebook_tqdm


## Example Data

For this tutorial, we will demonstrate the foundational models on three different datasets to showcase their versability:

1. Wine Dataset (Multi-class Classification) - Medium-sized dataset for comparing model performance
2. California Housing (Regression) - Regression dataset

In [2]:
# Load datasets

# 1. Wine (Multi-class Classification)
wine_data = load_wine()
wine_df = pd.DataFrame(wine_data.data, columns=wine_data.feature_names)
wine_df['target'] = wine_data.target

# 2. California Housing (Regression)
housing_data = fetch_california_housing()
housing_df = pd.DataFrame(housing_data.data, columns=housing_data.feature_names)
housing_df['target'] = housing_data.target

print("Dataset shapes:")
print(f"Wine: {wine_df.shape}")
print(f"California Housing: {housing_df.shape}")

Dataset shapes:
Wine: (178, 14)
California Housing: (20640, 9)


## Create Train/Test Splits

In [3]:
# Create train/test splits (80/20)
wine_train, wine_test = train_test_split(wine_df, test_size=0.2, random_state=42, stratify=wine_df['target'])
housing_train, housing_test = train_test_split(housing_df, test_size=0.2, random_state=42)

print("Training set sizes:")
print(f"Wine: {len(wine_train)} samples")
print(f"Housing: {len(housing_train)} samples")

# Convert to TabularDataset
wine_train_data = TabularDataset(wine_train)
wine_test_data = TabularDataset(wine_test)
housing_train_data = TabularDataset(housing_train)
housing_test_data = TabularDataset(housing_test)

Training set sizes:
Wine: 142 samples
Housing: 16512 samples


## 1. Mitra: AutoGluon's Tabular Foundation Model

You can fit the model on your data with three lines of code. Built on the in-context learning paradigm and pretrained exclusively on synthetic data, Mitra introduces a principled pretraining approach by carefully selecting and mixing diverse synthetic priors to promote robust generalization across a wide range of real-world tabular datasets.

**Mitra achieves state-of-the-art performance** on major benchmarks including TabRepo, TabZilla, AMLB, and TabArena, especially excelling on small tabular datasets with fewer than 5,000 samples and 100 features, for both classification and regression tasks.

**Mitra supports both zero-shot and fine-tuning** modes and runs seamlessly on both GPU and CPU. Its weights are fully open-sourced under the Apache-2.0 license, making it a privacy-conscious and production-ready solution for enterprises concerned about data sharing and hosting.

**Please make sure you have enough compute either on CPU or GPU for fitting the model (no gradient updates just in-context learning)**

In [4]:
# Create predictor with Mitra
print("Training Mitra classifier on classification dataset...")
mitra_predictor = TabularPredictor(label='target')
mitra_predictor.fit(
    wine_train_data,
    hyperparameters={
        'MITRA': {'fine_tune': False}
    },
   )

print("\nMitra training completed!")

No path specified. Models will be saved in: "AutogluonModels\ag-20251024_162506"
Verbosity: 2 (Standard Logging)
AutoGluon Version:  1.4.0
Python Version:     3.11.13
Operating System:   Windows
Platform Machine:   AMD64
Platform Version:   10.0.19045
CPU Count:          8
Memory Avail:       2.81 GB / 15.93 GB (17.6%)
Disk Space Avail:   230.58 GB / 476.33 GB (48.4%)
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='extreme' : New in v1.4: Massively better than 'best' on datasets <30000 samples by using new models meta-learned on https://tabarena.ai: TabPFNv2, TabICL, Mitra, and TabM. Absolute best accuracy. Requires a GPU. Recommended 64 GB CPU memory and 32+ GB GPU memory.
	presets='best'    : Maximize accuracy. Recommended for most users. Use in competition

Training Mitra classifier on classification dataset...


Beginning AutoGluon training ...
AutoGluon will save models to "C:\Users\zhjiang\Documents\tabfm_tutorial\AutogluonModels\ag-20251024_162506"
Train Data Rows:    142
Train Data Columns: 13
Label Column:       target
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
	3 unique label values:  [np.int64(0), np.int64(2), np.int64(1)]
	If 'multiclass' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
Problem Type:       multiclass
Preprocessing data ...
Train Data Class Count: 3
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    2864.88 MB
	Train Data (Original)  Memory Usage: 0.01 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set featur

RuntimeError: No models were trained successfully during fit(). Inspect the log output or increase verbosity to determine why no models were fit. Alternatively, set `raise_on_no_models_fitted` to False during the fit call.

## Evaluate Mitra Performance

In [None]:
# Make predictions
mitra_predictions = mitra_predictor.predict(wine_test_data)
print("Sample Mitra predictions:")
print(mitra_predictions.head(10))

# Show prediction probabilities for first few samples
mitra_predictions = mitra_predictor.predict_proba(wine_test_data)
print(mitra_predictions.head())

# Show model leaderboard
print("\nMitra Model Leaderboard:")
mitra_predictor.leaderboard(wine_test_data)

## Fine-tuning with Mitra

TFMs can be fine-tuned to improve the performance if needed.

In [None]:
mitra_predictor_ft = TabularPredictor(label='target')
mitra_predictor_ft.fit(
    wine_train_data,
    hyperparameters={
        'MITRA': {'fine_tune': True, 'fine_tune_steps': 10}
    },
    time_limit=120,  # 2 minutes
   )

print("\nMitra fine-tuning completed!")

## Evaluating Fine-tuned Mitra Performance

In [None]:
# Show model leaderboard
print("\nMitra Model Leaderboard:")
mitra_predictor_ft.leaderboard(wine_test_data)

## Using Mitra for Regression

In [None]:
# Create predictor with Mitra for regression
print("Training Mitra regressor on California Housing dataset...")
mitra_reg_predictor = TabularPredictor(
    label='target',
    path='./mitra_regressor_model',
    problem_type='regression'
)
mitra_reg_predictor.fit(
    housing_train_data.sample(100), # sample 100 rows
    hyperparameters={
        'MITRA': {'fine_tune': False}
    },
)

# Evaluate regression performance
mitra_reg_predictor.leaderboard(housing_test_data)

## 2. TabICL: In-Context Learning for Large-Scale Tabular Data

TabICL is a foundation model dedicated to in-context learning on large tabular datasets.

TabICL leverages transformer architecture with in-context learning capabilities, making it particularly effective for scenarios where you have limited training data but access to related examples.

In [None]:
!pip install uv
!uv pip install autogluon.tabular[tabicl]   # For TabICL
# Train TabICL on dataset
print("Training TabICL on wine dataset...")
tabicl_predictor = TabularPredictor(
    label='target',
    path='./tabicl_model'
)
tabicl_predictor.fit(
    wine_train_data,
    hyperparameters={
        'TABICL': {},
    },
)

# Show prediction probabilities for first few samples
tabicl_predictions = tabicl_predictor.predict_proba(wine_test_data)
print(tabicl_predictions.head())

# Show TabICL leaderboard
print("\nTabICL Model Details:")
tabicl_predictor.leaderboard(wine_test_data)

## 3. TabPFN v2: Prior-Fitted Networks

TabPFNv2 is designed for accurate predictions on small tabular datasets by using prior-fitted network architectures.

TabPFNv2 excels on small datasets (< 10,000 samples) by leveraging prior knowledge encoded in the network architecture.

In [None]:
!uv pip install autogluon.tabular[tabpfn]   # For TabPFNv2
# Train TabPFNv2 on Wine dataset (perfect size for TabPFNv2)
print("Training TabPFNv2 on Wine dataset...")
tabpfnv2_predictor = TabularPredictor(
    label='target',
    path='./tabpfnv2_model'
)
tabpfnv2_predictor.fit(
    wine_train_data,
    hyperparameters={
        'TABPFNV2': {
            # TabPFNv2 works best with default parameters on small datasets
        },
    },
)

# Show prediction probabilities for first few samples
tabpfnv2_predictions = tabpfnv2_predictor.predict_proba(wine_test_data)
print(tabpfnv2_predictions.head())


tabpfnv2_predictor.leaderboard(wine_test_data)

## Advanced Usage: Combining Multiple Foundational Models

AutoGluon allows you to combine multiple foundational models in a single predictor for enhanced performance through model stacking and ensembling.
We also compare TFMs with traditional ML models.

In [None]:
# Configure multiple foundational models together
multi_foundation_config = {
    'MITRA': {
        'fine_tune': False,
        # 'fine_tune_steps': 10
    },
    'TABPFNV2': {},
    'TABICL': {},
    'GBM': {}, # Gradient Boosting Machine
    'NN_TORCH': {}, # Neural Network
    "CAT": {}, # CatBoost
    "RF": {}, # Random Forest
    "XGB": {}, # XGBoost
}

print("Training ensemble of foundational models...")
ensemble_predictor = TabularPredictor(
    label='target',
    path='./ensemble_foundation_model' # For classification
    # path='./mitra_regressor_model',      # For regression
    # problem_type='regression' TabICL does not work for regression.
).fit(
    wine_train_data,
    hyperparameters=multi_foundation_config,
    # time_limit=300,  # More time for multiple models
)

# Evaluate ensemble performance
ensemble_predictor.leaderboard(wine_test_data)

**For exercise: Please use ``load_iris`` and ``load_diabetes`` for classification and regression evaluation.**

Please note that TabICL does not work for regression now at AutoGluon.