# Demand Forecasting with TabPFN Time Series

This notebook demonstrates time series forecasting for demand planning using TabPFN's native time series capabilities.

**Use Case:** Demand Planning - Forecast product demand by category and region

**Business Context:** Accurate demand forecasting is the foundation of supply chain planning. It drives:
- Inventory planning and safety stock optimization
- Production scheduling and capacity planning
- Distribution requirements planning (DRP)
- Procurement and supplier management

**What you will learn:**
- How to use TabPFN's TimeSeriesDataFrame for time series data
- How to apply automatic feature engineering with FeatureTransformer
- How to use TabPFNTimeSeriesPredictor for forecasting
- How to evaluate forecast accuracy with industry-standard metrics
- How to forecast across multiple series and aggregate for planning

**Prerequisites:** Run `00_data_preparation` notebook first.

## Compute Setup

We recommend running this notebook on a cluster with **DBR 17.4 LTS for ML** or above for the best experience. This runtime includes optimized ML libraries and better compatibility with TabPFN Time Series.

## 1. Installation

In [None]:
%pip install tabpfn-client tabpfn-time-series scikit-learn pandas matplotlib mlflow --quiet

In [None]:
dbutils.library.restartPython()

## 2. Authentication

In [None]:
import tabpfn_client

token = dbutils.secrets.get(scope="tabpfn-client", key="token")
tabpfn_client.set_access_token(token)

## 3. Configuration

In [None]:
CATALOG = "tabpfn_databricks"
SCHEMA = "default"

# MLflow experiment configuration (shared across all TabPFN notebooks)
# Default uses user namespace, but can be customized
current_user = spark.sql("SELECT current_user()").collect()[0][0]
MLFLOW_EXPERIMENT_NAME = f"/Users/{current_user}/tabpfn-databricks"

spark.sql(f"USE CATALOG {CATALOG}")
spark.sql(f"USE SCHEMA {SCHEMA}")

## 4. Import Libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import mean_absolute_error, mean_squared_error
import mlflow

# TabPFN Time Series imports
from tabpfn_time_series import (
    TimeSeriesDataFrame,
    FeatureTransformer,
    TabPFNTimeSeriesPredictor,
    TabPFNMode,
)
from tabpfn_time_series.data_preparation import generate_test_X
from tabpfn_time_series.features import RunningIndexFeature, CalendarFeature, AutoSeasonalFeature

# Set MLflow experiment
mlflow.set_experiment(MLFLOW_EXPERIMENT_NAME)
print(f"MLflow experiment set to: {MLFLOW_EXPERIMENT_NAME}")

## 5. Load Demand Forecast Data

The demand forecast dataset contains monthly demand data across:
- Multiple product categories (Beverages, Snacks, Dairy, Frozen, Personal Care)
- Multiple regions (Northeast, Southeast, Midwest, West)

Each series exhibits realistic patterns including:
- Seasonality (annual patterns)
- Trend (growth or decline)
- Holiday effects
- Random noise

In [None]:
# Load demand forecast training data
df_demand = spark.table("demand_forecast_train").toPandas()
df_demand['date'] = pd.to_datetime(df_demand['date'])

print(f"Dataset shape: {df_demand.shape}")
print(f"Number of time series: {df_demand['series_id'].nunique()}")
print(f"Time range: {df_demand['date'].min().strftime('%Y-%m')} to {df_demand['date'].max().strftime('%Y-%m')}")
print(f"\nCategory distribution:")
print(df_demand['category'].value_counts())

display(df_demand.head(10))

In [None]:
# Visualize sample time series
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Get one series from each category
categories = df_demand['category'].unique()[:4]

for ax, cat in zip(axes.flatten(), categories):
    series_id = df_demand[df_demand['category'] == cat]['series_id'].iloc[0]
    df_s = df_demand[df_demand['series_id'] == series_id].sort_values('date')
    
    ax.plot(df_s['date'], df_s['demand_units'], 'b-', linewidth=1.5, marker='o', markersize=3)
    ax.set_title(f'{cat} ({df_s["region"].iloc[0]})')
    ax.set_xlabel('Date')
    ax.set_ylabel('Demand (Units)')
    ax.grid(True, alpha=0.3)
    ax.tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

## 6. Convert to TimeSeriesDataFrame

TabPFN Time Series uses a specialized `TimeSeriesDataFrame` format that indexes data by `item_id` (series identifier) and `timestamp`. This enables efficient handling of multiple time series.

In [None]:
def pandas_to_time_series_dataframe(df, item_id_col, timestamp_col, target_col):
    """
    Convert a pandas DataFrame to a TimeSeriesDataFrame.
    
    Args:
        df: pandas DataFrame with time series data
        item_id_col: Column name for series identifier
        timestamp_col: Column name for timestamps
        target_col: Column name for target values
    
    Returns:
        TimeSeriesDataFrame
    """
    # Prepare the data with proper index
    df_ts = df[[item_id_col, timestamp_col, target_col]].copy()
    df_ts = df_ts.rename(columns={
        item_id_col: 'item_id',
        timestamp_col: 'timestamp',
        target_col: 'target'
    })
    df_ts = df_ts.sort_values(['item_id', 'timestamp'])
    df_ts = df_ts.set_index(['item_id', 'timestamp'])
    
    return TimeSeriesDataFrame(df_ts)

In [None]:
# Convert to TimeSeriesDataFrame
tsdf = pandas_to_time_series_dataframe(
    df_demand,
    item_id_col='series_id',
    timestamp_col='date',
    target_col='demand_units'
)

print(f"TimeSeriesDataFrame created with {len(tsdf.item_ids)} series")
print(f"Item IDs: {tsdf.item_ids[:5]}...")
print(f"\nDataFrame shape: {tsdf.shape}")
display(tsdf.head(10))

## 7. Single Series Forecasting Demo

Let's first demonstrate forecasting on a single series to understand the workflow.

In [None]:
# Select a single series for demonstration
selected_series = tsdf.item_ids[0]
tsdf_single = tsdf[tsdf.index.get_level_values('item_id') == selected_series]

# Get series metadata from original dataframe
series_info = df_demand[df_demand['series_id'] == selected_series].iloc[0]

print(f"Selected series: {selected_series}")
print(f"Category: {series_info['category']}")
print(f"Region: {series_info['region']}")
print(f"Series length: {len(tsdf_single)} months")

In [None]:
# Define forecast horizon
pred_len = 6  # Forecast 6 months ahead

# Split into training and test portions
train_single, test_single = tsdf_single.train_test_split(prediction_length=pred_len)

# Generate test features (X) for the forecast horizon
test_X_single = generate_test_X(train_single, pred_len)

print(f"Training samples: {len(train_single)}")
print(f"Test samples: {len(test_single)} (forecast horizon: {pred_len} months)")

In [None]:
# Add temporal and seasonal features using FeatureTransformer
features = [
    RunningIndexFeature(),  # Sequential index feature
    CalendarFeature(),      # Month, day of week, etc.
    AutoSeasonalFeature()   # Automatic seasonality detection
]

transformer = FeatureTransformer(features)
train_single_feat, test_X_single_feat = transformer.transform(train_single, test_X_single)

print(f"Training features shape: {train_single_feat.shape}")
print(f"Test features shape: {test_X_single_feat.shape}")
print(f"\nFeature columns: {list(train_single_feat.columns)}")

## 8. Train TabPFN Time Series Predictor

In [None]:
# Initialize the predictor (client mode uses pretrained TabPFN weights)
predictor = TabPFNTimeSeriesPredictor(tabpfn_mode=TabPFNMode.CLIENT)

# Generate forecasts with MLflow logging
with mlflow.start_run(run_name="demand_forecast_single_series"):
    # Log parameters
    mlflow.log_param("model_type", "TabPFNTimeSeriesPredictor")
    mlflow.log_param("tabpfn_mode", "CLIENT")
    mlflow.log_param("task", "demand_forecasting")
    mlflow.log_param("series_id", selected_series)
    mlflow.log_param("category", series_info['category'])
    mlflow.log_param("region", series_info['region'])
    mlflow.log_param("forecast_horizon", pred_len)
    mlflow.log_param("n_features", train_single_feat.shape[1])
    mlflow.log_param("train_samples", len(train_single_feat))
    
    # Make predictions
    pred_single = predictor.predict(train_single_feat, test_X_single_feat)
    
    # Extract actual values for evaluation
    # Note: train_test_split returns the full series as the second element,
    # so we extract only the last pred_len values from the original series
    y_test = tsdf_single['target'].values[-pred_len:]
    y_pred = pred_single['mean'].values if 'mean' in pred_single.columns else pred_single.iloc[:, 0].values
    
    # Calculate forecast metrics
    mae = mean_absolute_error(y_test, y_pred)
    rmse = np.sqrt(mean_squared_error(y_test, y_pred))
    mape = np.mean(np.abs((y_test - y_pred) / y_test)) * 100
    
    # Calculate bias
    bias = np.mean(y_pred - y_test)
    bias_pct = (bias / np.mean(y_test)) * 100
    
    # Log metrics
    mlflow.log_metric("mae", mae)
    mlflow.log_metric("rmse", rmse)
    mlflow.log_metric("mape", mape)
    mlflow.log_metric("bias", bias)
    mlflow.log_metric("bias_pct", bias_pct)
    
    print(f"Forecast Metrics ({pred_len}-month horizon):")
    print(f"  MAE:  {mae:,.0f} units")
    print(f"  RMSE: {rmse:,.0f} units")
    print(f"  MAPE: {mape:.1f}%")
    print(f"  Bias: {bias:,.0f} units ({bias_pct:+.1f}%)")
    print(f"  MLflow Run ID: {mlflow.active_run().info.run_id}")

In [None]:
# Visualize forecast
fig, ax = plt.subplots(figsize=(14, 6))

# Get dates for plotting
train_dates = train_single.index.get_level_values('timestamp')
# Get only the last pred_len dates for the test period
test_dates = tsdf_single.index.get_level_values('timestamp')[-pred_len:]
train_values = train_single['target'].values

# Plot training data
ax.plot(train_dates, train_values, 'b-', linewidth=1.5, label='Training Data')

# Plot actual vs forecast
ax.plot(test_dates, y_test, 'g-', linewidth=2, marker='o', markersize=8, label='Actual')
ax.plot(test_dates, y_pred, 'r--', linewidth=2, marker='s', markersize=8, label='Forecast')

# Highlight forecast region
ax.axvspan(test_dates[0], test_dates[-1], alpha=0.1, color='yellow', label='Forecast Horizon')

ax.set_xlabel('Date')
ax.set_ylabel('Demand (Units)')
ax.set_title(f'Demand Forecast - {series_info["category"]} ({series_info["region"]}) | MAPE: {mape:.1f}%')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## 9. Forecast with Uncertainty Quantification

TabPFN can provide prediction intervals, which is crucial for inventory planning to ensure adequate safety stock.

In [None]:
# Check if prediction contains quantiles
pred_columns = pred_single.columns.tolist()
print(f"Prediction columns: {pred_columns}")

# Extract prediction intervals if available
if 'q0.05' in pred_columns and 'q0.95' in pred_columns:
    y_lower = pred_single['q0.05'].values
    y_upper = pred_single['q0.95'].values
    y_median = pred_single['q0.5'].values if 'q0.5' in pred_columns else y_pred
    
    # Calculate coverage
    coverage = np.mean((y_test >= y_lower) & (y_test <= y_upper))
    print(f"90% Prediction Interval Coverage: {coverage:.1%}")
else:
    print("Quantile predictions not available in output. Using point predictions only.")
    y_lower = y_pred * 0.9  # Simple estimate
    y_upper = y_pred * 1.1
    y_median = y_pred

In [None]:
# Visualize forecast with uncertainty
fig, ax = plt.subplots(figsize=(14, 6))

# Training data
ax.plot(train_dates, train_values, 'b-', linewidth=1.5, label='Training Data', alpha=0.7)

# Forecast with uncertainty band
ax.fill_between(test_dates, y_lower, y_upper, alpha=0.3, color='red', label='90% Prediction Interval')
ax.plot(test_dates, y_median, 'r-', linewidth=2, label='Forecast (Median)')
ax.scatter(test_dates, y_test, color='green', s=100, zorder=5, label='Actual', edgecolors='black')

ax.set_xlabel('Date')
ax.set_ylabel('Demand (Units)')
ax.set_title(f'Demand Forecast with Uncertainty - {series_info["category"]}')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## 10. Batch Forecasting Across Multiple Series

In practice, demand planners need to forecast hundreds or thousands of SKU-location combinations. Let's demonstrate batch forecasting using the full TimeSeriesDataFrame.

In [None]:
# Forecast multiple series
n_series_to_forecast = 10  # Forecast 10 series for demonstration
series_ids = tsdf.item_ids[:n_series_to_forecast]

# Filter to selected series
tsdf_subset = tsdf[tsdf.index.get_level_values('item_id').isin(series_ids)]

# Define forecast horizon
forecast_horizon = 6

# Split all series
train_batch, test_batch = tsdf_subset.train_test_split(prediction_length=forecast_horizon)
test_X_batch = generate_test_X(train_batch, forecast_horizon)

# Apply feature transformation
train_batch_feat, test_X_batch_feat = transformer.transform(train_batch, test_X_batch)

print(f"Forecasting {len(series_ids)} series with {forecast_horizon}-month horizon...")
print(f"Training samples: {len(train_batch_feat)}")
print(f"Test samples: {len(test_X_batch_feat)}")

In [None]:
# Initialize predictor and make batch predictions
predictor_batch = TabPFNTimeSeriesPredictor(tabpfn_mode=TabPFNMode.CLIENT)

print("Generating batch forecasts...")
pred_batch = predictor_batch.predict(train_batch_feat, test_X_batch_feat)

print(f"Batch predictions shape: {pred_batch.shape}")
display(pred_batch.head(10))

In [None]:
# Calculate metrics for each series
results = []

for sid in series_ids:
    # Get actual values from the original series (last forecast_horizon values)
    # Note: test_batch from train_test_split returns the full series, not just the test portion
    y_actual = tsdf_subset[tsdf_subset.index.get_level_values('item_id') == sid]['target'].values[-forecast_horizon:]
    
    # Get predictions
    pred_col = 'mean' if 'mean' in pred_batch.columns else pred_batch.columns[0]
    y_forecast = pred_batch[pred_batch.index.get_level_values('item_id') == sid][pred_col].values
    
    # Ensure arrays have same length (should both be forecast_horizon)
    min_len = min(len(y_actual), len(y_forecast))
    y_actual = y_actual[:min_len]
    y_forecast = y_forecast[:min_len]
    
    # Calculate metrics
    mae_s = mean_absolute_error(y_actual, y_forecast)
    mape_s = np.mean(np.abs((y_actual - y_forecast) / (y_actual + 1e-8))) * 100
    rmse_s = np.sqrt(mean_squared_error(y_actual, y_forecast))
    
    # Get series metadata
    series_meta = df_demand[df_demand['series_id'] == sid].iloc[0]
    
    # Log to MLflow
    with mlflow.start_run(run_name=f"demand_forecast_{sid}"):
        mlflow.log_param("model_type", "TabPFNTimeSeriesPredictor")
        mlflow.log_param("tabpfn_mode", "CLIENT")
        mlflow.log_param("task", "demand_forecasting")
        mlflow.log_param("evaluation_type", "batch_forecasting")
        mlflow.log_param("series_id", sid)
        mlflow.log_param("category", series_meta['category'])
        mlflow.log_param("region", series_meta['region'])
        mlflow.log_param("forecast_horizon", forecast_horizon)
        
        mlflow.log_metric("mae", mae_s)
        mlflow.log_metric("mape", mape_s)
        mlflow.log_metric("rmse", rmse_s)
        mlflow.log_metric("actual_total", y_actual.sum())
        mlflow.log_metric("forecast_total", y_forecast.sum())
    
    results.append({
        'series_id': sid,
        'category': series_meta['category'],
        'region': series_meta['region'],
        'MAE': mae_s,
        'MAPE': mape_s,
        'actual_total': y_actual.sum(),
        'forecast_total': y_forecast.sum()
    })
    print(f"{sid}: {series_meta['category']:15s} | MAE={mae_s:,.0f}, MAPE={mape_s:.1f}%")

df_results = pd.DataFrame(results)

In [None]:
# Summary statistics with MLflow logging
print("\n" + "="*70)
print("FORECAST ACCURACY SUMMARY")
print("="*70)
print(f"\nOverall Performance:")
print(f"  Average MAPE: {df_results['MAPE'].mean():.1f}%")
print(f"  Median MAPE: {df_results['MAPE'].median():.1f}%")
print(f"  Best MAPE: {df_results['MAPE'].min():.1f}%")
print(f"  Worst MAPE: {df_results['MAPE'].max():.1f}%")

# Log summary metrics to MLflow
with mlflow.start_run(run_name="demand_forecast_batch_summary"):
    mlflow.log_param("model_type", "TabPFNTimeSeriesPredictor")
    mlflow.log_param("tabpfn_mode", "CLIENT")
    mlflow.log_param("task", "demand_forecasting")
    mlflow.log_param("evaluation_type", "batch_summary")
    mlflow.log_param("n_series", n_series_to_forecast)
    mlflow.log_param("forecast_horizon", forecast_horizon)
    
    mlflow.log_metric("mape_mean", df_results['MAPE'].mean())
    mlflow.log_metric("mape_median", df_results['MAPE'].median())
    mlflow.log_metric("mape_min", df_results['MAPE'].min())
    mlflow.log_metric("mape_max", df_results['MAPE'].max())
    mlflow.log_metric("mae_mean", df_results['MAE'].mean())
    
    print(f"\n  MLflow Summary Run ID: {mlflow.active_run().info.run_id}")

print(f"\nPerformance by Category:")
category_summary = df_results.groupby('category').agg({
    'MAPE': 'mean',
    'series_id': 'count'
}).rename(columns={'series_id': 'n_series'})
print(category_summary.round(1))

In [None]:
# Visualize MAPE distribution
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# MAPE histogram
axes[0].hist(df_results['MAPE'], bins=15, edgecolor='black', alpha=0.7)
axes[0].axvline(df_results['MAPE'].mean(), color='red', linestyle='--', 
                label=f"Mean: {df_results['MAPE'].mean():.1f}%")
axes[0].set_xlabel('MAPE (%)')
axes[0].set_ylabel('Frequency')
axes[0].set_title('Distribution of Forecast Accuracy (MAPE)')
axes[0].legend()

# Actual vs Forecast scatter
axes[1].scatter(df_results['actual_total'], df_results['forecast_total'], 
                alpha=0.7, s=100, c=df_results['category'].astype('category').cat.codes)
max_val = max(df_results['actual_total'].max(), df_results['forecast_total'].max())
axes[1].plot([0, max_val], [0, max_val], 'r--', label='Perfect Forecast')
axes[1].set_xlabel('Actual Total Demand')
axes[1].set_ylabel('Forecast Total Demand')
axes[1].set_title(f'Forecast vs Actual ({forecast_horizon}-month total)')
axes[1].legend()

plt.tight_layout()
plt.show()

## 11. Aggregate Forecasts for Planning

Demand planners often need aggregated forecasts at different levels (category, region, total) for different planning decisions.

In [None]:
# Aggregate forecasts by category
print("Aggregate Forecast Summary by Category:")
print("="*70)

category_forecast = df_results.groupby('category').agg({
    'actual_total': 'sum',
    'forecast_total': 'sum',
    'series_id': 'count'
}).rename(columns={'series_id': 'n_series'})

category_forecast['forecast_error'] = category_forecast['forecast_total'] - category_forecast['actual_total']
category_forecast['error_pct'] = (category_forecast['forecast_error'] / category_forecast['actual_total']) * 100

print(category_forecast.round(0))

# Total company forecast
total_actual = df_results['actual_total'].sum()
total_forecast = df_results['forecast_total'].sum()
total_error_pct = ((total_forecast - total_actual) / total_actual) * 100

print(f"\nTotal Company Forecast:")
print(f"  Actual: {total_actual:,.0f} units")
print(f"  Forecast: {total_forecast:,.0f} units")
print(f"  Error: {total_error_pct:+.1f}%")

## Summary

In this notebook, we demonstrated:

- **TimeSeriesDataFrame** - Native time series data structure for TabPFN
- **FeatureTransformer** - Automatic feature engineering with calendar and seasonal features
- **TabPFNTimeSeriesPredictor** - The foundation model for time series forecasting
- **Single Series Forecasting** - End-to-end workflow for one series
- **Uncertainty Quantification** - Prediction intervals for risk assessment
- **Batch Forecasting** - Forecasting multiple series efficiently
- **Aggregate Analysis** - Rolling up forecasts for planning

**Key Takeaways:**
1. TabPFN Time Series provides a dedicated API for time series forecasting
2. Automatic feature engineering reduces manual preprocessing
3. Built-in uncertainty quantification enables risk-aware planning
4. Batch forecasting allows scaling to enterprise-level SKU counts

**Typical Demand Planning MAPE Benchmarks:**
- Excellent: < 15%
- Good: 15-25%
- Acceptable: 25-35%
- Needs Improvement: > 35%

**Next Steps:**
- Integrate forecasts with inventory planning systems
- Add external features (promotions, holidays, weather)
- Implement hierarchical forecast reconciliation
- Deploy as a scheduled Databricks workflow