# Comprehensive Time Series Forecasting: Hyperparameter Tuning & Workflows

This notebook demonstrates advanced forecasting workflows using the Preem dataset:

1. **Hyperparameter Tuning**: Grid search optimization for multiple model types
2. **Workflow Composition**: Building preprocessing + model pipelines
3. **Cross-Validation**: Time series CV for robust evaluation
4. **Model Comparison**: Systematic comparison of tuned models
5. **Visualization**: Interactive plots of tuning results and forecasts

## Setup

In [1]:
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings('ignore')

# py-tidymodels imports
from py_workflows import workflow
from py_parsnip import (
    linear_reg, prophet_reg, arima_reg, boost_tree, 
    rand_forest, decision_tree, exp_smoothing,
    prophet_boost, arima_boost
)
from py_rsample import initial_time_split, time_series_cv, vfold_cv
from py_yardstick import metric_set, rmse, mae, r_squared, smape, mape
from py_recipes import recipe
from py_tune import tune, grid_regular, tune_grid, fit_resamples, finalize_workflow
from py_visualize import plot_forecast, plot_tune_results, plot_model_comparison

print("✓ All imports successful")

✓ All imports successful


## Load and Prepare Data

In [2]:
# Load data
df = pd.read_csv('__data/preem.csv')
df['date'] = pd.to_datetime(df['date'])

print(f"Dataset shape: {df.shape}")
print(f"Date range: {df['date'].min()} to {df['date'].max()}")
print(f"\nColumns: {list(df.columns)}")

# Display summary
df.head()

Dataset shape: (57, 10)
Date range: 2020-04-01 00:00:00 to 2024-12-01 00:00:00

Columns: ['date', 'mean_med_diesel_crack_input1_trade_month_lag2', 'mean_nwe_hsfo_crack_trade_month_lag1', 'mean_nwe_lsfo_crack_trade_month', 'mean_nwe_ulsfo_crack_trade_month_lag3', 'mean_sing_gasoline_vs_vlsfo_trade_month', 'mean_sing_vlsfo_crack_trade_month_lag3', 'new_sweet_sr_margin', 'target', 'totaltar']


Unnamed: 0,date,mean_med_diesel_crack_input1_trade_month_lag2,mean_nwe_hsfo_crack_trade_month_lag1,mean_nwe_lsfo_crack_trade_month,mean_nwe_ulsfo_crack_trade_month_lag3,mean_sing_gasoline_vs_vlsfo_trade_month,mean_sing_vlsfo_crack_trade_month_lag3,new_sweet_sr_margin,target,totaltar
0,2020-04-01,-56.18,-11.43,-3.07,28.58,-16.09,26.91,0.47,137.65,0.0
1,2020-05-01,-42.36,-9.17,3.89,22.64,-8.75,20.36,0.57,113.53,0.0
2,2020-06-01,-30.21,-8.58,-3.44,13.43,-4.28,11.32,6.61,43.31,0.0
3,2020-07-01,-28.86,-6.86,-2.71,10.74,-3.2,8.53,-1.55,79.77,0.0
4,2020-08-01,-38.8,-5.86,-2.63,4.58,-5.35,6.26,-2.7,48.27,0.0


In [3]:
# Define formula (exclude date from predictors)
FORMULA_STR = "target ~ totaltar + mean_med_diesel_crack_input1_trade_month_lag2 + mean_nwe_hsfo_crack_trade_month_lag1"

print(f"Formula: {FORMULA_STR}")

Formula: target ~ totaltar + mean_med_diesel_crack_input1_trade_month_lag2 + mean_nwe_hsfo_crack_trade_month_lag1


In [4]:
# Create train/test split
split = initial_time_split(df, date_col='date', prop=0.8)
train_data = split.training()
test_data = split.testing()

print(f"Training: {train_data.shape[0]} rows | {train_data['date'].min()} to {train_data['date'].max()}")
print(f"Testing:  {test_data.shape[0]} rows | {test_data['date'].min()} to {test_data['date'].max()}")

Training: 45 rows | 2020-04-01 00:00:00 to 2023-12-01 00:00:00
Testing:  12 rows | 2024-01-01 00:00:00 to 2024-12-01 00:00:00


## 1. Hyperparameter Tuning with Linear Models

### 1.1 Ridge Regression (L2 Regularization)

In [5]:
# Create workflow with tunable parameters
wf_ridge = (
    workflow()
    .add_formula(FORMULA_STR)
    .add_model(
        linear_reg(
            penalty=tune(),      # Regularization strength to tune
            mixture=0.0          # 0 = ridge (L2)
        )
    )
)

print("✓ Ridge workflow created with tunable penalty parameter")

✓ Ridge workflow created with tunable penalty parameter


In [6]:
# Create parameter grid
ridge_grid = grid_regular(
    {
        "penalty": {"range": (0.001, 10.0), "trans": "log"}
    },
    levels=10  # Test 10 different penalty values
)

print(f"Ridge grid: {len(ridge_grid)} parameter combinations")
print(ridge_grid.head())

Ridge grid: 10 parameter combinations
    penalty     .config
0  0.001000  config_001
1  0.002783  config_002
2  0.007743  config_003
3  0.021544  config_004
4  0.059948  config_005


In [7]:
# Create cross-validation folds
cv_folds = vfold_cv(train_data, v=5)

print(f"Created {len(cv_folds)} cross-validation folds")

Created 5 cross-validation folds


In [8]:
# Perform grid search
ridge_results = tune_grid(
    wf_ridge,
    resamples=cv_folds,
    grid=ridge_grid,
    metrics=metric_set(rmse, mae, r_squared)
)

print("✓ Ridge tuning complete")
print(f"Results shape: {ridge_results.metrics.shape}")

✓ Ridge tuning complete
Results shape: (150, 4)


In [9]:
# Show best results
best_ridge = ridge_results.show_best(metric="rmse", n=5, maximize=False)
print("Top 5 Ridge configurations by RMSE:")
display(best_ridge)

Top 5 Ridge configurations by RMSE:


Unnamed: 0,.config,mean,penalty
0,config_010,32.405476,10.0
1,config_009,32.408876,3.593814
2,config_008,32.410103,1.29155
3,config_007,32.410545,0.464159
4,config_006,32.410703,0.16681


In [10]:
# Visualize tuning results
fig = plot_tune_results(ridge_results, metric="rmse", show_best=3)
fig.update_layout(title="Ridge Regression: Penalty vs RMSE")
fig.show()

### 1.2 Lasso Regression (L1 Regularization)

In [11]:
# Lasso workflow
wf_lasso = (
    workflow()
    .add_formula(FORMULA_STR)
    .add_model(
        linear_reg(
            penalty=tune(),
            mixture=1.0  # 1 = lasso (L1)
        )
    )
)

# Use same grid as ridge
lasso_results = tune_grid(
    wf_lasso,
    resamples=cv_folds,
    grid=ridge_grid,
    metrics=metric_set(rmse, mae, r_squared)
)

print("✓ Lasso tuning complete")
print("\nTop 5 Lasso configurations:")
display(lasso_results.show_best(metric="rmse", n=5, maximize=False))

✓ Lasso tuning complete

Top 5 Lasso configurations:


Unnamed: 0,.config,mean,penalty
0,config_010,32.219647,10.0
1,config_009,32.33955,3.593814
2,config_008,32.391791,1.29155
3,config_007,32.408612,0.464159
4,config_006,32.410015,0.16681


In [12]:
# Visualize Lasso tuning
fig = plot_tune_results(lasso_results, metric="rmse", show_best=3)
fig.update_layout(title="Lasso Regression: Penalty vs RMSE")
fig.show()

### 1.3 Elastic Net (Combined L1 + L2)

In [13]:
# Elastic Net with 2D tuning
wf_elasticnet = (
    workflow()
    .add_formula(FORMULA_STR)
    .add_model(
        linear_reg(
            penalty=tune(),
            mixture=tune()  # Tune the L1/L2 mix
        )
    )
)

# 2D grid
elasticnet_grid = grid_regular(
    {
        "penalty": {"range": (0.001, 10.0), "trans": "log"},
        "mixture": {"range": (0.0, 1.0)}  # 0=ridge, 1=lasso
    },
    levels=5  # 5x5 = 25 combinations
)

print(f"Elastic Net grid: {len(elasticnet_grid)} combinations")
print(elasticnet_grid.head())

Elastic Net grid: 25 combinations
   penalty  mixture     .config
0    0.001     0.00  config_001
1    0.001     0.25  config_002
2    0.001     0.50  config_003
3    0.001     0.75  config_004
4    0.001     1.00  config_005


In [14]:
# Tune Elastic Net
elasticnet_results = tune_grid(
    wf_elasticnet,
    resamples=cv_folds,
    grid=elasticnet_grid,
    metrics=metric_set(rmse, mae, r_squared)
)

print("✓ Elastic Net tuning complete")
print("\nTop 5 Elastic Net configurations:")
display(elasticnet_results.show_best(metric="rmse", n=5, maximize=False))

✓ Elastic Net tuning complete

Top 5 Elastic Net configurations:


Unnamed: 0,.config,mean,penalty,mixture
0,config_025,32.219647,10.0,1.0
1,config_024,32.232716,10.0,0.75
2,config_023,32.247367,10.0,0.5
3,config_022,32.263461,10.0,0.25
4,config_017,32.395454,1.0,0.25


In [15]:
# Visualize 2D tuning results (heatmap)
fig = plot_tune_results(elasticnet_results, metric="rmse")
fig.update_layout(title="Elastic Net: Penalty × Mixture Heatmap")
fig.show()

## 2. Tree-Based Model Tuning

### 2.1 Random Forest

In [16]:
# Random Forest workflow
wf_rf = (
    workflow()
    .add_formula(FORMULA_STR + " - date")  # Exclude date column
    .add_model(
        rand_forest(
            mtry=tune(),        # Number of variables to sample
            trees=tune(),       # Number of trees
            min_n=tune()        # Minimum samples per leaf
        ).set_mode("regression")
    )
)

# Create grid
rf_grid = grid_regular(
    {
        "mtry": {"range": (1, 3)},      # 1-3 variables (we have 3 predictors)
        "trees": {"range": (50, 500)},
        "min_n": {"range": (2, 20)}
    },
    levels=3  # 3x3x3 = 27 combinations
)

print(f"Random Forest grid: {len(rf_grid)} combinations")
print(rf_grid.head())

Random Forest grid: 27 combinations
   mtry  trees  min_n     .config
0     1     50      2  config_001
1     1     50     11  config_002
2     1     50     20  config_003
3     1    275      2  config_004
4     1    275     11  config_005


In [17]:
# Tune Random Forest
rf_results = tune_grid(
    wf_rf,
    resamples=cv_folds,
    grid=rf_grid,
    metrics=metric_set(rmse, mae, r_squared)
)

print("✓ Random Forest tuning complete")
print("\nTop 5 Random Forest configurations:")
display(rf_results.show_best(metric="rmse", n=5, maximize=False))

✓ Random Forest tuning complete

Top 5 Random Forest configurations:


Unnamed: 0,.config,mean,mtry,trees,min_n
0,config_009,33.361387,1,500,20
1,config_006,33.536816,1,275,20
2,config_008,33.731528,1,500,11
3,config_003,33.744663,1,50,20
4,config_005,33.824203,1,275,11


In [18]:
# Visualize RF tuning (parallel coordinates for 3+ parameters)
fig = plot_tune_results(rf_results, metric="rmse", show_best=10)
fig.update_layout(title="Random Forest: Parameter Exploration")
fig.show()

### 2.2 XGBoost (Gradient Boosting)

In [19]:
# XGBoost workflow
wf_xgb = (
    workflow()
    .add_formula(FORMULA_STR + " - date")
    .add_model(
        boost_tree(
            trees=tune(),
            tree_depth=tune(),
            learn_rate=tune()
        ).set_mode("regression").set_engine("xgboost")
    )
)

# XGBoost grid
xgb_grid = grid_regular(
    {
        "trees": {"range": (50, 500)},
        "tree_depth": {"range": (3, 10)},
        "learn_rate": {"range": (0.01, 0.3), "trans": "log"}
    },
    levels=3  # 3x3x3 = 27 combinations
)

print(f"XGBoost grid: {len(xgb_grid)} combinations")

XGBoost grid: 27 combinations


In [20]:
# Tune XGBoost
xgb_results = tune_grid(
    wf_xgb,
    resamples=cv_folds,
    grid=xgb_grid,
    metrics=metric_set(rmse, mae, r_squared)
)

print("✓ XGBoost tuning complete")
print("\nTop 5 XGBoost configurations:")
display(xgb_results.show_best(metric="rmse", n=5, maximize=False))

✓ XGBoost tuning complete

Top 5 XGBoost configurations:


Unnamed: 0,.config,mean,trees,tree_depth,learn_rate
0,config_001,34.25651,50,3,0.01
1,config_004,35.979325,50,6,0.01
2,config_007,36.550279,50,10,0.01
3,config_010,38.813393,275,3,0.01
4,config_002,38.94654,50,3,0.054772


In [21]:
# Visualize XGBoost tuning
fig = plot_tune_results(xgb_results, metric="rmse", show_best=10)
fig.update_layout(title="XGBoost: Parameter Exploration")
fig.show()

## 3. Time Series Model Tuning

### 3.1 Prophet with Hyperparameters

In [22]:
# Prophet workflow with tunable parameters
wf_prophet = (
    workflow()
    .add_formula("target ~ date + totaltar")
    .add_model(
        prophet_reg(
            changepoint_prior_scale=tune(),
            seasonality_prior_scale=tune()
        )
    )
)

# Prophet grid
prophet_grid = grid_regular(
    {
        "changepoint_prior_scale": {"range": (0.001, 0.5), "trans": "log"},
        "seasonality_prior_scale": {"range": (0.01, 10.0), "trans": "log"}
    },
    levels=5  # 5x5 = 25 combinations
)

print(f"Prophet grid: {len(prophet_grid)} combinations")

Prophet grid: 25 combinations


In [23]:
# Time series CV for Prophet
ts_cv_folds = time_series_cv(
    train_data,
    date_column='date',
    initial='12 months',
    assess='3 months',
    skip='3 months',
    cumulative=True
)

print(f"Created {len(ts_cv_folds)} time series CV folds")

Created 9 time series CV folds


In [24]:
# Tune Prophet
prophet_results = tune_grid(
    wf_prophet,
    resamples=ts_cv_folds,
    grid=prophet_grid,
    metrics=metric_set(rmse, mae, mape)
)

print("✓ Prophet tuning complete")
print("\nTop 5 Prophet configurations:")
display(prophet_results.show_best(metric="rmse", n=5, maximize=False))

00:30:38 - cmdstanpy - INFO - Chain [1] start processing
00:30:38 - cmdstanpy - INFO - Chain [1] done processing


✓ Prophet tuning complete

Top 5 Prophet configurations:


Unnamed: 0,.config,mean,changepoint_prior_scale,seasonality_prior_scale
0,config_001,26.648282,0.001,0.01
1,config_002,26.802119,0.001,0.056234
2,config_003,26.819105,0.001,0.316228
3,config_017,26.885436,0.105737,0.056234
4,config_012,26.901875,0.022361,0.056234


In [25]:
# Visualize Prophet tuning (heatmap)
fig = plot_tune_results(prophet_results, metric="rmse")
fig.update_layout(title="Prophet: Changepoint vs Seasonality Prior Scales")
fig.show()

## 4. Finalize and Compare Best Models

In [26]:
# Select best parameters for each model
best_ridge_params = ridge_results.select_best(metric="rmse", maximize=False)
best_lasso_params = lasso_results.select_best(metric="rmse", maximize=False)
best_elasticnet_params = elasticnet_results.select_best(metric="rmse", maximize=False)
best_rf_params = rf_results.select_best(metric="rmse", maximize=False)
best_xgb_params = xgb_results.select_best(metric="rmse", maximize=False)
best_prophet_params = prophet_results.select_best(metric="rmse", maximize=False)

print("✓ Best parameters selected for all models")
print("\nBest Ridge penalty:", best_ridge_params['penalty'])
print("Best Lasso penalty:", best_lasso_params['penalty'])
print("Best Elastic Net:", f"penalty={best_elasticnet_params['penalty']:.4f}, mixture={best_elasticnet_params['mixture']:.4f}")

✓ Best parameters selected for all models

Best Ridge penalty: 10.0
Best Lasso penalty: 10.0
Best Elastic Net: penalty=10.0000, mixture=1.0000


In [27]:
# Finalize workflows with best parameters
final_ridge = finalize_workflow(wf_ridge, best_ridge_params)
final_lasso = finalize_workflow(wf_lasso, best_lasso_params)
final_elasticnet = finalize_workflow(wf_elasticnet, best_elasticnet_params)
final_rf = finalize_workflow(wf_rf, best_rf_params)
final_xgb = finalize_workflow(wf_xgb, best_xgb_params)
final_prophet = finalize_workflow(wf_prophet, best_prophet_params)

print("✓ Workflows finalized with best parameters")

✓ Workflows finalized with best parameters


In [28]:
# Fit final models and evaluate on test data
fit_ridge = final_ridge.fit(train_data).evaluate(test_data)
fit_lasso = final_lasso.fit(train_data).evaluate(test_data)
fit_elasticnet = final_elasticnet.fit(train_data).evaluate(test_data)
fit_rf = final_rf.fit(train_data).evaluate(test_data)
fit_xgb = final_xgb.fit(train_data).evaluate(test_data)
fit_prophet = final_prophet.fit(train_data).evaluate(test_data)

print("✓ All models fitted and evaluated on test data")

✓ All models fitted and evaluated on test data


In [29]:
# Extract stats for comparison
stats_list = [
    fit_ridge.extract_outputs()[2],
    fit_lasso.extract_outputs()[2],
    fit_elasticnet.extract_outputs()[2],
    fit_rf.extract_outputs()[2],
    fit_xgb.extract_outputs()[2],
    fit_prophet.extract_outputs()[2]
]

model_names = ["Ridge", "Lasso", "Elastic Net", "Random Forest", "XGBoost", "Prophet"]

# Compare test set performance
fig = plot_model_comparison(
    stats_list,
    model_names=model_names,
    metrics=["rmse", "mae", "r_squared"],
    plot_type="bar"
)
fig.update_layout(title="Model Comparison: Test Set Performance")
fig.show()

In [30]:
# Create summary table
summary = []
for name, stats_df in zip(model_names, stats_list):
    test_stats = stats_df[stats_df['split'] == 'test']
    summary.append({
        'Model': name,
        'RMSE': test_stats[test_stats['metric'] == 'rmse']['value'].values[0],
        'MAE': test_stats[test_stats['metric'] == 'mae']['value'].values[0],
        'R²': test_stats[test_stats['metric'] == 'r_squared']['value'].values[0]
    })

summary_df = pd.DataFrame(summary).sort_values('RMSE')
print("\nTest Set Performance Summary (sorted by RMSE):")
display(summary_df)


Test Set Performance Summary (sorted by RMSE):


Unnamed: 0,Model,RMSE,MAE,R²
3,Random Forest,29.35348,23.753643,-0.987013
5,Prophet,30.831311,22.953873,-1.192126
4,XGBoost,31.933695,24.643882,-1.351689
1,Lasso,43.004872,36.037723,-3.264979
2,Elastic Net,43.004872,36.037723,-3.264979
0,Ridge,43.490826,36.447385,-3.361912


## 5. Visualize Best Model Forecasts

In [31]:
# Plot forecast for best model (lowest RMSE)
best_model_name = summary_df.iloc[0]['Model']
best_fit = [fit_ridge, fit_lasso, fit_elasticnet, fit_rf, fit_xgb, fit_prophet][model_names.index(best_model_name)]

fig = plot_forecast(best_fit, prediction_intervals=False)
fig.update_layout(title=f"Best Model Forecast: {best_model_name}")
fig.show()

In [32]:
# Compare top 3 models
top3_models = summary_df.head(3)['Model'].tolist()
top3_fits = []
for name in top3_models:
    idx = model_names.index(name)
    top3_fits.append([fit_ridge, fit_lasso, fit_elasticnet, fit_rf, fit_xgb, fit_prophet][idx])

print(f"\nTop 3 models by RMSE: {', '.join(top3_models)}")


Top 3 models by RMSE: Random Forest, Prophet, XGBoost


## 6. Advanced Workflows with Preprocessing

### 6.1 Recipe + Model Workflow

In [33]:
# Create recipe with preprocessing steps
# For recipes, we prep/bake outside the workflow, then use formula in workflow
rec = (
    recipe()  # Create empty recipe
    .step_normalize()  # Normalize numeric features (z-score) - None = all numeric
)

print("✓ Recipe created with imputation and normalization steps")

✓ Recipe created with imputation and normalization steps


In [34]:
# Workflow with recipe + tunable model
wf_recipe_xgb = (
    workflow()
    .add_recipe(rec)
    .add_model(
        boost_tree(
            trees=tune(),
            tree_depth=tune(),
            learn_rate=tune()
        ).set_mode("regression").set_engine("xgboost")
    )
)

print("✓ Workflow with recipe + XGBoost created")

✓ Workflow with recipe + XGBoost created


In [35]:
# Tune the recipe+model workflow
recipe_xgb_results = tune_grid(
    wf_recipe_xgb,
    resamples=cv_folds,
    grid=xgb_grid,
    metrics=metric_set(rmse, mae, r_squared)
)

print("✓ Recipe + XGBoost tuning complete")
print("\nTop 5 configurations:")
display(recipe_xgb_results.show_best(metric="rmse", n=5, maximize=False))

✓ Recipe + XGBoost tuning complete

Top 5 configurations:


Unnamed: 0,.config,mean,trees,tree_depth,learn_rate


In [36]:
# Compare with/without preprocessing
best_recipe_xgb_params = recipe_xgb_results.select_best(metric="rmse", maximize=False)
final_recipe_xgb = finalize_workflow(wf_recipe_xgb, best_recipe_xgb_params)
fit_recipe_xgb = final_recipe_xgb.fit(train_data).evaluate(test_data)

# Compare
stats_recipe_xgb = fit_recipe_xgb.extract_outputs()[2]
test_rmse_recipe = stats_recipe_xgb[(stats_recipe_xgb['split'] == 'test') & (stats_recipe_xgb['metric'] == 'rmse')]['value'].values[0]
test_rmse_plain = summary_df[summary_df['Model'] == 'XGBoost']['RMSE'].values[0]

print(f"\nXGBoost without preprocessing: RMSE = {test_rmse_plain:.4f}")
print(f"XGBoost with preprocessing:    RMSE = {test_rmse_recipe:.4f}")
print(f"Improvement: {((test_rmse_plain - test_rmse_recipe) / test_rmse_plain * 100):.2f}%")

IndexError: single positional indexer is out-of-bounds

## Summary

This notebook demonstrated:

1. **Hyperparameter Tuning**:
   - Ridge, Lasso, Elastic Net (1D and 2D grids)
   - Random Forest and XGBoost (3D grids)
   - Prophet (2D grid with time series CV)

2. **Workflow Composition**:
   - Formula-based workflows
   - Recipe-based workflows with preprocessing
   - Finalizing workflows with best parameters

3. **Cross-Validation**:
   - Standard k-fold CV for ML models
   - Time series CV for time series models

4. **Model Comparison**:
   - Systematic evaluation on test set
   - Visual comparison with plot_model_comparison
   - Summary tables

5. **Visualization**:
   - Tuning result plots (1D, 2D, parallel coordinates)
   - Forecast plots
   - Model comparison plots

**Next Steps**: See the companion workflowsets notebook for multi-model comparison across different preprocessing strategies.