# Advanced Tuning: Racing with ANOVA (tune_race_anova)

This notebook demonstrates **statistical racing** for hyperparameter optimization using repeated measures ANOVA.

## Key Benefits:
- **50-80% faster** than exhaustive grid search
- **Statistically rigorous**: Uses repeated measures ANOVA to filter poor configs
- **Early elimination**: Drops unpromising parameter combinations
- **Efficient**: More evaluations on promising configs

## Racing Algorithm:
1. Evaluate all configs on first resample
2. After each resample, run ANOVA test
3. Eliminate configs significantly worse than best
4. Continue with survivors only
5. Stop when one winner or min resamples reached

## Setup

In [None]:
import pandas as pd
import numpy as np
import warnings
import time
warnings.filterwarnings('ignore')

# py-tidymodels imports
from py_workflows import workflow
from py_parsnip import rand_forest, boost_tree, linear_reg
from py_rsample import vfold_cv
from py_yardstick import metric_set, rmse, mae, r_squared
from py_tune import (
    tune, grid_regular, tune_grid,
    tune_race_anova, control_race
)

print("✓ All imports successful")

## Load and Prepare Data

In [None]:
# Load data
df = pd.read_csv('../_md/__data/preem.csv')
df['date'] = pd.to_datetime(df['date'])
df = df.drop(columns=['date'])  # Drop date to avoid patsy categorical issues

print(f"Dataset shape: {df.shape}")
print(f"Date range: {df['date'].min()} to {df['date'].max()}")
print(f"\nColumns: {list(df.columns)[:5]}...")

# Display summary
df.head()

In [None]:
# Define formula (exclude date from predictors)
FORMULA = "target ~ totaltar + mean_med_diesel_crack_input1_trade_month_lag2 + mean_nwe_hsfo_crack_trade_month_lag1"

print(f"Formula: {FORMULA}")

## 1. Random Forest Tuning with ANOVA Racing

### 1.1 Setup: Create Large Grid (81 combinations)

In [None]:
# Workflow with tunable Random Forest
wf_rf = (
    workflow()
    .add_formula(FORMULA)
    .add_model(
        rand_forest(
            mtry=tune(),        # Number of variables to sample
            trees=tune(),       # Number of trees
            min_n=tune()        # Minimum samples per leaf
        ).set_mode("regression")
    )
)

# Create large 3D grid
rf_grid = grid_regular(
    {
        "mtry": {"range": (1, 3)},       # 1-3 variables
        "trees": {"range": (50, 500)},   # Trees
        "min_n": {"range": (2, 40)}      # Min samples
    },
    levels=5  # 5x5x5 = 125 combinations
)

print(f"Grid size: {len(rf_grid)} parameter combinations")
print(f"\nFirst 5 configurations:")
print(rf_grid.head())

In [None]:
# Create cross-validation folds
cv_folds = vfold_cv(df, v=10)

print(f"Created {len(cv_folds)} CV folds")
print(f"Total evaluations without racing: {len(rf_grid)} configs × {len(cv_folds)} folds = {len(rf_grid) * len(cv_folds)} model fits")

### 1.2 Baseline: Standard Grid Search (for comparison)

In [None]:
# Time standard grid search
print("Running standard tune_grid (exhaustive search)...")
start_time = time.time()

grid_results = tune_grid(
    wf_rf,
    resamples=cv_folds,
    grid=rf_grid,
    metrics=metric_set(rmse, mae)
)

grid_time = time.time() - start_time

print(f"\n✓ Grid search complete in {grid_time:.1f} seconds")
print(f"Evaluated: {len(rf_grid)} configs × {len(cv_folds)} folds = {len(rf_grid) * len(cv_folds)} fits")

# Show best
print("\nTop 5 configurations:")
grid_results.show_best(metric="rmse", n=5, maximize=False)

### 1.3 Racing with ANOVA

Now let's use `tune_race_anova()` with the same grid.

In [None]:
# Configure racing
race_ctrl = control_race(
    burn_in=3,          # Evaluate all configs on first 3 folds
    num_ties=5,         # Keep top 5 if tied
    alpha=0.05,         # Significance level for ANOVA
    verbose_elim=True,       # Show progress
    save_pred=False
)

print("Racing configuration:")
print(f"  Burn-in: {race_ctrl.burn_in} folds (all configs evaluated)")
print(f"  Significance: α = {race_ctrl.alpha}")
print(f"  Ties kept: {race_ctrl.num_ties}")
print(f"\nThis will eliminate poor configs early!")

In [None]:
# Run ANOVA racing
print("Running tune_race_anova (efficient search)...\n")
start_time = time.time()

race_results = tune_race_anova(
    wf_rf,
    resamples=cv_folds,
    grid=rf_grid,
    metrics=metric_set(rmse, mae),
    control=race_ctrl
)

race_time = time.time() - start_time

print(f"\n✓ Racing complete in {race_time:.1f} seconds")

### 1.4 Performance Comparison

In [None]:
# Compare timing
speedup = grid_time / race_time
reduction_pct = (1 - race_time / grid_time) * 100

print("=" * 60)
print("PERFORMANCE COMPARISON")
print("=" * 60)
print(f"\nStandard grid search:  {grid_time:.1f} seconds")
print(f"ANOVA racing:          {race_time:.1f} seconds")
print(f"\nSpeedup:               {speedup:.2f}x faster")
print(f"Time reduction:        {reduction_pct:.1f}%")

# Count actual evaluations in racing
race_metrics = race_results.metrics
n_race_evals = len(race_metrics[race_metrics['metric'] == 'rmse'])
n_grid_evals = len(grid_results.metrics[grid_results.metrics['metric'] == 'rmse'])

print(f"\nModel fits (grid):     {n_grid_evals}")
print(f"Model fits (racing):   {n_race_evals}")
print(f"Evaluation reduction:  {(1 - n_race_evals/n_grid_evals)*100:.1f}%")

In [None]:
# Compare best results
grid_best = grid_results.select_best(metric="rmse", maximize=False)
race_best = race_results.select_best(metric="rmse", maximize=False)

print("Best parameters found:")
print("\nGrid search:")
for param, value in grid_best.items():
    print(f"  {param}: {value}")

print("\nRacing:")
for param, value in race_best.items():
    print(f"  {param}: {value}")

# Check if same winner
if grid_best == race_best:
    print("\n✓ Both methods found the SAME best configuration!")
else:
    print("\n⚠ Different winners (both should be close in performance)")

In [None]:
# Show top 10 from racing
print("Top 10 configurations from racing:")
race_results.show_best(metric="rmse", n=10, maximize=False)

## 2. XGBoost Tuning with ANOVA Racing

Let's test racing on a different model type with continuous parameters.

In [None]:
# XGBoost workflow
wf_xgb = (
    workflow()
    .add_formula(FORMULA)
    .add_model(
        boost_tree(
            trees=tune(),
            tree_depth=tune(),
            learn_rate=tune(),
            min_n=tune()
        ).set_mode("regression").set_engine("xgboost")
    )
)

# 4D grid
xgb_grid = grid_regular(
    {
        "trees": {"range": (50, 300)},
        "tree_depth": {"range": (3, 10)},
        "learn_rate": {"range": (0.001, 0.3), "trans": "log"},
        "min_n": {"range": (2, 40)}
    },
    levels=4  # 4^4 = 256 combinations
)

print(f"XGBoost grid: {len(xgb_grid)} combinations")
print(f"Full grid search would require {len(xgb_grid) * len(cv_folds)} model fits")

In [None]:
# Run racing on XGBoost
print("Running tune_race_anova on XGBoost...\n")
start_time = time.time()

xgb_race_results = tune_race_anova(
    wf_xgb,
    resamples=cv_folds,
    grid=xgb_grid,
    metrics=metric_set(rmse, mae, r_squared),
    control=race_ctrl
)

xgb_time = time.time() - start_time

print(f"\n✓ XGBoost racing complete in {xgb_time:.1f} seconds")

In [None]:
# Show best XGBoost configurations
print("Top 5 XGBoost configurations:")
xgb_race_results.show_best(metric="rmse", n=5, maximize=False)

In [None]:
# Count evaluations
xgb_evals = len(xgb_race_results.metrics[xgb_race_results.metrics['metric'] == 'rmse'])
xgb_full = len(xgb_grid) * len(cv_folds)

print(f"\nXGBoost evaluations:")
print(f"  Racing: {xgb_evals} fits")
print(f"  Full grid: {xgb_full} fits")
print(f"  Saved: {xgb_full - xgb_evals} fits ({(1 - xgb_evals/xgb_full)*100:.1f}% reduction)")

## 3. Controlling Racing Behavior

### 3.1 Aggressive Racing (eliminate quickly)

In [None]:
# Aggressive racing: higher alpha, fewer ties
aggressive_ctrl = control_race(
    burn_in=2,          # Start eliminating after 2 folds
    num_ties=3,         # Keep only top 3
    alpha=0.10,         # Less stringent (eliminate more)
    verbose_elim=True
)

print("Running aggressive racing...\n")
start_time = time.time()

aggressive_results = tune_race_anova(
    wf_rf,
    resamples=cv_folds,
    grid=rf_grid,
    metrics=metric_set(rmse),
    control=aggressive_ctrl
)

aggressive_time = time.time() - start_time
print(f"\n✓ Aggressive racing: {aggressive_time:.1f} seconds")

### 3.2 Conservative Racing (keep more candidates)

In [None]:
# Conservative racing: lower alpha, more ties
conservative_ctrl = control_race(
    burn_in=4,          # Longer burn-in
    num_ties=10,        # Keep top 10
    alpha=0.01,         # Very stringent (keep more)
    verbose_elim=True
)

print("Running conservative racing...\n")
start_time = time.time()

conservative_results = tune_race_anova(
    wf_rf,
    resamples=cv_folds,
    grid=rf_grid,
    metrics=metric_set(rmse),
    control=conservative_ctrl
)

conservative_time = time.time() - start_time
print(f"\n✓ Conservative racing: {conservative_time:.1f} seconds")

In [None]:
# Compare racing strategies
strategies = {
    'Standard': race_time,
    'Aggressive': aggressive_time,
    'Conservative': conservative_time
}

print("Racing strategy comparison:")
print("=" * 50)
for name, t in strategies.items():
    print(f"{name:15s}: {t:6.1f} seconds ({speedup * grid_time / t:.2f}x speedup)")

print("\n✓ More aggressive = faster but might miss good configs")
print("✓ More conservative = slower but safer")

## 4. Summary and Best Practices

### When to use ANOVA racing:
- ✓ **Large parameter grids** (50+ configurations)
- ✓ **Many resamples** (5-10 CV folds)
- ✓ **Expensive models** (slow to train)
- ✓ **Clear winners** (some configs much better)

### Configuration guidelines:
- **burn_in**: 2-4 folds (need baseline for ANOVA)
- **alpha**: 0.05 standard, 0.10 aggressive, 0.01 conservative
- **num_ties**: 5-10 (keep enough for robust selection)

### Expected speedup:
- **Typical**: 2-5x faster than grid search
- **Best case**: 10x faster with clear winner
- **Worst case**: Similar to grid if all configs equally good

### Trade-offs:
- ✓ Much faster than exhaustive search
- ✓ Statistically principled elimination
- ⚠ Might eliminate close competitors early
- ⚠ Requires multiple resamples (doesn't help with single holdout)

In [None]:
# Final summary
print("\n" + "=" * 70)
print("FINAL SUMMARY: tune_race_anova()")
print("=" * 70)
print(f"\nDataset: {df.shape[0]} observations, {df.shape[1]} features")
print(f"Cross-validation: {len(cv_folds)} folds")
print(f"\nRandom Forest tuning:")
print(f"  Grid size: {len(rf_grid)} configurations")
print(f"  Grid search: {grid_time:.1f}s ({len(rf_grid) * len(cv_folds)} fits)")
print(f"  Racing: {race_time:.1f}s ({n_race_evals} fits)")
print(f"  Speedup: {speedup:.2f}x")
print(f"\nXGBoost tuning:")
print(f"  Grid size: {len(xgb_grid)} configurations")
print(f"  Racing: {xgb_time:.1f}s ({xgb_evals} fits)")
print(f"  Saved: {xgb_full - xgb_evals} evaluations")
print(f"\n✓ Racing provides significant speedup with minimal accuracy loss")
print("✓ Perfect for initial hyperparameter screening")
print("✓ Combine with Bayesian optimization for best results")