<a href="https://colab.research.google.com/github/john-d-noble/callcenter/blob/main/FINAL_CX_CB_RUN_3_IPYNB.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Enhanced Time Series Forecasting System

**Complete Implementation with 25-30 ML Models and VERIFIED GridSearchCV**

This notebook implements a comprehensive machine learning approach to time series forecasting with full debugging.

## Key Features:
- 25-30 diverse ML models (tree-based, linear, neural networks, ensemble)
- VERIFIED GridSearchCV execution with timing
- Advanced feature engineering for time series → supervised learning
- Market regime-aware model selection and feature engineering
- Three-phase optimization: V1 → V2 → VP
- Comprehensive debugging and error visibility
- Performance monitoring at every step

## GPU and System Check

In [None]:
# Check NVIDIA GPU Status
!nvidia-smi

In [None]:
# PyTorch GPU Setup and System Info
import torch
import psutil

# Check if CUDA (GPU) is available
if torch.cuda.is_available():
    device = torch.device("cuda")
    print("Using GPU:", torch.cuda.get_device_name(0))
    print("GPU Memory:", torch.cuda.get_device_properties(0).total_memory / 1e9, "GB")
    print("GPU Compute Capability:", torch.cuda.get_device_properties(0).major, ".", torch.cuda.get_device_properties(0).minor)
    GPU_AVAILABLE = True
else:
    device = torch.device("cpu")
    print("Using CPU")
    GPU_AVAILABLE = False

# Memory check
available_ram_gb = psutil.virtual_memory().available / 1e9
total_ram_gb = psutil.virtual_memory().total / 1e9
print(f"Available RAM: {available_ram_gb:.1f} GB")
print(f"Total RAM: {total_ram_gb:.1f} GB")

# Set system capabilities flags
HIGH_MEMORY = available_ram_gb > 16
print(f"\nSYSTEM CAPABILITIES:")
print(f"  GPU Available: {GPU_AVAILABLE}")
print(f"  High Memory: {HIGH_MEMORY}")

# Example: Move a tensor to the GPU
x = torch.randn(10, 10).to(device)
print(f"\nTensor device test: {x.device}")

In [None]:
!pip uninstall -y numpy
!pip install numpy==1.26.4
!pip install tensorflow
!pip install tbats
!pip install pmdarima

## Enhanced Imports & Logging Setup

In [1]:
# Enhanced imports with debugging setup
import pandas as pd
import numpy as np
import warnings
import time
import logging
import sys
import traceback
from datetime import datetime, timedelta
from typing import Dict, List, Optional, Tuple, Any, Union
from dataclasses import dataclass
import itertools
from abc import ABC, abstractmethod

# PyTorch GPU Setup and System Info
import torch
import psutil

# Configure comprehensive logging for notebook visibility
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s',
    handlers=[
        logging.StreamHandler(sys.stdout),  # Force notebook output
        logging.FileHandler('ml_pipeline_debug.log')  # Save to file
    ]
)
logger = logging.getLogger(__name__)
logger.info("DEBUG LOGGING INITIALIZED - Messages will appear in notebook and log file")

# Check if CUDA (GPU) is available
if torch.cuda.is_available():
    device = torch.device("cuda")
    print("Using GPU:", torch.cuda.get_device_name(0))
    print("GPU Memory:", torch.cuda.get_device_properties(0).total_memory / 1e9, "GB")
    print("GPU Compute Capability:", torch.cuda.get_device_properties(0).major, ".", torch.cuda.get_device_properties(0).minor)
    GPU_AVAILABLE = True
else:
    device = torch.device("cpu")
    print("Using CPU")
    GPU_AVAILABLE = False

# Memory check
available_ram_gb = psutil.virtual_memory().available / 1e9
total_ram_gb = psutil.virtual_memory().total / 1e9
print(f"Available RAM: {available_ram_gb:.1f} GB")
print(f"Total RAM: {total_ram_gb:.1f} GB")

# Set system capabilities flags
HIGH_MEMORY = available_ram_gb > 16
print(f"\nSYSTEM CAPABILITIES:")
print(f"  GPU Available: {GPU_AVAILABLE}")
print(f"  High Memory: {HIGH_MEMORY}")

# Example: Move a tensor to the GPU
x = torch.randn(10, 10).to(device)
print(f"\nTensor device test: {x.device}")


# Core ML imports
from sklearn.model_selection import GridSearchCV, TimeSeriesSplit, train_test_split
from sklearn.preprocessing import StandardScaler, RobustScaler, MinMaxScaler
from sklearn.pipeline import Pipeline
from sklearn.feature_selection import SelectKBest, f_regression, RFE
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.base import clone

# Tree-based models
from sklearn.ensemble import RandomForestRegressor, ExtraTreesRegressor, GradientBoostingRegressor
from sklearn.ensemble import BaggingRegressor, AdaBoostRegressor, VotingRegressor, StackingRegressor
from sklearn.tree import DecisionTreeRegressor

# Check optional libraries with detailed reporting
logger.info("Checking optional ML libraries...")
try:
    import xgboost as xgb
    XGB_AVAILABLE = True
    logger.info(f"XGBoost available: {xgb.__version__}")
    # Test GPU capability
    if GPU_AVAILABLE:
        try:
            test_xgb = xgb.XGBRegressor(tree_method='gpu_hist', n_estimators=1)
            XGB_GPU_AVAILABLE = True
            logger.info("XGBoost GPU support: AVAILABLE")
        except Exception as e:
            XGB_GPU_AVAILABLE = False
            logger.warning(f"XGBoost GPU support: NOT AVAILABLE - {str(e)}")
    else:
        XGB_GPU_AVAILABLE = False
except ImportError:
    XGB_AVAILABLE = False
    XGB_GPU_AVAILABLE = False
    logger.warning("XGBoost not available. Install with: pip install xgboost")

try:
    import lightgbm as lgb
    LGB_AVAILABLE = True
    logger.info(f"LightGBM available: {lgb.__version__}")
    # Test GPU capability
    if GPU_AVAILABLE:
        try:
            test_lgb = lgb.LGBMRegressor(device='gpu', n_estimators=1, verbose=-1)
            LGB_GPU_AVAILABLE = True
            logger.info("LightGBM GPU support: AVAILABLE")
        except Exception as e:
            LGB_GPU_AVAILABLE = False
            logger.warning(f"LightGBM GPU support: NOT AVAILABLE - {str(e)}")
    else:
        LGB_GPU_AVAILABLE = False
except ImportError:
    LGB_AVAILABLE = False
    LGB_GPU_AVAILABLE = False
    logger.warning("LightGBM not available. Install with: pip install lightgbm")

# Linear models
from sklearn.linear_model import (
    LinearRegression, Ridge, Lasso, ElasticNet,
    BayesianRidge, ARDRegression, HuberRegressor,
    SGDRegressor, PassiveAggressiveRegressor
)

# Neural networks
from sklearn.neural_network import MLPRegressor

# Other models
from sklearn.svm import SVR
from sklearn.neighbors import KNeighborsRegressor
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.kernel_ridge import KernelRidge

# Feature engineering
from sklearn.preprocessing import PolynomialFeatures
from scipy import stats
from scipy.signal import find_peaks

# Suppress sklearn warnings but keep our logging
warnings.filterwarnings('ignore', category=UserWarning)
warnings.filterwarnings('ignore', category=FutureWarning)

# Set reproducible seed
np.random.seed(42)

logger.info("All imports completed successfully")
print(f"\nLIBRARY STATUS:")
print(f"  XGBoost: {'✅' if XGB_AVAILABLE else '❌'} (GPU: {'✅' if XGB_GPU_AVAILABLE else '❌'})")
print(f"  LightGBM: {'✅' if LGB_AVAILABLE else '❌'} (GPU: {'✅' if LGB_GPU_AVAILABLE else '❌'})")
print(f"  PyTorch: {'✅' if torch.cuda.is_available() else '❌'}")

Using GPU: Tesla T4
GPU Memory: 15.828320256 GB
GPU Compute Capability: 7 . 5
Available RAM: 52.5 GB
Total RAM: 54.8 GB

SYSTEM CAPABILITIES:
  GPU Available: True
  High Memory: True

Tensor device test: cuda:0

LIBRARY STATUS:
  XGBoost: ✅ (GPU: ✅)
  LightGBM: ✅ (GPU: ✅)
  PyTorch: ✅


## Configuration with Debug Settings

In [2]:
@dataclass
class DebugMLForecastingConfig:
    """Configuration with comprehensive debugging capabilities"""

    # Data parameters
    target_column: str = "calls"
    seasonal_period: int = 7
    test_split_ratio: float = 0.7
    validation_split_ratio: float = 0.15

    # Feature engineering
    max_lags: int = 21
    rolling_windows: List[int] = None
    create_technical_indicators: bool = True
    create_calendar_features: bool = True
    polynomial_degree: int = 2

    # Model selection
    use_tree_models: bool = True
    use_linear_models: bool = True
    use_neural_models: bool = True
    use_ensemble_models: bool = True
    use_other_models: bool = True

    # GridSearchCV parameters - CRITICAL FOR DEBUGGING
    cv_splits: int = 3
    parallel_jobs: int = 1  # Single job for debugging visibility
    scoring_metric: str = 'neg_mean_absolute_error'
    gridsearch_verbose: int = 2  # Show GridSearch progress

    # Market regime integration
    use_market_regime_switching: bool = True
    use_regime_features: bool = True
    vix_thresholds: Dict[str, float] = None
    regime_specific_models: Dict[str, List[str]] = None

    # Optimization levels
    quick_search: bool = False  # Force detailed search for debugging
    detailed_search: bool = True
    top_models_for_optimization: int = 5

    # Pipeline phases
    enable_v2_feature_engineering: bool = True
    enable_vp_optimization: bool = True

    # Feature selection
    feature_selection_method: str = 'auto'
    max_features_ratio: float = 0.8

    # Debug settings
    debug_mode: bool = True
    show_progress: bool = True
    time_each_phase: bool = True
    validate_gridsearch: bool = True
    save_intermediate_results: bool = True

    def __post_init__(self):
        if self.rolling_windows is None:
            if HIGH_MEMORY:
                self.rolling_windows = [3, 7, 14, 21, 30]
                self.max_lags = 30
                self.top_models_for_optimization = 8
                self.cv_splits = 5
            else:
                self.rolling_windows = [3, 7, 14]
                self.max_lags = 15
                self.top_models_for_optimization = 3

        if self.vix_thresholds is None:
            self.vix_thresholds = {
                'low_volatility': 15,
                'normal': 25,
                'high_volatility': 35
            }

        if self.regime_specific_models is None:
            self.regime_specific_models = {
                'low_volatility': ['LinearRegression', 'Ridge', 'RandomForest'],
                'normal': ['RandomForest', 'GradientBoosting', 'XGBoost', 'LightGBM'],
                'high_volatility': ['ExtraTrees', 'SVR', 'KNeighbors', 'MLP'],
                'extreme_volatility': ['Lasso', 'ElasticNet', 'Huber', 'BayesianRidge']
            }

        logger.info(f"Configuration initialized - Debug mode: {self.debug_mode}")
        logger.info(f"  Max lags: {self.max_lags}")
        logger.info(f"  Rolling windows: {self.rolling_windows}")
        logger.info(f"  Top models for optimization: {self.top_models_for_optimization}")
        logger.info(f"  GridSearch CV splits: {self.cv_splits}")

# Performance monitoring decorator
def monitor_performance(func_name: str = None):
    """Decorator to monitor function execution time and memory"""
    def decorator(func):
        def wrapper(*args, **kwargs):
            name = func_name or func.__name__
            start_time = time.time()
            start_memory = psutil.Process().memory_info().rss / 1024 / 1024  # MB

            logger.info(f"🚀 Starting {name}...")

            try:
                result = func(*args, **kwargs)

                end_time = time.time()
                end_memory = psutil.Process().memory_info().rss / 1024 / 1024  # MB

                duration = end_time - start_time
                memory_delta = end_memory - start_memory

                logger.info(f"✅ {name} completed in {duration:.1f}s, memory: {memory_delta:+.1f}MB")

                return result

            except Exception as e:
                end_time = time.time()
                duration = end_time - start_time
                logger.error(f"❌ {name} failed after {duration:.1f}s: {str(e)}")
                raise

        return wrapper
    return decorator

print("Configuration and monitoring setup complete")

Configuration and monitoring setup complete


## Base ML Forecaster with Enhanced Debugging

In [3]:
class DebugMLForecaster:
    """Enhanced ML forecaster with comprehensive debugging"""

    def __init__(self, name: str, model, debug_mode: bool = True):
        self.name = name
        self.model = clone(model)
        self.pipeline = None
        self.is_fitted = False
        self.feature_importance_ = None
        self.debug_mode = debug_mode
        self.training_time = None
        self.prediction_time = None
        self.fit_error = None

    def fit(self, X: pd.DataFrame, y: pd.Series) -> 'DebugMLForecaster':
        """Fit model with comprehensive error handling and timing"""

        fit_start = time.time()

        try:
            if self.debug_mode:
                logger.debug(f"Fitting {self.name} on data shape: {X.shape}")

            # Validate input data
            if X.empty or y.empty:
                raise ValueError(f"Empty input data for {self.name}")

            if len(X) != len(y):
                raise ValueError(f"Feature/target length mismatch for {self.name}: {len(X)} vs {len(y)}")

            # Convert to numpy arrays for consistent handling
            X_array = X.values if isinstance(X, pd.DataFrame) else X
            y_array = y.values if isinstance(y, pd.Series) else y

            # Check for NaN/inf values
            if np.any(np.isnan(X_array)) or np.any(np.isinf(X_array)):
                logger.warning(f"{self.name}: Found NaN/inf in features, cleaning...")
                X_array = np.nan_to_num(X_array, nan=0.0, posinf=1e6, neginf=-1e6)

            if np.any(np.isnan(y_array)) or np.any(np.isinf(y_array)):
                logger.warning(f"{self.name}: Found NaN/inf in target, cleaning...")
                y_array = np.nan_to_num(y_array, nan=np.nanmean(y_array))

            # Create and fit pipeline
            self.pipeline = Pipeline([
                ('scaler', StandardScaler()),
                ('model', clone(self.model))
            ])

            # Fit with error handling for specific model types
            try:
                self.pipeline.fit(X_array, y_array.ravel())
                self.is_fitted = True

                # Extract feature importance
                self._extract_feature_importance(X)

                # Test prediction capability
                test_pred = self.pipeline.predict(X_array[:min(5, len(X_array))])
                if np.any(np.isnan(test_pred)) or np.any(np.isinf(test_pred)):
                    logger.warning(f"{self.name}: Model produces invalid predictions")
                    self.is_fitted = False

            except Exception as model_error:
                self.fit_error = str(model_error)
                logger.error(f"{self.name} fit failed: {self.fit_error}")
                self.is_fitted = False

        except Exception as e:
            self.fit_error = str(e)
            logger.error(f"{self.name} preprocessing failed: {self.fit_error}")
            self.is_fitted = False

        finally:
            self.training_time = time.time() - fit_start

            if self.debug_mode:
                status = "✅ SUCCESS" if self.is_fitted else "❌ FAILED"
                logger.info(f"{self.name}: {status} (Training time: {self.training_time:.2f}s)")

        return self

    def predict(self, X: pd.DataFrame) -> np.ndarray:
        """Make predictions with error handling"""

        if not self.is_fitted:
            logger.error(f"{self.name}: Cannot predict - model not fitted")
            return np.zeros(len(X))

        pred_start = time.time()

        try:
            X_array = X.values if isinstance(X, pd.DataFrame) else X

            # Clean input data
            if np.any(np.isnan(X_array)) or np.any(np.isinf(X_array)):
                X_array = np.nan_to_num(X_array, nan=0.0, posinf=1e6, neginf=-1e6)

            predictions = self.pipeline.predict(X_array)

            # Validate predictions
            if np.any(np.isnan(predictions)) or np.any(np.isinf(predictions)):
                logger.warning(f"{self.name}: Invalid predictions detected, cleaning...")
                predictions = np.nan_to_num(predictions, nan=0.0, posinf=1e6, neginf=-1e6)

            self.prediction_time = time.time() - pred_start

            if self.debug_mode:
                logger.debug(f"{self.name}: Prediction completed in {self.prediction_time:.3f}s")

            return predictions.flatten()

        except Exception as e:
            self.prediction_time = time.time() - pred_start
            logger.error(f"{self.name} prediction failed: {str(e)}")
            return np.zeros(len(X))

    def _extract_feature_importance(self, X: pd.DataFrame):
        """Extract feature importance if available"""
        try:
            model = self.pipeline.named_steps['model']

            if hasattr(model, 'feature_importances_'):
                importances = model.feature_importances_
            elif hasattr(model, 'coef_'):
                importances = np.abs(model.coef_).flatten()
            else:
                importances = None

            if importances is not None and len(importances) == len(X.columns):
                self.feature_importance_ = dict(zip(X.columns, importances))
                if self.debug_mode:
                    top_features = sorted(self.feature_importance_.items(), key=lambda x: x[1], reverse=True)[:3]
                    logger.debug(f"{self.name} top features: {[f[0] for f in top_features]}")

        except Exception as e:
            if self.debug_mode:
                logger.debug(f"Could not extract feature importance for {self.name}: {str(e)}")
            self.feature_importance_ = None

    def get_info(self) -> Dict[str, Any]:
        """Get comprehensive model information"""
        return {
            'name': self.name,
            'is_fitted': self.is_fitted,
            'training_time': self.training_time,
            'prediction_time': self.prediction_time,
            'fit_error': self.fit_error,
            'has_feature_importance': self.feature_importance_ is not None
        }

logger.info("Enhanced ML Forecaster class loaded")

## Data Generation with Debug Tracking

In [4]:
@monitor_performance("Data Generation")
def generate_debug_synthetic_data(n_points: int = 300) -> pd.DataFrame:
    """Generate synthetic data optimized for ML forecasting with debug tracking"""

    logger.info(f"📊 Generating synthetic data with {n_points} points...")

    np.random.seed(42)
    dates = pd.date_range(start='2023-01-01', periods=n_points, freq='D')

    # Base call volume with complex patterns
    trend = np.linspace(8000, 9500, n_points)

    # Multiple seasonal components
    weekly_seasonal = 1000 * np.sin(2 * np.pi * np.arange(n_points) / 7)
    monthly_seasonal = 500 * np.sin(2 * np.pi * np.arange(n_points) / 30)

    # Generate VIX first (needed for regime detection)
    logger.info("  Generating VIX data...")
    base_vix = 18
    vix_values = [base_vix]

    regime_counts = {'low_volatility': 0, 'normal': 0, 'high_volatility': 0, 'extreme_volatility': 0}

    for i in range(1, n_points):
        change = 0.15 * (base_vix - vix_values[-1]) + np.random.normal(0, 1.5)
        if np.random.random() < 0.08:  # 8% chance of volatility spike
            change += np.random.uniform(8, 25)
        new_vix = max(8, min(vix_values[-1] + change, 80))  # Cap at reasonable levels
        vix_values.append(new_vix)

        # Count regimes for validation
        if new_vix < 15:
            regime_counts['low_volatility'] += 1
        elif new_vix < 25:
            regime_counts['normal'] += 1
        elif new_vix < 35:
            regime_counts['high_volatility'] += 1
        else:
            regime_counts['extreme_volatility'] += 1

    vix_series = pd.Series(vix_values, index=dates)

    # Create regime-dependent noise
    logger.info("  Creating regime-dependent patterns...")
    noise_levels = []
    for vix_val in vix_values:
        if vix_val < 15:  # Low volatility
            noise_levels.append(150)
        elif vix_val < 25:  # Normal
            noise_levels.append(250)
        elif vix_val < 35:  # High volatility
            noise_levels.append(400)
        else:  # Extreme volatility
            noise_levels.append(600)

    noise = np.random.normal(0, noise_levels)

    # Day-of-week effects
    dow_effects = np.array([1.3 if d.weekday() < 5 else 0.7 for d in dates])

    # Generate call volume with all components
    call_volume = (
        trend +
        weekly_seasonal +
        monthly_seasonal
    ) * dow_effects + noise

    call_volume = np.maximum(call_volume, 1000)  # Minimum call volume

    # Generate correlated S&P 500 data
    logger.info("  Generating S&P 500 data...")
    sp500_base = 4000
    sp500_values = [sp500_base]

    for i in range(1, n_points):
        vix_effect = -0.001 * (vix_values[i] - 20) / 20  # VIX fear effect
        base_return = 0.0008 + vix_effect + np.random.normal(0, 0.012)
        new_price = sp500_values[-1] * (1 + base_return)
        sp500_values.append(max(new_price, 2000))  # Minimum price floor

    # Create comprehensive DataFrame - FIXED pandas methods
    data = pd.DataFrame({
        'calls': call_volume,
        'vix': vix_values,
        'sp500': sp500_values
    }, index=dates)

    # Add additional market indicators
    data['sp500_volume'] = np.random.gamma(2, 50000000, n_points)
    data['treasury_10y'] = 2.5 + 0.5 * np.sin(2 * np.pi * np.arange(n_points) / 365) + np.random.normal(0, 0.1, n_points)

    # Clean data - FIXED methods
    data = data.ffill().bfill()
    data = data.replace([np.inf, -np.inf], np.nan).fillna(data.median())

    # Data quality summary
    logger.info("✅ Data generation complete:")
    logger.info(f"  Shape: {data.shape}")
    logger.info(f"  Call volume range: {data['calls'].min():.0f} - {data['calls'].max():.0f}")
    logger.info(f"  VIX range: {data['vix'].min():.1f} - {data['vix'].max():.1f}")
    logger.info(f"  Regime distribution:")
    for regime, count in regime_counts.items():
        pct = count / n_points * 100
        logger.info(f"    {regime}: {pct:.1f}%")

    return data

## Quick Functionality Test

In [5]:
# Quick test to verify core functionality
def quick_functionality_test():
    """Quick test of core functionality before main execution"""

    logger.info("🧪 Running quick functionality test...")

    try:
        # Test data generation
        test_data = generate_debug_synthetic_data(n_points=50)
        if test_data.empty:
            raise ValueError("Data generation failed")

        logger.info("✅ Functionality test PASSED")
        logger.info(f"  Data: {test_data.shape}")

        return True

    except Exception as e:
        logger.error(f"❌ Functionality test FAILED: {str(e)}")
        return False

# Run the test
test_result = quick_functionality_test()
print(f"\nFunctionality test result: {'PASS' if test_result else 'FAIL'}")


Functionality test result: PASS


## Full Pipeline Execution

**Note:** The complete pipeline implementation includes:
- Model Factory (25+ ML models)
- Feature Engineering (100+ features)
- Market Regime Analysis
- GridSearchCV Optimization
- Comprehensive Evaluation

**To see the full implementation, use the original artifact or implement the remaining classes as needed.**

In [6]:
# Example execution (requires full implementation)
print("🎯 ML FORECASTING SYSTEM - READY FOR EXECUTION")
print("=" * 80)
print("")
print("To run the full pipeline:")
print("1. Implement the remaining classes from the full artifact")
print("2. Run: results = run_debug_pipeline()")
print("")
print("Current status:")
print(f"  GPU Available: {GPU_AVAILABLE}")
print(f"  High Memory: {HIGH_MEMORY}")
print(f"  XGBoost: {XGB_AVAILABLE}")
print(f"  LightGBM: {LGB_AVAILABLE}")
print(f"  Test Data Generation: {'✅ PASS' if test_result else '❌ FAIL'}")

🎯 ML FORECASTING SYSTEM - READY FOR EXECUTION

To run the full pipeline:
1. Implement the remaining classes from the full artifact
2. Run: results = run_debug_pipeline()

Current status:
  GPU Available: True
  High Memory: True
  XGBoost: True
  LightGBM: True
  Test Data Generation: ✅ PASS


In [7]:
# ============================================================================
# ML MODEL FACTORY
# ============================================================================

class DebugMLModelFactory:
    """Model factory with comprehensive creation debugging"""

    @staticmethod
    @monitor_performance("Model Creation")
    def create_all_models(config: DebugMLForecastingConfig) -> List[DebugMLForecaster]:
        """Create comprehensive set of ML models with debugging"""

        logger.info("🏭 Creating ML models...")

        all_models = []
        creation_stats = {
            'total_attempted': 0,
            'successful': 0,
            'failed': 0,
            'errors': []
        }

        model_groups = [
            ("Tree Models", config.use_tree_models, DebugMLModelFactory._create_tree_models),
            ("Linear Models", config.use_linear_models, DebugMLModelFactory._create_linear_models),
            ("Neural Models", config.use_neural_models, DebugMLModelFactory._create_neural_models),
            ("Ensemble Models", config.use_ensemble_models, DebugMLModelFactory._create_ensemble_models),
            ("Other Models", config.use_other_models, DebugMLModelFactory._create_other_models)
        ]

        for group_name, enabled, creator_func in model_groups:
            if not enabled:
                logger.info(f"  ⏭️ {group_name}: DISABLED")
                continue

            logger.info(f"  🔨 Creating {group_name}...")

            try:
                group_models = creator_func(config)
                group_successful = 0

                for model in group_models:
                    creation_stats['total_attempted'] += 1

                    try:
                        # Test model creation
                        _ = model.model.get_params()
                        all_models.append(model)
                        group_successful += 1
                        creation_stats['successful'] += 1

                        logger.debug(f"    ✅ {model.name}")

                    except Exception as e:
                        creation_stats['failed'] += 1
                        error_msg = f"{model.name}: {str(e)[:50]}..."
                        creation_stats['errors'].append(error_msg)

                        logger.warning(f"    ❌ {model.name} - {str(e)[:50]}...")

                logger.info(f"    📊 {group_name}: {group_successful}/{len(group_models)} successful")

            except Exception as e:
                error_msg = f"{group_name} creation failed: {str(e)}"
                creation_stats['errors'].append(error_msg)
                logger.error(f"    💥 {group_name} creation failed: {str(e)}")

        # Final summary
        logger.info(f"🎯 Model creation complete:")
        logger.info(f"  Total models: {creation_stats['successful']}/{creation_stats['total_attempted']}")
        logger.info(f"  Success rate: {creation_stats['successful']/max(1,creation_stats['total_attempted'])*100:.1f}%")

        if creation_stats['errors'] and config.debug_mode:
            logger.info(f"  Errors encountered: {len(creation_stats['errors'])}")
            for error in creation_stats['errors'][:3]:  # Show first 3 errors
                logger.debug(f"    {error}")

        if len(all_models) == 0:
            logger.error("No models created successfully! Creating fallback models...")
            all_models = DebugMLModelFactory._create_fallback_models(config)

        return all_models

    @staticmethod
    def _create_tree_models(config: DebugMLForecastingConfig) -> List[DebugMLForecaster]:
        """Create tree-based models"""
        models = []

        # Random Forest variants
        models.extend([
            DebugMLForecaster("RandomForest_100", RandomForestRegressor(n_estimators=100, random_state=42, n_jobs=1), config.debug_mode),
            DebugMLForecaster("RandomForest_200", RandomForestRegressor(n_estimators=200, random_state=42, n_jobs=1), config.debug_mode),
        ])

        # Extra Trees
        models.extend([
            DebugMLForecaster("ExtraTrees_100", ExtraTreesRegressor(n_estimators=100, random_state=42, n_jobs=1), config.debug_mode),
            DebugMLForecaster("ExtraTrees_200", ExtraTreesRegressor(n_estimators=200, random_state=42, n_jobs=1), config.debug_mode),
        ])

        # Gradient Boosting
        models.extend([
            DebugMLForecaster("GradientBoosting_100", GradientBoostingRegressor(n_estimators=100, random_state=42), config.debug_mode),
            DebugMLForecaster("GradientBoosting_200", GradientBoostingRegressor(n_estimators=200, random_state=42), config.debug_mode),
        ])

        # XGBoost models (CPU and GPU)
        if XGB_AVAILABLE:
            models.extend([
                DebugMLForecaster("XGBoost_100", xgb.XGBRegressor(n_estimators=100, random_state=42, n_jobs=1), config.debug_mode),
                DebugMLForecaster("XGBoost_200", xgb.XGBRegressor(n_estimators=200, random_state=42, n_jobs=1), config.debug_mode),
            ])

            # GPU models if available
            if XGB_GPU_AVAILABLE and HIGH_MEMORY:
                models.extend([
                    DebugMLForecaster("XGBoost_GPU_200", xgb.XGBRegressor(n_estimators=200, tree_method='gpu_hist', gpu_id=0, random_state=42), config.debug_mode),
                    DebugMLForecaster("XGBoost_GPU_500", xgb.XGBRegressor(n_estimators=500, tree_method='gpu_hist', gpu_id=0, random_state=42), config.debug_mode),
                ])

        # LightGBM models
        if LGB_AVAILABLE:
            models.extend([
                DebugMLForecaster("LightGBM_100", lgb.LGBMRegressor(n_estimators=100, random_state=42, verbose=-1, n_jobs=1), config.debug_mode),
                DebugMLForecaster("LightGBM_200", lgb.LGBMRegressor(n_estimators=200, random_state=42, verbose=-1, n_jobs=1), config.debug_mode),
            ])

            # GPU models if available
            if LGB_GPU_AVAILABLE and HIGH_MEMORY:
                models.extend([
                    DebugMLForecaster("LightGBM_GPU_200", lgb.LGBMRegressor(n_estimators=200, device='gpu', random_state=42, verbose=-1), config.debug_mode),
                    DebugMLForecaster("LightGBM_GPU_500", lgb.LGBMRegressor(n_estimators=500, device='gpu', random_state=42, verbose=-1), config.debug_mode),
                ])

        # Decision Tree
        models.append(
            DebugMLForecaster("DecisionTree", DecisionTreeRegressor(random_state=42, max_depth=10), config.debug_mode)
        )

        return models

    @staticmethod
    def _create_linear_models(config: DebugMLForecastingConfig) -> List[DebugMLForecaster]:
        """Create linear models"""
        models = []

        # Basic linear models
        models.extend([
            DebugMLForecaster("LinearRegression", LinearRegression(), config.debug_mode),
            DebugMLForecaster("Ridge_0.1", Ridge(alpha=0.1), config.debug_mode),
            DebugMLForecaster("Ridge_1.0", Ridge(alpha=1.0), config.debug_mode),
            DebugMLForecaster("Ridge_10.0", Ridge(alpha=10.0), config.debug_mode),
            DebugMLForecaster("Lasso_0.1", Lasso(alpha=0.1, max_iter=2000), config.debug_mode),
            DebugMLForecaster("Lasso_1.0", Lasso(alpha=1.0, max_iter=2000), config.debug_mode),
            DebugMLForecaster("ElasticNet_0.1", ElasticNet(alpha=0.1, max_iter=2000), config.debug_mode),
            DebugMLForecaster("ElasticNet_1.0", ElasticNet(alpha=1.0, max_iter=2000), config.debug_mode),
        ])

        # Bayesian models
        models.extend([
            DebugMLForecaster("BayesianRidge", BayesianRidge(), config.debug_mode),
            DebugMLForecaster("ARDRegression", ARDRegression(max_iter=500), config.debug_mode),
        ])

        # Robust models
        models.extend([
            DebugMLForecaster("HuberRegressor", HuberRegressor(max_iter=200), config.debug_mode),
            DebugMLForecaster("SGDRegressor", SGDRegressor(random_state=42, max_iter=2000), config.debug_mode),
        ])

        return models

    @staticmethod
    def _create_neural_models(config: DebugMLForecastingConfig) -> List[DebugMLForecaster]:
        """Create neural network models"""
        models = []

        # MLPRegressor variants with proper parameters
        models.extend([
            DebugMLForecaster("MLP_50", MLPRegressor(
                hidden_layer_sizes=(50,),
                random_state=42,
                max_iter=1000,
                early_stopping=True,
                validation_fraction=0.1,
                n_iter_no_change=10
            ), config.debug_mode),
            DebugMLForecaster("MLP_100_50", MLPRegressor(
                hidden_layer_sizes=(100, 50),
                random_state=42,
                max_iter=1000,
                early_stopping=True,
                validation_fraction=0.1,
                n_iter_no_change=10
            ), config.debug_mode),
            DebugMLForecaster("MLP_200_100", MLPRegressor(
                hidden_layer_sizes=(200, 100),
                random_state=42,
                max_iter=1000,
                early_stopping=True,
                validation_fraction=0.1,
                n_iter_no_change=10
            ), config.debug_mode),
        ])

        return models

    @staticmethod
    def _create_ensemble_models(config: DebugMLForecastingConfig) -> List[DebugMLForecaster]:
        """Create ensemble models"""
        models = []

        # Bagging models
        models.extend([
            DebugMLForecaster("BaggingRegressor", BaggingRegressor(random_state=42, n_jobs=1), config.debug_mode),
            DebugMLForecaster("AdaBoostRegressor", AdaBoostRegressor(random_state=42, n_estimators=50), config.debug_mode),
        ])

        # Voting ensemble
        try:
            voting_models = [
                ('rf', RandomForestRegressor(n_estimators=50, random_state=42, n_jobs=1)),
                ('ridge', Ridge(alpha=1.0)),
                ('tree', DecisionTreeRegressor(random_state=42, max_depth=10))
            ]
            models.append(
                DebugMLForecaster("VotingRegressor", VotingRegressor(estimators=voting_models, n_jobs=1), config.debug_mode)
            )
        except Exception as e:
            logger.warning(f"Could not create VotingRegressor: {str(e)}")

        return models

    @staticmethod
    def _create_other_models(config: DebugMLForecastingConfig) -> List[DebugMLForecaster]:
        """Create other ML models"""
        models = []

        # Support Vector Regression
        models.extend([
            DebugMLForecaster("SVR_linear", SVR(kernel='linear', C=1.0), config.debug_mode),
            DebugMLForecaster("SVR_rbf", SVR(kernel='rbf', C=1.0), config.debug_mode),
        ])

        # K-Nearest Neighbors
        models.extend([
            DebugMLForecaster("KNeighbors_5", KNeighborsRegressor(n_neighbors=5), config.debug_mode),
            DebugMLForecaster("KNeighbors_10", KNeighborsRegressor(n_neighbors=10), config.debug_mode),
        ])

        # Kernel Ridge
        models.append(
            DebugMLForecaster("KernelRidge", KernelRidge(alpha=1.0), config.debug_mode)
        )

        return models

    @staticmethod
    def _create_fallback_models(config: DebugMLForecastingConfig) -> List[DebugMLForecaster]:
        """Create basic fallback models if all others fail"""
        logger.warning("Creating fallback models...")

        return [
            DebugMLForecaster("Fallback_LinearRegression", LinearRegression(), config.debug_mode),
            DebugMLForecaster("Fallback_Ridge", Ridge(alpha=1.0), config.debug_mode),
            DebugMLForecaster("Fallback_RandomForest", RandomForestRegressor(n_estimators=50, random_state=42), config.debug_mode)
        ]

print("DebugMLModelFactory loaded")


DebugMLModelFactory loaded


In [8]:
# ============================================================================
# FEATURE ENGINEERING
# ============================================================================

class DebugFeatureEngineer:
    """Advanced feature engineering with comprehensive debugging"""

    def __init__(self, config: DebugMLForecastingConfig):
        self.config = config
        self.feature_names_ = []
        self.feature_creation_log = []

    @monitor_performance("Feature Engineering")
    def create_features(self, data: pd.DataFrame, regime_data: pd.Series = None) -> pd.DataFrame:
        """Create comprehensive feature set with debugging"""

        logger.info(f"🛠️ Starting feature engineering on data shape: {data.shape}")

        # Validate input
        if data.empty:
            raise ValueError("Empty input data for feature engineering")

        target_col = self.config.target_column
        if target_col not in data.columns:
            raise ValueError(f"Target column '{target_col}' not found in data")

        # Initialize features dataframe
        features_df = pd.DataFrame(index=data.index)
        target_series = data[target_col].copy()

        # Log initial data quality
        nan_count = target_series.isna().sum()
        logger.info(f"Target series: {len(target_series)} points, {nan_count} NaN values")

        # Feature creation steps with individual error handling
        feature_steps = [
            ("Lagged Features", self._add_lagged_features, target_series),
            ("Rolling Features", self._add_rolling_features, target_series),
            ("Technical Indicators", self._add_technical_indicators, target_series),
            ("Calendar Features", self._add_calendar_features, None),
            ("Statistical Features", self._add_statistical_features, target_series)
        ]

        # Add market features if available
        if 'vix' in data.columns:
            feature_steps.append(("Market Features", self._add_market_features, data))

        # Add regime features if available
        if self.config.use_regime_features and regime_data is not None:
            feature_steps.append(("Regime Features", self._add_regime_features, regime_data))

        # Execute feature creation steps
        for step_name, step_func, step_data in feature_steps:
            initial_count = len(features_df.columns)
            step_start = time.time()

            try:
                logger.info(f"  🔧 Creating {step_name}...")

                if step_name == "Calendar Features":
                    features_df = step_func(features_df)
                else:
                    features_df = step_func(features_df, step_data)

                added_count = len(features_df.columns) - initial_count
                step_time = time.time() - step_start

                logger.info(f"    ✅ {step_name}: +{added_count} features ({step_time:.2f}s)")
                self.feature_creation_log.append({
                    'step': step_name,
                    'features_added': added_count,
                    'time': step_time,
                    'status': 'success'
                })

            except Exception as e:
                step_time = time.time() - step_start
                logger.error(f"    ❌ {step_name} failed: {str(e)}")
                self.feature_creation_log.append({
                    'step': step_name,
                    'features_added': 0,
                    'time': step_time,
                    'status': 'failed',
                    'error': str(e)
                })

        # Clean and validate features
        logger.info(f"  🧹 Cleaning features...")
        features_df = self._clean_features(features_df)

        self.feature_names_ = list(features_df.columns)

        # Log final feature summary
        logger.info(f"✅ Feature engineering complete:")
        logger.info(f"  Final shape: {features_df.shape}")
        logger.info(f"  Features created: {len(self.feature_names_)}")

        return features_df

    def _add_lagged_features(self, df: pd.DataFrame, series: pd.Series) -> pd.DataFrame:
        """Add lagged features with validation"""
        max_lags = min(self.config.max_lags, len(series) // 4)
        logger.debug(f"Creating {max_lags} lag features")

        for lag in range(1, max_lags + 1):
            df[f'lag_{lag}'] = series.shift(lag)

        return df

    def _add_rolling_features(self, df: pd.DataFrame, series: pd.Series) -> pd.DataFrame:
        """Add rolling window features with validation"""
        for window in self.config.rolling_windows:
            if window < len(series):
                df[f'rolling_mean_{window}'] = series.rolling(window=window, min_periods=1).mean()
                df[f'rolling_std_{window}'] = series.rolling(window=window, min_periods=1).std()
                df[f'rolling_min_{window}'] = series.rolling(window=window, min_periods=1).min()
                df[f'rolling_max_{window}'] = series.rolling(window=window, min_periods=1).max()

        return df

    def _add_technical_indicators(self, df: pd.DataFrame, series: pd.Series) -> pd.DataFrame:
        """Add technical indicators with error handling"""
        try:
            # RSI calculation
            for period in [7, 14]:
                if period < len(series):
                    delta = series.diff()
                    gain = delta.where(delta > 0, 0).rolling(window=period, min_periods=1).mean()
                    loss = (-delta.where(delta < 0, 0)).rolling(window=period, min_periods=1).mean()
                    rs = gain / (loss + 1e-8)
                    df[f'rsi_{period}'] = 100 - (100 / (1 + rs))

            # Moving average ratios
            if len(series) > 12:
                ma_short = series.rolling(window=5, min_periods=1).mean()
                ma_long = series.rolling(window=12, min_periods=1).mean()
                df['ma_ratio'] = ma_short / (ma_long + 1e-8)

            # Momentum indicators
            for period in [3, 7]:
                if period < len(series):
                    df[f'momentum_{period}'] = series / (series.shift(period) + 1e-8) - 1
                    df[f'rate_of_change_{period}'] = series.pct_change(periods=period)

        except Exception as e:
            logger.warning(f"Technical indicators creation failed: {str(e)}")

        return df

    def _add_calendar_features(self, df: pd.DataFrame) -> pd.DataFrame:
        """Add calendar features with validation"""
        try:
            if isinstance(df.index, pd.DatetimeIndex):
                df['dow'] = df.index.dayofweek
                df['is_weekend'] = (df['dow'] >= 5).astype(int)
                df['month'] = df.index.month
                df['quarter'] = df.index.quarter
                df['day_of_month'] = df.index.day
                df['day_of_year'] = df.index.dayofyear

                # Cyclical encoding
                df['dow_sin'] = np.sin(2 * np.pi * df['dow'] / 7)
                df['dow_cos'] = np.cos(2 * np.pi * df['dow'] / 7)
                df['month_sin'] = np.sin(2 * np.pi * df['month'] / 12)
                df['month_cos'] = np.cos(2 * np.pi * df['month'] / 12)
            else:
                # Position-based features for non-datetime index
                df['position'] = np.arange(len(df))
                df['position_sin'] = np.sin(2 * np.pi * df['position'] / 7)
                df['position_cos'] = np.cos(2 * np.pi * df['position'] / 7)

        except Exception as e:
            logger.warning(f"Calendar features creation failed: {str(e)}")

        return df

    def _add_market_features(self, df: pd.DataFrame, data: pd.DataFrame) -> pd.DataFrame:
        """Add market features with validation"""
        try:
            if 'vix' in data.columns:
                df['vix'] = data['vix']
                df['vix_lag1'] = data['vix'].shift(1)
                df['vix_change'] = data['vix'].diff()
                if len(data['vix']) > 7:
                    df['vix_rolling_7'] = data['vix'].rolling(7, min_periods=1).mean()

            if 'sp500' in data.columns:
                df['sp500_return'] = data['sp500'].pct_change()
                if len(data['sp500']) > 7:
                    df['sp500_volatility'] = data['sp500'].pct_change().rolling(7, min_periods=1).std()

        except Exception as e:
            logger.warning(f"Market features creation failed: {str(e)}")

        return df

    def _add_regime_features(self, df: pd.DataFrame, regime_data: pd.Series) -> pd.DataFrame:
        """Add regime features with validation"""
        try:
            aligned_regimes = regime_data.reindex(df.index, method='ffill')
            regime_dummies = pd.get_dummies(aligned_regimes, prefix='regime')
            regime_dummies.index = df.index
            df = pd.concat([df, regime_dummies], axis=1)

            # Regime duration
            regime_changes = aligned_regimes != aligned_regimes.shift(1)
            regime_groups = regime_changes.cumsum()
            df['regime_duration'] = regime_groups.groupby(regime_groups).cumcount() + 1

        except Exception as e:
            logger.warning(f"Regime features creation failed: {str(e)}")

        return df

    def _add_statistical_features(self, df: pd.DataFrame, series: pd.Series) -> pd.DataFrame:
        """Add statistical features"""
        try:
            # Z-scores
            for window in [7, 14]:
                if window < len(series):
                    rolling_mean = series.rolling(window, min_periods=1).mean()
                    rolling_std = series.rolling(window, min_periods=1).std()
                    df[f'zscore_{window}'] = (series - rolling_mean) / (rolling_std + 1e-8)

            # Distance from moving averages
            for window in [7, 14]:
                if window < len(series):
                    ma = series.rolling(window, min_periods=1).mean()
                    df[f'distance_from_ma_{window}'] = (series - ma) / (ma + 1e-8)

        except Exception as e:
            logger.warning(f"Statistical features creation failed: {str(e)}")

        return df

    def _clean_features(self, df: pd.DataFrame) -> pd.DataFrame:
        """Clean and validate features - FIXED for modern pandas"""
        initial_shape = df.shape

        # Replace infinite values
        df = df.replace([np.inf, -np.inf], np.nan)

        # Drop columns with all NaN
        df = df.dropna(axis=1, how='all')

        # Drop rows with too many NaN values
        threshold = len(df.columns) * 0.5
        df = df.dropna(thresh=threshold)

        # FIXED: Use modern pandas methods
        df = df.ffill()  # Forward fill
        df = df.bfill()  # Backward fill

        # Final cleanup
        df = df.fillna(0)

        if self.config.debug_mode:
            logger.debug(f"Feature cleaning: {initial_shape} → {df.shape}")

        return df

    def create_target_from_features(self, features_df: pd.DataFrame,
                                   original_data: pd.DataFrame,
                                   forecast_horizon: int = 1) -> Tuple[pd.DataFrame, pd.Series]:
        """Create target variable aligned with features"""

        logger.info(f"🎯 Creating target variable with forecast horizon: {forecast_horizon}")

        target_col = self.config.target_column

        if target_col not in original_data.columns:
            raise ValueError(f"Target column '{target_col}' not found in original data")

        # Create target with forecast horizon
        target_series = original_data[target_col].shift(-forecast_horizon)

        # Align indices
        common_index = features_df.index.intersection(target_series.index)

        if len(common_index) == 0:
            raise ValueError("No common index between features and target")

        aligned_features = features_df.loc[common_index]
        aligned_target = target_series.loc[common_index]

        # Remove rows where target is NaN
        valid_mask = ~aligned_target.isna()

        final_features = aligned_features[valid_mask]
        final_target = aligned_target[valid_mask]

        if len(final_features) == 0:
            raise ValueError("No valid samples after alignment")

        logger.info(f"✅ Target alignment complete: Features {final_features.shape}, Target {final_target.shape}")

        return final_features, final_target

print("DebugFeatureEngineer loaded")


DebugFeatureEngineer loaded


In [9]:
# ============================================================================
# GRIDSEARCHCV OPTIMIZER
# ============================================================================

def get_verified_param_grids(config: DebugMLForecastingConfig) -> Dict[str, Dict]:
    """Get VERIFIED parameter grids that will actually trigger GridSearchCV"""

    logger.info(f"🎛️ Creating parameter grids (detailed_search: {config.detailed_search})")

    if config.detailed_search and HIGH_MEMORY:
        # Comprehensive parameter grids for high-memory systems
        param_grids = {
            'RandomForest': {
                'model__n_estimators': [100, 200, 500],
                'model__max_depth': [10, 20, None],
                'model__min_samples_split': [2, 5, 10],
                'model__min_samples_leaf': [1, 2, 4]
            },
            'XGBoost': {
                'model__n_estimators': [100, 200, 500],
                'model__max_depth': [3, 4, 5, 6],
                'model__learning_rate': [0.01, 0.1, 0.2],
                'model__subsample': [0.8, 1.0],
                'model__reg_alpha': [0, 0.1, 1]
            },
            'LightGBM': {
                'model__n_estimators': [100, 200, 500],
                'model__max_depth': [3, 5, 10, -1],
                'model__learning_rate': [0.01, 0.1, 0.2],
                'model__num_leaves': [20, 31, 50],
                'model__reg_alpha': [0, 0.1, 1]
            },
            'Ridge': {
                'model__alpha': [0.1, 1.0, 10.0, 100.0]
            },
            'SVR': {
                'model__C': [0.1, 1.0, 10.0],
                'model__kernel': ['linear', 'rbf'],
                'model__epsilon': [0.01, 0.1]
            },
            'MLP': {
                'model__hidden_layer_sizes': [(50,), (100,), (100, 50)],
                'model__alpha': [0.001, 0.01, 0.1],
                'model__learning_rate': ['constant', 'adaptive'],
                'model__max_iter': [1000, 2000]
            }
        }
    else:
        # Quick search grids for faster execution
        param_grids = {
            'RandomForest': {
                'model__n_estimators': [50, 100, 200],
                'model__max_depth': [10, None],
                'model__min_samples_split': [2, 5]
            },
            'XGBoost': {
                'model__n_estimators': [50, 100],
                'model__max_depth': [3, 6],
                'model__learning_rate': [0.1, 0.2]
            },
            'Ridge': {
                'model__alpha': [0.1, 1.0, 10.0]
            },
            'SVR': {
                'model__C': [0.1, 1.0, 10.0],
                'model__kernel': ['linear', 'rbf']
            },
            'MLP': {
                'model__hidden_layer_sizes': [(50,), (100,)],
                'model__alpha': [0.001, 0.01]
            }
        }

    # Calculate and log total parameter combinations
    total_combinations = 0
    for model_type, grid in param_grids.items():
        combinations = 1
        for param_values in grid.values():
            combinations *= len(param_values)
        total_combinations += combinations

        logger.info(f"  {model_type}: {combinations} parameter combinations")

    expected_gridsearch_fits = total_combinations * config.cv_splits
    expected_time_minutes = expected_gridsearch_fits * 0.5 / 60  # Rough estimate

    logger.info(f"📊 Parameter grid summary:")
    logger.info(f"  Total combinations: {total_combinations}")
    logger.info(f"  With {config.cv_splits}-fold CV: {expected_gridsearch_fits} total fits")
    logger.info(f"  Estimated GridSearch time: {expected_time_minutes:.1f} minutes")

    return param_grids

class VerifiedGridSearchOptimizer:
    """GridSearchCV optimizer with VERIFIED execution and comprehensive debugging"""

    def __init__(self, config: DebugMLForecastingConfig):
        self.config = config
        self.param_grids = get_verified_param_grids(config)
        self.optimization_log = []

    @monitor_performance("VP GridSearch Optimization")
    def optimize_top_models(self, v2_predictions: Dict[str, np.ndarray],
                           X_train: pd.DataFrame, y_train: pd.Series,
                           X_test: pd.DataFrame, y_test: pd.Series) -> Dict[str, np.ndarray]:
        """Optimize hyperparameters with VERIFIED GridSearchCV execution"""

        logger.info("🚀 Starting VERIFIED VP GridSearchCV optimization...")

        if len(v2_predictions) == 0:
            logger.error("❌ No V2 predictions provided for optimization")
            return {}

        try:
            # CRITICAL: Rank V2 models by performance with detailed logging
            logger.info("  📊 Ranking V2 models by performance...")
            v2_performance = {}

            for model_name, predictions in v2_predictions.items():
                try:
                    if len(predictions) == 0:
                        logger.warning(f"    {model_name}: Empty predictions")
                        v2_performance[model_name] = float('inf')
                        continue

                    # Ensure proper alignment
                    pred_len = min(len(predictions), len(y_test))
                    if pred_len == 0:
                        logger.warning(f"    {model_name}: No overlapping predictions")
                        v2_performance[model_name] = float('inf')
                        continue

                    mae = mean_absolute_error(y_test.iloc[:pred_len], predictions[:pred_len])
                    v2_performance[model_name] = mae

                    logger.debug(f"    {model_name}: MAE = {mae:.3f}")

                except Exception as e:
                    logger.warning(f"    {model_name}: Performance calculation failed - {str(e)}")
                    v2_performance[model_name] = float('inf')

            # Filter out infinite performance scores
            valid_performance = {k: v for k, v in v2_performance.items() if not np.isinf(v)}

            logger.info(f"  Valid models for optimization: {len(valid_performance)}/{len(v2_predictions)}")

            if not valid_performance:
                logger.error("❌ CRITICAL: No valid V2 models for optimization!")
                return v2_predictions

            # Select top models
            top_models = sorted(valid_performance.items(), key=lambda x: x[1])[:self.config.top_models_for_optimization]

            logger.info(f"📈 Top {len(top_models)} models selected for GridSearchCV:")
            for i, (model_name, mae) in enumerate(top_models, 1):
                logger.info(f"  {i}. {model_name}: MAE = {mae:.3f}")

            vp_predictions = v2_predictions.copy()
            gridsearch_count = 0
            successful_optimizations = 0

            # CRITICAL: Actually run GridSearchCV on each top model
            logger.info(f"\n🔥 STARTING GRIDSEARCHCV EXECUTION...")
            gridsearch_start_time = time.time()

            for i, (model_name, baseline_mae) in enumerate(top_models, 1):
                logger.info(f"\n🎯 [{i}/{len(top_models)}] Optimizing {model_name}...")
                logger.info(f"  Baseline MAE: {baseline_mae:.3f}")

                try:
                    optimized_predictions = self._run_verified_gridsearch(
                        model_name, X_train, y_train, X_test, baseline_mae
                    )

                    if optimized_predictions is not None:
                        vp_predictions[f"{model_name}_VP_optimized"] = optimized_predictions
                        successful_optimizations += 1
                        gridsearch_count += 1
                        logger.info(f"  ✅ {model_name} optimization completed")
                    else:
                        logger.warning(f"  ⚠️ {model_name} optimization failed")

                except Exception as e:
                    logger.error(f"  ❌ {model_name} optimization error: {str(e)}")

            total_gridsearch_time = time.time() - gridsearch_start_time

            # VERIFICATION: Check that GridSearchCV actually ran
            logger.info(f"\n🏁 VP OPTIMIZATION SUMMARY:")
            logger.info(f"  Models attempted: {len(top_models)}")
            logger.info(f"  GridSearchCV runs completed: {gridsearch_count}")
            logger.info(f"  Successful optimizations: {successful_optimizations}")
            logger.info(f"  Total GridSearchCV time: {total_gridsearch_time:.1f} seconds")

            # CRITICAL VERIFICATION
            if gridsearch_count == 0:
                logger.error("🚨 CRITICAL ERROR: NO GRIDSEARCHCV RUNS COMPLETED!")
            elif total_gridsearch_time < 30:
                logger.warning(f"⚠️ WARNING: GridSearchCV completed very quickly ({total_gridsearch_time:.1f}s)")
            else:
                logger.info(f"✅ VERIFIED: GridSearchCV ran successfully for {total_gridsearch_time:.1f} seconds")

            return vp_predictions

        except Exception as e:
            logger.error(f"❌ VP optimization pipeline failed: {str(e)}")
            return v2_predictions

    def _run_verified_gridsearch(self, model_name: str, X_train: pd.DataFrame, y_train: pd.Series,
                                X_test: pd.DataFrame, baseline_mae: float) -> Optional[np.ndarray]:
        """Run a single GridSearchCV with comprehensive verification"""

        try:
            # Determine model type and get parameter grid
            model_type = self._get_model_type(model_name)

            if model_type not in self.param_grids:
                logger.warning(f"    No parameter grid for {model_type}")
                return None

            param_grid = self.param_grids[model_type]

            # Calculate expected number of fits
            total_combinations = 1
            for param_values in param_grid.values():
                total_combinations *= len(param_values)

            expected_fits = total_combinations * self.config.cv_splits

            logger.info(f"    Parameter combinations: {total_combinations}")
            logger.info(f"    Expected CV fits: {expected_fits}")

            # Create base model
            base_model = self._create_base_model(model_type)
            if base_model is None:
                logger.warning(f"    Could not create base model for {model_type}")
                return None

            # Create pipeline
            pipeline = Pipeline([
                ('scaler', StandardScaler()),
                ('model', base_model)
            ])

            # Setup TimeSeriesSplit with validation
            n_splits = max(2, min(self.config.cv_splits, len(X_train) // 50))
            tscv = TimeSeriesSplit(n_splits=n_splits)

            logger.info(f"    Using {n_splits}-fold TimeSeriesSplit")

            # Setup GridSearchCV with comprehensive logging
            grid_search = GridSearchCV(
                pipeline,
                param_grid=param_grid,
                cv=tscv,
                scoring=self.config.scoring_metric,
                n_jobs=1,  # Single job for debugging visibility
                verbose=self.config.gridsearch_verbose,  # Show progress
                error_score='raise'
            )

            # CRITICAL: Actually run GridSearchCV with timing
            logger.info(f"    🚀 STARTING GridSearchCV for {model_name}...")
            gridsearch_start = time.time()

            # Clean input data
            X_train_clean = X_train.fillna(0).replace([np.inf, -np.inf], 0)
            y_train_clean = y_train.fillna(y_train.mean())

            grid_search.fit(X_train_clean, y_train_clean)

            gridsearch_end = time.time()
            gridsearch_duration = gridsearch_end - gridsearch_start

            # VERIFICATION: Check that GridSearchCV actually ran
            actual_fits = len(grid_search.cv_results_['params'])

            logger.info(f"    ✅ GridSearchCV completed in {gridsearch_duration:.1f} seconds")
            logger.info(f"    Expected fits: {expected_fits}, Actual fits: {actual_fits}")
            logger.info(f"    Best score: {-grid_search.best_score_:.4f}")
            logger.info(f"    Best params: {grid_search.best_params_}")

            # Make predictions with optimized model
            X_test_clean = X_test.fillna(0).replace([np.inf, -np.inf], 0)
            predictions = grid_search.predict(X_test_clean)

            # Log optimization improvement
            optimized_mae = -grid_search.best_score_
            improvement = ((baseline_mae - optimized_mae) / baseline_mae) * 100

            logger.info(f"    📈 Optimization result:")
            logger.info(f"      Baseline MAE: {baseline_mae:.4f}")
            logger.info(f"      Optimized MAE: {optimized_mae:.4f}")
            logger.info(f"      Improvement: {improvement:+.2f}%")

            # Store optimization log
            self.optimization_log.append({
                'model_name': model_name,
                'model_type': model_type,
                'baseline_mae': baseline_mae,
                'optimized_mae': optimized_mae,
                'improvement_pct': improvement,
                'gridsearch_duration': gridsearch_duration,
                'expected_fits': expected_fits,
                'actual_fits': actual_fits,
                'best_params': grid_search.best_params_
            })

            return predictions

        except Exception as e:
            logger.error(f"    ❌ GridSearchCV failed for {model_name}: {str(e)}")
            return None

    def _get_model_type(self, model_name: str) -> str:
        """Extract model type from model name with comprehensive matching"""

        name_lower = model_name.lower()

        if 'randomforest' in name_lower:
            return 'RandomForest'
        elif 'xgboost' in name_lower and XGB_AVAILABLE:
            return 'XGBoost'
        elif 'lightgbm' in name_lower and LGB_AVAILABLE:
            return 'LightGBM'
        elif 'ridge' in name_lower:
            return 'Ridge'
        elif 'svr' in name_lower:
            return 'SVR'
        elif 'mlp' in name_lower:
            return 'MLP'
        else:
            return 'Unknown'

    def _create_base_model(self, model_type: str):
        """Create base model for optimization with error handling"""

        try:
            if model_type == 'RandomForest':
                return RandomForestRegressor(random_state=42, n_jobs=1)
            elif model_type == 'XGBoost' and XGB_AVAILABLE:
                return xgb.XGBRegressor(random_state=42, n_jobs=1)
            elif model_type == 'LightGBM' and LGB_AVAILABLE:
                return lgb.LGBMRegressor(random_state=42, verbose=-1, n_jobs=1)
            elif model_type == 'Ridge':
                return Ridge()
            elif model_type == 'SVR':
                return SVR()
            elif model_type == 'MLP':
                return MLPRegressor(random_state=42, max_iter=1000, early_stopping=True, validation_fraction=0.1)
            else:
                return None
        except Exception as e:
            logger.warning(f"Could not create {model_type}: {str(e)}")
            return None

    def get_optimization_summary(self) -> pd.DataFrame:
        """Get comprehensive optimization summary"""

        if not self.optimization_log:
            return pd.DataFrame()

        df = pd.DataFrame(self.optimization_log)
        return df.sort_values('improvement_pct', ascending=False)

print("VerifiedGridSearchOptimizer loaded")


VerifiedGridSearchOptimizer loaded


In [10]:
# ============================================================================
# MARKET REGIME ANALYZER
# ============================================================================

class DebugMarketRegimeAnalyzer:
    """Market regime analysis with comprehensive debugging"""

    def __init__(self, config: DebugMLForecastingConfig):
        self.config = config
        self.vix_thresholds = config.vix_thresholds
        self.regime_stats = None

    @monitor_performance("Market Regime Analysis")
    def analyze_regimes(self, data: pd.DataFrame) -> Dict[str, Any]:
        """Comprehensive market regime analysis with debugging"""

        logger.info("📈 Analyzing market regimes...")

        # Get or simulate VIX data
        if 'vix' in data.columns:
            vix_series = data['vix'].copy()
            logger.info(f"  Using actual VIX data: {len(vix_series)} points")
        else:
            vix_series = self._simulate_vix_data(len(data))
            vix_series.index = data.index
            logger.info(f"  Using simulated VIX data: {len(vix_series)} points")

        # Clean VIX data
        initial_nan_count = vix_series.isna().sum()
        vix_series = vix_series.ffill().bfill().fillna(20.0)

        if initial_nan_count > 0:
            logger.info(f"  Cleaned {initial_nan_count} NaN values in VIX data")

        # Classify regimes
        regimes = vix_series.apply(self.classify_market_regime)

        # Log VIX statistics
        logger.info(f"  VIX range: {vix_series.min():.1f} - {vix_series.max():.1f}")
        logger.info(f"  VIX mean: {vix_series.mean():.1f}, std: {vix_series.std():.1f}")

        # Calculate regime statistics
        try:
            regime_distribution = regimes.value_counts(normalize=True)
            regime_transitions = self._calculate_transition_matrix(regimes)

            logger.info(f"  Regime distribution:")
            for regime, pct in regime_distribution.items():
                logger.info(f"    {regime}: {pct*100:.1f}%")

        except Exception as e:
            logger.warning(f"Could not calculate regime statistics: {str(e)}")
            regime_distribution = pd.Series([1.0], index=['normal'])
            regime_transitions = pd.DataFrame()

        current_regime = regimes.iloc[-1] if len(regimes) > 0 else 'normal'

        self.regime_stats = {
            'vix_values': vix_series,
            'regimes': regimes,
            'current_regime': current_regime,
            'regime_distribution': regime_distribution,
            'transition_matrix': regime_transitions
        }

        logger.info(f"✅ Market regime analysis complete")
        logger.info(f"  Current regime: {current_regime}")

        return self.regime_stats

    def classify_market_regime(self, vix_value: float) -> str:
        """Classify market regime based on VIX with validation"""

        if pd.isna(vix_value) or vix_value <= 0:
            return 'normal'

        if vix_value < self.vix_thresholds['low_volatility']:
            return 'low_volatility'
        elif vix_value < self.vix_thresholds['normal']:
            return 'normal'
        elif vix_value < self.vix_thresholds['high_volatility']:
            return 'high_volatility'
        else:
            return 'extreme_volatility'

    def select_models_for_regime(self, all_models: List[DebugMLForecaster], regime: str) -> List[DebugMLForecaster]:
        """Select appropriate ML models for current market regime with logging"""

        logger.info(f"🎯 Selecting models for {regime} regime...")

        regime_preferences = self.config.regime_specific_models.get(
            regime,
            self.config.regime_specific_models['normal']
        )

        selected = []
        for model in all_models:
            model_type = model.name.split('_')[0]

            # Check if model type matches regime preference
            for pref in regime_preferences:
                if pref.lower() in model_type.lower():
                    selected.append(model)
                    break

        # Ensure minimum number of models
        min_models = 8 if HIGH_MEMORY else 6
        if len(selected) < min_models:
            logger.info(f"  Adding additional models to reach minimum of {min_models}")
            for model in all_models:
                if model not in selected:
                    selected.append(model)
                    if len(selected) >= min_models * 2:  # Cap at 2x minimum
                        break

        selected_names = [model.name for model in selected]
        logger.info(f"  Selected {len(selected)} models: {selected_names[:5]}{'...' if len(selected_names) > 5 else ''}")

        return selected

    def _calculate_transition_matrix(self, regimes: pd.Series) -> pd.DataFrame:
        """Calculate regime transition probabilities with error handling"""

        try:
            unique_regimes = regimes.unique()
            n_regimes = len(unique_regimes)

            if n_regimes == 0:
                return pd.DataFrame()

            transition_matrix = pd.DataFrame(
                np.zeros((n_regimes, n_regimes)),
                index=unique_regimes,
                columns=unique_regimes
            )

            for i in range(1, len(regimes)):
                from_regime = regimes.iloc[i-1]
                to_regime = regimes.iloc[i]
                if pd.notna(from_regime) and pd.notna(to_regime):
                    transition_matrix.loc[from_regime, to_regime] += 1

            # Normalize rows to get probabilities
            row_sums = transition_matrix.sum(axis=1)
            transition_matrix = transition_matrix.div(row_sums, axis=0).fillna(0)

            return transition_matrix

        except Exception as e:
            logger.warning(f"Could not calculate transition matrix: {str(e)}")
            return pd.DataFrame()

    def _simulate_vix_data(self, n_points: int) -> pd.Series:
        """Generate realistic VIX simulation"""

        logger.debug(f"Simulating VIX data for {n_points} points")

        base_vix = 18
        vix_values = [base_vix]

        for i in range(1, n_points):
            # Mean reversion with random noise
            change = 0.15 * (base_vix - vix_values[-1]) + np.random.normal(0, 1.8)

            # Random volatility spikes
            if np.random.random() < 0.08:  # 8% chance of volatility spike
                change += np.random.uniform(8, 20)

            new_vix = max(10, vix_values[-1] + change)
            vix_values.append(new_vix)

        return pd.Series(vix_values, name='VIX_simulated')

print("DebugMarketRegimeAnalyzer loaded")


DebugMarketRegimeAnalyzer loaded


In [11]:
# ============================================================================
# MAIN PIPELINE
# ============================================================================

class DebugMLForecastingPipeline:
    """Complete ML forecasting pipeline with comprehensive debugging and verification"""

    def __init__(self, config: DebugMLForecastingConfig):
        self.config = config
        self.results = {}
        self.regime_analyzer = DebugMarketRegimeAnalyzer(config)
        self.feature_engineer = DebugFeatureEngineer(config)
        self.models = {}
        self.phase_timings = {}
        self.execution_log = []

    @monitor_performance("Complete ML Pipeline")
    def execute(self, data: pd.DataFrame) -> pd.DataFrame:
        """Execute complete ML forecasting pipeline with comprehensive debugging"""

        logger.info("=" * 80)
        logger.info("🚀 ENHANCED MACHINE LEARNING FORECASTING PIPELINE - DEBUG VERSION")
        logger.info("=" * 80)
        logger.info(f"Input data shape: {data.shape}")
        logger.info(f"Target column: {self.config.target_column}")
        logger.info(f"Debug mode: {self.config.debug_mode}")
        logger.info(f"System capabilities: GPU={GPU_AVAILABLE}, High Memory={HIGH_MEMORY}")

        pipeline_start_time = time.time()

        try:
            # 1. Input validation
            logger.info("\n📋 Step 1: Input Validation")
            if not self._validate_input_data(data):
                raise ValueError("Input data validation failed")

            # 2. Market regime analysis
            logger.info("\n📈 Step 2: Market Regime Analysis")
            regime_stats = self.regime_analyzer.analyze_regimes(data)
            current_regime = regime_stats['current_regime']
            self.execution_log.append(f"Market regime identified: {current_regime}")

            # 3. Feature engineering
            logger.info("\n🛠️ Step 3: Feature Engineering")
            features_df = self.feature_engineer.create_features(
                data,
                regime_data=regime_stats['regimes'] if self.config.use_regime_features else None
            )

            # 4. Create supervised learning dataset
            logger.info("\n🎯 Step 4: Supervised Dataset Creation")
            X, y = self.feature_engineer.create_target_from_features(features_df, data)

            # 5. Data validation and splitting
            logger.info("\n✂️ Step 5: Data Splitting")
            if len(X) < 50:
                raise ValueError(f"Insufficient data: {len(X)} samples (minimum 50 required)")

            train_end = int(len(X) * self.config.test_split_ratio)
            train_end = max(train_end, 30)  # Ensure minimum training samples

            X_train = X.iloc[:train_end].copy()
            X_test = X.iloc[train_end:].copy()
            y_train = y.iloc[:train_end].copy()
            y_test = y.iloc[train_end:].copy()

            logger.info(f"  Train set: {X_train.shape} features, {y_train.shape} targets")
            logger.info(f"  Test set: {X_test.shape} features, {y_test.shape} targets")

            # 6. Model creation
            logger.info("\n🏭 Step 6: Model Creation")
            all_models = DebugMLModelFactory.create_all_models(self.config)

            if len(all_models) == 0:
                raise ValueError("No models created successfully")

            # 7. Model selection based on regime
            logger.info("\n🎯 Step 7: Model Selection")
            if self.config.use_market_regime_switching:
                selected_models = self.regime_analyzer.select_models_for_regime(all_models, current_regime)
            else:
                max_models = 15 if HIGH_MEMORY else 10
                selected_models = all_models[:max_models]
                logger.info(f"  Selected first {len(selected_models)} models (regime switching disabled)")

            # 8. PHASE 1: V1 Baseline Models
            logger.info("\n" + "=" * 60)
            logger.info("🔥 PHASE 1: V1 BASELINE ML MODELS")
            logger.info("=" * 60)

            v1_predictions = self._train_v1_models_with_debug(selected_models, X_train, y_train, X_test)

            if len(v1_predictions) == 0:
                raise ValueError("No V1 models trained successfully")

            # 9. PHASE 2: V2 Advanced Feature Engineering
            if self.config.enable_v2_feature_engineering and len(v1_predictions) > 0:
                logger.info("\n" + "=" * 60)
                logger.info("🔥 PHASE 2: V2 ADVANCED FEATURE ENGINEERING")
                logger.info("=" * 60)

                X_train_v2, X_test_v2 = self._create_v2_features(X_train, y_train, X_test, v1_predictions)
                v2_predictions = self._train_v2_models_with_debug(selected_models[:8], X_train_v2, y_train, X_test_v2)
            else:
                logger.info("\n⏭️ PHASE 2: V2 Feature Engineering SKIPPED")
                v2_predictions = v1_predictions
                X_train_v2, X_test_v2 = X_train, X_test

            # 10. PHASE 3: VP Hyperparameter Optimization (CRITICAL - GridSearchCV)
            if self.config.enable_vp_optimization and len(v2_predictions) > 0:
                logger.info("\n" + "=" * 60)
                logger.info("🔥 PHASE 3: VP HYPERPARAMETER OPTIMIZATION (GridSearchCV)")
                logger.info("=" * 60)

                vp_optimizer = VerifiedGridSearchOptimizer(self.config)
                vp_predictions = vp_optimizer.optimize_top_models(v2_predictions, X_train_v2, y_train, X_test_v2, y_test)

                # Log GridSearchCV optimization summary
                optimization_summary = vp_optimizer.get_optimization_summary()
                if not optimization_summary.empty:
                    logger.info(f"\n📊 GRIDSEARCHCV OPTIMIZATION RESULTS:")
                    for _, row in optimization_summary.head(3).iterrows():
                        logger.info(f"  {row['model_name']}: {row['improvement_pct']:+.1f}% improvement")
                else:
                    logger.warning(f"  No optimization summary available - GridSearchCV may not have run")

                # Store VP timing
                self.phase_timings['VP'] = time.time() - pipeline_start_time - sum(self.phase_timings.values())

            else:
                logger.info(f"\n⏭️ PHASE 3: VP Optimization SKIPPED")
                vp_predictions = v2_predictions
                self.phase_timings['VP'] = 0

            # 11. Comprehensive evaluation
            logger.info(f"\n📊 Step 11: Final Evaluation")
            eval_start = time.time()

            results = self._evaluate_all_predictions_with_debug(
                {'V1': v1_predictions, 'V2': v2_predictions, 'VP': vp_predictions},
                y_test,
                y_train
            )

            eval_time = time.time() - eval_start

            # 12. Pipeline summary and verification
            total_time = time.time() - pipeline_start_time

            logger.info(f"\n" + "=" * 80)
            logger.info(f"✅ PIPELINE EXECUTION COMPLETE")
            logger.info(f"=" * 80)

            # Detailed timing breakdown
            logger.info(f"EXECUTION TIMING:")
            logger.info(f"  V1 Phase: {self.phase_timings.get('V1', 0):.1f}s")
            logger.info(f"  V2 Phase: {self.phase_timings.get('V2', 0):.1f}s")
            logger.info(f"  VP Phase: {self.phase_timings.get('VP', 0):.1f}s")
            logger.info(f"  Evaluation: {eval_time:.1f}s")
            logger.info(f"  Total: {total_time:.1f}s ({total_time/60:.1f} minutes)")

            # Results summary
            if not results.empty:
                logger.info(f"\nRESULTS SUMMARY:")
                logger.info(f"  Total models evaluated: {len(results)}")
                logger.info(f"  V1 models: {len(results[results['Phase'] == 'V1'])}")
                logger.info(f"  V2 models: {len(results[results['Phase'] == 'V2'])}")
                logger.info(f"  VP models: {len(results[results['Phase'] == 'VP'])}")

                # Champion model
                best_model = results.iloc[0]
                logger.info(f"\nCHAMPION MODEL:")
                logger.info(f"  Name: {best_model['Phase']}_{best_model['Model']}")
                logger.info(f"  MAE: {best_model['MAE']:.3f}")
                if best_model['R2'] > -1:
                    logger.info(f"  R²: {best_model['R2']:.3f}")
                if best_model['MAPE'] < 999:
                    logger.info(f"  MAPE: {best_model['MAPE']:.1f}%")

                # Phase performance comparison
                if len(results[results['Phase'] == 'VP']) > 0:
                    v2_best_mae = results[results['Phase'] == 'V2']['MAE'].min()
                    vp_best_mae = results[results['Phase'] == 'VP']['MAE'].min()
                    improvement = ((v2_best_mae - vp_best_mae) / v2_best_mae) * 100
                    logger.info(f"\nOPTIMIZATION IMPACT:")
                    logger.info(f"  Best V2 MAE: {v2_best_mae:.3f}")
                    logger.info(f"  Best VP MAE: {vp_best_mae:.3f}")
                    logger.info(f"  GridSearch improvement: {improvement:+.1f}%")

                    if improvement > 0:
                        logger.info(f"  ✅ GridSearchCV successfully improved models")
                    else:
                        logger.warning(f"  ⚠️ GridSearchCV did not improve best model")
                else:
                    logger.warning(f"  ⚠️ No VP models found - GridSearchCV may have failed")
            else:
                logger.error(f"  ❌ No valid results generated")

            return results

        except Exception as e:
            total_time = time.time() - pipeline_start_time
            logger.error(f"❌ Pipeline execution failed after {total_time:.1f} seconds: {str(e)}")
            logger.error(f"Full traceback: {traceback.format_exc()}")

            # Return minimal error results
            return pd.DataFrame({
                'Phase': ['Error'],
                'Model': ['Pipeline_Failed'],
                'MAE': [float('inf')],
                'RMSE': [float('inf')],
                'R2': [-1.0],
                'Error': [str(e)]
            })

    def _validate_input_data(self, data: pd.DataFrame) -> bool:
        """Comprehensive input data validation"""

        try:
            if data.empty:
                logger.error("  ❌ Input data is empty")
                return False

            if self.config.target_column not in data.columns:
                logger.error(f"  ❌ Target column '{self.config.target_column}' not found")
                return False

            target_series = data[self.config.target_column]
            nan_pct = target_series.isna().sum() / len(target_series) * 100

            logger.info(f"  Data shape: {data.shape}")
            logger.info(f"  Target column: {self.config.target_column}")
            logger.info(f"  NaN percentage in target: {nan_pct:.1f}%")

            if nan_pct > 50:
                logger.error(f"  ❌ Too many NaN values in target: {nan_pct:.1f}%")
                return False

            if len(data) < 100:
                logger.warning(f"  ⚠️ Small dataset: {len(data)} samples (recommended: 200+)")

            logger.info("  ✅ Input data validation passed")
            return True

        except Exception as e:
            logger.error(f"  ❌ Data validation failed: {str(e)}")
            return False

    def _train_v1_models_with_debug(self, models: List[DebugMLForecaster],
                                   X_train: pd.DataFrame, y_train: pd.Series,
                                   X_test: pd.DataFrame) -> Dict[str, np.ndarray]:
        """Train V1 models with comprehensive debugging"""

        v1_predictions = {}
        total_models = len(models)
        phase_start = time.time()

        logger.info(f"🚀 Training {total_models} V1 models...")

        successful = 0
        failed = 0

        for i, model in enumerate(models, 1):
            model_start = time.time()

            # Progress indicator
            progress = f"[{i}/{total_models}]"
            logger.info(f"{progress} Training {model.name}...")

            try:
                model.fit(X_train, y_train)

                if model.is_fitted:
                    predictions = model.predict(X_test)

                    # Validate predictions
                    if len(predictions) > 0 and not np.all(np.isnan(predictions)):
                        v1_predictions[model.name] = predictions
                        successful += 1

                        model_time = time.time() - model_start

                        # Log success with timing
                        logger.info(f"  ✅ {model.name} SUCCESS ({model_time:.2f}s)")
                    else:
                        failed += 1
                        logger.warning(f"  ❌ {model.name} FAILED - Invalid predictions")
                else:
                    failed += 1
                    error_msg = model.fit_error or "Unknown error"
                    logger.warning(f"  ❌ {model.name} FAILED - {error_msg[:50]}...")

            except Exception as e:
                failed += 1
                model_time = time.time() - model_start
                logger.error(f"  ❌ {model.name} FAILED - {str(e)[:50]}... ({model_time:.2f}s)")

        phase_time = time.time() - phase_start
        success_rate = (successful / total_models) * 100

        logger.info(f"\n📊 V1 PHASE COMPLETE:")
        logger.info(f"  Duration: {phase_time:.1f} seconds")
        logger.info(f"  Success rate: {successful}/{total_models} ({success_rate:.1f}%)")
        logger.info(f"  Valid predictions: {len(v1_predictions)}")

        self.phase_timings['V1'] = phase_time

        return v1_predictions

    def _create_v2_features(self, X_train: pd.DataFrame, y_train: pd.Series,
                           X_test: pd.DataFrame, v1_predictions: Dict[str, np.ndarray]) -> Tuple[pd.DataFrame, pd.DataFrame]:
        """Create enhanced V2 features"""

        logger.info("🔧 Creating V2 enhanced features...")

        try:
            X_train_v2 = X_train.copy()
            X_test_v2 = X_test.copy()

            # Add simple ensemble features from V1 predictions
            if len(v1_predictions) > 0:
                logger.info("  Adding V1 ensemble features...")

                valid_preds = []
                for name, preds in v1_predictions.items():
                    if len(preds) >= len(X_test):
                        valid_preds.append(preds[:len(X_test)])

                if valid_preds:
                    pred_array = np.column_stack(valid_preds)

                    X_test_v2['v1_ensemble_mean'] = np.mean(pred_array, axis=1)
                    X_test_v2['v1_ensemble_std'] = np.std(pred_array, axis=1)

                    # For training set - create proxy features
                    overall_mean = np.mean([np.mean(pred) for pred in valid_preds])
                    overall_std = np.std([np.mean(pred) for pred in valid_preds])

                    X_train_v2['v1_ensemble_mean'] = overall_mean
                    X_train_v2['v1_ensemble_std'] = overall_std

                    logger.info(f"    Added 2 ensemble features")

            logger.info(f"  V2 features created: {X_train_v2.shape[1]} features")

            return X_train_v2, X_test_v2

        except Exception as e:
            logger.warning(f"  V2 feature creation failed: {str(e)}")
            return X_train.copy(), X_test.copy()

    def _train_v2_models_with_debug(self, models: List[DebugMLForecaster],
                                   X_train: pd.DataFrame, y_train: pd.Series,
                                   X_test: pd.DataFrame) -> Dict[str, np.ndarray]:
        """Train V2 models with enhanced features"""

        v2_predictions = {}
        phase_start = time.time()

        logger.info(f"🚀 Training {len(models)} V2 models with enhanced features...")

        successful = 0
        failed = 0

        for i, model in enumerate(models, 1):
            progress = f"[{i}/{len(models)}]"
            logger.info(f"{progress} Training V2_{model.name}...")

            try:
                # Create new model instance for V2
                model_v2 = DebugMLForecaster(f"{model.name}_V2", clone(model.model), self.config.debug_mode)
                model_v2.fit(X_train, y_train)

                if model_v2.is_fitted:
                    predictions = model_v2.predict(X_test)

                    if len(predictions) > 0 and not np.all(np.isnan(predictions)):
                        v2_predictions[model_v2.name] = predictions
                        successful += 1
                        logger.info(f"  ✅ V2_{model.name} SUCCESS")
                    else:
                        failed += 1
                        logger.warning(f"  ❌ V2_{model.name} FAILED - Invalid predictions")
                else:
                    failed += 1
                    logger.warning(f"  ❌ V2_{model.name} FAILED - Not fitted")

            except Exception as e:
                failed += 1
                logger.error(f"  ❌ V2_{model.name} FAILED - {str(e)[:50]}...")

        phase_time = time.time() - phase_start

        logger.info(f"\n📊 V2 PHASE COMPLETE:")
        logger.info(f"  Duration: {phase_time:.1f} seconds")
        logger.info(f"  Success rate: {successful}/{len(models)} ({successful/len(models)*100:.1f}%)")
        logger.info(f"  Valid predictions: {len(v2_predictions)}")

        self.phase_timings['V2'] = phase_time

        return v2_predictions

    def _evaluate_all_predictions_with_debug(self, all_predictions: Dict[str, Dict[str, np.ndarray]],
                                            y_test: pd.Series, y_train: pd.Series) -> pd.DataFrame:
        """Evaluate all predictions with comprehensive debugging"""

        logger.info("📊 Evaluating all model predictions...")

        results = []
        evaluation_errors = []

        for phase_name, phase_predictions in all_predictions.items():
            logger.info(f"  Evaluating {phase_name} models: {len(phase_predictions)} models")

            for model_name, predictions in phase_predictions.items():
                try:
                    metrics = self._calculate_comprehensive_metrics(
                        y_test.values,
                        predictions,
                        y_train.values,
                        model_name
                    )

                    result_row = {
                        'Phase': phase_name,
                        'Model': model_name,
                        **metrics
                    }
                    results.append(result_row)

                except Exception as e:
                    evaluation_errors.append(f"{model_name}: {str(e)}")
                    logger.warning(f"    ❌ Evaluation failed for {model_name}: {str(e)[:50]}...")

        if not results:
            logger.error("❌ No valid results to evaluate!")
            return pd.DataFrame()

        results_df = pd.DataFrame(results)

        # Handle infinite values in sorting
        results_df['MAE_sort'] = results_df['MAE'].replace([np.inf, -np.inf], 999999)
        results_df = results_df.sort_values('MAE_sort', ascending=True).drop('MAE_sort', axis=1)

        logger.info(f"✅ Evaluation complete:")
        logger.info(f"  Total models evaluated: {len(results_df)}")
        logger.info(f"  Evaluation errors: {len(evaluation_errors)}")

        return results_df

    def _calculate_comprehensive_metrics(self, y_true: np.ndarray, y_pred: np.ndarray,
                                        y_train: np.ndarray = None, model_name: str = None) -> Dict[str, float]:
        """Calculate comprehensive evaluation metrics with error handling"""

        try:
            # Ensure arrays are proper format
            y_true = np.asarray(y_true).flatten()
            y_pred = np.asarray(y_pred).flatten()

            # Handle length mismatch
            min_length = min(len(y_true), len(y_pred))
            if min_length == 0:
                raise ValueError("Empty prediction arrays")

            y_true = y_true[:min_length]
            y_pred = y_pred[:min_length]

            # Remove invalid values
            valid_mask = ~(np.isnan(y_true) | np.isnan(y_pred) | np.isinf(y_true) | np.isinf(y_pred))

            if not np.any(valid_mask):
                raise ValueError("No valid prediction pairs")

            y_true_clean = y_true[valid_mask]
            y_pred_clean = y_pred[valid_mask]

            # Calculate metrics
            metrics = {}

            metrics['MAE'] = float(mean_absolute_error(y_true_clean, y_pred_clean))
            metrics['RMSE'] = float(np.sqrt(mean_squared_error(y_true_clean, y_pred_clean)))

            # R² with error handling
            try:
                r2 = r2_score(y_true_clean, y_pred_clean)
                metrics['R2'] = float(r2) if not np.isnan(r2) else -1.0
            except:
                metrics['R2'] = -1.0

            # MAPE with zero handling
            try:
                mask_nonzero = y_true_clean != 0
                if mask_nonzero.sum() > 0:
                    mape = np.mean(np.abs((y_true_clean[mask_nonzero] - y_pred_clean[mask_nonzero]) / y_true_clean[mask_nonzero])) * 100
                    metrics['MAPE'] = float(mape) if not np.isnan(mape) else 999.0
                else:
                    metrics['MAPE'] = 999.0
            except:
                metrics['MAPE'] = 999.0

            return metrics

        except Exception as e:
            logger.warning(f"Metrics calculation failed for {model_name}: {str(e)}")
            return {
                'MAE': float('inf'),
                'RMSE': float('inf'),
                'R2': -1.0,
                'MAPE': 999.0
            }

print("DebugMLForecastingPipeline loaded")

DebugMLForecastingPipeline loaded


In [12]:
# ============================================================================
# EXECUTION FUNCTIONS
# ============================================================================

def run_debug_pipeline(data_size: int = None, quick_test: bool = False) -> pd.DataFrame:
    """Run the complete debug pipeline with comprehensive logging"""

    logger.info("🎯 STARTING DEBUG ML FORECASTING PIPELINE")
    logger.info("=" * 80)

    # Determine optimal data size based on system capabilities
    if data_size is None:
        if HIGH_MEMORY and not quick_test:
            data_size = 400
        else:
            data_size = 200

    logger.info(f"Configuration: {data_size} data points, quick_test={quick_test}")
    logger.info(f"System: GPU={GPU_AVAILABLE}, High Memory={HIGH_MEMORY}")

    try:
        # Generate data
        data = generate_debug_synthetic_data(n_points=data_size)

        # Configure pipeline based on system and test mode
        config = DebugMLForecastingConfig(
            # Core settings
            target_column="calls",
            seasonal_period=7,
            test_split_ratio=0.75,

            # Feature engineering - scale with system
            max_lags=15 if quick_test else (25 if HIGH_MEMORY else 20),
            rolling_windows=[3, 7, 14] if quick_test else ([3, 7, 14, 21, 30] if HIGH_MEMORY else [3, 7, 14, 21]),
            create_technical_indicators=True,
            create_calendar_features=True,

            # Model settings
            use_tree_models=True,
            use_linear_models=True,
            use_neural_models=not quick_test,  # Skip neural for quick tests
            use_ensemble_models=not quick_test,
            use_other_models=True,

            # GridSearchCV settings - CRITICAL for optimization verification
            cv_splits=3 if quick_test else 5,
            gridsearch_verbose=2,  # Show GridSearch progress
            parallel_jobs=1,  # Single job for debug visibility

            # Optimization settings
            quick_search=quick_test,
            detailed_search=not quick_test,
            top_models_for_optimization=3 if quick_test else (8 if HIGH_MEMORY else 5),

            # Pipeline phases
            enable_v2_feature_engineering=True,
            enable_vp_optimization=True,  # CRITICAL - ensure GridSearchCV runs

            # Debug settings
            debug_mode=True,
            show_progress=True,
            time_each_phase=True,
            validate_gridsearch=True
        )

        # Create and execute pipeline
        pipeline = DebugMLForecastingPipeline(config)
        results = pipeline.execute(data)

        # Enhanced results display
        if not results.empty:
            logger.info("🏆 PIPELINE RESULTS SUMMARY:")
            logger.info("=" * 60)

            # Show top 10 models
            top_10 = results.head(10)
            logger.info("Top 10 Models:")
            for i, (_, row) in enumerate(top_10.iterrows(), 1):
                mae_str = f"{row['MAE']:.3f}" if row['MAE'] != float('inf') else "inf"
                r2_str = f"{row['R2']:.3f}" if row['R2'] != -1.0 else "N/A"
                logger.info(f"  {i:2d}. {row['Phase']:<2} {row['Model']:<20} MAE:{mae_str} R²:{r2_str}")

            # GridSearchCV verification
            vp_models = results[results['Phase'] == 'VP']
            if len(vp_models) > 0:
                logger.info(f"\n✅ GridSearchCV VERIFICATION:")
                logger.info(f"  VP (optimized) models found: {len(vp_models)}")
                logger.info(f"  Best VP model MAE: {vp_models['MAE'].min():.3f}")
            else:
                logger.warning(f"\n⚠️ GridSearchCV VERIFICATION:")
                logger.warning(f"  NO VP (optimized) models found!")
                logger.warning(f"  This indicates GridSearchCV may not have run properly")
        else:
            logger.error("❌ No results generated")

        return results

    except Exception as e:
        logger.error(f"❌ Debug pipeline failed: {str(e)}")
        logger.error(f"Full traceback: {traceback.format_exc()}")
        return pd.DataFrame()

print("🎯 COMPLETE ML FORECASTING SYSTEM LOADED")
print("=" * 80)
print("Ready to execute!")
print("")
print("To run the pipeline:")
print("  results = run_debug_pipeline(data_size=200, quick_test=True)  # Quick test")
print("  results = run_debug_pipeline()  # Full execution")

🎯 COMPLETE ML FORECASTING SYSTEM LOADED
Ready to execute!

To run the pipeline:
  results = run_debug_pipeline(data_size=200, quick_test=True)  # Quick test
  results = run_debug_pipeline()  # Full execution


In [13]:
# ============================================================================
# COMPLETE ML PIPELINE EXECUTION
# ============================================================================

# CSV FILE CONFIGURATION - MODIFY THESE SETTINGS FOR YOUR DATA
CSV_FILE_PATH = "enhanced_eda_data.csv"  # Path to your CSV file
TARGET_COLUMN = "calls"                  # Name of your target column to forecast
DATE_COLUMN = "date"                     # Name of your date column (set to None if no date column)
QUICK_TEST = True                        # Set to False for full execution with all models

# ============================================================================
# ROBUST CSV DATA LOADING
# ============================================================================

def load_csv_data_robust(csv_path: str, target_column: str = 'calls', date_column: str = None) -> pd.DataFrame:
    """Load and prepare CSV data - completely robust version"""

    logger.info(f"Loading CSV data from: {csv_path}")

    try:
        # Load CSV
        data = pd.read_csv(csv_path)
        logger.info(f"  Raw data shape: {data.shape}")
        logger.info(f"  Columns: {list(data.columns)}")

        # Handle date column if specified
        if date_column and date_column in data.columns:
            logger.info(f"  Converting {date_column} to datetime index")
            data[date_column] = pd.to_datetime(data[date_column])
            data = data.set_index(date_column).sort_index()

        # Validate target column
        if target_column not in data.columns:
            logger.error(f"  Target column '{target_column}' not found!")
            logger.info(f"  Available columns: {list(data.columns)}")
            raise ValueError(f"Target column '{target_column}' not found")

        logger.info(f"  Processing data for ML pipeline...")

        # Remove rows where target is null
        initial_length = len(data)
        data = data.dropna(subset=[target_column])
        removed_rows = initial_length - len(data)

        if removed_rows > 0:
            logger.info(f"    Removed {removed_rows} rows with null target values")

        # Convert target to numeric
        data[target_column] = pd.to_numeric(data[target_column], errors='coerce')

        # Process ALL other columns - simple numeric conversion only
        logger.info(f"    Converting all columns to numeric...")

        columns_to_drop = []
        conversion_success = 0

        for col in data.columns:
            if col == target_column:
                continue
            if data.index.name == col:
                continue

            try:
                # Simple numeric conversion - no string operations
                data[col] = pd.to_numeric(data[col], errors='coerce')

                # Fill NaN values
                if data[col].isna().any():
                    median_val = data[col].median()
                    if pd.isna(median_val):
                        median_val = 0.0
                    data[col] = data[col].fillna(median_val)

                # Convert to float64
                data[col] = data[col].astype(np.float64)

                conversion_success += 1

            except Exception as e:
                logger.warning(f"    Could not convert {col}, will drop: {str(e)}")
                columns_to_drop.append(col)

        # Drop problematic columns
        if columns_to_drop:
            data = data.drop(columns=columns_to_drop)
            logger.info(f"    Dropped {len(columns_to_drop)} non-convertible columns")

        logger.info(f"    Successfully converted {conversion_success} columns to numeric")

        # Final cleanup
        data = data.replace([np.inf, -np.inf], np.nan)

        for col in data.columns:
            if data[col].isna().any():
                median_val = data[col].median()
                if pd.isna(median_val):
                    median_val = 0.0
                data[col] = data[col].fillna(median_val)

        # Verify all columns are numeric
        all_numeric = True
        for col in data.columns:
            if not np.issubdtype(data[col].dtype, np.number):
                all_numeric = False
                logger.warning(f"    Column {col} is still not numeric: {data[col].dtype}")

        if not all_numeric:
            # Force numeric conversion as last resort
            numeric_data = data.select_dtypes(include=[np.number])
            logger.warning(f"    Keeping only numeric columns: {numeric_data.shape}")
            data = numeric_data

        # Validate target survived
        if target_column not in data.columns:
            raise ValueError(f"Target column '{target_column}' was lost during processing")

        logger.info(f"  Data loaded successfully: {data.shape}")
        logger.info(f"  Target range: {data[target_column].min():.2f} - {data[target_column].max():.2f}")
        logger.info(f"  All columns numeric: {all(np.issubdtype(data[col].dtype, np.number) for col in data.columns)}")

        # Set market data detection for your columns
        detected_market_data = {}
        column_mapping = {
            '^VIX_close': 'vix',
            'SPY_close': 'sp500',
            'SPY_volume': 'sp500_volume',
            'QQQ_close': 'nasdaq',
            'DX-Y.NYB_close': 'dollar',
            'GC=F_close': 'gold',
            'BTC-USD_close': 'bitcoin',
            'ETH-USD_close': 'ethereum'
        }

        for csv_col, market_type in column_mapping.items():
            if csv_col in data.columns:
                detected_market_data[market_type] = csv_col

        if detected_market_data:
            logger.info(f"  Market data detected: {list(detected_market_data.keys())}")

        data._detected_market_data = detected_market_data

        return data

    except Exception as e:
        logger.error(f"  Failed to load CSV: {str(e)}")
        raise

# ============================================================================
# ENHANCED MODEL FITTING WITH ROBUST DATA HANDLING
# ============================================================================

class RobustMLForecaster(DebugMLForecaster):
    """Enhanced forecaster with bulletproof data type handling"""

    def fit(self, X: pd.DataFrame, y: pd.Series) -> 'RobustMLForecaster':
        """Fit model with bulletproof data type conversion"""

        fit_start = time.time()

        try:
            if self.debug_mode:
                logger.debug(f"Fitting {self.name} on data shape: {X.shape}")

            # Validate input
            if X.empty or y.empty:
                raise ValueError(f"Empty input data for {self.name}")

            if len(X) != len(y):
                raise ValueError(f"Feature/target length mismatch for {self.name}: {len(X)} vs {len(y)}")

            # BULLETPROOF data conversion
            logger.debug(f"{self.name}: Converting data to safe numeric format...")

            # Convert features
            if isinstance(X, pd.DataFrame):
                X_clean = X.copy()
                for col in X_clean.columns:
                    if not np.issubdtype(X_clean[col].dtype, np.number):
                        X_clean[col] = pd.to_numeric(X_clean[col], errors='coerce')
                    X_clean[col] = X_clean[col].fillna(0.0)
                X_array = X_clean.values.astype(np.float64)
            else:
                X_array = np.asarray(X, dtype=np.float64)

            # Convert target
            if isinstance(y, pd.Series):
                y_clean = pd.to_numeric(y, errors='coerce')
                y_clean = y_clean.fillna(y_clean.median())
                if y_clean.isna().any():
                    y_clean = y_clean.fillna(0.0)
                y_array = y_clean.values.astype(np.float64)
            else:
                y_array = np.asarray(y, dtype=np.float64)

            # Verify arrays are proper
            if X_array.size == 0 or y_array.size == 0:
                raise ValueError(f"Empty arrays after conversion")

            # Clean arrays
            X_array = np.nan_to_num(X_array, nan=0.0, posinf=1e6, neginf=-1e6)
            y_array = np.nan_to_num(y_array, nan=0.0, posinf=1e6, neginf=-1e6)

            # Create and fit pipeline
            self.pipeline = Pipeline([
                ('scaler', StandardScaler()),
                ('model', clone(self.model))
            ])

            self.pipeline.fit(X_array, y_array.ravel())
            self.is_fitted = True

            # Test prediction
            test_pred = self.pipeline.predict(X_array[:min(5, len(X_array))])
            if len(test_pred) == 0 or np.all(np.isnan(test_pred)):
                logger.warning(f"{self.name}: Model produces invalid predictions")
                self.is_fitted = False

        except Exception as e:
            self.fit_error = str(e)
            logger.error(f"{self.name} fit failed: {self.fit_error}")
            self.is_fitted = False

        finally:
            self.training_time = time.time() - fit_start
            status = "SUCCESS" if self.is_fitted else "FAILED"
            logger.info(f"{self.name}: {status} ({self.training_time:.2f}s)")

        return self

    def predict(self, X: pd.DataFrame) -> np.ndarray:
        """Make predictions with robust data handling"""

        if not self.is_fitted:
            logger.error(f"{self.name}: Cannot predict - model not fitted")
            return np.zeros(len(X))

        try:
            # Convert input data safely
            if isinstance(X, pd.DataFrame):
                X_clean = X.copy()
                for col in X_clean.columns:
                    if not np.issubdtype(X_clean[col].dtype, np.number):
                        X_clean[col] = pd.to_numeric(X_clean[col], errors='coerce')
                    X_clean[col] = X_clean[col].fillna(0.0)
                X_array = X_clean.values.astype(np.float64)
            else:
                X_array = np.asarray(X, dtype=np.float64)

            # Clean array
            X_array = np.nan_to_num(X_array, nan=0.0, posinf=1e6, neginf=-1e6)

            # Make predictions
            predictions = self.pipeline.predict(X_array)
            predictions = np.nan_to_num(predictions, nan=0.0, posinf=1e6, neginf=-1e6)

            return predictions.flatten()

        except Exception as e:
            logger.error(f"{self.name} prediction failed: {str(e)}")
            return np.zeros(len(X))

# ============================================================================
# ROBUST MODEL FACTORY
# ============================================================================

def create_robust_models(config: DebugMLForecastingConfig) -> List[RobustMLForecaster]:
    """Create models with robust forecasters"""

    logger.info("Creating robust ML models...")

    models = []

    # Linear models
    models.extend([
        RobustMLForecaster("LinearRegression", LinearRegression(), config.debug_mode),
        RobustMLForecaster("Ridge_1.0", Ridge(alpha=1.0), config.debug_mode),
        RobustMLForecaster("Lasso_1.0", Lasso(alpha=1.0, max_iter=2000), config.debug_mode),
    ])

    # Tree models
    models.extend([
        RobustMLForecaster("RandomForest_100", RandomForestRegressor(n_estimators=100, random_state=42, n_jobs=1), config.debug_mode),
        RobustMLForecaster("ExtraTrees_100", ExtraTreesRegressor(n_estimators=100, random_state=42, n_jobs=1), config.debug_mode),
        RobustMLForecaster("GradientBoosting_100", GradientBoostingRegressor(n_estimators=100, random_state=42), config.debug_mode),
    ])

    # Add XGBoost and LightGBM if available
    if XGB_AVAILABLE:
        models.append(RobustMLForecaster("XGBoost_100", xgb.XGBRegressor(n_estimators=100, random_state=42, n_jobs=1), config.debug_mode))

    if LGB_AVAILABLE:
        models.append(RobustMLForecaster("LightGBM_100", lgb.LGBMRegressor(n_estimators=100, random_state=42, verbose=-1, n_jobs=1), config.debug_mode))

    # Other models if not quick test
    if not config.quick_search:
        models.extend([
            RobustMLForecaster("SVR_rbf", SVR(kernel='rbf', C=1.0), config.debug_mode),
            RobustMLForecaster("KNeighbors_5", KNeighborsRegressor(n_neighbors=5), config.debug_mode),
        ])

    logger.info(f"Created {len(models)} robust models")
    return models

# ============================================================================
# SIMPLIFIED PIPELINE EXECUTION
# ============================================================================

def run_robust_pipeline(csv_path: str, target_column: str, date_column: str = None, quick_test: bool = True) -> pd.DataFrame:
    """Run ML pipeline with robust error handling"""

    logger.info("ROBUST ML FORECASTING PIPELINE")
    logger.info("=" * 60)

    try:
        # 1. Load data
        logger.info("Step 1: Loading CSV data...")
        data = load_csv_data_robust(csv_path, target_column, date_column)

        # 2. Basic feature engineering
        logger.info("Step 2: Creating features...")
        features_df = data.copy()

        # Add simple lag features
        for lag in [1, 2, 3, 7]:
            if lag < len(data):
                features_df[f'lag_{lag}'] = data[target_column].shift(lag)

        # Add rolling features
        for window in [3, 7]:
            if window < len(data):
                features_df[f'rolling_mean_{window}'] = data[target_column].rolling(window, min_periods=1).mean()
                features_df[f'rolling_std_{window}'] = data[target_column].rolling(window, min_periods=1).std()

        # Clean features
        features_df = features_df.fillna(method='ffill').fillna(method='bfill').fillna(0)

        # Remove target from features
        if target_column in features_df.columns:
            features_df = features_df.drop(columns=[target_column])

        # 3. Create supervised dataset
        logger.info("Step 3: Creating supervised dataset...")
        target_series = data[target_column].shift(-1)  # Forecast next period

        # Align data
        common_index = features_df.index.intersection(target_series.index)
        X = features_df.loc[common_index]
        y = target_series.loc[common_index]

        # Remove last row (no target available)
        X = X.iloc[:-1]
        y = y.iloc[:-1]

        # Remove NaN targets
        valid_mask = ~y.isna()
        X = X[valid_mask]
        y = y[valid_mask]

        logger.info(f"  Final dataset: {X.shape} features, {y.shape} targets")

        # 4. Split data
        logger.info("Step 4: Splitting data...")
        train_size = int(len(X) * 0.75)
        X_train, X_test = X.iloc[:train_size], X.iloc[train_size:]
        y_train, y_test = y.iloc[:train_size], y.iloc[train_size:]

        logger.info(f"  Train: {X_train.shape}, Test: {X_test.shape}")

        # 5. Create and train models
        logger.info("Step 5: Training models...")
        config = DebugMLForecastingConfig(
            target_column=target_column,
            quick_search=quick_test,
            debug_mode=True
        )

        models = create_robust_models(config)
        results = []

        for i, model in enumerate(models, 1):
            logger.info(f"  [{i}/{len(models)}] Training {model.name}...")

            try:
                model.fit(X_train, y_train)

                if model.is_fitted:
                    predictions = model.predict(X_test)

                    if len(predictions) > 0 and not np.all(np.isnan(predictions)):
                        mae = mean_absolute_error(y_test.values, predictions)
                        rmse = np.sqrt(mean_squared_error(y_test.values, predictions))

                        results.append({
                            'Model': model.name,
                            'MAE': mae,
                            'RMSE': rmse,
                            'Status': 'Success'
                        })

                        logger.info(f"    SUCCESS - MAE: {mae:.3f}")
                    else:
                        logger.warning(f"    FAILED - Invalid predictions")
                else:
                    logger.warning(f"    FAILED - {model.fit_error}")

            except Exception as e:
                logger.error(f"    FAILED - {str(e)}")

        # 6. Results
        if results:
            results_df = pd.DataFrame(results)
            results_df = results_df.sort_values('MAE')

            logger.info("RESULTS:")
            logger.info("=" * 40)
            for i, (_, row) in enumerate(results_df.head(5).iterrows(), 1):
                logger.info(f"  {i}. {row['Model']}: MAE {row['MAE']:.3f}")

            return results_df
        else:
            logger.error("No successful models!")
            return pd.DataFrame()

    except Exception as e:
        logger.error(f"Pipeline failed: {str(e)}")
        return pd.DataFrame()

# ============================================================================
# AUTOMATIC EXECUTION
# ============================================================================

if __name__ == "__main__":
    print("ROBUST ML FORECASTING SYSTEM")
    print("=" * 50)
    print(f"CSV File: {CSV_FILE_PATH}")
    print(f"Target: {TARGET_COLUMN}")
    print(f"Date Column: {DATE_COLUMN}")
    print("=" * 50)

    try:
        results = run_robust_pipeline(
            csv_path=CSV_FILE_PATH,
            target_column=TARGET_COLUMN,
            date_column=DATE_COLUMN,
            quick_test=QUICK_TEST
        )

        if not results.empty:
            print("\nTOP MODELS:")
            print("-" * 30)
            for i, (_, row) in enumerate(results.head(5).iterrows(), 1):
                print(f"{i}. {row['Model']}: MAE {row['MAE']:.3f}")

            # Save results
            output_file = f"ml_results_{TARGET_COLUMN}.csv"
            results.to_csv(output_file, index=False)
            print(f"\nResults saved to: {output_file}")
        else:
            print("\nNo results generated")

    except Exception as e:
        print(f"Execution failed: {str(e)}")

print("\n" + "="*40)
print("MANUAL EXECUTION:")
print("="*40)
print("# To run manually:")
print(f'results = run_robust_pipeline("{CSV_FILE_PATH}", "{TARGET_COLUMN}", "{DATE_COLUMN}")')

ROBUST ML FORECASTING SYSTEM
CSV File: enhanced_eda_data.csv
Target: calls
Date Column: date

TOP MODELS:
------------------------------
1. ExtraTrees_100: MAE 688.027
2. LightGBM_100: MAE 810.669
3. RandomForest_100: MAE 836.143
4. XGBoost_100: MAE 960.436
5. Ridge_1.0: MAE 1196.221

Results saved to: ml_results_calls.csv

MANUAL EXECUTION:
# To run manually:
results = run_robust_pipeline("enhanced_eda_data.csv", "calls", "date")


In [14]:
# ============================================================================
# GUARANTEED 30+ MODELS PIPELINE
# ============================================================================

def create_all_30_models() -> List[RobustMLForecaster]:
    """Create exactly 30+ models - no shortcuts"""

    print("Creating ALL 30+ ML models...")

    models = []

    # LINEAR MODELS (10 models)
    print("  Adding Linear Models...")
    models.extend([
        RobustMLForecaster("LinearRegression", LinearRegression(), True),
        RobustMLForecaster("Ridge_0.1", Ridge(alpha=0.1), True),
        RobustMLForecaster("Ridge_1.0", Ridge(alpha=1.0), True),
        RobustMLForecaster("Ridge_10.0", Ridge(alpha=10.0), True),
        RobustMLForecaster("Lasso_0.1", Lasso(alpha=0.1, max_iter=2000), True),
        RobustMLForecaster("Lasso_1.0", Lasso(alpha=1.0, max_iter=2000), True),
        RobustMLForecaster("ElasticNet_0.1", ElasticNet(alpha=0.1, max_iter=2000), True),
        RobustMLForecaster("ElasticNet_1.0", ElasticNet(alpha=1.0, max_iter=2000), True),
        RobustMLForecaster("BayesianRidge", BayesianRidge(), True),
        RobustMLForecaster("HuberRegressor", HuberRegressor(max_iter=200), True),
    ])

    # TREE MODELS (9 models)
    print("  Adding Tree Models...")
    models.extend([
        RobustMLForecaster("RandomForest_50", RandomForestRegressor(n_estimators=50, random_state=42, n_jobs=1), True),
        RobustMLForecaster("RandomForest_100", RandomForestRegressor(n_estimators=100, random_state=42, n_jobs=1), True),
        RobustMLForecaster("RandomForest_200", RandomForestRegressor(n_estimators=200, random_state=42, n_jobs=1), True),
        RobustMLForecaster("ExtraTrees_50", ExtraTreesRegressor(n_estimators=50, random_state=42, n_jobs=1), True),
        RobustMLForecaster("ExtraTrees_100", ExtraTreesRegressor(n_estimators=100, random_state=42, n_jobs=1), True),
        RobustMLForecaster("ExtraTrees_200", ExtraTreesRegressor(n_estimators=200, random_state=42, n_jobs=1), True),
        RobustMLForecaster("GradientBoosting_50", GradientBoostingRegressor(n_estimators=50, random_state=42), True),
        RobustMLForecaster("GradientBoosting_100", GradientBoostingRegressor(n_estimators=100, random_state=42), True),
        RobustMLForecaster("DecisionTree", DecisionTreeRegressor(random_state=42, max_depth=10), True),
    ])

    # XGBOOST MODELS (3 models)
    if XGB_AVAILABLE:
        print("  Adding XGBoost Models...")
        models.extend([
            RobustMLForecaster("XGBoost_50", xgb.XGBRegressor(n_estimators=50, random_state=42, n_jobs=1), True),
            RobustMLForecaster("XGBoost_100", xgb.XGBRegressor(n_estimators=100, random_state=42, n_jobs=1), True),
            RobustMLForecaster("XGBoost_200", xgb.XGBRegressor(n_estimators=200, random_state=42, n_jobs=1), True),
        ])

    # LIGHTGBM MODELS (3 models)
    if LGB_AVAILABLE:
        print("  Adding LightGBM Models...")
        models.extend([
            RobustMLForecaster("LightGBM_50", lgb.LGBMRegressor(n_estimators=50, random_state=42, verbose=-1, n_jobs=1), True),
            RobustMLForecaster("LightGBM_100", lgb.LGBMRegressor(n_estimators=100, random_state=42, verbose=-1, n_jobs=1), True),
            RobustMLForecaster("LightGBM_200", lgb.LGBMRegressor(n_estimators=200, random_state=42, verbose=-1, n_jobs=1), True),
        ])

    # NEURAL NETWORK MODELS (3 models)
    print("  Adding Neural Network Models...")
    models.extend([
        RobustMLForecaster("MLP_50", MLPRegressor(hidden_layer_sizes=(50,), random_state=42, max_iter=1000, early_stopping=True, validation_fraction=0.1), True),
        RobustMLForecaster("MLP_100", MLPRegressor(hidden_layer_sizes=(100,), random_state=42, max_iter=1000, early_stopping=True, validation_fraction=0.1), True),
        RobustMLForecaster("MLP_100_50", MLPRegressor(hidden_layer_sizes=(100, 50), random_state=42, max_iter=1000, early_stopping=True, validation_fraction=0.1), True),
    ])

    # ENSEMBLE MODELS (2 models)
    print("  Adding Ensemble Models...")
    models.extend([
        RobustMLForecaster("BaggingRegressor", BaggingRegressor(random_state=42, n_jobs=1), True),
        RobustMLForecaster("AdaBoostRegressor", AdaBoostRegressor(random_state=42, n_estimators=50), True),
    ])

    # OTHER MODELS (7 models)
    print("  Adding Other Models...")
    models.extend([
        RobustMLForecaster("SVR_linear", SVR(kernel='linear', C=1.0), True),
        RobustMLForecaster("SVR_rbf", SVR(kernel='rbf', C=1.0), True),
        RobustMLForecaster("SVR_poly", SVR(kernel='poly', C=1.0, degree=3), True),
        RobustMLForecaster("KNeighbors_3", KNeighborsRegressor(n_neighbors=3), True),
        RobustMLForecaster("KNeighbors_5", KNeighborsRegressor(n_neighbors=5), True),
        RobustMLForecaster("KNeighbors_10", KNeighborsRegressor(n_neighbors=10), True),
        RobustMLForecaster("KernelRidge", KernelRidge(alpha=1.0), True),
    ])

    print(f"✅ Created {len(models)} total models")

    # Count by type
    linear_count = len([m for m in models if any(x in m.name for x in ['Linear', 'Ridge', 'Lasso', 'Elastic', 'Bayesian', 'Huber'])])
    tree_count = len([m for m in models if any(x in m.name for x in ['Forest', 'Extra', 'Gradient', 'Decision'])])
    xgb_count = len([m for m in models if 'XGBoost' in m.name])
    lgb_count = len([m for m in models if 'LightGBM' in m.name])
    neural_count = len([m for m in models if 'MLP' in m.name])
    ensemble_count = len([m for m in models if any(x in m.name for x in ['Bagging', 'Ada'])])
    other_count = len([m for m in models if any(x in m.name for x in ['SVR', 'KNeighbors', 'Kernel'])])

    print(f"  Linear models: {linear_count}")
    print(f"  Tree models: {tree_count}")
    print(f"  XGBoost models: {xgb_count}")
    print(f"  LightGBM models: {lgb_count}")
    print(f"  Neural networks: {neural_count}")
    print(f"  Ensemble models: {ensemble_count}")
    print(f"  Other models: {other_count}")

    return models

def run_all_30_models_pipeline():
    """Guaranteed pipeline that runs ALL 30+ models"""

    print("🚀 RUNNING GUARANTEED 30+ MODELS PIPELINE")
    print("=" * 60)

    try:
        # 1. Load data
        print("Step 1: Loading CSV data...")
        data = load_csv_data_robust(CSV_FILE_PATH, TARGET_COLUMN, DATE_COLUMN)

        # 2. Feature engineering
        print("\nStep 2: Creating features...")
        features_df = data.copy()

        # Add lag features
        for lag in [1, 2, 3, 7, 14]:
            if lag < len(data):
                features_df[f'lag_{lag}'] = data[TARGET_COLUMN].shift(lag)

        # Add rolling features
        for window in [3, 7, 14]:
            if window < len(data):
                features_df[f'rolling_mean_{window}'] = data[TARGET_COLUMN].rolling(window, min_periods=1).mean()
                features_df[f'rolling_std_{window}'] = data[TARGET_COLUMN].rolling(window, min_periods=1).std()

        # Add simple market features if available
        if '^VIX_close' in data.columns:
            features_df['vix_lag1'] = data['^VIX_close'].shift(1)
            features_df['vix_change'] = data['^VIX_close'].diff()

        if 'SPY_close' in data.columns:
            features_df['spy_return'] = data['SPY_close'].pct_change()

        # Clean features
        features_df = features_df.fillna(method='ffill').fillna(method='bfill').fillna(0)

        # Remove target from features
        if TARGET_COLUMN in features_df.columns:
            features_df = features_df.drop(columns=[TARGET_COLUMN])

        print(f"  Created {features_df.shape[1]} features")

        # 3. Create supervised dataset
        print("\nStep 3: Creating supervised dataset...")
        target_series = data[TARGET_COLUMN].shift(-1)  # Forecast next period

        # Align data
        common_index = features_df.index.intersection(target_series.index)
        X = features_df.loc[common_index]
        y = target_series.loc[common_index]

        # Remove last row (no target available) and NaN targets
        X = X.iloc[:-1]
        y = y.iloc[:-1]
        valid_mask = ~y.isna()
        X = X[valid_mask]
        y = y[valid_mask]

        print(f"  Final dataset: {X.shape} features, {y.shape} targets")

        # 4. Split data
        print("\nStep 4: Splitting data...")
        train_size = int(len(X) * 0.75)
        X_train, X_test = X.iloc[:train_size], X.iloc[train_size:]
        y_train, y_test = y.iloc[:train_size], y.iloc[train_size:]

        print(f"  Train: {X_train.shape}, Test: {X_test.shape}")

        # 5. Create ALL models
        print("\nStep 5: Creating ALL 30+ models...")
        all_models = create_all_30_models()

        # 6. Train ALL models
        print(f"\nStep 6: Training ALL {len(all_models)} models...")
        results = []

        for i, model in enumerate(all_models, 1):
            print(f"  [{i:2d}/{len(all_models):2d}] Training {model.name}...")

            try:
                model.fit(X_train, y_train)

                if model.is_fitted:
                    predictions = model.predict(X_test)

                    if len(predictions) > 0 and not np.all(np.isnan(predictions)):
                        mae = mean_absolute_error(y_test.values, predictions)
                        rmse = np.sqrt(mean_squared_error(y_test.values, predictions))

                        results.append({
                            'Model': model.name,
                            'MAE': mae,
                            'RMSE': rmse,
                            'Status': 'Success'
                        })

                        print(f"    ✅ SUCCESS - MAE: {mae:.3f}")
                    else:
                        print(f"    ❌ FAILED - Invalid predictions")
                        results.append({
                            'Model': model.name,
                            'MAE': 9999.0,
                            'RMSE': 9999.0,
                            'Status': 'Failed'
                        })
                else:
                    print(f"    ❌ FAILED - {model.fit_error or 'Unknown error'}")
                    results.append({
                        'Model': model.name,
                        'MAE': 9999.0,
                        'RMSE': 9999.0,
                        'Status': 'Failed'
                    })

            except Exception as e:
                print(f"    ❌ FAILED - {str(e)}")
                results.append({
                    'Model': model.name,
                    'MAE': 9999.0,
                    'RMSE': 9999.0,
                    'Status': 'Failed'
                })

        # 7. Results
        print(f"\nStep 7: Processing results...")
        if results:
            results_df = pd.DataFrame(results)
            results_df = results_df.sort_values('MAE')

            # Save results
            output_file = f"all_models_results_{TARGET_COLUMN}.csv"
            results_df.to_csv(output_file, index=False)

            successful = len(results_df[results_df['Status'] == 'Success'])
            total = len(results_df)

            print(f"✅ TRAINING COMPLETE!")
            print(f"  Total models: {total}")
            print(f"  Successful: {successful}")
            print(f"  Failed: {total - successful}")
            print(f"  Results saved to: {output_file}")

            return results_df
        else:
            print("❌ No successful models!")
            return pd.DataFrame()

    except Exception as e:
        print(f"❌ Pipeline failed: {str(e)}")
        import traceback
        traceback.print_exc()
        return pd.DataFrame()

# ============================================================================
# RUN THE 30+ MODELS PIPELINE
# ============================================================================

print("ABOUT TO RUN ALL 30+ MODELS...")
print("This will take several minutes but you'll get the complete results.")
print()

# Run the guaranteed pipeline
all_results = run_all_30_models_pipeline()

if not all_results.empty:
    print(f"\n🎉 SUCCESS! Trained {len(all_results)} models")
    print("\nTop 5 models:")
    for i, (_, row) in enumerate(all_results.head(5).iterrows(), 1):
        print(f"  {i}. {row['Model']}: MAE {row['MAE']:.3f}")
else:
    print("\n❌ Pipeline failed to generate results")

print("\n" + "="*60)
print("NOW RUN THE ANALYSIS CELL TO SEE ALL MODELS VS BASELINE!")
print("="*60)

ABOUT TO RUN ALL 30+ MODELS...
This will take several minutes but you'll get the complete results.

🚀 RUNNING GUARANTEED 30+ MODELS PIPELINE
Step 1: Loading CSV data...

Step 2: Creating features...
  Created 33 features

Step 3: Creating supervised dataset...
  Final dataset: (975, 33) features, (975,) targets

Step 4: Splitting data...
  Train: (731, 33), Test: (244, 33)

Step 5: Creating ALL 30+ models...
Creating ALL 30+ ML models...
  Adding Linear Models...
  Adding Tree Models...
  Adding XGBoost Models...
  Adding LightGBM Models...
  Adding Neural Network Models...
  Adding Ensemble Models...
  Adding Other Models...
✅ Created 37 total models
  Linear models: 11
  Tree models: 9
  XGBoost models: 3
  LightGBM models: 3
  Neural networks: 3
  Ensemble models: 2
  Other models: 7

Step 6: Training ALL 37 models...
  [ 1/37] Training LinearRegression...
    ✅ SUCCESS - MAE: 1294.918
  [ 2/37] Training Ridge_0.1...
    ✅ SUCCESS - MAE: 1278.036
  [ 3/37] Training Ridge_1.0...
    

In [15]:
# ============================================================================
# FINAL ANALYSIS - ALL 30+ MODELS vs SEASONAL BASELINE
# ============================================================================

import pandas as pd
import numpy as np
from sklearn.metrics import mean_absolute_error, mean_squared_error

print("="*80)
print("FINAL COMPREHENSIVE ANALYSIS - ALL MODELS vs SEASONAL BASELINE")
print("="*80)

# Load the data to calculate baseline
data = pd.read_csv("enhanced_eda_data.csv")
if "date" in data.columns:
    data["date"] = pd.to_datetime(data["date"])
    data = data.set_index("date").sort_index()

# Calculate seasonal baseline (7-day naive)
print("Calculating Seasonal Baseline...")
target_series = data["calls"]
train_size = int(len(target_series) * 0.75)
train_data = target_series.iloc[:train_size]
test_data = target_series.iloc[train_size:]

# Create seasonal predictions
seasonal_predictions = []
for i in range(len(test_data)):
    lookup_idx = train_size + i - 7  # 7-day seasonal
    if lookup_idx >= 0 and lookup_idx < len(target_series):
        seasonal_predictions.append(target_series.iloc[lookup_idx])
    else:
        seasonal_predictions.append(train_data.iloc[-1])

seasonal_predictions = np.array(seasonal_predictions)
baseline_mae = mean_absolute_error(test_data.values, seasonal_predictions)
baseline_rmse = np.sqrt(mean_squared_error(test_data.values, seasonal_predictions))

print(f"Seasonal Baseline (7-day):")
print(f"  MAE: {baseline_mae:.4f}")
print(f"  RMSE: {baseline_rmse:.4f}")
print()

# Load ML results
print("Loading ML model results...")
try:
    ml_results = pd.read_csv("all_models_results_calls.csv")
    print(f"Loaded {len(ml_results)} model results")
except FileNotFoundError:
    print("ERROR: all_models_results_calls.csv not found!")
    print("Please run the 30+ models pipeline first.")
    exit()

# Add baseline comparison
ml_results['Baseline_MAE'] = baseline_mae
ml_results['MAE_Improvement'] = baseline_mae - ml_results['MAE']
ml_results['Improvement_Pct'] = ((baseline_mae - ml_results['MAE']) / baseline_mae) * 100
ml_results['Beats_Baseline'] = ml_results['MAE'] < baseline_mae
ml_results['Performance'] = ml_results['Beats_Baseline'].apply(lambda x: '✓ BEAT' if x else '✗ FAILED')

# Sort by MAE (best first)
ml_results = ml_results.sort_values('MAE')

print("ALL MODELS DETAILED RESULTS:")
print("="*85)
print(f"{'Rank':<4} {'Model':<25} {'MAE':<8} {'vs Baseline':<12} {'Improvement':<12} {'Status':<8}")
print("-"*85)

for i, (_, row) in enumerate(ml_results.iterrows(), 1):
    rank = f"{i}."
    model = row['Model'][:24]
    mae = f"{row['MAE']:.4f}"
    improvement = f"{row['Improvement_Pct']:+.1f}%"
    vs_baseline = f"{row['MAE_Improvement']:+.4f}"
    status = row['Performance']

    print(f"{rank:<4} {model:<25} {mae:<8} {vs_baseline:<12} {improvement:<12} {status:<8}")

# Summary statistics
total_models = len(ml_results)
successful_models = ml_results['Beats_Baseline'].sum()
failed_models = total_models - successful_models
best_model = ml_results.iloc[0]

print("\n" + "="*80)
print("PERFORMANCE SUMMARY:")
print("="*80)
print(f"Seasonal Baseline MAE: {baseline_mae:.4f}")
print(f"Total ML Models: {total_models}")
print(f"Models that BEAT baseline: {successful_models} ({successful_models/total_models*100:.1f}%)")
print(f"Models that FAILED vs baseline: {failed_models} ({failed_models/total_models*100:.1f}%)")
print()

print("CHAMPION MODEL:")
print("-" * 30)
print(f"Model: {best_model['Model']}")
print(f"MAE: {best_model['MAE']:.4f}")
print(f"Improvement over baseline: {best_model['Improvement_Pct']:+.1f}%")
print(f"Absolute improvement: {best_model['MAE_Improvement']:+.4f}")
print()

# Model type performance
print("MODEL TYPE PERFORMANCE:")
print("-" * 50)

model_types = {
    'Linear': ['Linear', 'Ridge', 'Lasso', 'Elastic', 'Bayesian', 'Huber'],
    'Tree': ['Forest', 'Extra', 'Gradient', 'Decision'],
    'XGBoost': ['XGBoost'],
    'LightGBM': ['LightGBM'],
    'Neural': ['MLP'],
    'Ensemble': ['Bagging', 'Ada'],
    'Other': ['SVR', 'KNeighbors', 'Kernel']
}

for category, keywords in model_types.items():
    category_models = ml_results[ml_results['Model'].str.contains('|'.join(keywords), case=False)]
    if len(category_models) > 0:
        category_success = category_models['Beats_Baseline'].sum()
        category_total = len(category_models)
        success_rate = (category_success / category_total) * 100
        best_in_category = category_models.iloc[0]

        print(f"{category:>10}: {category_success:2d}/{category_total:2d} beat baseline ({success_rate:5.1f}%) | Best: {best_in_category['Model']:<20} (MAE: {best_in_category['MAE']:.4f})")

# Top performers
print("\nTOP 10 PERFORMERS:")
print("-" * 60)
for i, (_, row) in enumerate(ml_results.head(10).iterrows(), 1):
    status_icon = "✓" if row['Beats_Baseline'] else "✗"
    print(f"{i:2d}. {status_icon} {row['Model']:<25} MAE: {row['MAE']:.4f} ({row['Improvement_Pct']:+.1f}%)")

# Final insights
print("\n" + "="*80)
print("KEY INSIGHTS:")
print("="*80)

if successful_models == 0:
    print("🚨 CRITICAL: No ML models beat the seasonal baseline!")
    print("   → The 7-day seasonal pattern is extremely strong")
    print("   → Consider ensemble methods or seasonal decomposition")
    print("   → Stick with simple seasonal forecasting for now")
elif successful_models < total_models * 0.3:
    print(f"⚠️  MIXED: Only {successful_models}/{total_models} models beat baseline")
    print("   → Most models are not adding value over simple forecasting")
    print("   → Focus on the successful models for deployment")
    print("   → Consider why other models failed")
else:
    print(f"✅ SUCCESS: {successful_models}/{total_models} models beat baseline!")
    print("   → ML is adding significant value over naive forecasting")
    print("   → You have multiple good model options")
    print("   → Advanced ML is worthwhile for your use case")

best_improvement = ml_results['Improvement_Pct'].max()
if best_improvement > 20:
    print(f"\n🎯 OUTSTANDING: Best model improves baseline by {best_improvement:.1f}%!")
elif best_improvement > 10:
    print(f"\n👍 GOOD: Best model improves baseline by {best_improvement:.1f}%")
elif best_improvement > 5:
    print(f"\n📈 MODEST: Best model improves baseline by {best_improvement:.1f}%")
else:
    print(f"\n😐 MINIMAL: Best improvement is only {best_improvement:.1f}%")

# Save enhanced results
enhanced_file = "final_enhanced_results_calls.csv"
ml_results.to_csv(enhanced_file, index=False)
print(f"\n💾 Complete results saved to: {enhanced_file}")

print("\n" + "="*80)
print("ANALYSIS COMPLETE!")
print("="*80)

FINAL COMPREHENSIVE ANALYSIS - ALL MODELS vs SEASONAL BASELINE
Calculating Seasonal Baseline...
Seasonal Baseline (7-day):
  MAE: 767.6025
  RMSE: 1022.0483

Loading ML model results...
Loaded 37 model results
ALL MODELS DETAILED RESULTS:
Rank Model                     MAE      vs Baseline  Improvement  Status  
-------------------------------------------------------------------------------------
1.   ExtraTrees_50             675.0193 +92.5831     +12.1%       ✓ BEAT  
2.   ExtraTrees_200            682.7147 +84.8877     +11.1%       ✓ BEAT  
3.   ExtraTrees_100            690.2918 +77.3107     +10.1%       ✓ BEAT  
4.   LightGBM_50               691.0704 +76.5320     +10.0%       ✓ BEAT  
5.   LightGBM_100              733.9691 +33.6333     +4.4%        ✓ BEAT  
6.   BaggingRegressor          755.7873 +11.8152     +1.5%        ✓ BEAT  
7.   LightGBM_200              773.0110 -5.4086      -0.7%        ✗ FAILED
8.   RandomForest_100          793.9450 -26.3426     -3.4%        ✗ FAILED


The Reality Check:
Only 6 out of 37 sophisticated ML models (16.2%) managed to beat a simple "use last week's value" forecast. This tells us your call volume data has extremely strong weekly seasonal patterns that are difficult for machine learning to improve upon.
What This Means:

Your 7-day seasonal pattern is so consistent that most advanced algorithms can't add value
The seasonal baseline (MAE: 767.60) is actually quite good for this type of data
ExtraTrees models and LightGBM are the only algorithms consistently finding patterns beyond seasonality

Model Type Insights:

Tree models performed best (3/9 success rate) - they can capture non-linear seasonal interactions
LightGBM was effective (2/3 success rate) - good at handling seasonal features
All linear models failed (0/12) - suggests non-linear patterns in the data
Neural networks performed terribly - likely overfitting on the complex seasonal patterns
XGBoost surprisingly failed despite usually being strong for time series

Practical Recommendations:

Use ExtraTrees_50 as your production model (12.1% improvement is meaningful)
Keep the seasonal baseline as a backup - it's surprisingly robust
Consider ensemble approaches combining the top 3-4 tree models
Focus on tree-based algorithms for future improvements rather than linear or neural approaches

The fact that most models failed isn't a failure of your analysis - it's valuable intelligence showing that your call patterns are highly predictable seasonally, and simple methods work well for this use case.