---
title: "Stock Forecasting using ARIMA, Prophet & LSTM (INSTRUCTOR VERSION)"
week: 4
author: "Praveen Kumar"
date: 2025-10-07
version: v1.0
instructor_only: true
---

# Week 4: Stock Forecasting using ARIMA, Prophet & LSTM
## **INSTRUCTOR VERSION** 🎓

This notebook contains **complete solutions** for all exercises and additional teaching notes.

⚠️ **CONFIDENTIAL**: Do not share this version with students.

In [None]:
# Parameters
SEED = 42
SAMPLE_MODE = True  # Use subset for faster execution
DATA_PATH = "data/synthetic/stock_prices.csv"
SYMBOL = "AAPL"  # Stock symbol to forecast

# INSTRUCTOR ONLY: Additional parameters for advanced models
ADVANCED_MODELS = True

In [None]:
# INSTRUCTOR ONLY: Setup with additional imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

# Time series specific
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.holtwinters import ExponentialSmoothing
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.seasonal import seasonal_decompose

# Prophet
try:
    from prophet import Prophet
    PROPHET_AVAILABLE = True
except ImportError:
    print("Prophet not available. Installing...")
    PROPHET_AVAILABLE = False

# Deep Learning
try:
    import tensorflow as tf
    from tensorflow.keras.models import Sequential
    from tensorflow.keras.layers import LSTM, Dense, Dropout
    from sklearn.preprocessing import MinMaxScaler
    from sklearn.metrics import mean_squared_error, mean_absolute_error
    TF_AVAILABLE = True
except ImportError:
    print("TensorFlow not available for LSTM modeling")
    TF_AVAILABLE = False

# Financial data
import yfinance as yf

# Set random seeds
np.random.seed(SEED)
if TF_AVAILABLE:
    tf.random.set_seed(SEED)

print("🎓 INSTRUCTOR VERSION - Setup complete!")

## INSTRUCTOR NOTE: Data Loading Strategy

**Teaching Point**: Always provide robust data loading with multiple fallback options for classroom environments.

In [None]:
# INSTRUCTOR ONLY: Enhanced data loading with multiple fallbacks
def load_stock_data_instructor(symbol="AAPL", period="5y", sample_mode=True):
    """Enhanced data loading with comprehensive error handling"""
    try:
        print(f"🔄 Loading {symbol} data...")
        ticker = yf.Ticker(symbol)
        df = ticker.history(period=period)
        
        if df.empty:
            raise ValueError("No data returned from yfinance")
            
        # Convert to lowercase columns for consistency
        df.columns = df.columns.str.lower()
        
        # INSTRUCTOR: Quality checks
        print(f"✅ Data quality check:")
        print(f"   - Date range: {df.index.min().date()} to {df.index.max().date()}")
        print(f"   - Missing values: {df.isnull().sum().sum()}")
        print(f"   - Price range: ${df['close'].min():.2f} - ${df['close'].max():.2f}")
        
        # Subset for sample mode
        if sample_mode and len(df) > 1000:
            df = df.tail(1000)
            print(f"📊 Using last 1000 days for sample mode")
            
        print(f"✅ Successfully loaded {len(df)} days of {symbol} data")
        return df, 'real'
        
    except Exception as e:
        print(f"❌ Error loading {symbol}: {e}")
        print("🔄 Falling back to synthetic data...")
        return generate_synthetic_stock_data_instructor(sample_mode), 'synthetic'

def generate_synthetic_stock_data_instructor(sample_mode=True):
    """INSTRUCTOR: Generate realistic synthetic financial time series"""
    np.random.seed(SEED)
    
    n_days = 1000 if sample_mode else 2000
    dates = pd.date_range(start='2020-01-01', periods=n_days, freq='D')
    
    # INSTRUCTOR: More sophisticated synthetic data generation
    # 1. Generate returns with volatility clustering (GARCH-like)
    returns = np.zeros(n_days)
    volatility = np.zeros(n_days)
    volatility[0] = 0.02  # Initial volatility
    
    for i in range(1, n_days):
        # GARCH(1,1) type volatility
        volatility[i] = 0.00001 + 0.05 * returns[i-1]**2 + 0.94 * volatility[i-1]
        returns[i] = np.random.normal(0.0005, np.sqrt(volatility[i]))
    
    # 2. Convert to prices
    prices = 100 * np.exp(np.cumsum(returns))
    
    # 3. Add weekly seasonality (higher returns on Fridays)
    weekday_effect = np.sin(2 * np.pi * np.arange(n_days) / 7) * 0.001
    prices *= np.exp(np.cumsum(weekday_effect))
    
    # 4. Generate OHLC data
    daily_vol = np.random.uniform(0.005, 0.025, n_days)
    high_prices = prices * (1 + daily_vol)
    low_prices = prices * (1 - daily_vol)
    open_prices = np.roll(prices, 1)
    open_prices[0] = prices[0]
    
    df = pd.DataFrame({
        'close': prices,
        'open': open_prices,
        'high': high_prices,
        'low': low_prices,
        'volume': np.random.lognormal(15, 0.5, n_days).astype(int)
    }, index=dates)
    
    print(f"✅ Generated {len(df)} days of sophisticated synthetic data")
    print(f"   - Includes volatility clustering and weekly seasonality")
    return df

# Load data with enhanced function
df, data_source = load_stock_data_instructor(SYMBOL, period="5y", sample_mode=SAMPLE_MODE)

print(f"\n📊 Dataset Summary:")
print(f"Source: {data_source}")
print(f"Shape: {df.shape}")
print(f"Date range: {df.index.min().date()} to {df.index.max().date()}")
print(f"Missing values: {df.isnull().sum().sum()}")

## INSTRUCTOR NOTE: Advanced EDA

**Teaching Points:**
- Always decompose time series into trend, seasonal, and residual components
- Check for volatility clustering in financial data
- Demonstrate autocorrelation analysis

In [None]:
# INSTRUCTOR ONLY: Advanced Exploratory Data Analysis
def advanced_eda_instructor(df):
    """Comprehensive EDA for financial time series"""
    
    # Calculate returns
    returns = df['close'].pct_change().dropna()
    log_returns = np.log(df['close'] / df['close'].shift(1)).dropna()
    
    # Create comprehensive visualization
    fig, axes = plt.subplots(3, 3, figsize=(18, 15))
    
    # 1. Price and volume
    axes[0, 0].plot(df.index, df['close'], linewidth=1.5, color='blue')
    axes[0, 0].set_title('Stock Price Over Time')
    axes[0, 0].set_ylabel('Price ($)')
    axes[0, 0].grid(True, alpha=0.3)
    
    # 2. Volume
    axes[0, 1].bar(df.index, df['volume'], width=1, alpha=0.7, color='green')
    axes[0, 1].set_title('Trading Volume')
    axes[0, 1].set_ylabel('Volume')
    axes[0, 1].grid(True, alpha=0.3)
    
    # 3. Returns distribution
    axes[0, 2].hist(returns, bins=50, alpha=0.7, color='red', edgecolor='black')
    axes[0, 2].axvline(returns.mean(), color='blue', linestyle='--', 
                      label=f'Mean: {returns.mean():.4f}')
    axes[0, 2].set_title('Returns Distribution')
    axes[0, 2].set_xlabel('Daily Returns')
    axes[0, 2].legend()
    axes[0, 2].grid(True, alpha=0.3)
    
    # 4. Autocorrelation of returns
    from statsmodels.graphics.tsaplots import plot_acf
    plot_acf(returns.dropna(), lags=40, ax=axes[1, 0], alpha=0.05)
    axes[1, 0].set_title('Returns Autocorrelation')
    
    # 5. Autocorrelation of squared returns (volatility clustering)
    plot_acf(returns.dropna()**2, lags=40, ax=axes[1, 1], alpha=0.05)
    axes[1, 1].set_title('Squared Returns ACF (Volatility Clustering)')
    
    # 6. Rolling volatility
    rolling_vol = returns.rolling(window=30).std() * np.sqrt(252)  # Annualized
    axes[1, 2].plot(rolling_vol.index, rolling_vol, color='purple', linewidth=1.5)
    axes[1, 2].set_title('30-day Rolling Volatility (Annualized)')
    axes[1, 2].set_ylabel('Volatility')
    axes[1, 2].grid(True, alpha=0.3)
    
    # 7. Q-Q plot for normality check
    from scipy import stats
    stats.probplot(returns.dropna(), dist="norm", plot=axes[2, 0])
    axes[2, 0].set_title('Q-Q Plot (Normality Check)')
    
    # 8. Seasonal decomposition (if enough data)
    if len(df) > 365:
        # Resample to weekly for seasonal decomposition
        weekly_prices = df['close'].resample('W').last()
        if len(weekly_prices) > 104:  # At least 2 years of weekly data
            decomposition = seasonal_decompose(weekly_prices, model='multiplicative', period=52)
            axes[2, 1].plot(decomposition.trend.dropna())
            axes[2, 1].set_title('Trend Component (Weekly)')
            axes[2, 1].grid(True, alpha=0.3)
            
            axes[2, 2].plot(decomposition.seasonal[:52])  # First year
            axes[2, 2].set_title('Seasonal Component (First Year)')
            axes[2, 2].grid(True, alpha=0.3)
        else:
            axes[2, 1].text(0.5, 0.5, 'Insufficient data\nfor decomposition', 
                           ha='center', va='center', transform=axes[2, 1].transAxes)
            axes[2, 2].text(0.5, 0.5, 'Insufficient data\nfor decomposition', 
                           ha='center', va='center', transform=axes[2, 2].transAxes)
    else:
        axes[2, 1].text(0.5, 0.5, 'Insufficient data\nfor decomposition', 
                       ha='center', va='center', transform=axes[2, 1].transAxes)
        axes[2, 2].text(0.5, 0.5, 'Insufficient data\nfor decomposition', 
                       ha='center', va='center', transform=axes[2, 2].transAxes)
    
    plt.tight_layout()
    plt.show()
    
    # Print statistical summary
    print("📊 STATISTICAL SUMMARY:")
    print(f"Returns - Mean: {returns.mean():.6f}, Std: {returns.std():.6f}")
    print(f"Annualized Volatility: {returns.std() * np.sqrt(252):.4f}")
    print(f"Skewness: {returns.skew():.4f}")
    print(f"Kurtosis: {returns.kurtosis():.4f}")
    
    # Test for normality
    from scipy.stats import jarque_bera
    jb_stat, jb_pvalue = jarque_bera(returns.dropna())
    print(f"Jarque-Bera test p-value: {jb_pvalue:.6f}")
    if jb_pvalue < 0.05:
        print("❌ Returns are NOT normally distributed")
    else:
        print("✅ Returns appear normally distributed")
    
    return returns, log_returns

# Perform advanced EDA
returns_series, log_returns_series = advanced_eda_instructor(df)

## INSTRUCTOR SOLUTION: Exercise 1 - Exponential Smoothing Baseline

**Teaching Point**: Always include simple baselines to validate that complex models add value.

In [None]:
# INSTRUCTOR ONLY: Complete solution for Exercise 1
def build_exponential_smoothing_instructor(train_data, seasonal_periods=None):
    """Build Holt-Winters exponential smoothing model"""
    
    try:
        # Determine if we have enough data for seasonality
        if seasonal_periods and len(train_data) >= 2 * seasonal_periods:
            model = ExponentialSmoothing(
                train_data,
                trend='add',
                seasonal='add',
                seasonal_periods=seasonal_periods
            )
            print(f"Using Holt-Winters with seasonality (period={seasonal_periods})")
        else:
            model = ExponentialSmoothing(train_data, trend='add')
            print("Using Holt's method (trend only)")
        
        fitted_model = model.fit(optimized=True)
        return fitted_model
        
    except Exception as e:
        print(f"Error with trend model: {e}")
        # Fall back to simple exponential smoothing
        model = ExponentialSmoothing(train_data)
        fitted_model = model.fit(optimized=True)
        print("Using Simple Exponential Smoothing")
        return fitted_model

# INSTRUCTOR: Implement the complete exercise solution
print("🎓 INSTRUCTOR SOLUTION: Exercise 1 - Exponential Smoothing")
print("="*60)

# Use the same data preparation as main notebook
target_series = returns_series
split_point = int(len(target_series) * 0.8)
train_data = target_series[:split_point].copy()
test_data = target_series[split_point:].copy()

# Build exponential smoothing model
exp_smooth_model = build_exponential_smoothing_instructor(train_data, seasonal_periods=5)

# Generate forecasts
exp_smooth_forecast = exp_smooth_model.forecast(steps=len(test_data))

# Calculate metrics
def calculate_metrics_instructor(actual, predicted, model_name):
    """Enhanced metrics calculation with additional measures"""
    mae = mean_absolute_error(actual, predicted)
    rmse = np.sqrt(mean_squared_error(actual, predicted))
    mape = np.mean(np.abs((actual - predicted) / actual)) * 100
    
    # Additional metrics for instructor version
    directional_accuracy = np.mean(np.sign(actual[1:]) == np.sign(predicted[1:])) * 100
    
    metrics = {
        'MAE': mae,
        'RMSE': rmse, 
        'MAPE': mape,
        'Directional_Accuracy': directional_accuracy
    }
    
    print(f"\n{model_name} Metrics:")
    for metric, value in metrics.items():
        print(f"  {metric}: {value:.6f}")
    
    return metrics

exp_smooth_metrics = calculate_metrics_instructor(test_data, exp_smooth_forecast, "Exponential Smoothing")

# Visualization comparison
plt.figure(figsize=(12, 6))
plt.plot(test_data.index, test_data.values, label='Actual', linewidth=2, color='black')
plt.plot(test_data.index, exp_smooth_forecast, label='Exp. Smoothing', linewidth=2, color='orange')
plt.title('Exponential Smoothing Forecast')
plt.ylabel('Returns')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

print("✅ Exercise 1 Complete: Exponential Smoothing implemented as baseline")

## INSTRUCTOR SOLUTION: Exercise 2 - Walk-Forward Validation

**Teaching Point**: Demonstrate realistic model evaluation using walk-forward validation to avoid look-ahead bias.

In [None]:
# INSTRUCTOR ONLY: Complete solution for Exercise 2  
def walk_forward_validation_instructor(data, model_type='arima', window_size=200, step_size=20):
    """Implement walk-forward validation for time series models"""
    
    print(f"🎓 INSTRUCTOR SOLUTION: Exercise 2 - Walk-Forward Validation")
    print(f"Model: {model_type.upper()}, Window: {window_size}, Step: {step_size}")
    print("="*60)
    
    all_forecasts = []
    all_actuals = []
    all_dates = []
    validation_metrics = []
    
    # Start from minimum required window size
    start_idx = window_size
    
    while start_idx + step_size < len(data):
        # Define training window
        train_end_idx = start_idx
        test_start_idx = start_idx
        test_end_idx = min(start_idx + step_size, len(data))
        
        # Get training and test data
        train_window = data.iloc[:train_end_idx]
        test_window = data.iloc[test_start_idx:test_end_idx]
        
        try:
            if model_type.lower() == 'arima':
                # Fit ARIMA model
                model = ARIMA(train_window, order=(1, 0, 1))  # Simple order for demo
                fitted_model = model.fit()
                forecast = fitted_model.forecast(steps=len(test_window))
                
            elif model_type.lower() == 'exp_smooth':
                # Fit exponential smoothing
                model = ExponentialSmoothing(train_window, trend='add')
                fitted_model = model.fit(optimized=True)
                forecast = fitted_model.forecast(steps=len(test_window))
                
            # Store results
            all_forecasts.extend(forecast)
            all_actuals.extend(test_window.values)
            all_dates.extend(test_window.index)
            
            # Calculate window metrics
            window_rmse = np.sqrt(mean_squared_error(test_window, forecast))
            window_mae = mean_absolute_error(test_window, forecast)
            
            validation_metrics.append({
                'window_end': train_window.index[-1],
                'forecast_start': test_window.index[0],
                'rmse': window_rmse,
                'mae': window_mae,
                'train_size': len(train_window),
                'test_size': len(test_window)
            })
            
            print(f"Window {len(validation_metrics):2d}: Train to {train_window.index[-1].date()}, "
                  f"Forecast {len(test_window)} days, RMSE: {window_rmse:.6f}")
            
        except Exception as e:
            print(f"Error in window ending {train_window.index[-1].date()}: {e}")
        
        # Move to next window
        start_idx += step_size
    
    # Convert to DataFrames for analysis
    results_df = pd.DataFrame({
        'Date': all_dates,
        'Actual': all_actuals, 
        'Forecast': all_forecasts
    })
    
    metrics_df = pd.DataFrame(validation_metrics)
    
    # Overall performance
    overall_rmse = np.sqrt(mean_squared_error(all_actuals, all_forecasts))
    overall_mae = mean_absolute_error(all_actuals, all_forecasts)
    
    print(f"\n📊 WALK-FORWARD VALIDATION RESULTS:")
    print(f"Total windows: {len(validation_metrics)}")
    print(f"Overall RMSE: {overall_rmse:.6f}")
    print(f"Overall MAE: {overall_mae:.6f}")
    print(f"Average window RMSE: {metrics_df['rmse'].mean():.6f}")
    print(f"RMSE std deviation: {metrics_df['rmse'].std():.6f}")
    
    # Visualization
    fig, axes = plt.subplots(2, 2, figsize=(16, 10))
    
    # Forecasts vs actuals
    axes[0, 0].plot(results_df['Date'], results_df['Actual'], label='Actual', alpha=0.7)
    axes[0, 0].plot(results_df['Date'], results_df['Forecast'], label='Forecast', alpha=0.7)
    axes[0, 0].set_title('Walk-Forward Forecasts vs Actual')
    axes[0, 0].legend()
    axes[0, 0].grid(True, alpha=0.3)
    
    # RMSE over time
    axes[0, 1].plot(metrics_df['window_end'], metrics_df['rmse'], marker='o')
    axes[0, 1].set_title('RMSE Over Time')
    axes[0, 1].set_ylabel('RMSE')
    axes[0, 1].grid(True, alpha=0.3)
    axes[0, 1].tick_params(axis='x', rotation=45)
    
    # Residuals
    residuals = results_df['Actual'] - results_df['Forecast']
    axes[1, 0].plot(results_df['Date'], residuals, alpha=0.7)
    axes[1, 0].axhline(y=0, color='red', linestyle='--')
    axes[1, 0].set_title('Forecast Residuals')
    axes[1, 0].set_ylabel('Residuals')
    axes[1, 0].grid(True, alpha=0.3)
    
    # Residuals distribution
    axes[1, 1].hist(residuals, bins=30, alpha=0.7, edgecolor='black')
    axes[1, 1].axvline(x=0, color='red', linestyle='--')
    axes[1, 1].set_title('Residuals Distribution')
    axes[1, 1].set_xlabel('Residuals')
    axes[1, 1].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    return results_df, metrics_df

# Execute walk-forward validation
wf_results, wf_metrics = walk_forward_validation_instructor(
    target_series, 
    model_type='arima', 
    window_size=200, 
    step_size=30
)

print("✅ Exercise 2 Complete: Walk-forward validation implemented")

## INSTRUCTOR SOLUTION: Exercise 3 - LSTM Hyperparameter Tuning

**Teaching Point**: Show systematic hyperparameter optimization and the impact of lookback windows.

In [None]:
# INSTRUCTOR ONLY: Complete solution for Exercise 3
def lstm_hyperparameter_tuning_instructor(train_data, test_data, lookback_values=[10, 20, 30, 50]):
    """Systematic LSTM hyperparameter tuning"""
    
    print(f"🎓 INSTRUCTOR SOLUTION: Exercise 3 - LSTM Hyperparameter Tuning")
    print("="*60)
    
    if not TF_AVAILABLE:
        print("❌ TensorFlow not available. Skipping LSTM tuning.")
        return
    
    def create_lstm_dataset_instructor(data, lookback):
        """Create supervised dataset for LSTM"""
        X, y = [], []
        for i in range(lookback, len(data)):
            X.append(data[i-lookback:i])
            y.append(data[i])
        return np.array(X), np.array(y)
    
    def build_and_evaluate_lstm(train_data, test_data, lookback, epochs=30):
        """Build LSTM with specific lookback and evaluate"""
        
        # Scale data
        scaler = MinMaxScaler(feature_range=(0, 1))
        train_scaled = scaler.fit_transform(train_data.values.reshape(-1, 1)).flatten()
        
        # Create supervised dataset
        X_train, y_train = create_lstm_dataset_instructor(train_scaled, lookback)
        X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
        
        # Build model with early stopping
        model = Sequential([
            LSTM(50, return_sequences=True, input_shape=(lookback, 1)),
            Dropout(0.2),
            LSTM(50, return_sequences=False),
            Dropout(0.2),
            Dense(1)
        ])
        
        model.compile(optimizer='adam', loss='mse')
        
        # Early stopping callback
        early_stopping = tf.keras.callbacks.EarlyStopping(
            monitor='val_loss', patience=10, restore_best_weights=True
        )
        
        # Train model
        history = model.fit(
            X_train, y_train,
            epochs=epochs,
            batch_size=32,
            validation_split=0.2,
            verbose=0,
            shuffle=False,
            callbacks=[early_stopping]
        )
        
        # Generate predictions
        full_data = pd.concat([train_data, test_data])
        full_scaled = scaler.transform(full_data.values.reshape(-1, 1)).flatten()
        
        predictions = []
        for i in range(len(train_data), len(full_data)):
            X_test = full_scaled[i-lookback:i].reshape(1, lookback, 1)
            pred_scaled = model.predict(X_test, verbose=0)
            pred = scaler.inverse_transform(pred_scaled.reshape(-1, 1))[0, 0]
            predictions.append(pred)
        
        predictions = np.array(predictions)
        
        # Calculate metrics
        rmse = np.sqrt(mean_squared_error(test_data, predictions))
        mae = mean_absolute_error(test_data, predictions)
        
        return {
            'lookback': lookback,
            'rmse': rmse,
            'mae': mae,
            'predictions': predictions,
            'history': history,
            'final_val_loss': min(history.history['val_loss'])
        }
    
    # Test different lookback values
    results = []
    
    for lookback in lookback_values:
        print(f"Testing lookback window: {lookback} days...")
        
        try:
            result = build_and_evaluate_lstm(train_data, test_data, lookback)
            results.append(result)
            print(f"  RMSE: {result['rmse']:.6f}, MAE: {result['mae']:.6f}")
            
        except Exception as e:
            print(f"  Error with lookback {lookback}: {e}")
    
    if not results:
        print("❌ No successful LSTM runs")
        return
    
    # Find best configuration
    best_result = min(results, key=lambda x: x['rmse'])
    print(f"\n🏆 Best Configuration:")
    print(f"Lookback: {best_result['lookback']} days")
    print(f"RMSE: {best_result['rmse']:.6f}")
    print(f"MAE: {best_result['mae']:.6f}")
    
    # Comprehensive visualization
    fig, axes = plt.subplots(2, 3, figsize=(18, 10))
    
    # Performance comparison
    lookbacks = [r['lookback'] for r in results]
    rmses = [r['rmse'] for r in results]
    maes = [r['mae'] for r in results]
    
    axes[0, 0].plot(lookbacks, rmses, 'bo-', linewidth=2, markersize=8)
    axes[0, 0].set_title('RMSE vs Lookback Window')
    axes[0, 0].set_xlabel('Lookback Window (days)')
    axes[0, 0].set_ylabel('RMSE')
    axes[0, 0].grid(True, alpha=0.3)
    
    axes[0, 1].plot(lookbacks, maes, 'ro-', linewidth=2, markersize=8)
    axes[0, 1].set_title('MAE vs Lookback Window')
    axes[0, 1].set_xlabel('Lookback Window (days)')
    axes[0, 1].set_ylabel('MAE')
    axes[0, 1].grid(True, alpha=0.3)
    
    # Best model forecast
    axes[0, 2].plot(test_data.index, test_data.values, label='Actual', linewidth=2)
    axes[0, 2].plot(test_data.index, best_result['predictions'], 
                   label=f'LSTM (lookback={best_result["lookback"]})', linewidth=2)
    axes[0, 2].set_title('Best LSTM Forecast')
    axes[0, 2].legend()
    axes[0, 2].grid(True, alpha=0.3)
    
    # Training histories
    colors = ['blue', 'red', 'green', 'orange', 'purple']
    for i, result in enumerate(results[:3]):  # Show first 3 for clarity
        axes[1, 0].plot(result['history'].history['loss'], 
                       color=colors[i], label=f'Lookback {result["lookback"]}')
    axes[1, 0].set_title('Training Loss Comparison')
    axes[1, 0].set_xlabel('Epoch')
    axes[1, 0].set_ylabel('Loss')
    axes[1, 0].legend()
    axes[1, 0].grid(True, alpha=0.3)
    
    # Validation loss
    for i, result in enumerate(results[:3]):
        axes[1, 1].plot(result['history'].history['val_loss'], 
                       color=colors[i], label=f'Lookback {result["lookback"]}')
    axes[1, 1].set_title('Validation Loss Comparison')
    axes[1, 1].set_xlabel('Epoch')
    axes[1, 1].set_ylabel('Validation Loss')
    axes[1, 1].legend()
    axes[1, 1].grid(True, alpha=0.3)
    
    # Residuals analysis for best model
    residuals = test_data.values - best_result['predictions']
    axes[1, 2].scatter(best_result['predictions'], residuals, alpha=0.6)
    axes[1, 2].axhline(y=0, color='red', linestyle='--')
    axes[1, 2].set_title('Residuals vs Predictions (Best Model)')
    axes[1, 2].set_xlabel('Predictions')
    axes[1, 2].set_ylabel('Residuals')
    axes[1, 2].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Summary table
    results_df = pd.DataFrame([
        {
            'Lookback': r['lookback'],
            'RMSE': r['rmse'],
            'MAE': r['mae'],
            'Final_Val_Loss': r['final_val_loss']
        }
        for r in results
    ])
    
    print(f"\n📊 HYPERPARAMETER TUNING RESULTS:")
    print(results_df.to_string(index=False, float_format='%.6f'))
    
    return results, best_result

# Execute LSTM hyperparameter tuning
if TF_AVAILABLE:
    lstm_results, best_lstm = lstm_hyperparameter_tuning_instructor(
        train_data, test_data, [10, 20, 30, 50]
    )
    print("✅ Exercise 3 Complete: LSTM hyperparameter tuning completed")
else:
    print("❌ TensorFlow not available - skipping Exercise 3")

## INSTRUCTOR NOTE: Teaching Summary

**Key Teaching Points for Week 4:**

### 1. Model Selection Guidance
- **ARIMA**: Best for stationary data with clear autocorrelation patterns
- **Prophet**: Excellent for business data with seasonality and missing values  
- **LSTM**: Powerful for non-linear patterns but requires large datasets

### 2. Common Student Mistakes
- Not checking stationarity before ARIMA modeling
- Using future information in time series splits (look-ahead bias)
- Not scaling data properly for LSTM
- Ignoring validation methodology (using random splits instead of temporal)

### 3. Practical Implementation Tips
- Always implement walk-forward validation for realistic performance assessment
- Use multiple metrics (RMSE, MAE, directional accuracy) for comprehensive evaluation
- Consider computational costs in real-world applications
- Ensemble methods often outperform individual models

### 4. Business Applications
- Short-term forecasting: ARIMA often sufficient
- Medium-term with seasonality: Prophet recommended
- Complex patterns with multiple features: LSTM
- Risk management: Focus on tail predictions and uncertainty quantification