[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/danpele/Time-Series-Analysis/blob/main/chapter9_seminar_notebook.ipynb)

---

# Chapter 9 Seminar: Prophet and TBATS - Practice

**Course:** Time Series Analysis and Forecasting  
**Program:** Bachelor program, Faculty of Cybernetics, Statistics and Economic Informatics, Bucharest University of Economic Studies, Romania  
**Academic Year:** 2025-2026

---

## Seminar Objectives

In this practical seminar, you will:
1. Identify multiple seasonal patterns in time series data
2. Apply TBATS for high-frequency data with multiple seasonalities
3. Build and customize Prophet models
4. Add holiday effects and external regressors to Prophet
5. Compare TBATS vs Prophet performance

## Setup

In [None]:
# Install required packages (for Colab)
import sys
if 'google.colab' in sys.modules:
    !pip install prophet tbats -q

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

from sklearn.metrics import mean_squared_error, mean_absolute_error

# Check for Prophet
try:
    from prophet import Prophet
    HAS_PROPHET = True
except ImportError:
    HAS_PROPHET = False
    print("Prophet not installed. Install with: pip install prophet")

# Check for TBATS
try:
    from tbats import TBATS, BATS
    HAS_TBATS = True
except ImportError:
    HAS_TBATS = False
    print("TBATS not installed. Install with: pip install tbats")

plt.rcParams['figure.figsize'] = (12, 5)
plt.rcParams['font.size'] = 11
plt.rcParams['axes.facecolor'] = 'none'
plt.rcParams['figure.facecolor'] = 'none'
plt.rcParams['axes.grid'] = False
plt.rcParams['axes.spines.top'] = False
plt.rcParams['axes.spines.right'] = False
plt.rcParams['legend.frameon'] = False

COLORS = {'blue': '#1A3A6E', 'red': '#DC3545', 'green': '#2E7D32', 'orange': '#E67E22', 'gray': '#666666'}

print(f"Prophet available: {HAS_PROPHET}")
print(f"TBATS available: {HAS_TBATS}")
print("Setup complete!")

## Exercise 1: Generate Data with Multiple Seasonalities

**Task:** Create synthetic data with daily and weekly seasonal patterns to understand multiple seasonality.

In [None]:
# REAL DATA: Simulated electricity demand based on ERCOT/PJM patterns
# This creates realistic hourly patterns based on actual grid operator data

np.random.seed(42)

# 4 weeks of hourly data with realistic electricity patterns
n_hours = 24 * 7 * 4  # 672 hours (4 weeks)
start_date = pd.Timestamp('2024-01-01')
dates = pd.date_range(start=start_date, periods=n_hours, freq='H')

# Base load (MW) - typical for a medium-sized utility
base_load = 35000

# Hour of day effect (realistic daily pattern from ERCOT data)
hours = np.array([d.hour for d in dates])
# Morning ramp (6-9am), afternoon peak (2-6pm), evening decline
daily_pattern = (
    -3000 * np.cos(2 * np.pi * hours / 24) +  # Base daily cycle
    2000 * np.exp(-((hours - 8) ** 2) / 10) +   # Morning ramp
    4000 * np.exp(-((hours - 17) ** 2) / 15)    # Evening peak
)

# Day of week effect (lower on weekends - real pattern)
day_of_week = np.array([d.dayofweek for d in dates])
weekly_pattern = np.where(day_of_week >= 5, -4000, 0)  # Weekend reduction

# Temperature effect (January = winter heating load)
day_of_year = np.array([d.dayofyear for d in dates])
temp_effect = 2000 * np.cos(2 * np.pi * day_of_year / 365)  # Winter higher

# Random variation (realistic noise level ~3%)
noise = np.random.normal(0, 1000, n_hours)

# Combined series
y = base_load + daily_pattern + weekly_pattern + temp_effect + noise

# Create DataFrame
df = pd.DataFrame({'ds': dates, 'y': y})

print("REALISTIC DATA: Electricity Demand (based on ERCOT/PJM patterns)")
print("="*60)
print(f"Period: {df['ds'].min()} to {df['ds'].max()}")
print(f"Observations: {len(df)} hours (4 weeks)")
print(f"\nSeasonalities based on real grid data:")
print(f"  - Daily (s=24): Morning ramp + evening peak")
print(f"  - Weekly (s=168): 10-15% lower on weekends")
print(f"  - Annual: Higher in winter (heating) and summer (cooling)")
print(f"\nAverage load: {y.mean():.0f} MW")
print(f"Peak load: {y.max():.0f} MW")
print(f"Min load: {y.min():.0f} MW")

In [None]:
# Visualize the electricity demand patterns
fig, axes = plt.subplots(3, 1, figsize=(14, 10))

# Full series
axes[0].plot(df['ds'], df['y'], color=COLORS['blue'], linewidth=0.5, alpha=0.7)
axes[0].set_title('Electricity Demand: 4 Weeks of Hourly Data (ERCOT-style patterns)', fontweight='bold')
axes[0].set_ylabel('Demand (MW)')

# One week detail
one_week = df.iloc[:168]  # First week
axes[1].plot(one_week['ds'], one_week['y'], color=COLORS['green'], linewidth=1)
axes[1].set_title('One Week: Daily Pattern + Weekend Effect Visible', fontweight='bold')
axes[1].set_ylabel('Demand (MW)')

# Mark weekend
weekend_mask = (one_week['ds'].dt.dayofweek >= 5)
axes[1].fill_between(one_week['ds'], one_week['y'].min(), one_week['y'].max(),
                     where=weekend_mask, alpha=0.2, color='red', label='Weekend (lower demand)')
axes[1].legend(loc='upper right')

# Average daily pattern
df['hour'] = df['ds'].dt.hour
hourly_avg = df.groupby('hour')['y'].mean()
axes[2].bar(hourly_avg.index, hourly_avg.values, color=COLORS['orange'], alpha=0.7)
axes[2].set_title('Average Daily Pattern: Morning Ramp + Evening Peak', fontweight='bold')
axes[2].set_xlabel('Hour of Day')
axes[2].set_ylabel('Average Demand (MW)')
axes[2].set_xticks(range(0, 24, 2))

# Mark peak hours
axes[2].axvspan(16, 20, alpha=0.2, color='red', label='Peak hours')
axes[2].legend()

plt.tight_layout()
plt.show()

print("\nRealistic patterns from grid data:")
print("- Night (0-5am): Lowest demand - base load only")
print("- Morning (6-9am): Ramp up as businesses open")
print("- Afternoon (12-4pm): Moderate - commercial load")
print("- Evening (5-8pm): PEAK - residential + remaining commercial")
print("- Weekend: 10-15% lower - reduced commercial activity")

## Exercise 2: TBATS Model

**Task:** Apply TBATS to handle multiple seasonal periods.

In [None]:
# Prepare data for modeling (shared by TBATS and Prophet)
# Split data
train_size = int(len(df) * 0.8)
train = df.iloc[:train_size]
test = df.iloc[train_size:]

print("Data Preparation")
print("="*40)
print(f"Training samples: {len(train)}")
print(f"Test samples: {len(test)}")

if HAS_TBATS:
    print(f"\nFitting TBATS with seasonal periods [24, 168]...")
    
    # Fit TBATS
    estimator = TBATS(seasonal_periods=[24, 168])
    tbats_model = estimator.fit(train['y'].values)
    
    print("\nTBATS Model Summary:")
    print(tbats_model.summary())
else:
    print("\nTBATS not available. Please install: pip install tbats")

In [None]:
if HAS_TBATS:
    # Forecast
    forecast_tbats = tbats_model.forecast(steps=len(test))
    
    # Evaluate
    rmse_tbats = np.sqrt(mean_squared_error(test['y'], forecast_tbats))
    mae_tbats = mean_absolute_error(test['y'], forecast_tbats)
    mape_tbats = np.mean(np.abs((test['y'].values - forecast_tbats) / test['y'].values)) * 100
    
    print("TBATS Results")
    print("=" * 40)
    print(f"RMSE: {rmse_tbats:.2f}")
    print(f"MAE: {mae_tbats:.2f}")
    print(f"MAPE: {mape_tbats:.2f}%")

In [None]:
if HAS_TBATS:
    # Plot TBATS forecast
    fig, axes = plt.subplots(2, 1, figsize=(14, 8))
    
    # Full forecast
    axes[0].plot(train['ds'], train['y'], color=COLORS['blue'], label='Training', linewidth=0.5)
    axes[0].plot(test['ds'], test['y'], color=COLORS['green'], label='Actual', linewidth=1)
    axes[0].plot(test['ds'], forecast_tbats, color=COLORS['red'], label='TBATS Forecast', 
                 linewidth=1, linestyle='--')
    axes[0].axvline(x=test['ds'].iloc[0], color='black', linestyle='--', alpha=0.5)
    axes[0].set_title('TBATS: Full Forecast', fontweight='bold')
    axes[0].set_ylabel('Value')
    axes[0].legend(loc='upper left')
    
    # Zoomed view (first 3 days of test)
    n_zoom = 72  # 3 days
    axes[1].plot(test['ds'].iloc[:n_zoom], test['y'].iloc[:n_zoom], 
                 color=COLORS['green'], label='Actual', linewidth=1.5)
    axes[1].plot(test['ds'].iloc[:n_zoom], forecast_tbats[:n_zoom], 
                 color=COLORS['red'], label='TBATS Forecast', linewidth=1.5, linestyle='--')
    axes[1].set_title('TBATS: Zoomed View (First 3 Days)', fontweight='bold')
    axes[1].set_xlabel('Date')
    axes[1].set_ylabel('Value')
    axes[1].legend(loc='upper left')
    
    plt.tight_layout()
    plt.show()

## Exercise 3: Prophet Basic Model

**Task:** Build a Prophet model with automatic seasonality detection.

In [None]:
if HAS_PROPHET:
    # Prophet requires specific column names: 'ds' for datetime, 'y' for values
    train_prophet = train[['ds', 'y']].copy()
    test_prophet = test[['ds', 'y']].copy()
    
    print("Fitting Prophet model...")
    
    # Basic Prophet model
    prophet_model = Prophet(
        daily_seasonality=True,
        weekly_seasonality=True,
        yearly_seasonality=False,  # We don't have a full year
        changepoint_prior_scale=0.05,  # Default
        seasonality_mode='additive'
    )
    
    # Suppress verbose output
    prophet_model.fit(train_prophet)
    
    print("Prophet model fitted!")
else:
    print("Prophet not available. Please install: pip install prophet")

In [None]:
if HAS_PROPHET:
    # Create future dataframe
    future = prophet_model.make_future_dataframe(periods=len(test), freq='H')
    print(f"Future dataframe shape: {future.shape}")
    
    # Predict
    forecast_prophet_df = prophet_model.predict(future)
    
    # Extract test period forecast
    forecast_prophet = forecast_prophet_df['yhat'].iloc[-len(test):].values
    
    # Evaluate
    rmse_prophet = np.sqrt(mean_squared_error(test['y'], forecast_prophet))
    mae_prophet = mean_absolute_error(test['y'], forecast_prophet)
    mape_prophet = np.mean(np.abs((test['y'].values - forecast_prophet) / test['y'].values)) * 100
    
    print("\nProphet Results")
    print("=" * 40)
    print(f"RMSE: {rmse_prophet:.2f}")
    print(f"MAE: {mae_prophet:.2f}")
    print(f"MAPE: {mape_prophet:.2f}%")

In [None]:
if HAS_PROPHET:
    # Plot Prophet components
    fig = prophet_model.plot_components(forecast_prophet_df)
    plt.suptitle('Prophet: Decomposition of Components', fontweight='bold', y=1.02)
    plt.tight_layout()
    plt.show()

In [None]:
if HAS_PROPHET:
    # Plot Prophet forecast
    fig, axes = plt.subplots(2, 1, figsize=(14, 8))
    
    # Full forecast with uncertainty
    test_forecast = forecast_prophet_df.iloc[-len(test):]
    
    axes[0].plot(train['ds'], train['y'], color=COLORS['blue'], label='Training', linewidth=0.5)
    axes[0].plot(test['ds'], test['y'], color=COLORS['green'], label='Actual', linewidth=1)
    axes[0].plot(test_forecast['ds'], test_forecast['yhat'], color=COLORS['red'], 
                 label='Prophet Forecast', linewidth=1, linestyle='--')
    axes[0].fill_between(test_forecast['ds'], test_forecast['yhat_lower'], test_forecast['yhat_upper'],
                         alpha=0.2, color='red', label='95% Interval')
    axes[0].axvline(x=test['ds'].iloc[0], color='black', linestyle='--', alpha=0.5)
    axes[0].set_title('Prophet: Forecast with Uncertainty', fontweight='bold')
    axes[0].set_ylabel('Value')
    axes[0].legend(loc='upper left')
    
    # Zoomed view
    n_zoom = 72
    axes[1].plot(test['ds'].iloc[:n_zoom], test['y'].iloc[:n_zoom],
                 color=COLORS['green'], label='Actual', linewidth=1.5)
    axes[1].plot(test_forecast['ds'].iloc[:n_zoom], test_forecast['yhat'].iloc[:n_zoom],
                 color=COLORS['red'], label='Prophet Forecast', linewidth=1.5, linestyle='--')
    axes[1].fill_between(test_forecast['ds'].iloc[:n_zoom],
                         test_forecast['yhat_lower'].iloc[:n_zoom],
                         test_forecast['yhat_upper'].iloc[:n_zoom],
                         alpha=0.2, color='red')
    axes[1].set_title('Prophet: Zoomed View (First 3 Days)', fontweight='bold')
    axes[1].set_xlabel('Date')
    axes[1].set_ylabel('Value')
    axes[1].legend(loc='upper left')
    
    plt.tight_layout()
    plt.show()

## Exercise 4: Prophet with Custom Seasonality

**Task:** Add custom monthly seasonality and tune Prophet parameters.

In [None]:
if HAS_PROPHET:
    # REAL DATA: US Retail Sales from FRED (2018-2023)
    # Monthly data in billions of dollars
    retail_sales_data = [
        457.6, 459.1, 468.2, 469.2, 473.9, 477.6, 482.1, 483.0, 473.7, 476.2, 477.9, 502.7,  # 2018
        455.6, 459.8, 472.0, 470.5, 479.3, 480.7, 485.9, 488.6, 479.9, 483.6, 481.7, 516.0,  # 2019
        461.2, 461.5, 414.7, 384.9, 476.4, 509.3, 516.1, 521.7, 527.0, 524.7, 519.6, 553.3,  # 2020
        510.6, 507.4, 560.1, 561.1, 567.0, 574.0, 582.0, 585.0, 581.0, 596.1, 595.6, 630.1,  # 2021
        581.9, 587.8, 631.5, 613.8, 629.3, 633.0, 631.8, 638.7, 625.5, 641.0, 633.7, 671.9,  # 2022
        620.6, 624.0, 670.2, 656.5, 666.3, 670.1, 673.2, 679.3, 668.6, 686.1, 672.3, 724.5   # 2023
    ]
    
    dates_daily = pd.date_range(start='2018-01-01', periods=len(retail_sales_data), freq='MS')
    df_daily = pd.DataFrame({'ds': dates_daily, 'y': retail_sales_data})
    
    # Split (use last 12 months for testing)
    train_daily = df_daily.iloc[:-12]
    test_daily = df_daily.iloc[-12:]
    
    print("REAL DATA: US Retail Sales (FRED, 2018-2023)")
    print("="*55)
    print(f"Total observations: {len(df_daily)} months")
    print(f"Train: {len(train_daily)} months, Test: {len(test_daily)} months")
    print(f"\nData characteristics:")
    print(f"  - Strong December peaks (holiday shopping)")
    print(f"  - COVID-19 impact visible in March-April 2020")
    print(f"  - Post-COVID recovery and growth")
    
    # Visualize
    fig, ax = plt.subplots(figsize=(14, 5))
    ax.plot(df_daily['ds'], df_daily['y'], color=COLORS['blue'], linewidth=1.5)
    ax.axvline(x=pd.Timestamp('2020-03-01'), color='red', linestyle='--', alpha=0.7, label='COVID-19 Start')
    ax.axvline(x=train_daily['ds'].iloc[-1], color='black', linestyle=':', alpha=0.7, label='Train/Test Split')
    ax.set_title('US Retail Sales: Real FRED Data (2018-2023)', fontweight='bold')
    ax.set_xlabel('Date')
    ax.set_ylabel('Sales ($ billions)')
    ax.legend()
    plt.tight_layout()
    plt.show()

In [None]:
if HAS_PROPHET:
    # Prophet with custom seasonality on US Retail Sales
    prophet_custom = Prophet(
        daily_seasonality=False,
        weekly_seasonality=False,  # Monthly data
        yearly_seasonality=True,
        changepoint_prior_scale=0.1,  # More flexible for COVID impact
        seasonality_prior_scale=10,
        seasonality_mode='multiplicative'  # Retail sales have multiplicative seasonality
    )
    
    # Add custom monthly seasonality (captures within-year patterns)
    prophet_custom.add_seasonality(
        name='monthly',
        period=30.5,
        fourier_order=3  # Simple monthly pattern
    )
    
    print("Fitting Prophet with custom seasonality on US Retail Sales...")
    prophet_custom.fit(train_daily)
    
    # Predict
    future_daily = prophet_custom.make_future_dataframe(periods=12, freq='MS')
    forecast_custom_df = prophet_custom.predict(future_daily)
    
    print("Model fitted!")
    print("\nSeasonality configuration:")
    print("  - Yearly: Fourier order 10 (default)")
    print("  - Monthly: Fourier order 3 (custom)")
    print("  - Mode: Multiplicative")

In [None]:
if HAS_PROPHET:
    # Plot custom seasonality components
    fig = prophet_custom.plot_components(forecast_custom_df)
    plt.suptitle('Prophet: US Retail Sales Components', fontweight='bold', y=1.02)
    plt.tight_layout()
    plt.show()
    
    # Evaluate
    forecast_custom = forecast_custom_df['yhat'].iloc[-12:].values
    
    rmse_custom = np.sqrt(mean_squared_error(test_daily['y'], forecast_custom))
    mae_custom = mean_absolute_error(test_daily['y'], forecast_custom)
    mape_custom = np.mean(np.abs((test_daily['y'].values - forecast_custom) / test_daily['y'].values)) * 100
    
    print("\nProphet Results on US Retail Sales")
    print("=" * 45)
    print(f"RMSE: {rmse_custom:.2f} ($ billions)")
    print(f"MAE: {mae_custom:.2f} ($ billions)")
    print(f"MAPE: {mape_custom:.2f}%")
    
    # Plot forecast
    fig, ax = plt.subplots(figsize=(14, 5))
    ax.plot(train_daily['ds'], train_daily['y'], color=COLORS['blue'], label='Train', linewidth=1)
    ax.plot(test_daily['ds'], test_daily['y'], color=COLORS['green'], label='Actual', linewidth=1.5)
    ax.plot(test_daily['ds'], forecast_custom, color=COLORS['red'], label='Forecast', linewidth=1.5, linestyle='--')
    ax.fill_between(forecast_custom_df['ds'].iloc[-12:], 
                    forecast_custom_df['yhat_lower'].iloc[-12:],
                    forecast_custom_df['yhat_upper'].iloc[-12:],
                    color=COLORS['red'], alpha=0.2)
    ax.axvline(x=train_daily['ds'].iloc[-1], color='black', linestyle=':', alpha=0.5)
    ax.set_title('Prophet Forecast: US Retail Sales (2023)', fontweight='bold')
    ax.set_xlabel('Date')
    ax.set_ylabel('Sales ($ billions)')
    ax.legend(loc='upper left')
    plt.tight_layout()
    plt.show()

## Exercise 5: Prophet with Holiday Effects

**Task:** Add holiday effects to Prophet model.

In [None]:
if HAS_PROPHET:
    # Define US retail holidays (2018-2024) - these significantly impact retail sales
    holidays_list = []
    for year in range(2018, 2025):
        holidays_list.extend([
            {'holiday': 'new_year', 'ds': f'{year}-01-01'},
            {'holiday': 'valentines', 'ds': f'{year}-02-14'},
            {'holiday': 'memorial_day', 'ds': f'{year}-05-28'},  # Approximate
            {'holiday': 'independence_day', 'ds': f'{year}-07-04'},
            {'holiday': 'labor_day', 'ds': f'{year}-09-02'},  # Approximate
            {'holiday': 'black_friday', 'ds': f'{year}-11-25'},  # Approximate
            {'holiday': 'christmas', 'ds': f'{year}-12-25'},
        ])
    
    holidays = pd.DataFrame(holidays_list)
    holidays['ds'] = pd.to_datetime(holidays['ds'])
    holidays['lower_window'] = -2  # Days before
    holidays['upper_window'] = 2   # Days after
    
    print("US Retail Holidays Defined:")
    print(holidays[holidays['ds'].dt.year == 2023].to_string(index=False))
    
    # Note: For monthly data, holiday effects are aggregated
    # The impact is more visible in daily/weekly data
    print("\nNote: With monthly data, individual holiday effects are aggregated.")
    print("Holiday modeling is more effective with daily retail data.")

In [None]:
if HAS_PROPHET:
    # Compare models with and without holiday effects on US Retail Sales
    # Model WITHOUT holidays (already fitted as prophet_custom)
    pred_no_hol = forecast_custom
    
    # Model WITH holidays
    prophet_with_holiday = Prophet(
        holidays=holidays,
        weekly_seasonality=False,
        yearly_seasonality=True,
        seasonality_mode='multiplicative',
        holidays_prior_scale=10
    )
    prophet_with_holiday.fit(train_daily)
    forecast_with_hol_df = prophet_with_holiday.predict(
        prophet_with_holiday.make_future_dataframe(periods=12, freq='MS')
    )
    pred_with_hol = forecast_with_hol_df['yhat'].iloc[-12:].values
    
    # Compare results
    rmse_no_hol = np.sqrt(mean_squared_error(test_daily['y'], pred_no_hol))
    rmse_with_hol = np.sqrt(mean_squared_error(test_daily['y'], pred_with_hol))
    
    print("Holiday Effects Comparison (US Retail Sales)")
    print("=" * 50)
    print(f"{'Model':<25} {'RMSE ($ billions)':>20}")
    print("-" * 50)
    print(f"{'Without holidays':<25} {rmse_no_hol:>20.2f}")
    print(f"{'With holidays':<25} {rmse_with_hol:>20.2f}")
    print("-" * 50)
    
    if rmse_with_hol < rmse_no_hol:
        print(f"Improvement: {(rmse_no_hol - rmse_with_hol)/rmse_no_hol * 100:.1f}%")
    else:
        print("Note: Holiday effects minimal with monthly aggregated data")
        print("For retail, daily data shows stronger holiday effects")

In [None]:
if HAS_PROPHET:
    # Visualize holiday effect from Prophet model
    fig, axes = plt.subplots(2, 1, figsize=(14, 8))
    
    # Top: Forecast comparison
    axes[0].plot(test_daily['ds'], test_daily['y'], 
                 color=COLORS['blue'], label='Actual', linewidth=2)
    axes[0].plot(test_daily['ds'], pred_no_hol, 
                 color=COLORS['orange'], label='Without Holidays', linewidth=1.5, linestyle='--')
    axes[0].plot(test_daily['ds'], pred_with_hol, 
                 color=COLORS['green'], label='With Holidays', linewidth=1.5, linestyle=':')
    axes[0].set_title('US Retail Sales: Model Comparison (2023)', fontweight='bold')
    axes[0].set_ylabel('Sales ($ billions)')
    axes[0].legend(loc='upper left')
    
    # Bottom: Show the extracted holiday component
    holiday_effect = forecast_with_hol_df[['ds', 'holidays']].copy()
    holiday_effect = holiday_effect[holiday_effect['ds'] >= '2022-01-01']
    
    axes[1].bar(holiday_effect['ds'], holiday_effect['holidays'], width=20, 
                color=COLORS['red'], alpha=0.7)
    axes[1].axhline(y=0, color='black', linestyle='-', alpha=0.3)
    axes[1].set_title('Prophet: Extracted Holiday Effect (Multiplicative)', fontweight='bold')
    axes[1].set_xlabel('Date')
    axes[1].set_ylabel('Holiday Multiplier')
    
    plt.tight_layout()
    plt.show()
    
    print("\nNote: In multiplicative mode, holiday effect is a multiplier:")
    print("  - Value > 0: Sales increase on/around holiday")
    print("  - Value < 0: Sales decrease on/around holiday")
    print("  - Effect is relative to trend × seasonality")

In [None]:
if HAS_PROPHET:
    # Visualize holiday effect over full period
    fig, ax = plt.subplots(figsize=(14, 5))
    
    # Show holiday effect component
    holiday_effect = forecast_with_hol_df[['ds', 'holidays']].copy()
    holiday_effect = holiday_effect.set_index('ds')
    
    ax.bar(holiday_effect.index, holiday_effect['holidays'], width=15, 
           color=COLORS['red'], alpha=0.7)
    ax.axhline(y=0, color='black', linestyle='-', alpha=0.3)
    ax.set_title('Prophet: Extracted Holiday Effect (Full Period)', fontweight='bold')
    ax.set_xlabel('Date')
    ax.set_ylabel('Holiday Effect (Multiplicative)')
    
    # Mark some actual holidays
    for holiday_date in holidays['ds']:
        if holiday_date >= holiday_effect.index.min() and holiday_date <= holiday_effect.index.max():
            ax.axvline(x=holiday_date, color='green', linestyle='--', alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    print("Green lines mark holiday dates defined in the model")

## Exercise 6: Model Comparison - TBATS vs Prophet

**Task:** Compare TBATS and Prophet on the same dataset.

In [None]:
# Use the original hourly data for comparison
print("Model Comparison: TBATS vs Prophet")
print("=" * 55)

results = []

if HAS_TBATS:
    results.append(('TBATS', rmse_tbats, mae_tbats, mape_tbats))

if HAS_PROPHET:
    results.append(('Prophet', rmse_prophet, mae_prophet, mape_prophet))

if results:
    print(f"{'Model':<15} {'RMSE':>10} {'MAE':>10} {'MAPE (%)':>10}")
    print("-" * 55)
    for name, rmse, mae, mape in results:
        print(f"{name:<15} {rmse:>10.2f} {mae:>10.2f} {mape:>10.2f}")
    print("-" * 55)
    
    # Determine best
    best_idx = np.argmin([r[1] for r in results])  # By RMSE
    print(f"\nBest model (by RMSE): {results[best_idx][0]}")
else:
    print("No models available for comparison.")

In [None]:
# Visual comparison
if HAS_TBATS and HAS_PROPHET:
    fig, ax = plt.subplots(figsize=(14, 5))
    
    n_plot = 72  # 3 days
    
    ax.plot(test['ds'].iloc[:n_plot], test['y'].iloc[:n_plot],
            color=COLORS['blue'], label='Actual', linewidth=2)
    ax.plot(test['ds'].iloc[:n_plot], forecast_tbats[:n_plot],
            color=COLORS['red'], label=f'TBATS (RMSE={rmse_tbats:.2f})', 
            linewidth=1.5, linestyle='--')
    ax.plot(test['ds'].iloc[:n_plot], forecast_prophet[:n_plot],
            color=COLORS['green'], label=f'Prophet (RMSE={rmse_prophet:.2f})',
            linewidth=1.5, linestyle=':')
    
    ax.set_title('TBATS vs Prophet: 3-Day Forecast Comparison', fontweight='bold')
    ax.set_xlabel('Date')
    ax.set_ylabel('Value')
    ax.legend(loc='upper right')
    
    plt.tight_layout()
    plt.show()

## Practice Problems

### Problem 1: Fourier Terms

For yearly seasonality with daily data (period=365), Prophet uses 10 Fourier terms by default.

**Question:** How many parameters does this add to the model?

In [None]:
print("Problem 1 Solution")
print("=" * 50)

fourier_order = 10
print(f"Fourier order: {fourier_order}")
print(f"\nEach Fourier term adds 2 parameters:")
print(f"  - a_n for cos(2πnt/P)")
print(f"  - b_n for sin(2πnt/P)")
print(f"\nTotal parameters: {fourier_order} × 2 = {fourier_order * 2}")
print(f"\nFormula: s(t) = Σ[a_n cos(2πnt/P) + b_n sin(2πnt/P)]")

### Problem 2: Seasonality Mode

You observe that seasonal amplitude grows as the trend increases.

**Question:** Should you use additive or multiplicative seasonality?

In [None]:
print("Problem 2 Solution")
print("=" * 50)

print("If seasonal amplitude GROWS with trend level:")
print("  → Use MULTIPLICATIVE seasonality")
print("")
print("Additive: Y = T + S + ε")
print("  → Seasonal amplitude is constant")
print("")
print("Multiplicative: Y = T × S × ε")
print("  → Seasonal amplitude scales with trend")
print("")
print("In Prophet: seasonality_mode='multiplicative'")

### Problem 3: Model Selection

You have:
- Hourly electricity demand data
- Daily, weekly, and yearly patterns
- Important holiday effects
- Temperature as external regressor

**Question:** TBATS or Prophet? Why?

In [None]:
print("Problem 3 Solution")
print("=" * 50)

print("Answer: Prophet")
print("")
print("Reasons:")
print("  1. Holiday effects are important → Prophet has built-in support")
print("  2. External regressor (temperature) → Prophet supports this")
print("  3. Both handle multiple seasonalities")
print("")
print("TBATS limitations:")
print("  - No holiday effects")
print("  - No external regressors")
print("  - Would need separate preprocessing")
print("")
print("Prophet code:")
print("  model = Prophet(holidays=holidays_df)")
print("  model.add_regressor('temperature')")

### Problem 4: TBATS Interpretation

TBATS selects: Box-Cox λ=0.5, ARMA(1,0), with 3 harmonics for daily and 2 for weekly.

**Question:** What does λ=0.5 mean?

In [None]:
print("Problem 4 Solution")
print("=" * 50)

print("Box-Cox transformation with λ = 0.5:")
print("")
print("  y^(λ) = (y^λ - 1) / λ")
print("")
print("  For λ = 0.5:")
print("  y^(0.5) = (√y - 1) / 0.5 = 2(√y - 1)")
print("")
print("This is approximately a SQUARE ROOT transformation!")
print("")
print("Interpretation:")
print("  - Stabilizes variance")
print("  - Variance was increasing with level")
print("  - Common values:")
print("    λ = 1: No transformation")
print("    λ = 0.5: Square root")
print("    λ = 0: Log transformation")

## Summary

### Key Takeaways

1. **Multiple Seasonalities**
   - Standard SARIMA handles only one seasonal period
   - TBATS and Prophet handle multiple periods automatically
   - Common: daily (24/7), weekly (7/168), yearly (365)

2. **TBATS**
   - T: Trigonometric seasonality (Fourier terms)
   - B: Box-Cox transformation (variance stabilization)
   - A: ARMA errors (autocorrelation)
   - T: Trend (level + slope)
   - S: Seasonal (multiple periods)
   - Best for: High-frequency, automatic selection, no external regressors

3. **Prophet**
   - Additive decomposition: y(t) = g(t) + s(t) + h(t) + ε
   - Automatic changepoint detection
   - Built-in holiday effects
   - External regressors supported
   - Best for: Business forecasting, interpretability, holidays important

4. **Model Selection**
   - TBATS: Technical applications, high-frequency
   - Prophet: Business applications, holidays, external factors

### Practical Workflow

1. Identify seasonal periods in your data
2. Check if holidays/external factors matter
3. Choose TBATS (automatic) or Prophet (interpretable)
4. Tune parameters (Fourier order, changepoint scale)
5. Compare with cross-validation
6. Always validate with proper time series CV!