# ETS (Exponential Smoothing)

ETS models represent a series by **Error**, **Trend**, and **Seasonality** components. They are strong for series with stable seasonal patterns.

ETS(A, A, A): additive error, additive trend, additive seasonality. The system updates components recursively, weighting recent observations more.


## Mathematical Foundation

ETS models decompose a time series into three components and update them recursively using **smoothing parameters**. The key insight is that recent observations receive more weight than older ones.

---

### 1. Simple Exponential Smoothing (SES)

For series with **no trend or seasonality**, we only track the **level** $\ell_t$:

$$\ell_t = \alpha y_t + (1-\alpha)\ell_{t-1}$$

**Intuition**: The new level is a weighted average of the current observation $y_t$ and the previous level $\ell_{t-1}$. Higher $\alpha$ means faster adaptation to recent changes.

**Forecast**: $\hat{y}_{t+h} = \ell_t$ (flat forecast)

---

### 2. Holt's Linear Method (Trend)

For series with a **trend**, we add a trend component $b_t$:

$$\ell_t = \alpha y_t + (1-\alpha)(\ell_{t-1} + b_{t-1})$$

$$b_t = \beta(\ell_t - \ell_{t-1}) + (1-\beta)b_{t-1}$$

**Intuition**: 
- The level equation now accounts for the expected growth from the previous period
- The trend equation smooths the change in level over time

**Forecast**: $\hat{y}_{t+h} = \ell_t + h \cdot b_t$

---

### 3. Holt-Winters Method (Trend + Seasonality)

For series with **trend and seasonality** (period $m$), we add a seasonal component $s_t$:

**Additive Seasonality:**
$$\ell_t = \alpha(y_t - s_{t-m}) + (1-\alpha)(\ell_{t-1} + b_{t-1})$$

$$b_t = \beta(\ell_t - \ell_{t-1}) + (1-\beta)b_{t-1}$$

$$s_t = \gamma(y_t - \ell_t) + (1-\gamma)s_{t-m}$$

**Forecast Equation:**
$$\hat{y}_{t+h} = \ell_t + h \cdot b_t + s_{t+h-m(k+1)}$$

where $k = \lfloor (h-1)/m \rfloor$ ensures we use the most recent seasonal index.

---

### 4. Smoothing Parameters

| Parameter | Range | Controls | Low Value | High Value |
|-----------|-------|----------|-----------|------------|
| $\alpha$ | $[0,1]$ | Level responsiveness | Smooth, slow adaptation | Reactive, fast adaptation |
| $\beta$ | $[0,1]$ | Trend responsiveness | Stable trend | Volatile trend |
| $\gamma$ | $[0,1]$ | Seasonal responsiveness | Fixed seasonal pattern | Evolving seasonality |

---

### 5. Error Types (E in ETS)

The **E** in ETS refers to how errors are incorporated:

- **Additive errors**: $y_t = \mu_t + \epsilon_t$ where $\epsilon_t \sim N(0, \sigma^2)$
- **Multiplicative errors**: $y_t = \mu_t(1 + \epsilon_t)$ where $\epsilon_t \sim N(0, \sigma^2)$

This affects prediction intervals and model selection via information criteria (AIC/BIC).

In [None]:
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from sktime.datasets import load_airline

# Reproducibility
np.random.seed(42)

y = load_airline()
y.name = "Passengers"



## Visualizing the Components

Let's visualize the level, trend, and seasonal components to build intuition about how ETS decomposes a time series.

In [None]:
from plotly.subplots import make_subplots
from statsmodels.tsa.holtwinters import ExponentialSmoothing as HW_statsmodels

# Fit Holt-Winters to extract components
hw_model = HW_statsmodels(y.values, trend='add', seasonal='add', seasonal_periods=12)
hw_fit = hw_model.fit()

# Extract components
level = hw_fit.level
trend = hw_fit.trend
seasonal = hw_fit.season
residuals = y.values - hw_fit.fittedvalues

# Create subplot with components
fig = make_subplots(rows=4, cols=1, shared_xaxes=True, vertical_spacing=0.05,
                    subplot_titles=('Original Series', 'Level Component (ℓₜ)', 
                                   'Trend Component (bₜ)', 'Seasonal Component (sₜ)'))

timestamps = y.index.to_timestamp()

fig.add_trace(go.Scatter(x=timestamps, y=y.values, name='Observed', line=dict(color='#1f77b4')), row=1, col=1)
fig.add_trace(go.Scatter(x=timestamps, y=level, name='Level', line=dict(color='#ff7f0e')), row=2, col=1)
fig.add_trace(go.Scatter(x=timestamps, y=trend, name='Trend', line=dict(color='#2ca02c')), row=3, col=1)
fig.add_trace(go.Scatter(x=timestamps, y=seasonal, name='Seasonal', line=dict(color='#d62728')), row=4, col=1)

fig.update_layout(height=700, title_text="ETS Component Decomposition",
                  showlegend=False, template='plotly_white')
fig

### Effect of Alpha (α) on Smoothing

The smoothing parameter $\alpha$ controls how quickly the model adapts to recent observations. Let's visualize different alpha values to build intuition.

In [None]:
def simple_exponential_smoothing_demo(y, alpha):
    """Simple exponential smoothing for demonstration."""
    n = len(y)
    level = np.zeros(n)
    level[0] = y[0]  # Initialize with first observation
    
    for t in range(1, n):
        level[t] = alpha * y[t] + (1 - alpha) * level[t-1]
    
    return level

# Compare different alpha values
alphas = [0.1, 0.3, 0.6, 0.9]
colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728']

fig = go.Figure()
fig.add_trace(go.Scatter(x=timestamps, y=y.values, name='Original', 
                         line=dict(color='gray', width=1), opacity=0.7))

for alpha, color in zip(alphas, colors):
    smoothed = simple_exponential_smoothing_demo(y.values, alpha)
    fig.add_trace(go.Scatter(x=timestamps, y=smoothed, name=f'α = {alpha}',
                            line=dict(color=color, width=2)))

fig.update_layout(
    title="Effect of α on Simple Exponential Smoothing",
    xaxis_title="Date",
    yaxis_title="Passengers",
    template='plotly_white',
    legend=dict(yanchor="top", y=0.99, xanchor="left", x=0.01),
    annotations=[
        dict(x=0.5, y=-0.15, xref='paper', yref='paper', showarrow=False,
             text="<b>Intuition:</b> Low α → smooth, slow to adapt | High α → reactive, tracks data closely",
             font=dict(size=12))
    ]
)
fig

## Fit ETS


In [None]:
from sktime.forecasting.exp_smoothing import ExponentialSmoothing
from sktime.forecasting.model_selection import temporal_train_test_split, ForecastingHorizon

model = ExponentialSmoothing(trend="add", seasonal="add", sp=12)

y_train, y_test = temporal_train_test_split(y, test_size=24)
fh = ForecastingHorizon(y_test.index, is_relative=False)

model.fit(y_train)
pred = model.predict(fh)



## Forecast plot


In [None]:
fig = go.Figure()
fig.add_trace(go.Scatter(x=y_train.index.to_timestamp(), y=y_train, name="Train"))
fig.add_trace(go.Scatter(x=y_test.index.to_timestamp(), y=y_test, name="Test"))
fig.add_trace(go.Scatter(x=pred.index.to_timestamp(), y=pred, name="Forecast"))
fig.update_layout(title="ETS forecast vs actual")
fig

## Interpretation

ETS is excellent when seasonal patterns are regular and stable. Unlike ARIMA, it does not explicitly model autocorrelation but can outperform ARIMA on some seasonal series.


In [None]:
def holt_winters(y: np.ndarray, alpha: float, beta: float, gamma: float, 
                  seasonal_periods: int = 12) -> tuple:
    """
    Holt-Winters Additive Method - for series with trend AND seasonality.
    
    Update equations:
        ℓₜ = α·(yₜ - sₜ₋ₘ) + (1-α)·(ℓₜ₋₁ + bₜ₋₁)
        bₜ = β·(ℓₜ - ℓₜ₋₁) + (1-β)·bₜ₋₁
        sₜ = γ·(yₜ - ℓₜ) + (1-γ)·sₜ₋ₘ
    
    Forecast: ŷₜ₊ₕ = ℓₜ + h·bₜ + sₜ₊ₕ₋ₘ
    
    Parameters
    ----------
    y : np.ndarray
        Time series values
    alpha : float
        Smoothing parameter for level (0 < α ≤ 1)
    beta : float
        Smoothing parameter for trend (0 < β ≤ 1)
    gamma : float
        Smoothing parameter for seasonality (0 < γ ≤ 1)
    seasonal_periods : int
        Number of periods in a complete seasonal cycle (e.g., 12 for monthly)
    
    Returns
    -------
    tuple : (fitted_values, level, trend, seasonal, residuals)
    """
    n = len(y)
    m = seasonal_periods
    
    level = np.zeros(n)
    trend = np.zeros(n)
    seasonal = np.zeros(n + m)  # Extra space for forecasting
    fitted = np.zeros(n)
    
    # === Initialization ===
    # Use first full seasonal cycle for initialization
    
    # Initial level: average of first season
    level[0] = np.mean(y[:m])
    
    # Initial trend: average slope between first two seasons
    if n >= 2 * m:
        trend[0] = (np.mean(y[m:2*m]) - np.mean(y[:m])) / m
    else:
        trend[0] = (y[m] - y[0]) / m if n > m else 0
    
    # Initial seasonal indices: deviation from level in first season
    for i in range(m):
        seasonal[i] = y[i] - level[0]
    
    fitted[0] = level[0] + seasonal[0]
    
    # === Recursive Updates ===
    for t in range(1, n):
        # Get seasonal index from m periods ago
        s_prev = seasonal[t - m] if t >= m else seasonal[t % m]
        
        # One-step-ahead forecast
        fitted[t] = level[t-1] + trend[t-1] + s_prev
        
        # Update level (de-seasonalized observation)
        level[t] = alpha * (y[t] - s_prev) + (1 - alpha) * (level[t-1] + trend[t-1])
        
        # Update trend
        trend[t] = beta * (level[t] - level[t-1]) + (1 - beta) * trend[t-1]
        
        # Update seasonal
        seasonal[t] = gamma * (y[t] - level[t]) + (1 - gamma) * s_prev
    
    residuals = y - fitted
    return fitted, level, trend, seasonal[:n], residuals


def holt_winters_forecast(level_final: float, trend_final: float, 
                          seasonal: np.ndarray, h: int, m: int,
                          sigma: float = None) -> tuple:
    """
    Generate h-step ahead forecast with prediction intervals.
    
    Parameters
    ----------
    level_final : float
        Final level estimate
    trend_final : float
        Final trend estimate
    seasonal : np.ndarray
        Seasonal components (last m values)
    h : int
        Forecast horizon
    m : int
        Seasonal period
    sigma : float, optional
        Residual standard deviation
    
    Returns
    -------
    tuple : (forecast, lower_95, upper_95)
    """
    forecast = np.zeros(h)
    
    for i in range(h):
        # Use appropriate seasonal index (cycling through)
        seasonal_idx = (len(seasonal) - m + (i % m)) % len(seasonal)
        if seasonal_idx < 0:
            seasonal_idx += len(seasonal)
        forecast[i] = level_final + (i + 1) * trend_final + seasonal[-(m - (i % m))]
    
    if sigma is not None:
        # Approximate prediction intervals
        z = 1.96
        # Variance increases with horizon
        horizon_factor = np.sqrt(1 + np.arange(1, h + 1) * 0.1)
        lower = forecast - z * sigma * horizon_factor
        upper = forecast + z * sigma * horizon_factor
        return forecast, lower, upper
    
    return forecast, None, None


# Test Holt-Winters implementation
fitted_hw, level_hw, trend_hw, seasonal_hw, resid_hw = holt_winters(
    y_array, alpha=0.4, beta=0.1, gamma=0.3, seasonal_periods=12
)

print(f"Holt-Winters Additive with α=0.4, β=0.1, γ=0.3")
print(f"Final level: {level_hw[-1]:.2f}")
print(f"Final trend: {trend_hw[-1]:.2f}")
print(f"RMSE: {np.sqrt(np.mean(resid_hw**2)):.2f}")

### Holt-Winters Method (with Trend and Seasonality)

The complete Holt-Winters method adds a seasonal component. We implement the **additive** version where seasonal effects are constant in magnitude.

In [None]:
def holt_linear(y: np.ndarray, alpha: float, beta: float) -> tuple:
    """
    Holt's Linear Method - for series with trend but no seasonality.
    
    Update equations:
        ℓₜ = α·yₜ + (1-α)·(ℓₜ₋₁ + bₜ₋₁)
        bₜ = β·(ℓₜ - ℓₜ₋₁) + (1-β)·bₜ₋₁
    
    Forecast: ŷₜ₊ₕ = ℓₜ + h·bₜ
    
    Parameters
    ----------
    y : np.ndarray
        Time series values
    alpha : float
        Smoothing parameter for level (0 < α ≤ 1)
    beta : float
        Smoothing parameter for trend (0 < β ≤ 1)
    
    Returns
    -------
    tuple : (fitted_values, level, trend, residuals)
    """
    n = len(y)
    level = np.zeros(n)
    trend = np.zeros(n)
    fitted = np.zeros(n)
    
    # Initialize using first few observations
    level[0] = y[0]
    trend[0] = y[1] - y[0] if n > 1 else 0  # Initial trend estimate
    fitted[0] = y[0]
    
    for t in range(1, n):
        # One-step-ahead forecast
        fitted[t] = level[t-1] + trend[t-1]
        
        # Update level
        level[t] = alpha * y[t] + (1 - alpha) * (level[t-1] + trend[t-1])
        
        # Update trend
        trend[t] = beta * (level[t] - level[t-1]) + (1 - beta) * trend[t-1]
    
    residuals = y - fitted
    return fitted, level, trend, residuals


def holt_forecast(level_final: float, trend_final: float, h: int) -> np.ndarray:
    """Generate h-step ahead forecast from Holt's method."""
    return np.array([level_final + i * trend_final for i in range(1, h + 1)])


# Test Holt's Linear implementation
fitted_holt, level_holt, trend_holt, resid_holt = holt_linear(y_array, alpha=0.3, beta=0.1)

print(f"Holt's Linear with α=0.3, β=0.1")
print(f"Final level: {level_holt[-1]:.2f}")
print(f"Final trend: {trend_holt[-1]:.2f}")
print(f"RMSE: {np.sqrt(np.mean(resid_holt**2)):.2f}")

### Holt's Linear Method (with Trend)

When the series has a trend, we need to track both the **level** and the **slope** of the trend.

In [None]:
def simple_exponential_smoothing(y: np.ndarray, alpha: float) -> tuple:
    """
    Simple Exponential Smoothing (SES) - for series with no trend or seasonality.
    
    Update equation: ℓₜ = α·yₜ + (1-α)·ℓₜ₋₁
    Forecast: ŷₜ₊ₕ = ℓₜ (flat forecast)
    
    Parameters
    ----------
    y : np.ndarray
        Time series values
    alpha : float
        Smoothing parameter for level (0 < α ≤ 1)
    
    Returns
    -------
    tuple : (fitted_values, level, residuals)
    """
    n = len(y)
    level = np.zeros(n)
    fitted = np.zeros(n)
    
    # Initialize: use first observation as initial level
    level[0] = y[0]
    fitted[0] = y[0]
    
    # Recursive update
    for t in range(1, n):
        # Forecast is the previous level
        fitted[t] = level[t-1]
        # Update level with new observation
        level[t] = alpha * y[t] + (1 - alpha) * level[t-1]
    
    residuals = y - fitted
    return fitted, level, residuals


def ses_forecast(level_final: float, h: int, sigma: float = None) -> tuple:
    """
    Generate h-step ahead forecast from SES.
    
    Parameters
    ----------
    level_final : float
        Final level estimate
    h : int
        Forecast horizon
    sigma : float, optional
        Residual standard deviation for prediction intervals
    
    Returns
    -------
    tuple : (point_forecast, lower_95, upper_95)
    """
    # SES produces flat forecasts
    forecast = np.full(h, level_final)
    
    if sigma is not None:
        # Prediction interval widens with horizon
        # Var(ŷₜ₊ₕ) = σ² · (1 + (h-1)·α²) for SES
        z = 1.96  # 95% confidence
        lower = forecast - z * sigma
        upper = forecast + z * sigma
        return forecast, lower, upper
    
    return forecast, None, None


# Test SES implementation
y_array = y.values.astype(float)
fitted_ses, level_ses, resid_ses = simple_exponential_smoothing(y_array, alpha=0.3)

print(f"SES with α=0.3")
print(f"Final level: {level_ses[-1]:.2f}")
print(f"RMSE: {np.sqrt(np.mean(resid_ses**2)):.2f}")

### Forecast with Prediction Intervals

Using our NumPy Holt-Winters implementation, let's generate forecasts with prediction intervals. The intervals widen over time as uncertainty grows.

In [None]:
# Compare our NumPy implementation with statsmodels (which sktime wraps)
comparison_fig = go.Figure()

comparison_fig.add_trace(go.Scatter(
    x=timestamps, y=y.values,
    name='Original',
    line=dict(color='#1f77b4', width=2)
))

comparison_fig.add_trace(go.Scatter(
    x=timestamps, y=fitted_hw,
    name='NumPy Implementation',
    line=dict(color='#ff7f0e', width=2)
))

comparison_fig.add_trace(go.Scatter(
    x=timestamps, y=hw_fit.fittedvalues,
    name='statsmodels Implementation',
    line=dict(color='#2ca02c', width=2, dash='dash')
))

comparison_fig.update_layout(
    title="NumPy vs statsmodels Holt-Winters Comparison",
    xaxis_title="Date",
    yaxis_title="Passengers",
    template='plotly_white',
    legend=dict(yanchor="top", y=0.99, xanchor="left", x=0.01)
)
comparison_fig.show()

# Compute error metrics
rmse_numpy = np.sqrt(np.mean((y.values - fitted_hw)**2))
rmse_statsmodels = np.sqrt(np.mean((y.values - hw_fit.fittedvalues)**2))
mae_numpy = np.mean(np.abs(y.values - fitted_hw))
mae_statsmodels = np.mean(np.abs(y.values - hw_fit.fittedvalues))

print("Performance Comparison:")
print(f"{'Metric':<15} {'NumPy':>12} {'statsmodels':>12}")
print("-" * 40)
print(f"{'RMSE':<15} {rmse_numpy:>12.2f} {rmse_statsmodels:>12.2f}")
print(f"{'MAE':<15} {mae_numpy:>12.2f} {mae_statsmodels:>12.2f}")

### Residual Diagnostics

Good forecasting models should have residuals that are:
1. **Uncorrelated** (no remaining autocorrelation)
2. **Zero mean** (no systematic bias)
3. **Constant variance** (homoscedastic)
4. **Normally distributed** (for valid prediction intervals)

In [None]:
# Residual diagnostics for Holt-Winters
fig = make_subplots(rows=2, cols=2, 
                    subplot_titles=('Residuals Over Time', 'Residual Distribution',
                                   'ACF of Residuals', 'Q-Q Plot'),
                    vertical_spacing=0.12, horizontal_spacing=0.1)

# 1. Residuals over time
fig.add_trace(go.Scatter(
    x=timestamps, y=resid_hw,
    mode='lines+markers',
    marker=dict(size=3),
    line=dict(color='#1f77b4', width=1),
    name='Residuals'
), row=1, col=1)
fig.add_hline(y=0, line_dash="dash", line_color="red", row=1, col=1)

# 2. Residual distribution (histogram)
fig.add_trace(go.Histogram(
    x=resid_hw, 
    nbinsx=20,
    name='Residual Distribution',
    marker_color='#2ca02c',
    opacity=0.7
), row=1, col=2)

# Add normal curve overlay
x_range = np.linspace(resid_hw.min(), resid_hw.max(), 100)
normal_curve = (1 / (sigma_hw * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x_range - resid_hw.mean()) / sigma_hw) ** 2)
# Scale to match histogram
normal_curve = normal_curve * len(resid_hw) * (resid_hw.max() - resid_hw.min()) / 20
fig.add_trace(go.Scatter(
    x=x_range, y=normal_curve,
    mode='lines',
    line=dict(color='red', width=2),
    name='Normal'
), row=1, col=2)

# 3. ACF of residuals (simplified)
def compute_acf(x, nlags=20):
    """Compute autocorrelation function."""
    n = len(x)
    x_centered = x - np.mean(x)
    acf = np.zeros(nlags + 1)
    var = np.sum(x_centered ** 2)
    for lag in range(nlags + 1):
        acf[lag] = np.sum(x_centered[:n-lag] * x_centered[lag:]) / var
    return acf

acf_values = compute_acf(resid_hw, nlags=20)
lags = np.arange(len(acf_values))
confidence_bound = 1.96 / np.sqrt(len(resid_hw))

fig.add_trace(go.Bar(
    x=lags, y=acf_values,
    marker_color='#1f77b4',
    name='ACF'
), row=2, col=1)
fig.add_hline(y=confidence_bound, line_dash="dash", line_color="red", row=2, col=1)
fig.add_hline(y=-confidence_bound, line_dash="dash", line_color="red", row=2, col=1)

# 4. Q-Q Plot
sorted_residuals = np.sort(resid_hw)
n = len(sorted_residuals)
theoretical_quantiles = np.array([np.percentile(np.random.standard_normal(10000), 100 * (i - 0.5) / n) 
                                   for i in range(1, n + 1)])

fig.add_trace(go.Scatter(
    x=theoretical_quantiles, y=sorted_residuals,
    mode='markers',
    marker=dict(color='#1f77b4', size=5),
    name='Q-Q'
), row=2, col=2)

# Add reference line
qq_min, qq_max = theoretical_quantiles.min(), theoretical_quantiles.max()
fig.add_trace(go.Scatter(
    x=[qq_min, qq_max],
    y=[resid_hw.mean() + qq_min * sigma_hw, resid_hw.mean() + qq_max * sigma_hw],
    mode='lines',
    line=dict(color='red', dash='dash'),
    name='Reference'
), row=2, col=2)

fig.update_layout(
    height=600, 
    title_text="Holt-Winters Residual Diagnostics",
    showlegend=False,
    template='plotly_white'
)

fig.update_xaxes(title_text="Date", row=1, col=1)
fig.update_xaxes(title_text="Residual Value", row=1, col=2)
fig.update_xaxes(title_text="Lag", row=2, col=1)
fig.update_xaxes(title_text="Theoretical Quantiles", row=2, col=2)

fig.update_yaxes(title_text="Residual", row=1, col=1)
fig.update_yaxes(title_text="Frequency", row=1, col=2)
fig.update_yaxes(title_text="ACF", row=2, col=1)
fig.update_yaxes(title_text="Sample Quantiles", row=2, col=2)

fig.show()

# Print diagnostic statistics
print("Residual Diagnostics Summary:")
print(f"  Mean: {resid_hw.mean():.4f} (should be ≈ 0)")
print(f"  Std Dev: {sigma_hw:.4f}")
print(f"  Skewness: {((resid_hw - resid_hw.mean())**3).mean() / sigma_hw**3:.4f} (should be ≈ 0)")
print(f"  Kurtosis: {((resid_hw - resid_hw.mean())**4).mean() / sigma_hw**4 - 3:.4f} (should be ≈ 0 for normal)")

---

## Comparing sktime vs NumPy Implementations

Let's verify our NumPy implementation against sktime's production-ready implementation.

In [None]:
# Generate forecast using our NumPy implementation
h = 24  # Forecast 24 months ahead
sigma_hw = np.std(resid_hw)

# Get forecast with prediction intervals
forecast_hw, lower_hw, upper_hw = holt_winters_forecast(
    level_final=level_hw[-1],
    trend_final=trend_hw[-1],
    seasonal=seasonal_hw,
    h=h,
    m=12,
    sigma=sigma_hw
)

# Create forecast dates
last_date = y.index[-1].to_timestamp()
forecast_dates = pd.date_range(start=last_date + pd.DateOffset(months=1), periods=h, freq='ME')

# Plot with prediction intervals
fig = go.Figure()

# Historical data
fig.add_trace(go.Scatter(
    x=timestamps, y=y.values, 
    name='Historical', 
    line=dict(color='#1f77b4', width=2)
))

# Fitted values
fig.add_trace(go.Scatter(
    x=timestamps, y=fitted_hw,
    name='Fitted (NumPy HW)',
    line=dict(color='#ff7f0e', width=1.5, dash='dash')
))

# Prediction interval (shaded)
fig.add_trace(go.Scatter(
    x=np.concatenate([forecast_dates, forecast_dates[::-1]]),
    y=np.concatenate([upper_hw, lower_hw[::-1]]),
    fill='toself',
    fillcolor='rgba(44, 160, 44, 0.2)',
    line=dict(color='rgba(255,255,255,0)'),
    name='95% Prediction Interval'
))

# Point forecast
fig.add_trace(go.Scatter(
    x=forecast_dates, y=forecast_hw,
    name='Forecast',
    line=dict(color='#2ca02c', width=2)
))

fig.update_layout(
    title="Holt-Winters Forecast with 95% Prediction Intervals (NumPy Implementation)",
    xaxis_title="Date",
    yaxis_title="Passengers",
    template='plotly_white',
    legend=dict(yanchor="top", y=0.99, xanchor="left", x=0.01),
    hovermode='x unified'
)
fig

---

## Key Takeaways

| Aspect | Description |
|--------|-------------|
| **When to Use** | Series with stable seasonal patterns; no need for complex autocorrelation modeling |
| **Advantages** | Interpretable components, fast computation, handles missing values well |
| **Limitations** | Assumes fixed seasonal pattern structure, may underperform on complex dynamics |
| **Parameter Selection** | Use AIC/BIC for automatic selection, or cross-validation |
| **vs ARIMA** | ETS explicitly models level/trend/seasonality; ARIMA models autocorrelation structure |

**Best Practices:**
1. Always visualize components to verify model appropriateness
2. Check residual diagnostics before trusting forecasts
3. Use prediction intervals to quantify uncertainty
4. Consider both additive and multiplicative specifications
5. For seasonal periods > 24, consider Fourier terms instead