[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/danpele/Time-Series-Analysis/blob/main/chapter4_seminar_notebook.ipynb)

---

# Chapter 4 Seminar: SARIMA Models - Practice Exercises

**Course:** Time Series Analysis and Forecasting  
**Program:** Bachelor program, Faculty of Cybernetics, Statistics and Economic Informatics, Bucharest University of Economic Studies, Romania  
**Academic Year:** 2025-2026

---

## Seminar Objectives

1. Identify and visualize seasonal patterns in time series
2. Practice seasonal decomposition
3. Apply seasonal differencing
4. Fit and diagnose SARIMA models
5. Generate seasonal forecasts
6. Work with real economic data with seasonality

## Setup

In [None]:
# Core libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

# Time series
from statsmodels.tsa.statespace.sarimax import SARIMAX
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.stattools import adfuller, kpss, acf, pacf
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.stats.diagnostic import acorr_ljungbox
from scipy import stats

# Auto-ARIMA
try:
    import pmdarima as pm
except:
    !pip install pmdarima -q
    import pmdarima as pm

# Plotting style
plt.rcParams['figure.figsize'] = (12, 5)
plt.rcParams['font.size'] = 11
plt.rcParams['axes.facecolor'] = 'none'
plt.rcParams['figure.facecolor'] = 'none'
plt.rcParams['axes.grid'] = False
plt.rcParams['axes.spines.top'] = False
plt.rcParams['axes.spines.right'] = False

COLORS = {'blue': '#1A3A6E', 'red': '#DC3545', 'green': '#2E7D32', 'orange': '#E67E22'}

print("Setup complete!")

## Exercise 1: Identifying Seasonality

### Task
Load and visualize the classic airline passengers dataset. Identify the seasonal pattern.

In [None]:
# Load airline passengers data
try:
    url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv'
    airline = pd.read_csv(url, index_col=0, parse_dates=True)
    airline.columns = ['Passengers']
    airline.index = pd.date_range('1949-01', periods=len(airline), freq='ME')
except:
    # Fallback: generate synthetic data
    np.random.seed(42)
    n = 144
    t = np.arange(n)
    trend = 100 + 2.5 * t
    seasonal = 40 * np.sin(2 * np.pi * t / 12)
    noise = np.random.randn(n) * 10
    passengers = trend * (1 + 0.3 * np.sin(2 * np.pi * t / 12)) + noise
    airline = pd.DataFrame({'Passengers': passengers},
                          index=pd.date_range('1949-01', periods=n, freq='ME'))

print(f"Data: {len(airline)} monthly observations")
print(f"Period: {airline.index[0].date()} to {airline.index[-1].date()}")
airline.head()

In [None]:
# Visualization
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Time series plot
axes[0, 0].plot(airline.index, airline['Passengers'], color=COLORS['blue'], linewidth=1, label='Passengers')
axes[0, 0].set_title('Airline Passengers (1949-1960)', fontweight='bold')
axes[0, 0].set_xlabel('Date')
axes[0, 0].set_ylabel('Passengers (thousands)')
axes[0, 0].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), frameon=False)

# Monthly boxplot
monthly_data = airline.copy()
monthly_data['Month'] = monthly_data.index.month
monthly_data.boxplot(column='Passengers', by='Month', ax=axes[0, 1])
axes[0, 1].set_title('Passengers by Month', fontweight='bold')
axes[0, 1].set_xlabel('Month')
axes[0, 1].set_ylabel('Passengers')
plt.suptitle('')  # Remove automatic title

# Yearly plot (multiple years overlaid)
for year in airline.index.year.unique():
    yearly_data = airline[airline.index.year == year]
    axes[1, 0].plot(range(1, 13), yearly_data['Passengers'].values, 
                    alpha=0.5, linewidth=1, label=str(year) if year % 3 == 0 else '')
axes[1, 0].set_title('Seasonal Pattern by Year', fontweight='bold')
axes[1, 0].set_xlabel('Month')
axes[1, 0].set_ylabel('Passengers')
axes[1, 0].set_xticks(range(1, 13))
axes[1, 0].set_xticklabels(['J', 'F', 'M', 'A', 'M', 'J', 'J', 'A', 'S', 'O', 'N', 'D'])
axes[1, 0].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=4, frameon=False)

# Average by month
monthly_avg = airline.groupby(airline.index.month).mean()
axes[1, 1].bar(range(1, 13), monthly_avg['Passengers'].values, 
               color=COLORS['blue'], alpha=0.7, label='Average')
axes[1, 1].set_title('Average Passengers by Month', fontweight='bold')
axes[1, 1].set_xlabel('Month')
axes[1, 1].set_ylabel('Average Passengers')
axes[1, 1].set_xticks(range(1, 13))
axes[1, 1].set_xticklabels(['J', 'F', 'M', 'A', 'M', 'J', 'J', 'A', 'S', 'O', 'N', 'D'])
axes[1, 1].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), frameon=False)

plt.tight_layout()
plt.show()

print("\nKey observations:")
print(f"- Peak month: {monthly_avg['Passengers'].idxmax()} (summer travel)")
print(f"- Trough month: {monthly_avg['Passengers'].idxmin()} (winter)")
print(f"- Seasonal amplitude grows over time (multiplicative seasonality)")

### Exercise 1 Questions

1. What is the seasonal period for this data?
2. Is the seasonality additive or multiplicative? How can you tell?
3. Why might summer months have higher passenger counts?

## Exercise 2: Seasonal Decomposition

### Task
Decompose the airline passengers series into trend, seasonal, and residual components.

In [None]:
# Compare additive vs multiplicative decomposition
decomp_add = seasonal_decompose(airline['Passengers'], model='additive', period=12)
decomp_mult = seasonal_decompose(airline['Passengers'], model='multiplicative', period=12)

fig, axes = plt.subplots(4, 2, figsize=(14, 12))
fig.suptitle('Additive vs Multiplicative Decomposition', fontsize=14, fontweight='bold')

# Additive (left column)
axes[0, 0].plot(decomp_add.observed, color=COLORS['blue'], linewidth=1)
axes[0, 0].set_title('Observed (Additive)', fontweight='bold')

axes[1, 0].plot(decomp_add.trend, color=COLORS['green'], linewidth=1)
axes[1, 0].set_title('Trend (Additive)', fontweight='bold')

axes[2, 0].plot(decomp_add.seasonal, color=COLORS['orange'], linewidth=1)
axes[2, 0].set_title('Seasonal (Additive) - constant amplitude', fontweight='bold')

axes[3, 0].plot(decomp_add.resid, color=COLORS['red'], linewidth=1)
axes[3, 0].set_title('Residual (Additive)', fontweight='bold')

# Multiplicative (right column)
axes[0, 1].plot(decomp_mult.observed, color=COLORS['blue'], linewidth=1)
axes[0, 1].set_title('Observed (Multiplicative)', fontweight='bold')

axes[1, 1].plot(decomp_mult.trend, color=COLORS['green'], linewidth=1)
axes[1, 1].set_title('Trend (Multiplicative)', fontweight='bold')

axes[2, 1].plot(decomp_mult.seasonal, color=COLORS['orange'], linewidth=1)
axes[2, 1].axhline(y=1, color='black', linestyle='--', alpha=0.3)
axes[2, 1].set_title('Seasonal (Multiplicative) - factors around 1', fontweight='bold')

axes[3, 1].plot(decomp_mult.resid, color=COLORS['red'], linewidth=1)
axes[3, 1].axhline(y=1, color='black', linestyle='--', alpha=0.3)
axes[3, 1].set_title('Residual (Multiplicative)', fontweight='bold')

plt.tight_layout()
plt.show()

# Compare residual variance
add_resid_var = np.nanvar(decomp_add.resid)
mult_resid_var = np.nanvar(decomp_mult.resid)

print(f"\nResidual Variance Comparison:")
print(f"  Additive: {add_resid_var:.2f}")
print(f"  Multiplicative: {mult_resid_var:.6f}")
print(f"\nMultiplicative model is better (lower, more stable residuals)")

In [None]:
# Extract and display seasonal factors
seasonal_factors = decomp_mult.seasonal.iloc[:12]

print("Monthly Seasonal Factors (Multiplicative):")
print("="*50)
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 
          'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
for month, factor in zip(months, seasonal_factors):
    pct_change = (factor - 1) * 100
    direction = "↑" if pct_change > 0 else "↓"
    print(f"  {month}: {factor:.3f} ({pct_change:+.1f}% {direction})")

## Exercise 3: Seasonal Differencing

### Task
Apply different differencing operators and test for stationarity.

In [None]:
# Apply log transformation and differencing
y = airline['Passengers'].values
log_y = np.log(y)

# Different differencing operations
diff1 = np.diff(log_y)                    # (1-L)
diff12 = log_y[12:] - log_y[:-12]         # (1-L^12)
diff1_12 = np.diff(diff12)                # (1-L)(1-L^12)

print("Number of observations after each transformation:")
print(f"  Original: {len(log_y)}")
print(f"  After (1-L): {len(diff1)}")
print(f"  After (1-L^12): {len(diff12)}")
print(f"  After (1-L)(1-L^12): {len(diff1_12)}")

In [None]:
# Visualize each transformation
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Original log series
axes[0, 0].plot(log_y, color=COLORS['blue'], linewidth=1)
axes[0, 0].set_title('log(Y_t): Trend + Seasonality', fontweight='bold')

# First difference only
axes[0, 1].plot(diff1, color=COLORS['green'], linewidth=1)
axes[0, 1].axhline(y=0, color='red', linestyle='--', alpha=0.5)
axes[0, 1].set_title('(1-L)log(Y): Seasonality remains', fontweight='bold')

# Seasonal difference only
axes[1, 0].plot(diff12, color=COLORS['orange'], linewidth=1)
axes[1, 0].axhline(y=0, color='red', linestyle='--', alpha=0.5)
axes[1, 0].set_title('(1-L^12)log(Y): Trend remains', fontweight='bold')

# Both differences
axes[1, 1].plot(diff1_12, color=COLORS['red'], linewidth=1)
axes[1, 1].axhline(y=0, color='black', linestyle='--', alpha=0.5)
axes[1, 1].set_title('(1-L)(1-L^12)log(Y): Stationary!', fontweight='bold')

plt.tight_layout()
plt.show()

In [None]:
# Unit root tests on each transformation
def adf_summary(series, name):
    result = adfuller(series, autolag='AIC')
    status = "STATIONARY" if result[1] < 0.05 else "NON-STATIONARY"
    print(f"{name:<30} ADF={result[0]:>8.3f}  p={result[1]:.4f}  → {status}")

print("ADF Test Results:")
print("="*75)
adf_summary(log_y, "log(Y)")
adf_summary(diff1, "(1-L)log(Y)")
adf_summary(diff12, "(1-L^12)log(Y)")
adf_summary(diff1_12, "(1-L)(1-L^12)log(Y)")
print("="*75)
print("\nConclusion: Need BOTH d=1 AND D=1 for stationarity")

## Exercise 4: ACF/PACF for Seasonal Model Identification

### Task
Analyze ACF and PACF patterns to identify SARIMA orders.

In [None]:
# ACF and PACF of the fully differenced series
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Full ACF
plot_acf(diff1_12, ax=axes[0, 0], lags=48, color=COLORS['blue'])
axes[0, 0].set_title('ACF of (1-L)(1-L^12)log(Y)', fontweight='bold')
for lag in [12, 24, 36, 48]:
    axes[0, 0].axvline(x=lag, color='red', linestyle=':', alpha=0.5)

# Full PACF
plot_pacf(diff1_12, ax=axes[0, 1], lags=48, color=COLORS['blue'])
axes[0, 1].set_title('PACF of (1-L)(1-L^12)log(Y)', fontweight='bold')
for lag in [12, 24, 36, 48]:
    axes[0, 1].axvline(x=lag, color='red', linestyle=':', alpha=0.5)

# Non-seasonal lags only (1-15)
plot_acf(diff1_12, ax=axes[1, 0], lags=15, color=COLORS['green'])
axes[1, 0].set_title('ACF: Non-seasonal lags', fontweight='bold')

plot_pacf(diff1_12, ax=axes[1, 1], lags=15, color=COLORS['green'])
axes[1, 1].set_title('PACF: Non-seasonal lags', fontweight='bold')

plt.tight_layout()
plt.show()

print("\nACF/PACF Analysis:")
print("- Significant spike at lag 1 in ACF → q=1 (MA component)")
print("- Significant spike at lag 12 in ACF → Q=1 (Seasonal MA)")
print("- Suggested model: SARIMA(0,1,1)(0,1,1)[12] - the airline model")

## Exercise 5: Fitting SARIMA Models

### Task
Fit and compare different SARIMA specifications.

In [None]:
# Compare different SARIMA models
print("SARIMA Model Comparison:")
print("="*70)
print(f"{'Model':<35} {'AIC':>12} {'BIC':>12}")
print("-"*70)

models_to_try = [
    ((0, 1, 1), (0, 1, 1, 12)),  # Airline model
    ((1, 1, 0), (1, 1, 0, 12)),  # Pure AR
    ((1, 1, 1), (0, 1, 1, 12)),  # Mixed non-seasonal
    ((0, 1, 1), (1, 1, 0, 12)),  # Mixed seasonal
    ((1, 1, 1), (1, 1, 1, 12)),  # Full model
    ((2, 1, 0), (0, 1, 1, 12)),  # AR(2) + seasonal MA
]

results_dict = {}
for order, seasonal_order in models_to_try:
    try:
        model = SARIMAX(log_y, order=order, seasonal_order=seasonal_order,
                        enforce_stationarity=False, enforce_invertibility=False)
        res = model.fit(disp=False)
        name = f"SARIMA{order}x{seasonal_order[:3]}[{seasonal_order[3]}]"
        results_dict[name] = res
        print(f"{name:<35} {res.aic:>12.2f} {res.bic:>12.2f}")
    except Exception as e:
        pass

# Find best model
best_model_name = min(results_dict.keys(), key=lambda x: results_dict[x].aic)
print("-"*70)
print(f"Best model by AIC: {best_model_name}")

In [None]:
# Fit the airline model and show detailed results
airline_model = SARIMAX(log_y, order=(0, 1, 1), seasonal_order=(0, 1, 1, 12),
                        enforce_stationarity=False, enforce_invertibility=False)
airline_results = airline_model.fit(disp=False)

print("Airline Model: SARIMA(0,1,1)(0,1,1)[12]")
print("="*60)
print(airline_results.summary())

In [None]:
# Use auto_arima for automatic selection
print("\nAutomatic Model Selection:")
print("="*60)

auto_model = pm.auto_arima(
    log_y,
    start_p=0, start_q=0, max_p=2, max_q=2,
    d=1,
    start_P=0, start_Q=0, max_P=2, max_Q=2,
    D=1, m=12,
    seasonal=True,
    stepwise=True,
    suppress_warnings=True,
    trace=True
)

print(f"\nAuto-selected: SARIMA{auto_model.order}x{auto_model.seasonal_order[:3]}[{auto_model.seasonal_order[3]}]")

## Exercise 6: Model Diagnostics

### Task
Validate the fitted model using residual analysis.

In [None]:
# Diagnostic plots
residuals = airline_results.resid

fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Residuals over time
axes[0, 0].plot(residuals, color=COLORS['blue'], linewidth=0.5)
axes[0, 0].axhline(y=0, color='red', linestyle='--')
axes[0, 0].set_title('Residuals Over Time', fontweight='bold')
axes[0, 0].set_xlabel('Time')

# Histogram
axes[0, 1].hist(residuals, bins=20, color=COLORS['blue'], edgecolor='black', 
                alpha=0.7, density=True)
x = np.linspace(residuals.min(), residuals.max(), 100)
axes[0, 1].plot(x, stats.norm.pdf(x, residuals.mean(), residuals.std()), 
                color=COLORS['red'], linewidth=2, label='Normal')
axes[0, 1].set_title('Residual Distribution', fontweight='bold')
axes[0, 1].legend(loc='upper right')

# ACF of residuals
plot_acf(residuals, ax=axes[1, 0], lags=36, color=COLORS['blue'])
axes[1, 0].set_title('ACF of Residuals', fontweight='bold')
for lag in [12, 24, 36]:
    axes[1, 0].axvline(x=lag, color='red', linestyle=':', alpha=0.3)

# Q-Q plot
(osm, osr), (slope, intercept, r) = stats.probplot(residuals, dist="norm")
axes[1, 1].scatter(osm, osr, color=COLORS['blue'], s=20, alpha=0.5)
axes[1, 1].plot(osm, slope*osm + intercept, color=COLORS['red'], linewidth=2)
axes[1, 1].set_title('Q-Q Plot', fontweight='bold')
axes[1, 1].set_xlabel('Theoretical Quantiles')
axes[1, 1].set_ylabel('Sample Quantiles')

plt.tight_layout()
plt.show()

In [None]:
# Ljung-Box test
lb_test = acorr_ljungbox(residuals, lags=[12, 24, 36], return_df=True)
print("Ljung-Box Test for Residual Autocorrelation:")
print("="*50)
print(lb_test)

all_pass = all(lb_test['lb_pvalue'] > 0.05)
print(f"\nConclusion: {'Residuals are white noise ✓' if all_pass else 'Model may be inadequate ✗'}")

## Exercise 7: Forecasting

### Task
Generate and evaluate forecasts for the airline passengers data.

In [None]:
# Generate forecasts
forecast_steps = 24  # 2 years
forecast = airline_results.get_forecast(steps=forecast_steps)
forecast_mean = forecast.predicted_mean
forecast_ci = forecast.conf_int()

# Convert from log scale
passengers_forecast = np.exp(forecast_mean)
ci_lower = np.exp(forecast_ci.iloc[:, 0])
ci_upper = np.exp(forecast_ci.iloc[:, 1])

# Create forecast dates
forecast_dates = pd.date_range(start=airline.index[-1] + pd.DateOffset(months=1), 
                               periods=forecast_steps, freq='ME')

# Plot
fig, ax = plt.subplots(figsize=(14, 6))

ax.plot(airline.index, airline['Passengers'], color=COLORS['blue'], linewidth=1.5, label='Historical')
ax.plot(forecast_dates, passengers_forecast, color=COLORS['red'], linewidth=2, label='Forecast')
ax.fill_between(forecast_dates, ci_lower, ci_upper, color=COLORS['red'], alpha=0.2, label='95% CI')

ax.axvline(x=airline.index[-1], color='black', linestyle='-', alpha=0.3)
ax.set_title('Airline Passengers: 24-Month Forecast', fontweight='bold')
ax.set_xlabel('Date')
ax.set_ylabel('Passengers (thousands)')
ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.1), ncol=3, frameon=False)

plt.tight_layout()
plt.show()

# Show forecast table
print("\nForecast Summary (next 12 months):")
print("="*60)
print(f"{'Date':<15} {'Forecast':>12} {'Lower 95%':>12} {'Upper 95%':>12}")
print("-"*60)
for i in range(12):
    print(f"{str(forecast_dates[i].date()):<15} {passengers_forecast.iloc[i]:>12.0f} "
          f"{ci_lower.iloc[i]:>12.0f} {ci_upper.iloc[i]:>12.0f}")

## Exercise 8: Real Data - US Retail Sales

### Task
Apply SARIMA to real US retail sales data with seasonality.

In [None]:
# Load retail sales data from FRED
try:
    import pandas_datareader as pdr
    retail = pdr.get_data_fred('RSXFSN', start='2010-01-01', end='2024-12-31')
    retail = retail.dropna()
    retail.columns = ['Sales']
    print(f"US Retail Sales: {len(retail)} monthly observations")
except:
    # Generate synthetic retail-like data
    np.random.seed(123)
    n = 180  # 15 years monthly
    t = np.arange(n)
    trend = 400 + 2 * t
    seasonal = 50 * np.sin(2 * np.pi * t / 12) + 30 * np.cos(4 * np.pi * t / 12)
    # December spike
    december = np.zeros(n)
    december[11::12] = 80
    noise = np.random.randn(n) * 15
    sales = trend + seasonal + december + noise
    retail = pd.DataFrame({'Sales': sales},
                         index=pd.date_range('2010-01', periods=n, freq='ME'))
    print(f"Simulated Retail Sales: {len(retail)} observations")

# Plot
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

axes[0].plot(retail.index, retail['Sales'], color=COLORS['blue'], linewidth=1)
axes[0].set_title('US Retail Sales', fontweight='bold')
axes[0].set_ylabel('Sales (Billions $)')

# Monthly pattern
monthly_avg = retail.groupby(retail.index.month).mean()
axes[1].bar(range(1, 13), monthly_avg['Sales'].values, color=COLORS['blue'], alpha=0.7)
axes[1].set_title('Average Sales by Month', fontweight='bold')
axes[1].set_xticks(range(1, 13))
axes[1].set_xticklabels(['J', 'F', 'M', 'A', 'M', 'J', 'J', 'A', 'S', 'O', 'N', 'D'])

plt.tight_layout()
plt.show()

In [None]:
# Split data and fit SARIMA
retail_values = retail['Sales'].values
train_end = len(retail_values) - 24  # Hold out last 2 years

train = retail_values[:train_end]
test = retail_values[train_end:]

# Log transform
log_train = np.log(train)
log_test = np.log(test)

print(f"Training: {len(train)} obs, Test: {len(test)} obs")

# Auto SARIMA
retail_auto = pm.auto_arima(
    log_train,
    start_p=0, start_q=0, max_p=2, max_q=2,
    d=1,
    start_P=0, start_Q=0, max_P=1, max_Q=1,
    D=1, m=12,
    seasonal=True,
    stepwise=True,
    suppress_warnings=True,
    trace=True
)

print(f"\nSelected: SARIMA{retail_auto.order}x{retail_auto.seasonal_order[:3]}[{retail_auto.seasonal_order[3]}]")

In [None]:
# Forecast and evaluate
fc_log, conf_log = retail_auto.predict(n_periods=len(test), return_conf_int=True)

# Convert back from log
fc = np.exp(fc_log)
conf_lower = np.exp(conf_log[:, 0])
conf_upper = np.exp(conf_log[:, 1])

# Plot
fig, ax = plt.subplots(figsize=(14, 6))

ax.plot(retail.index[:train_end][-48:], train[-48:], color=COLORS['blue'], linewidth=1.5, label='Training')
ax.plot(retail.index[train_end:], test, color=COLORS['orange'], linewidth=1.5, label='Actual')
ax.plot(retail.index[train_end:], fc, color=COLORS['red'], linewidth=2, linestyle='--', label='Forecast')
ax.fill_between(retail.index[train_end:], conf_lower, conf_upper, color=COLORS['red'], alpha=0.2)

ax.axvline(x=retail.index[train_end], color='black', linestyle='-', alpha=0.3)
ax.set_title('Retail Sales: Out-of-Sample Forecast', fontweight='bold')
ax.set_ylabel('Sales (Billions $)')
ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.1), ncol=3, frameon=False)

plt.tight_layout()
plt.show()

# Accuracy metrics
mape = np.mean(np.abs((test - fc) / test)) * 100
rmse = np.sqrt(np.mean((test - fc)**2))

print(f"\nForecast Accuracy:")
print(f"  MAPE: {mape:.2f}%")
print(f"  RMSE: {rmse:.2f}")

## Exercise 9: Practice Questions

Answer the following questions based on your analysis:

1. **Why do we use log transformation before fitting SARIMA to airline data?**

2. **What is the difference between seasonal differencing $(1-L^{12})$ and regular differencing $(1-L)$?**

3. **The airline model has only 2 parameters. Why is it so effective?**

4. **How would you modify the SARIMA model for quarterly data?**

5. **Why do seasonal forecast intervals grow more rapidly than non-seasonal ones?**

## Summary

### What We Practiced

1. **Identifying Seasonality**: Visual analysis and seasonal decomposition
2. **Seasonal Differencing**: $(1-L^s)$ operator to remove seasonal patterns
3. **ACF/PACF Analysis**: Pattern recognition at seasonal lags
4. **SARIMA Modeling**: Fitting and comparing different specifications
5. **Diagnostics**: Residual analysis for seasonal models
6. **Forecasting**: Generating seasonal forecasts with confidence intervals

### Key Takeaways

- Multiplicative seasonality → use log transformation
- Often need both d=1 and D=1 for stationarity
- The airline model SARIMA(0,1,1)(0,1,1)[12] is remarkably robust
- ACF/PACF show patterns at seasonal lags (12, 24, 36, ...)
- Always validate with residual diagnostics before forecasting