# BSAD 8310: Business Forecasting
## Lab 01 — Introduction to Forecasting
**University of Nebraska at Omaha**

---

### Learning Objectives

By the end of this lab, you will be able to:

1. Load and visualize a real-world business time series
2. Decompose a series into trend, seasonal, and residual components
3. Implement four benchmark forecasting models in Python
4. Evaluate forecast accuracy using RMSE, MAE, and MAPE
5. Produce a publication-ready comparison plot

### Dataset

**US Advance Monthly Retail Trade Survey — Total Retail Sales (RSXFS)**  
Source: Federal Reserve Bank of St. Louis (FRED)  
Frequency: Monthly  
Units: Millions of dollars, seasonally unadjusted  

We use the **not seasonally adjusted** series so that seasonality is visible.  
If FRED access is unavailable, a fallback dataset is provided automatically.

### Notation (matches Lecture 1 slides)

| Symbol | Python variable | Meaning |
|--------|----------------|---------|
| $y_t$ | `y` | Time series observation |
| $\hat{y}_{t+h\|t}$ | `y_hat` | Point forecast |
| $e_t$ | `e` | Forecast error = $y_t - \hat{y}_{t\|t-1}$ |
| $m$ | `m` | Seasonal period (12 for monthly) |

---
## Section 1: Setup and Imports

In [None]:
# Standard imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
import warnings
from pathlib import Path

# Statsmodels
from statsmodels.tsa.seasonal import STL

# Reproducibility
np.random.seed(42)
warnings.filterwarnings('ignore')

# ---- UNO Maverick color palette ----
UNO_BLUE  = '#005CA9'
UNO_RED   = '#E41C38'
GRAY      = '#525252'
GREEN     = '#15803d'
LIGHT_BLUE = '#E8F0FA'

# ---- Matplotlib defaults ----
mpl.rcParams.update({
    'font.family': 'sans-serif',
    'axes.titlesize': 14,
    'axes.titleweight': 'bold',
    'axes.titlecolor': UNO_BLUE,
    'axes.labelsize': 12,
    'axes.spines.top': False,
    'axes.spines.right': False,
    'legend.frameon': False,
    'figure.dpi': 150,
    'lines.linewidth': 1.8,
})

# ---- Paths ----
ROOT = Path('.').resolve().parent  # repo root (one level above scripts/)
FIG_DIR = ROOT / 'Figures'
FIG_DIR.mkdir(parents=True, exist_ok=True)

print('Setup complete.')
print(f'Figures will be saved to: {FIG_DIR}')

---
## Section 2: Load and Explore the Data

We attempt to load US Retail Sales (RSXFS) from FRED.  
If unavailable, we fall back to the classic **US Air Passenger** dataset,
which has the same structural features (trend + strong seasonality).

In [None]:
# ---- Load data: FRED RSXFS or fallback ----
m = 12  # seasonal period (monthly data)

try:
    import pandas_datareader.data as web
    import datetime

    start = datetime.datetime(2000, 1, 1)
    end   = datetime.datetime(2023, 12, 1)

    raw = web.DataReader('RSXFS', 'fred', start, end)
    y = raw['RSXFS'].dropna()
    series_name = 'US Retail Sales (RSXFS, millions USD)'
    print(f'Loaded FRED data: {len(y)} monthly observations.')

except Exception as exc:
    # Fallback: AirPassengers dataset from statsmodels
    import statsmodels.api as sm
    air = sm.datasets.get_rdataset('AirPassengers', 'datasets').data
    # Convert to monthly DatetimeIndex
    y = pd.Series(
        air['value'].values,
        index=pd.date_range('1949-01', periods=len(air), freq='MS'),
        name='Passengers'
    )
    series_name = 'Monthly Airline Passengers (thousands)'
    print(f'FRED unavailable ({exc.__class__.__name__}). '
          f'Using AirPassengers fallback: {len(y)} observations.')

# Summary
print(f'\nSeries: {series_name}')
print(f'Start:  {y.index[0].strftime("%Y-%m")}')
print(f'End:    {y.index[-1].strftime("%Y-%m")}')
print(f'Mean:   {y.mean():.1f}  |  Std: {y.std():.1f}')
print(f'Min:    {y.min():.1f}  |  Max: {y.max():.1f}')

In [None]:
# ---- Plot the raw time series ----
fig, ax = plt.subplots(figsize=(12, 4))

ax.plot(y.index, y.values, color=UNO_BLUE, linewidth=1.5, label=series_name)
ax.set_title('Raw Time Series', fontsize=14, fontweight='bold', color=UNO_BLUE)
ax.set_xlabel('Date')
ax.set_ylabel('Value')
ax.legend()

plt.tight_layout()
fig.savefig(FIG_DIR / 'lecture01_raw_series.png', dpi=150, bbox_inches='tight',
            facecolor='white')
plt.show()
print('Figure saved: lecture01_raw_series.png')

---
## Section 3: STL Decomposition

**STL** (Seasonal-Trend decomposition using LOESS) decomposes $y_t$ into:
$$y_t = T_t + S_t + R_t$$
where $T_t$ is the trend, $S_t$ is the seasonal component, and $R_t$ is the remainder.

This gives us a visual intuition for what a good forecast model must capture.

In [None]:
# ---- STL decomposition ----
stl = STL(y, period=m, robust=True)
result = stl.fit()

fig, axes = plt.subplots(4, 1, figsize=(12, 10), sharex=True)

components = [
    (y.values,             'Observed $y_t$',    UNO_BLUE),
    (result.trend,         'Trend $T_t$',        UNO_BLUE),
    (result.seasonal,      'Seasonal $S_t$',     UNO_RED),
    (result.resid,         'Remainder $R_t$',    GRAY),
]

for ax, (data, label, color) in zip(axes, components):
    ax.plot(y.index, data, color=color, linewidth=1.5)
    ax.set_ylabel(label, fontsize=10)
    ax.axhline(0, color=GRAY, linewidth=0.5, linestyle='--')

axes[0].set_title('STL Decomposition', fontsize=14, fontweight='bold', color=UNO_BLUE)
axes[-1].set_xlabel('Date')

plt.tight_layout()
fig.savefig(FIG_DIR / 'lecture01_stl_decomposition.png', dpi=150,
            bbox_inches='tight', facecolor='white')
plt.show()
print('Figure saved: lecture01_stl_decomposition.png')

---
## Section 4: Train / Test Split

We hold out the **last 24 months** as a test set.  
All models are trained on the remaining history.

> **Key rule:** Time order is never violated. We never randomly shuffle.

In [None]:
# ---- Train/test split ----
H = 24  # forecast horizon (months)

y_train = y.iloc[:-H]
y_test  = y.iloc[-H:]

print(f'Training set: {y_train.index[0].strftime("%Y-%m")} to '
      f'{y_train.index[-1].strftime("%Y-%m")}  ({len(y_train)} obs)')
print(f'Test set:     {y_test.index[0].strftime("%Y-%m")} to '
      f'{y_test.index[-1].strftime("%Y-%m")}  ({len(y_test)} obs, H={H})')

---
## Section 5: Implement Benchmark Forecasts

We implement the four benchmarks from Lecture 1:

| Model | Formula | Python |
|-------|---------|--------|
| Naïve | $\hat{y}_{t+h\|t} = y_t$ | Last train value |
| Seasonal Naïve | $\hat{y}_{t+h\|t} = y_{t+h-m}$ | Shift by 12 |
| Historical Mean | $\hat{y}_{t+h\|t} = \bar{y}$ | `y_train.mean()` |
| RW with Drift | $\hat{y}_{t+h\|t} = y_t + h\hat{c}$ | Linear extrapolation |

In [None]:
# ---- Benchmark 1: Naïve forecast ----
last_value = y_train.iloc[-1]
y_hat_naive = pd.Series(
    np.full(H, last_value),
    index=y_test.index
)

# ---- Benchmark 2: Seasonal Naïve ----
# Use the last m observations from train as the seasonal pattern
last_season = y_train.iloc[-m:].values   # shape (12,)
# Tile to cover H periods
reps = int(np.ceil(H / m))
seasonal_pattern = np.tile(last_season, reps)[:H]
y_hat_snaive = pd.Series(seasonal_pattern, index=y_test.index)

# ---- Benchmark 3: Historical Mean ----
mean_value = y_train.mean()
y_hat_mean = pd.Series(
    np.full(H, mean_value),
    index=y_test.index
)

# ---- Benchmark 4: Random Walk with Drift ----
# Drift = average period-to-period change over training set
drift = (y_train.iloc[-1] - y_train.iloc[0]) / (len(y_train) - 1)
y_hat_drift = pd.Series(
    [last_value + (h + 1) * drift for h in range(H)],
    index=y_test.index
)

print('Benchmark forecasts computed.')
print(f'  Naïve:          {y_hat_naive.iloc[0]:.1f} (flat)')
print(f'  Seasonal Naïve: {y_hat_snaive.iloc[:3].values}  ... (seasonal pattern)')
print(f'  Mean:           {y_hat_mean.iloc[0]:.1f} (flat)')
print(f'  Drift:          {y_hat_drift.iloc[0]:.1f} → {y_hat_drift.iloc[-1]:.1f}')

---
## Section 6: Evaluate Forecast Accuracy

Compute RMSE, MAE, and MAPE for each benchmark on the test set.

$$
\text{RMSE} = \sqrt{\frac{1}{H}\sum_{h=1}^{H}(y_{T+h} - \hat{y}_{T+h|T})^2}
\qquad
\text{MAE} = \frac{1}{H}\sum_{h=1}^{H}|y_{T+h} - \hat{y}_{T+h|T}|
\qquad
\text{MAPE} = \frac{1}{H}\sum_{h=1}^{H}\left|\frac{e_{T+h}}{y_{T+h}}\right| \times 100
$$

In [None]:
def compute_metrics(y_actual: pd.Series, y_forecast: pd.Series) -> dict:
    """Compute RMSE, MAE, and MAPE for point forecasts.

    Parameters
    ----------
    y_actual : pd.Series
        Realized values (the test set).
    y_forecast : pd.Series
        Point forecasts aligned to y_actual.

    Returns
    -------
    dict with keys 'RMSE', 'MAE', 'MAPE'.
    """
    e = y_actual.values - y_forecast.values  # forecast errors e_t = y_t - y_hat
    rmse = np.sqrt(np.mean(e ** 2))
    mae  = np.mean(np.abs(e))
    mape = np.mean(np.abs(e / y_actual.values)) * 100
    return {'RMSE': rmse, 'MAE': mae, 'MAPE': mape}


# ---- Compute metrics for all four benchmarks ----
benchmarks = {
    'Naïve':           y_hat_naive,
    'Seasonal Naïve':  y_hat_snaive,
    'Historical Mean': y_hat_mean,
    'RW + Drift':      y_hat_drift,
}

results = {}
for name, y_hat in benchmarks.items():
    results[name] = compute_metrics(y_test, y_hat)

# ---- Display as a formatted table ----
results_df = pd.DataFrame(results).T
results_df = results_df[['RMSE', 'MAE', 'MAPE']].round(2)
results_df['MAPE'] = results_df['MAPE'].apply(lambda x: f'{x:.2f}%')

print('\n=== Forecast Accuracy: 24-Month Out-of-Sample ===')
print(results_df.to_string())
print('\nLower is better for all metrics.')

---
## Section 7: Visualization — Actual vs. Forecasts

A publication-ready comparison plot using the UNO color palette.

In [None]:
# ---- Publication-ready forecast comparison plot ----
fig, ax = plt.subplots(figsize=(14, 5))

# Show last 36 months of training + all test
n_context = 36
y_context = y_train.iloc[-n_context:]

# Plot training context
ax.plot(y_context.index, y_context.values,
        color=GRAY, linewidth=1.5, label='History (last 3 years)', alpha=0.7)

# Plot actual test values
ax.plot(y_test.index, y_test.values,
        color='black', linewidth=2.0, label='Actual (test set)', zorder=5)

# Plot each benchmark
colors = [UNO_BLUE, UNO_RED, GREEN, '#FF8C00']
linestyles = ['--', '--', ':', '-.']
for (name, y_hat), color, ls in zip(benchmarks.items(), colors, linestyles):
    ax.plot(y_hat.index, y_hat.values,
            color=color, linewidth=1.5, linestyle=ls, label=name)

# Vertical line at forecast origin
ax.axvline(y_test.index[0], color=UNO_RED, linewidth=1.0,
           linestyle=':', alpha=0.6, label='Forecast origin')

# Shading for test period
ax.axvspan(y_test.index[0], y_test.index[-1],
           alpha=0.05, color=UNO_RED)

ax.set_title('Benchmark Forecasts vs. Actual', fontsize=14,
             fontweight='bold', color=UNO_BLUE)
ax.set_xlabel('Date')
ax.set_ylabel('Value')
ax.legend(loc='upper left', fontsize=9, ncol=2)

plt.tight_layout()
fig.savefig(FIG_DIR / 'lecture01_benchmark_forecasts.png',
            dpi=150, bbox_inches='tight', facecolor='white')
plt.show()
print('Figure saved: lecture01_benchmark_forecasts.png')

In [None]:
# ---- Bar chart: RMSE comparison ----
rmse_values = {name: compute_metrics(y_test, y_hat)['RMSE']
               for name, y_hat in benchmarks.items()}

fig, ax = plt.subplots(figsize=(8, 4))

bar_colors = [UNO_BLUE if v == min(rmse_values.values()) else GRAY
              for v in rmse_values.values()]

bars = ax.bar(rmse_values.keys(), rmse_values.values(),
              color=bar_colors, edgecolor='white', linewidth=0.5)

# Label each bar
for bar, val in zip(bars, rmse_values.values()):
    ax.text(bar.get_x() + bar.get_width() / 2,
            bar.get_height() + max(rmse_values.values()) * 0.01,
            f'{val:.0f}', ha='center', va='bottom', fontsize=10, color=GRAY)

ax.set_title('RMSE by Benchmark Model (lower is better)',
             fontsize=14, fontweight='bold', color=UNO_BLUE)
ax.set_ylabel('RMSE')
ax.set_ylim(0, max(rmse_values.values()) * 1.15)

plt.tight_layout()
fig.savefig(FIG_DIR / 'lecture01_rmse_comparison.png',
            dpi=150, bbox_inches='tight', facecolor='white')
plt.show()
print('Figure saved: lecture01_rmse_comparison.png')

---
## Discussion Questions

Answer these questions based on your results above.

1. Which benchmark performed best on RMSE? On MAPE? Are they the same model?

2. The Seasonal Naïve model does not account for trend. How does this limitation show up visually in the forecast plot?

3. The Historical Mean benchmark uses all available training data equally. Is this always a good idea for a series with trend?

4. Look at the RMSE bar chart. By how much (in percentage) does the best benchmark beat the worst? Is this a large or small difference in practical terms?

5. **Challenge:** The benchmark principle says a useful model must beat the best benchmark. Based on what you see in the decomposition plot, what features would a better model need to capture?

---
## Summary

In this lab we:

- Loaded and explored a real business time series with trend and seasonality
- Used STL to decompose the series into interpretable components
- Implemented four standard benchmarks: Naïve, Seasonal Naïve, Mean, RW+Drift
- Evaluated accuracy using RMSE, MAE, and MAPE on a held-out test set
- Produced publication-ready visualizations

**Next lecture:** Regression-Based Forecasting — using predictor variables to improve on these benchmarks.