[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/danpele/Time-Series-Analysis/blob/main/chapter2_lecture_notebook.ipynb)

---

# Chapter 2: ARMA Models

**Course:** Time Series Analysis and Forecasting  
**Program:** Master in Statistics and Data Science  
**Academic Year:** 2025-2026

---

## Learning Objectives

By the end of this notebook, you will be able to:
1. Understand and simulate AR, MA, and ARMA processes
2. Identify model orders using ACF and PACF
3. Estimate ARMA parameters using Maximum Likelihood
4. Perform model diagnostics (residual analysis, Ljung-Box test)
5. Generate forecasts with confidence intervals
6. Compare models using information criteria (AIC, BIC)

## Setup and Imports

In [None]:
# Core libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

# Time series specific
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.arima_process import ArmaProcess
from statsmodels.tsa.stattools import adfuller, kpss, acf, pacf
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.stats.diagnostic import acorr_ljungbox
from scipy import stats

# Plotting style - clean, professional
plt.rcParams['figure.figsize'] = (12, 5)
plt.rcParams['font.size'] = 11
plt.rcParams['axes.facecolor'] = 'none'
plt.rcParams['figure.facecolor'] = 'none'
plt.rcParams['savefig.facecolor'] = 'none'
plt.rcParams['axes.grid'] = False
plt.rcParams['axes.spines.top'] = False
plt.rcParams['axes.spines.right'] = False

# Colors (IDA color scheme)
COLORS = {
    'blue': '#1A3A6E',
    'red': '#DC3545',
    'green': '#2E7D32',
    'orange': '#E67E22',
    'gray': '#666666'
}

print("All libraries loaded successfully!")

## 1. The Lag Operator

The **lag operator** (or backshift operator) $L$ shifts a time series back by one period:

$$L X_t = X_{t-1}$$

**Properties:**
- $L^k X_t = X_{t-k}$
- $(1-L)X_t = X_t - X_{t-1} = \Delta X_t$ (first difference)

In [None]:
# Demonstrate lag operator
np.random.seed(42)
X = pd.Series([10, 12, 15, 14, 18, 20, 19, 22], name='X_t')

# Create lagged versions
df = pd.DataFrame({
    'X_t': X,
    'L(X_t) = X_{t-1}': X.shift(1),
    'L^2(X_t) = X_{t-2}': X.shift(2),
    '(1-L)X_t = ΔX_t': X.diff(1)
})

print("Lag Operator Examples:")
print(df.to_string())

## 2. Autoregressive (AR) Models

### AR(1) Model

$$X_t = c + \phi X_{t-1} + \varepsilon_t$$

where $\varepsilon_t \sim WN(0, \sigma^2)$.

**Stationarity condition:** $|\phi| < 1$

**Properties:**
- Mean: $\mu = \frac{c}{1-\phi}$
- Variance: $\gamma(0) = \frac{\sigma^2}{1-\phi^2}$
- ACF: $\rho(h) = \phi^h$ (exponential decay)

In [None]:
# Simulate AR(1) processes with different phi values
np.random.seed(42)
n = 300
phi_values = [0.9, 0.5, -0.7]

fig, axes = plt.subplots(len(phi_values), 3, figsize=(15, 10))

for i, phi in enumerate(phi_values):
    # Simulate AR(1)
    ar1 = np.zeros(n)
    for t in range(1, n):
        ar1[t] = phi * ar1[t-1] + np.random.randn()
    
    # Time series plot
    axes[i, 0].plot(ar1, color=COLORS['blue'], linewidth=0.8, label=f'AR(1), φ={phi}')
    axes[i, 0].axhline(y=0, color='red', linestyle='--', alpha=0.5)
    axes[i, 0].set_title(f'AR(1) with φ = {phi}', fontweight='bold')
    axes[i, 0].set_xlabel('Time')
    axes[i, 0].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), frameon=False)
    
    # ACF
    plot_acf(ar1, ax=axes[i, 1], lags=20, color=COLORS['blue'])
    axes[i, 1].set_title('ACF (Decays)', fontweight='bold')
    
    # PACF
    plot_pacf(ar1, ax=axes[i, 2], lags=20, color=COLORS['blue'], method='ywm')
    axes[i, 2].set_title('PACF (Cuts off at lag 1)', fontweight='bold')

plt.tight_layout()
plt.show()

print("\nKey Pattern: AR(1) has exponentially decaying ACF and PACF that cuts off after lag 1")

### AR(2) Model

$$X_t = c + \phi_1 X_{t-1} + \phi_2 X_{t-2} + \varepsilon_t$$

**Stationarity conditions:**
1. $\phi_1 + \phi_2 < 1$
2. $\phi_2 - \phi_1 < 1$
3. $|\phi_2| < 1$

AR(2) can exhibit **pseudo-cyclical behavior** when roots are complex.

In [None]:
# Simulate AR(2) with complex roots (pseudo-cycles)
np.random.seed(42)
n = 300
phi1, phi2 = 1.0, -0.5  # Complex roots -> oscillations

ar2 = np.zeros(n)
for t in range(2, n):
    ar2[t] = phi1 * ar2[t-1] + phi2 * ar2[t-2] + np.random.randn()

fig, axes = plt.subplots(1, 3, figsize=(15, 4))

# Time series
axes[0].plot(ar2, color=COLORS['blue'], linewidth=0.8, label='AR(2)')
axes[0].set_title(f'AR(2): φ₁={phi1}, φ₂={phi2}', fontweight='bold')
axes[0].set_xlabel('Time')
axes[0].legend(loc='upper center', bbox_to_anchor=(0.5, -0.15), frameon=False)

# ACF
plot_acf(ar2, ax=axes[1], lags=25, color=COLORS['blue'])
axes[1].set_title('ACF (Damped oscillations)', fontweight='bold')

# PACF
plot_pacf(ar2, ax=axes[2], lags=25, color=COLORS['blue'], method='ywm')
axes[2].set_title('PACF (Cuts off at lag 2)', fontweight='bold')

plt.tight_layout()
plt.show()

# Check stationarity conditions
print(f"\nStationarity check for AR(2) with φ₁={phi1}, φ₂={phi2}:")
print(f"  φ₁ + φ₂ = {phi1 + phi2} < 1? {phi1 + phi2 < 1}")
print(f"  φ₂ - φ₁ = {phi2 - phi1} < 1? {phi2 - phi1 < 1}")
print(f"  |φ₂| = {abs(phi2)} < 1? {abs(phi2) < 1}")

## 3. Moving Average (MA) Models

### MA(1) Model

$$X_t = \mu + \varepsilon_t + \theta \varepsilon_{t-1}$$

**Properties:**
- Always stationary (for finite $\theta$)
- Invertible if $|\theta| < 1$
- ACF: $\rho(1) = \frac{\theta}{1+\theta^2}$, $\rho(h) = 0$ for $h > 1$

In [None]:
# Simulate MA(1) processes
np.random.seed(42)
n = 300
theta_values = [0.8, -0.8]

fig, axes = plt.subplots(len(theta_values), 3, figsize=(15, 7))

for i, theta in enumerate(theta_values):
    # Simulate MA(1)
    eps = np.random.randn(n)
    ma1 = np.zeros(n)
    for t in range(1, n):
        ma1[t] = eps[t] + theta * eps[t-1]
    
    # Time series plot
    axes[i, 0].plot(ma1, color=COLORS['green'], linewidth=0.8, label=f'MA(1), θ={theta}')
    axes[i, 0].axhline(y=0, color='red', linestyle='--', alpha=0.5)
    axes[i, 0].set_title(f'MA(1) with θ = {theta}', fontweight='bold')
    axes[i, 0].set_xlabel('Time')
    axes[i, 0].legend(loc='upper center', bbox_to_anchor=(0.5, -0.15), frameon=False)
    
    # ACF
    plot_acf(ma1, ax=axes[i, 1], lags=20, color=COLORS['green'])
    axes[i, 1].set_title('ACF (Cuts off at lag 1)', fontweight='bold')
    
    # PACF
    plot_pacf(ma1, ax=axes[i, 2], lags=20, color=COLORS['green'], method='ywm')
    axes[i, 2].set_title('PACF (Decays)', fontweight='bold')

plt.tight_layout()
plt.show()

# Theoretical ACF for MA(1)
for theta in theta_values:
    rho1 = theta / (1 + theta**2)
    print(f"MA(1) θ={theta}: Theoretical ρ(1) = {rho1:.4f}")

## 4. ARMA Models

### ARMA(p,q) Model

$$X_t = c + \phi_1 X_{t-1} + \cdots + \phi_p X_{t-p} + \varepsilon_t + \theta_1\varepsilon_{t-1} + \cdots + \theta_q\varepsilon_{t-q}$$

**Compact form:** $\phi(L)X_t = c + \theta(L)\varepsilon_t$

**Key pattern:** Both ACF and PACF decay (neither cuts off cleanly)

In [None]:
# Simulate ARMA(1,1)
np.random.seed(42)
n = 300
phi, theta = 0.7, 0.4

# Use statsmodels ArmaProcess for correct simulation
ar_coef = np.array([1, -phi])  # AR polynomial: 1 - phi*L
ma_coef = np.array([1, theta])  # MA polynomial: 1 + theta*L
arma_process = ArmaProcess(ar_coef, ma_coef)
arma11 = arma_process.generate_sample(n)

fig, axes = plt.subplots(1, 3, figsize=(15, 4))

# Time series
axes[0].plot(arma11, color=COLORS['orange'], linewidth=0.8, label='ARMA(1,1)')
axes[0].axhline(y=0, color='red', linestyle='--', alpha=0.5)
axes[0].set_title(f'ARMA(1,1): φ={phi}, θ={theta}', fontweight='bold')
axes[0].set_xlabel('Time')
axes[0].legend(loc='upper center', bbox_to_anchor=(0.5, -0.15), frameon=False)

# ACF
plot_acf(arma11, ax=axes[1], lags=20, color=COLORS['orange'])
axes[1].set_title('ACF (Decays)', fontweight='bold')

# PACF
plot_pacf(arma11, ax=axes[2], lags=20, color=COLORS['orange'], method='ywm')
axes[2].set_title('PACF (Decays)', fontweight='bold')

plt.tight_layout()
plt.show()

print("\nARMA(p,q) Identification: Both ACF and PACF decay gradually")

## 5. Model Identification Summary

| Model | ACF Pattern | PACF Pattern |
|-------|-------------|---------------|
| AR(p) | Decays (exp. or damped) | Cuts off at lag p |
| MA(q) | Cuts off at lag q | Decays (exp. or damped) |
| ARMA(p,q) | Decays | Decays |

In [None]:
# Visual comparison of AR, MA, ARMA patterns
np.random.seed(123)
n = 500

# Simulate different processes
# AR(2)
ar2 = np.zeros(n)
for t in range(2, n):
    ar2[t] = 0.5*ar2[t-1] + 0.3*ar2[t-2] + np.random.randn()

# MA(2)
eps = np.random.randn(n)
ma2 = np.zeros(n)
for t in range(2, n):
    ma2[t] = eps[t] + 0.6*eps[t-1] + 0.3*eps[t-2]

# ARMA(1,1)
ar_coef = np.array([1, -0.6])
ma_coef = np.array([1, 0.4])
arma = ArmaProcess(ar_coef, ma_coef).generate_sample(n)

fig, axes = plt.subplots(3, 2, figsize=(14, 10))

processes = [
    (ar2, 'AR(2)', COLORS['blue']),
    (ma2, 'MA(2)', COLORS['green']),
    (arma, 'ARMA(1,1)', COLORS['orange'])
]

for i, (data, name, color) in enumerate(processes):
    plot_acf(data, ax=axes[i, 0], lags=15, color=color)
    axes[i, 0].set_title(f'{name}: ACF', fontweight='bold')
    
    plot_pacf(data, ax=axes[i, 1], lags=15, color=color, method='ywm')
    axes[i, 1].set_title(f'{name}: PACF', fontweight='bold')

plt.tight_layout()
plt.show()

print("Identification Guide:")
print("- AR(2): ACF decays, PACF cuts off at lag 2")
print("- MA(2): ACF cuts off at lag 2, PACF decays")
print("- ARMA(1,1): Both ACF and PACF decay")

## 6. Model Estimation

We use **Maximum Likelihood Estimation (MLE)** to fit ARMA models.

In [None]:
# Fit ARMA model to simulated data
# True model: AR(1) with phi = 0.7
np.random.seed(42)
n = 500
true_phi = 0.7

ar1_data = np.zeros(n)
for t in range(1, n):
    ar1_data[t] = true_phi * ar1_data[t-1] + np.random.randn()

# Fit AR(1) model using ARIMA with d=0
model = ARIMA(ar1_data, order=(1, 0, 0))
results = model.fit()

print("=" * 60)
print("AR(1) Model Estimation Results")
print("=" * 60)
print(f"True φ = {true_phi}")
print(f"Estimated φ = {results.arparams[0]:.4f}")
print(f"Standard Error = {results.bse[0]:.4f}")
print(f"95% CI: [{results.arparams[0] - 1.96*results.bse[0]:.4f}, {results.arparams[0] + 1.96*results.bse[0]:.4f}]")
print(f"\nAIC = {results.aic:.2f}")
print(f"BIC = {results.bic:.2f}")

In [None]:
# Model summary
print(results.summary())

## 7. Model Selection with Information Criteria

**AIC** (Akaike Information Criterion): $\text{AIC} = -2\ln(\hat{L}) + 2k$

**BIC** (Bayesian Information Criterion): $\text{BIC} = -2\ln(\hat{L}) + k\ln(n)$

Lower values are better. BIC penalizes complexity more strongly.

In [None]:
# Compare different model orders
orders = [(1,0,0), (2,0,0), (0,0,1), (0,0,2), (1,0,1), (2,0,1)]
results_dict = {}

print("Model Comparison:")
print("=" * 50)
print(f"{'Model':<15} {'AIC':>12} {'BIC':>12}")
print("-" * 50)

for order in orders:
    try:
        model = ARIMA(ar1_data, order=order)
        res = model.fit()
        model_name = f"ARMA({order[0]},{order[2]})"
        results_dict[model_name] = {'AIC': res.aic, 'BIC': res.bic}
        print(f"{model_name:<15} {res.aic:>12.2f} {res.bic:>12.2f}")
    except:
        pass

print("-" * 50)
print("\nBest model by AIC:", min(results_dict, key=lambda x: results_dict[x]['AIC']))
print("Best model by BIC:", min(results_dict, key=lambda x: results_dict[x]['BIC']))

## 8. Model Diagnostics

After fitting a model, we check if residuals are white noise:
1. Plot residuals over time
2. Check ACF of residuals
3. Ljung-Box test for autocorrelation
4. Q-Q plot for normality

In [None]:
# Residual diagnostics
residuals = results.resid

fig, axes = plt.subplots(2, 2, figsize=(12, 8))

# Residuals over time
axes[0, 0].plot(residuals, color=COLORS['blue'], linewidth=0.5, label='Residuals')
axes[0, 0].axhline(y=0, color='red', linestyle='--')
axes[0, 0].set_title('Residuals Over Time', fontweight='bold')
axes[0, 0].set_xlabel('Time')
axes[0, 0].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), frameon=False)

# Histogram
axes[0, 1].hist(residuals, bins=30, color=COLORS['blue'], edgecolor='black', alpha=0.7, density=True, label='Residuals')
x = np.linspace(residuals.min(), residuals.max(), 100)
axes[0, 1].plot(x, stats.norm.pdf(x, residuals.mean(), residuals.std()), 
                color=COLORS['red'], linewidth=2, label='Normal')
axes[0, 1].set_title('Residual Distribution', fontweight='bold')
axes[0, 1].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=2, frameon=False)

# ACF of residuals
plot_acf(residuals, ax=axes[1, 0], lags=20, color=COLORS['blue'])
axes[1, 0].set_title('ACF of Residuals', fontweight='bold')

# Q-Q plot
(osm, osr), (slope, intercept, r) = stats.probplot(residuals, dist="norm")
axes[1, 1].scatter(osm, osr, color=COLORS['blue'], s=20, alpha=0.5, label='Sample')
axes[1, 1].plot(osm, slope*osm + intercept, color=COLORS['red'], linewidth=2, label='Theoretical')
axes[1, 1].set_title('Q-Q Plot', fontweight='bold')
axes[1, 1].set_xlabel('Theoretical Quantiles')
axes[1, 1].set_ylabel('Sample Quantiles')
axes[1, 1].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=2, frameon=False)

plt.tight_layout()
plt.show()

In [None]:
# Ljung-Box test
lb_test = acorr_ljungbox(residuals, lags=[10, 20, 30], return_df=True)
print("Ljung-Box Test for Residual Autocorrelation:")
print("="*50)
print(lb_test)
print("\nInterpretation:")
print("If all p-values > 0.05, residuals are white noise (good!)")

## 9. Forecasting with ARMA

For AR(1): $\hat{X}_{n+h|n} = \mu + \phi^h(X_n - \mu)$

- Point forecasts converge to the mean as $h \to \infty$
- Forecast uncertainty increases with horizon

In [None]:
# Generate forecasts with confidence intervals
forecast_steps = 50
forecast = results.get_forecast(steps=forecast_steps)
forecast_mean = forecast.predicted_mean
forecast_ci = forecast.conf_int()

# Plot
fig, ax = plt.subplots(figsize=(14, 5))

# Historical data (last 100 points)
ax.plot(range(400, 500), ar1_data[400:], color=COLORS['blue'], linewidth=1, label='Historical')

# Forecasts
forecast_index = range(500, 500 + forecast_steps)
ax.plot(forecast_index, forecast_mean, color=COLORS['red'], linewidth=2, label='Forecast')

# Confidence interval
ax.fill_between(forecast_index, 
                forecast_ci.iloc[:, 0], 
                forecast_ci.iloc[:, 1],
                color=COLORS['red'], alpha=0.2, label='95% CI')

# Mean line
ax.axhline(y=0, color='gray', linestyle='--', alpha=0.5, label='Mean')

ax.axvline(x=500, color='black', linestyle='-', alpha=0.3)
ax.set_xlabel('Time')
ax.set_ylabel('Value')
ax.set_title('AR(1) Forecasts with 95% Confidence Interval', fontweight='bold')
ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=4, frameon=False)
plt.tight_layout()
plt.show()

print("\nForecast Properties:")
print(f"- Forecasts converge to mean = {forecast_mean.iloc[-1]:.4f}")
print(f"- CI width at h=1: {forecast_ci.iloc[0, 1] - forecast_ci.iloc[0, 0]:.4f}")
print(f"- CI width at h=50: {forecast_ci.iloc[-1, 1] - forecast_ci.iloc[-1, 0]:.4f}")

## 10. Real Data Example: Stock Returns

In [None]:
# Load real data
import yfinance as yf

# Download S&P 500 data
sp500 = yf.download('^GSPC', start='2020-01-01', end='2024-12-31', progress=False)
if isinstance(sp500.columns, pd.MultiIndex):
    sp500.columns = sp500.columns.droplevel(1)

# Calculate returns
returns = sp500['Close'].pct_change().dropna() * 100
print(f"S&P 500 Returns: {len(returns)} observations")
print(f"Mean: {returns.mean():.4f}%")
print(f"Std: {returns.std():.4f}%")

In [None]:
# Check stationarity
adf_result = adfuller(returns)
print("ADF Test for Stationarity:")
print(f"  Test Statistic: {adf_result[0]:.4f}")
print(f"  p-value: {adf_result[1]:.6f}")
print(f"  Conclusion: {'STATIONARY' if adf_result[1] < 0.05 else 'NON-STATIONARY'}")

In [None]:
# ACF/PACF of returns
fig, axes = plt.subplots(1, 2, figsize=(14, 4))

plot_acf(returns, ax=axes[0], lags=30, color=COLORS['blue'])
axes[0].set_title('ACF of S&P 500 Returns', fontweight='bold')

plot_pacf(returns, ax=axes[1], lags=30, color=COLORS['blue'], method='ywm')
axes[1].set_title('PACF of S&P 500 Returns', fontweight='bold')

plt.tight_layout()
plt.show()

print("Note: Stock returns show little autocorrelation (efficient market hypothesis)")

In [None]:
# Fit ARMA models and compare
orders = [(1,0,0), (0,0,1), (1,0,1), (2,0,1)]

print("Model Comparison for S&P 500 Returns:")
print("=" * 50)
print(f"{'Model':<15} {'AIC':>12} {'BIC':>12}")
print("-" * 50)

best_aic = float('inf')
best_model = None

for order in orders:
    try:
        model = ARIMA(returns, order=order)
        res = model.fit()
        model_name = f"ARMA({order[0]},{order[2]})"
        print(f"{model_name:<15} {res.aic:>12.2f} {res.bic:>12.2f}")
        if res.aic < best_aic:
            best_aic = res.aic
            best_model = res
    except:
        pass

print("\nNote: Low autocorrelation in returns means simple models often suffice.")

## Summary

### Key Takeaways

1. **AR(p) models:** Current value depends on $p$ past values
   - Stationarity: roots of $\phi(z)$ outside unit circle
   - PACF cuts off at lag $p$

2. **MA(q) models:** Current value depends on $q$ past shocks
   - Always stationary; invertibility: roots of $\theta(z)$ outside unit circle
   - ACF cuts off at lag $q$

3. **ARMA(p,q):** Combines AR and MA
   - Both ACF and PACF decay

4. **Model selection:** Use AIC/BIC to compare models

5. **Diagnostics:** Residuals must be white noise (Ljung-Box test)

6. **Forecasting:** Point forecasts converge to mean; uncertainty increases with horizon

### Next Chapter: ARIMA and Seasonal Models
- ARIMA(p,d,q) for non-stationary data
- Seasonal ARIMA models