# Day 13: ARMA Models

## Combining AR and MA Components

ARMA (AutoRegressive Moving Average) models combine the strengths of both AR and MA approaches:
- **AR component**: Models dependency on past values
- **MA component**: Models dependency on past forecast errors
- **ARMA(p,q)**: More flexible than pure AR or MA models

In this notebook, we'll explore:
- ACF/PACF patterns for ARMA identification
- Fitting ARMA models with different orders
- Model selection using AIC and BIC
- Parameter interpretation and diagnostics

In [14]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objects as go
from plotly.subplots import make_subplots

import yfinance as yf
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.stattools import acf, pacf, adfuller
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from sklearn.metrics import mean_squared_error, mean_absolute_error
import warnings
warnings.filterwarnings('ignore')

# Set style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (14, 6)

print("✓ Libraries imported successfully")

✓ Libraries imported successfully


## 1. Load and Prepare Data

In [15]:
# Fetch gold price data
gold = yf.download('GLD', start='2015-01-01', end='2026-01-25', progress=False)
gold['Price'] = (gold['Close'] * 10.8).round(0)
gold = gold[['Price']].copy()

print(f"Gold data shape: {gold.shape}")
print(f"Date range: {gold.index.min()} to {gold.index.max()}")
print(f"\nData summary:")
print(gold['Price'].describe())

# Plot original series
fig = go.Figure()
fig.add_trace(go.Scatter(
    x=gold.index, y=gold['Price'],
    mode='lines',
    name='Gold Price',
    line=dict(color='goldenrod', width=1)
))
fig.update_layout(
    title="Gold Price Time Series",
    xaxis_title='Date',
    yaxis_title='Price ($)',
    template='plotly_white',
    height=500
)
fig.show()

print("✓ Data loaded")

Gold data shape: (2781, 1)
Date range: 2015-01-02 00:00:00 to 2026-01-23 00:00:00

Data summary:
count    2781.000000
mean     1805.171161
std       678.594784
min      1085.000000
25%      1306.000000
50%      1726.000000
75%      1956.000000
max      4946.000000
Name: Price, dtype: float64


✓ Data loaded


## 2. Stationarity Testing and Differencing

In [16]:
# First difference for stationarity
gold_diff = gold['Price'].diff().dropna()

# ADF Test
adf_result = adfuller(gold_diff, autolag='AIC')
print(f"ADF Test on Differenced Series:")
print(f"  Test Statistic: {adf_result[0]:.6f}")
print(f"  P-value: {adf_result[1]:.6f}")
print(f"  Result: {'STATIONARY' if adf_result[1] < 0.05 else 'NON-STATIONARY'}")

# Plot original vs differenced
fig = make_subplots(
    rows=2, cols=1,
    subplot_titles=('Original Series', 'First Differenced'),
    shared_xaxes=True,
    vertical_spacing=0.1
)

fig.add_trace(
    go.Scatter(x=gold.index, y=gold['Price'], mode='lines', name='Original', line=dict(color='blue')),
    row=1, col=1
)

fig.add_trace(
    go.Scatter(x=gold_diff.index, y=gold_diff.values, mode='lines', name='Differenced', line=dict(color='red')),
    row=2, col=1
)

fig.update_layout(height=600, showlegend=True, title_text="Stationarity Check")
fig.show()

print("✓ Series is stationary after differencing")

ADF Test on Differenced Series:
  Test Statistic: -6.144134
  P-value: 0.000000
  Result: STATIONARY


✓ Series is stationary after differencing


## 3. ACF and PACF Analysis

In [17]:
# Calculate ACF and PACF
acf_vals = acf(gold_diff, nlags=40)
pacf_vals = pacf(gold_diff, nlags=40, method='ywm')

# Confidence interval
ci = 1.96 / np.sqrt(len(gold_diff))

# Create side-by-side plot
fig = make_subplots(
    rows=1, cols=2,
    subplot_titles=('ACF', 'PACF')
)

# ACF
fig.add_trace(
    go.Bar(x=list(range(len(acf_vals))), y=acf_vals, name='ACF', marker=dict(color='blue')),
    row=1, col=1
)
fig.add_hline(y=ci, line_dash='dash', line_color='red', row=1, col=1)
fig.add_hline(y=-ci, line_dash='dash', line_color='red', row=1, col=1)

# PACF
fig.add_trace(
    go.Bar(x=list(range(len(pacf_vals))), y=pacf_vals, name='PACF', marker=dict(color='green')),
    row=1, col=2
)
fig.add_hline(y=ci, line_dash='dash', line_color='red', row=1, col=2)
fig.add_hline(y=-ci, line_dash='dash', line_color='red', row=1, col=2)

fig.update_layout(height=500, showlegend=True, title_text="ACF and PACF Analysis")
fig.show()

print(f"ACF values (1-5): {acf_vals[1:6]}")
print(f"PACF values (1-5): {pacf_vals[1:6]}")
print(f"95% CI: ±{ci:.4f}")

# Identify significant lags
sig_acf = [i for i in range(1, 21) if abs(acf_vals[i]) > ci]
sig_pacf = [i for i in range(1, 21) if abs(pacf_vals[i]) > ci]

print(f"\nSignificant ACF lags: {sig_acf}")
print(f"Significant PACF lags: {sig_pacf}")
print(f"\nSuggested:")
print(f"  AR order (p): {sig_pacf[0] if sig_pacf else 0}")
print(f"  MA order (q): {sig_acf[0] if sig_acf else 0}")

ACF values (1-5): [-0.01262527  0.03511747 -0.0127346  -0.04307289  0.0102948 ]
PACF values (1-5): [-0.01262527  0.03496365 -0.01188173 -0.04466352  0.01013569]
95% CI: ±0.0372

Significant ACF lags: [4, 8, 13, 14, 16, 18]
Significant PACF lags: [4, 12, 13, 14, 16, 18]

Suggested:
  AR order (p): 4
  MA order (q): 4


## 4. Train-Test Split

In [18]:
train_size = int(len(gold_diff) * 0.8)
train = gold_diff.iloc[:train_size]
test = gold_diff.iloc[train_size:]

print(f"Training set: {len(train)} observations ({len(train)/len(gold_diff)*100:.1f}%)")
print(f"Test set: {len(test)} observations ({len(test)/len(gold_diff)*100:.1f}%)")
print(f"Train period: {train.index.min()} to {train.index.max()}")
print(f"Test period: {test.index.min()} to {test.index.max()}")

Training set: 2224 observations (80.0%)
Test set: 556 observations (20.0%)
Train period: 2015-01-05 00:00:00 to 2023-11-02 00:00:00
Test period: 2023-11-03 00:00:00 to 2026-01-23 00:00:00


## 5. Fit ARMA Models with Different Orders

In [19]:
# Define ARMA orders to test
arma_orders = [
    (1, 0),  # AR(1)
    (2, 0),  # AR(2)
    (0, 1),  # MA(1)
    (0, 2),  # MA(2)
    (1, 1),  # ARMA(1,1)
    (2, 1),  # ARMA(2,1)
]

models = {}
results_table = []

print(f"\n{'Model':<12} {'AIC':<15} {'BIC':<15} {'RMSE':<15}")
print("-" * 57)

for p, q in arma_orders:
    try:
        # Fit ARIMA(p,0,q) which is ARMA(p,q)
        model = ARIMA(train, order=(p, 0, q), trend='c')
        fit = model.fit()
        
        # Forecast
        forecast = fit.forecast(steps=len(test))
        forecast_series = pd.Series(forecast.values, index=test.index)
        
        # Calculate RMSE
        rmse = np.sqrt(mean_squared_error(test, forecast_series))
        
        models[(p, q)] = fit
        results_table.append({
            'order': (p, q),
            'aic': fit.aic,
            'bic': fit.bic,
            'rmse': rmse,
            'forecast': forecast_series
        })
        
        print(f"ARMA({p},{q}){' '*(6-len(str((p,q))))} {fit.aic:<15.2f} {fit.bic:<15.2f} {rmse:<15.6f}")
    except Exception as e:
        print(f"ARMA({p},{q}) Failed: {str(e)[:40]}")

# Find best by BIC
results_df = pd.DataFrame(results_table)
best_idx = results_df['bic'].idxmin()
best_order = results_df.loc[best_idx, 'order']
best_model = models[best_order]

print(f"\nOptimal: ARMA{best_order} (by BIC)")
print(f"  AIC: {best_model.aic:.2f}")
print(f"  BIC: {best_model.bic:.2f}")
print(f"  RMSE: {results_df.loc[best_idx, 'rmse']:.6f}")


Model        AIC             BIC             RMSE           
---------------------------------------------------------
ARMA(1,0) 18118.06        18135.18        36.743841      
ARMA(2,0) 18115.85        18138.68        36.743549      
ARMA(0,1) 18117.91        18135.04        36.743831      
ARMA(0,2) 18115.71        18138.54        36.743573      
ARMA(1,1) 18117.31        18140.13        36.743482      
ARMA(2,1) 18117.27        18145.80        36.743581      

Optimal: ARMA(0, 1) (by BIC)
  AIC: 18117.91
  BIC: 18135.04
  RMSE: 36.743831


## 6. Examine Optimal Model

In [20]:
print(f"ARMA{best_order} Summary:")
print(best_model.summary())

ARMA(0, 1) Summary:
                               SARIMAX Results                                
Dep. Variable:                  Price   No. Observations:                 2224
Model:                 ARIMA(0, 0, 1)   Log Likelihood               -9055.957
Date:                Sun, 25 Jan 2026   AIC                          18117.914
Time:                        14:05:22   BIC                          18135.035
Sample:                             0   HQIC                         18124.167
                               - 2224                                         
Covariance Type:                  opg                                         
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.3399      0.313      1.085      0.278      -0.274       0.954
ma.L1          0.0289      0.017      1.692      0.091      -0.005       0.062
sigma2       201.6086      3.357

## 7. Extract and Interpret Parameters

In [21]:
params = best_model.params
p, q = best_order

print(f"ARMA{best_order} Parameters:")
print(f"  Constant: {params.iloc[0]:.6f}")

# AR parameters
if p > 0:
    print(f"\n  AR Coefficients (φ):")
    for i in range(1, p + 1):
        param_name = f'ar.L{i}'
        if param_name in params.index:
            print(f"    φ_{i}: {params[param_name]:.6f}")

# MA parameters
if q > 0:
    print(f"\n  MA Coefficients (θ):")
    for i in range(1, q + 1):
        param_name = f'ma.L{i}'
        if param_name in params.index:
            print(f"    θ_{i}: {params[param_name]:.6f}")

print(f"\nFormula: y(t) = c + Σφᵢy(t-i) + εₜ + Σθⱼε(t-j)")
print(f"  AR: Models dependency on past values")
print(f"  MA: Models dependency on past forecast errors")

ARMA(0, 1) Parameters:
  Constant: 0.339931

  MA Coefficients (θ):
    θ_1: 0.028940

Formula: y(t) = c + Σφᵢy(t-i) + εₜ + Σθⱼε(t-j)
  AR: Models dependency on past values
  MA: Models dependency on past forecast errors


## 8. Evaluate Forecasts

In [22]:
best_forecast = results_df.loc[best_idx, 'forecast']

# Calculate metrics
rmse = np.sqrt(mean_squared_error(test, best_forecast))
mae = mean_absolute_error(test, best_forecast)

# Naive forecast
naive = pd.Series([train.iloc[-1]] * len(test), index=test.index)
naive_rmse = np.sqrt(mean_squared_error(test, naive))
improvement = ((naive_rmse - rmse) / naive_rmse * 100)

print(f"ARMA{best_order} Performance:")
print(f"  RMSE: {rmse:.6f}")
print(f"  MAE: {mae:.6f}")
print(f"  Naive RMSE: {naive_rmse:.6f}")
print(f"  Improvement: {improvement:.2f}%")

# Plot forecast vs actual
fig = go.Figure()

fig.add_trace(go.Scatter(
    x=train.index, y=train.values,
    mode='lines',
    name='Training Data',
    line=dict(color='blue', width=1)
))

fig.add_trace(go.Scatter(
    x=test.index, y=test.values,
    mode='lines',
    name='Test Data (Actual)',
    line=dict(color='green', width=2)
))

fig.add_trace(go.Scatter(
    x=test.index, y=best_forecast.values,
    mode='lines',
    name=f'ARMA{best_order} Forecast',
    line=dict(color='red', width=2, dash='dash')
))

fig.update_layout(
    title=f"ARMA{best_order}: Forecast vs Actual",
    xaxis_title='Date',
    yaxis_title='Price Change ($)',
    template='plotly_white',
    height=500
)
fig.show()

print("✓ Forecast evaluation complete")

ARMA(0, 1) Performance:
  RMSE: 36.743831
  MAE: 25.101864
  Naive RMSE: 36.411171
  Improvement: -0.91%


✓ Forecast evaluation complete


## 9. Residual Diagnostics

In [23]:
# In-sample residuals
residuals_train = best_model.resid
print(f"In-Sample Residuals:")
print(f"  Mean: {residuals_train.mean():.6f}")
print(f"  Std: {residuals_train.std():.6f}")

# Out-of-sample residuals
residuals_test = test - best_forecast
print(f"\nOut-of-Sample Residuals:")
print(f"  Mean: {residuals_test.mean():.6f}")
print(f"  Std: {residuals_test.std():.6f}")

# Check for remaining autocorrelation
residuals_acf = acf(residuals_test.dropna(), nlags=20)
sig_residuals = [i for i in range(1, 21) if abs(residuals_acf[i]) > ci]

print(f"\nResidual ACF - Significant lags: {sig_residuals}")
if len(sig_residuals) == 0:
    print("  ✓ No significant autocorrelation")
else:
    print(f"  ⚠ Some autocorrelation at lags {sig_residuals[:5]}")

# Plot residuals
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=('Residuals Over Time', 'Histogram', 'ACF of Residuals', 'Q-Q Plot'),
)

# Residuals over time
fig.add_trace(
    go.Scatter(x=test.index, y=residuals_test, mode='lines+markers', name='Residuals'),
    row=1, col=1
)
fig.add_hline(y=0, line_dash='dash', line_color='red', row=1, col=1)

# Histogram
fig.add_trace(
    go.Histogram(x=residuals_test, name='Distribution', nbinsx=20),
    row=1, col=2
)

# ACF
fig.add_trace(
    go.Bar(x=list(range(len(residuals_acf))), y=residuals_acf, name='ACF'),
    row=2, col=1
)
fig.add_hline(y=ci, line_dash='dash', line_color='red', row=2, col=1)
fig.add_hline(y=-ci, line_dash='dash', line_color='red', row=2, col=1)

# Q-Q plot
from scipy import stats
theoretical_quantiles = stats.norm.ppf(np.linspace(0.01, 0.99, len(residuals_test)))
sample_quantiles = np.sort(residuals_test.values)
fig.add_trace(
    go.Scatter(x=theoretical_quantiles, y=sample_quantiles, mode='markers', name='Q-Q'),
    row=2, col=2
)
fig.add_trace(
    go.Scatter(x=theoretical_quantiles, y=theoretical_quantiles, mode='lines', name='Ideal', line=dict(dash='dash', color='red')),
    row=2, col=2
)

fig.update_layout(height=700, showlegend=True, title_text="Residual Diagnostics")
fig.show()

print("✓ Residual diagnostics complete")

In-Sample Residuals:
  Mean: 0.000071
  Std: 14.199540

Out-of-Sample Residuals:
  Mean: 4.979909
  Std: 36.437584

Residual ACF - Significant lags: [1, 2, 3, 4, 6, 7, 8, 9, 10, 12, 14, 16, 17, 18, 19]
  ⚠ Some autocorrelation at lags [1, 2, 3, 4, 6]


✓ Residual diagnostics complete


## 10. Model Comparison

In [24]:
# Comparison table
comparison_data = []
for result in results_table:
    order = result['order']
    comparison_data.append({
        'Model': f"ARMA{order}",
        'AIC': result['aic'],
        'BIC': result['bic'],
        'RMSE': result['rmse']
    })

comparison_df = pd.DataFrame(comparison_data)
print("\nModel Comparison:")
print(comparison_df.to_string(index=False))

# Plot comparison
fig = go.Figure()

fig.add_trace(go.Scatter(
    x=[f"ARMA{r['order']}" for r in results_table],
    y=[r['aic'] for r in results_table],
    mode='lines+markers',
    name='AIC',
    marker=dict(size=10)
))

fig.add_trace(go.Scatter(
    x=[f"ARMA{r['order']}" for r in results_table],
    y=[r['bic'] for r in results_table],
    mode='lines+markers',
    name='BIC',
    marker=dict(size=10)
))

fig.update_layout(
    title="Information Criteria Comparison",
    xaxis_title='Model',
    yaxis_title='Criterion Value',
    template='plotly_white',
    height=500
)
fig.show()

print(f"\n✓ Best model: ARMA{best_order} with BIC={best_model.bic:.2f}")


Model Comparison:
     Model          AIC          BIC      RMSE
ARMA(1, 0) 18118.062628 18135.183816 36.743841
ARMA(2, 0) 18115.850954 18138.679204 36.743549
ARMA(0, 1) 18117.914204 18135.035392 36.743831
ARMA(0, 2) 18115.707082 18138.535333 36.743573
ARMA(1, 1) 18117.305520 18140.133771 36.743482
ARMA(2, 1) 18117.267543 18145.802856 36.743581



✓ Best model: ARMA(0, 1) with BIC=18135.04


## 11. Key Insights

In [25]:
print("="*70)
print("KEY INSIGHTS: ARMA MODELS")
print("="*70)

print(f"\nARMA Model Structure:")
print(f"  - ARMA(p,q): Combines AR(p) and MA(q) components")
print(f"  - AR: Models dependency on past values")
print(f"  - MA: Models dependency on past forecast errors")
print(f"  - Formula: y(t) = c + Σφᵢy(t-i) + εₜ + Σθⱼε(t-j)")

print(f"\nOptimal Model Selection:")
print(f"  - Best: ARMA{best_order} (by BIC)")
print(f"  - BIC: {best_model.bic:.2f}")
print(f"  - RMSE: {rmse:.6f}")
print(f"  - Improvement: {improvement:.2f}% over naive")

print(f"\nACF/PACF Identification:")
print(f"  - AR(p): PACF cuts off at p, ACF decays")
print(f"  - MA(q): ACF cuts off at q, PACF decays")
print(f"  - ARMA(p,q): Both ACF and PACF decay")

print(f"\nWhen to Use ARMA:")
print(f"  ✓ Stationary series")
print(f"  ✓ Mixed autocorrelation patterns")
print(f"  ✓ When pure AR or MA insufficient")
print(f"  ✗ Trending data (use ARIMA)")
print(f"  ✗ Seasonal data (use SARIMA)")

print(f"\nProgression:")
print(f"  Day 12: AR - Past values only")
print(f"  Day 13: ARMA - Past values + errors ← HERE")
print(f"  Day 14: ARIMA - ARMA + differencing")

print("\n" + "="*70)
print("✓ Day 13 Analysis Complete!")
print("="*70)

KEY INSIGHTS: ARMA MODELS

ARMA Model Structure:
  - ARMA(p,q): Combines AR(p) and MA(q) components
  - AR: Models dependency on past values
  - MA: Models dependency on past forecast errors
  - Formula: y(t) = c + Σφᵢy(t-i) + εₜ + Σθⱼε(t-j)

Optimal Model Selection:
  - Best: ARMA(0, 1) (by BIC)
  - BIC: 18135.04
  - RMSE: 36.743831
  - Improvement: -0.91% over naive

ACF/PACF Identification:
  - AR(p): PACF cuts off at p, ACF decays
  - MA(q): ACF cuts off at q, PACF decays
  - ARMA(p,q): Both ACF and PACF decay

When to Use ARMA:
  ✓ Stationary series
  ✓ Mixed autocorrelation patterns
  ✓ When pure AR or MA insufficient
  ✗ Trending data (use ARIMA)
  ✗ Seasonal data (use SARIMA)

Progression:
  Day 12: AR - Past values only
  Day 13: ARMA - Past values + errors ← HERE
  Day 14: ARIMA - ARMA + differencing

✓ Day 13 Analysis Complete!
