# About Unit 7

Welcome to the Marquette University AIM time series analysis curriculum! In this unit, you will learn about **Seasonal ARIMA (SARIMA)** models and how to extend them with exogenous variables (**SARIMAX**). These models handle both seasonality and external predictors in time series forecasting.

By the end of this unit, you will:
- Understand SARIMA and SARIMAX models.
- Incorporate seasonality and external variables into time series models.
- Apply model selection and diagnostics.

# Getting Started

**Import Packages**

Run the following code to bring the necessary packages into your environment. Ensure you are running a Python kernel >=3.0.0.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.statespace.sarimax import SARIMAX
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.stats.diagnostic import acorr_ljungbox

# SARIMA Models

### Seasonal ARIMA (SARIMA)
SARIMA models extend ARIMA by including seasonal components:
\[ \text{SARIMA}(p, d, q)(P, D, Q, s) \]
- \( p, d, q \): Non-seasonal ARIMA parameters.
- \( P, D, Q \): Seasonal ARIMA parameters.
- \( s \): Seasonal period.

### Key Components:
- **Differencing (D)**: Seasonal differencing makes the series stationary.
- **AR (P)** and **MA (Q)** terms: Model seasonal patterns.

### Steps:
1. Identify seasonal period \( s \).
2. Fit SARIMA with selected parameters.
3. Conduct diagnostics and refine the model.

In [None]:
# Simulate Seasonal Data
np.random.seed(42)
time = np.arange(1, 121)
seasonal_data = 10 + 5 * np.sin(2 * np.pi * time / 12) + np.random.randn(len(time))

# Plot seasonal data
plt.plot(time, seasonal_data)
plt.title('Simulated Seasonal Data')
plt.show()

In [None]:
# Fit SARIMA Model
from statsmodels.tsa.statespace.sarimax import SARIMAX

# Fit SARIMA(1,1,1)(1,1,1,12)
sarima_model = SARIMAX(seasonal_data, order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))
sarima_result = sarima_model.fit()

# Print summary
print(sarima_result.summary())

# Model Diagnostics

### Residual Analysis
Residuals from a well-fitted SARIMA model should:
- Be uncorrelated.
- Have constant variance.

### Ljung-Box Test
The Ljung-Box test checks for residual autocorrelation.

In [None]:
# Residual Analysis
residuals = sarima_result.resid

# Plot residuals
plt.figure(figsize=(10, 6))
plt.plot(residuals)
plt.title('SARIMA Model Residuals')
plt.show()

# Perform Ljung-Box test
ljung_box_results = acorr_ljungbox(residuals, lags=[10], return_df=True)
print(ljung_box_results)

# SARIMAX Models

### SARIMAX: SARIMA with Exogenous Variables
SARIMAX models add exogenous predictors to SARIMA models, allowing for external influences:
\[ Y_t = \text{SARIMA}(p, d, q)(P, D, Q, s) + X_t \beta \]
- \( X_t \): Exogenous variables (predictors).
- \( \beta \): Coefficients for predictors.

In [None]:
# Example: SARIMAX with Exogenous Variable
# Simulate exogenous variable
exog = np.random.randn(len(seasonal_data))

# Fit SARIMAX(1,1,1)(1,1,1,12) with exogenous variable
sarimax_model = SARIMAX(seasonal_data, order=(1, 1, 1), seasonal_order=(1, 1, 1, 12), exog=exog)
sarimax_result = sarimax_model.fit()

# Print summary
print(sarimax_result.summary())

# Forecasting with SARIMAX

Forecasting with SARIMAX involves predicting future values while considering both the time series and exogenous variables.

In [None]:
# Forecasting with SARIMAX
forecast_steps = 12
future_exog = np.random.randn(forecast_steps)  # Simulate future exogenous variable
forecast = sarimax_result.get_forecast(steps=forecast_steps, exog=future_exog)
forecast_mean = forecast.predicted_mean
forecast_ci = forecast.conf_int()

# Plot forecast
plt.plot(seasonal_data, label='Original Data')
plt.plot(range(len(seasonal_data), len(seasonal_data) + forecast_steps), forecast_mean,
         label='Forecast', color='red')
plt.fill_between(range(len(seasonal_data), len(seasonal_data) + forecast_steps),
                 forecast_ci.iloc[:, 0], forecast_ci.iloc[:, 1],
                 color='pink', alpha=0.3, label='95% CI')
plt.legend()
plt.title('SARIMAX Forecast')
plt.show()

# Summary

In this unit, you learned about **Seasonal ARIMA (SARIMA)** and **SARIMAX** models:
- SARIMA models handle seasonality in time series.
- SARIMAX models incorporate exogenous predictors.
- Diagnostics and forecasting ensure model accuracy and generalization.

These models are powerful tools for time series analysis and forecasting in complex scenarios.