# ARIMA (package [`statsmodels`](https://www.statsmodels.org/stable/index.html))

Specification of ARIMA(p,d,q) (ARIMA = AutoRegressive Integrated Moving Average)

$$
\begin{aligned}
	\Delta ^d y_t &= (\alpha_0+\alpha_1 t)+\sum_{j=1}^p\phi_j\Delta^d y_{t-j}+u_t+\sum_{s=1}^q\theta_s u_{t-s} &
	u_t&\sim WN(0,\sigma^2)
\end{aligned}
$$
where
* p is an oder of autoregressive part
* d is an integration order 
* q is an order of moving averageп part
* $\alpha_0$ is an intercept/cons (d=0) or drift (d>0)

Specification by means of  lag operator

$$
\begin{aligned}
	\phi_p(L)(1-L)^dy_t&=(\alpha_0+\alpha_1t)+\theta_q(L)u_t & u_t&\sim WN(0,\sigma^2)
\end{aligned}
$$
with polynomials
$$
\begin{aligned}
	\phi_p(z)&=1-\phi_1z-\cdots-\phi_pz^p & \theta_q(z)&=1+\theta_1z+\cdots+\theta_qz^q
\end{aligned}
$$

In [2]:
import numpy as np
import pandas as pd

from statsmodels.tsa.api import ARIMA
from statsmodels.stats.api import het_arch, acorr_ljungbox
from statsmodels.graphics.tsaplots import plot_predict

import pandas_datareader.data as web

# настройки визуализации
import matplotlib.pyplot as plt

# Не показывать Warnings
import warnings
warnings.simplefilter(action='ignore', category=Warning)
# Не показывать ValueWarning, ConvergenceWarning из statsmodels
from statsmodels.tools.sm_exceptions import ValueWarning, ConvergenceWarning
warnings.simplefilter('ignore', category=ValueWarning)
warnings.simplefilter('ignore', category=ConvergenceWarning)

We usei the following classes
* [ARIMA](https://www.statsmodels.org/stable/generated/statsmodels.tsa.arima.model.ARIMA.html#statsmodels.tsa.arima.model.ARIMA) (specified model)
* [ARIMAResult](https://www.statsmodels.org/stable/generated/statsmodels.tsa.arima.model.ARIMAResults.html#statsmodels.tsa.arima.model.ARIMAResults) (fitted model)

We need to set parameters `order` (model's order) and `trend`

|d|parameter|`trend`|
|-|-|-|
|0|$\alpha_0=\alpha_1=0$|'n'|
|0|$\alpha_0\ne0,\alpha_1=0$|'c'|
|0|$\alpha_0,\alpha_1\ne0$|'ct'|
|>0|$\alpha_0=\alpha_1=0$|'n'|
|>0|$\alpha_0\ne0,\alpha_1=0$|'t'|

## Fitting of ARIMA of given order

Let's import from [`FRED`](https://fred.stlouisfed.org/) weekly data on Market Yield on U.S. Treasury Securities at 10-Year Constant Maturity (Symbol [`WGS10YR`](https://fred.stlouisfed.org/series/WGS10YR)) from 2000-01-01 to 2023-12-31 as `y` DataFrame

In [3]:
y = web.DataReader(name='WGS10YR', data_source='fred', start='2000-01-01', end='2023-12-31')

Consider ARIMA(2,1,1) without drift for series `y`

Model's specification

$$
	(1-\phi_1L-\phi_2 L^2)(1-L) y_t=u_t+\theta u_{t-1}
$$

In [None]:
# specified model
mod = ARIMA(y, order=(2,1,1), trend='n', missing='drop')
# model's fitting
res = mod.fit()
# fitting report
res.summary(alpha=0.05)

## Diagnostic of fitted model

Some graphs

In [None]:
res.plot_diagnostics(lags=15)

plt.show()

Portmanteau test for serial correlation (Ljung-Box test). Let's the number of lags is 7

In [None]:
# debiased degrees of freedom: number of fitted coefs = number of parameters - 1 (-sigma2)
model_df = mod.k_params-1
# to test we drop first d residuals (d=mod.k_diff)
acorr_ljungbox(res.resid[mod.k_diff:] , lags=[7], model_df=model_df)

Test on heteroskedasticity (on ARCH-effects)

In [None]:
# debiased degrees of freedom: number of fitted coefs = number of parameters - 1 (-sigma2)
model_df = mod.k_params-1
# to test we drop first d residuals (d=mod.k_diff)
lm_stat, lm_pval, f_stat, f_pval = het_arch(res.resid[mod.k_diff:], nlags=7, ddof=model_df)

lm_stat, lm_pval

## Forecasting

Let's forecast to 10 periods

In [None]:
forecasts = res.forecast(steps=10)
forecasts

visualization of forecasts

In [None]:
plot_predict(res, start=len(y), end=len(y)+10, alpha=0.05)

plt.show()

In [None]:
plt.plot(y.tail(30))
plt.plot(forecasts)

plt.show()