## Moving Average (MA Models)

( mainly from Introduction to Time Series Forecasting with Python, Jason Brownlee)

A moving average process, or the moving average (MA) model, states that the current value is linearly dependent on the current and past error terms. The error terms are assumed to be mutually independent and normally distributed, just like white noise.

The residual errors from forecasts on a time series provide another source of information that we can model. Residual errors themselves form a time series that can have temporal structure. A simple autoregression model of this structure can be used to predict the forecast error, which in turn can be used to correct forecasts.

**Residual Error:** 
The difference between what was expected and what was predicted is called the residual error. It is calculated as:
$residual\_error = expected-predicted$

A moving average model is denoted as $MA(q)$, where $q$ is the order. The order $q$ of the moving average model determines the number of past error terms that affect the present value. The model expresses the present value as a linear combination of the mean of the series $\mu$, the present error term $\epsilon_t$, and past error terms $\epsilon_{t-q}$

$y_t = \mu + \epsilon_t + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} +\ldots+ \theta_q \epsilon_{t-q}$

In [103]:
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.stattools import adfuller

In [37]:
# calculate residual errors for a persistence forecast model
from pandas import read_csv
from pandas import DataFrame
from pandas import concat
from sklearn.metrics import mean_squared_error
from math import sqrt

**Load the data and plot it**

**Get the ADF test result, plot the acf and pacf**

**Create a Persistence Model for Births Dataset**

**Autoregression of Residual Error**
We can model the residual error time series using an autoregression model. This is a linear regression model that creates a weighted linear sum of lagged residual error terms. For example:
$error(t+1) = b_0 + b_1*error(t) + b_2 * error(t-1) + \ldots +b_n * error(t-n)$

As usual, the first step is to gather the data. Then we test for stationarity. If our series is not stationary, we apply transformations, such as differencing, until the series is stationary. Then we plot the ACF and look for significant autocorrelation coefficients. In the case of a random walk, we will not see significant coefficients after lag 0. On the other hand, if we see significant coefficients, we must check whether they become abruptly non-significant after some lag q. If that is the case, then we know that we have a moving average process of order q. 

**Use the persistence Model, find the residuals, and create an MA model for the residuals**

**Correct Predictions with a Model of REsiduals**
we can add the expected forecast error to a prediction to correct it and in turn improve the skill of the mode

$improved\_forecast = forecast + estimated\_error$

**Use from statsmodels.tsa.arima.model import ARIMA, train an ARIMA model**

**Use from from statsmodels.tsa.statespace.sarimax import SARIMAX, train a SARIMAX model**

**Try seasonality adjustment on your SARIMAX model**