## Fitted values and residuals
***
## Fitted values
Each observation in a time series can be forecast using all previous observations. We call these fitted values and they are denoted by $\hat{y}_{t|t-1}$, meaning the forecast of $y_{t}$ based on observations $y_{,},...,y_{t-1}$ . We use these so often, we sometimes drop part of the subscript and just write $\hat{y}_{t}$ instead of $\hat{y}_{t|t-1}$. Fitted values almost always involve one-step forecasts.

Actually, fitted values are often not true forecasts because any parameters involved in the forecasting method are estimated using all available observations in the time series, including future observations. For example, if we use the mean method, the fitted values are given by:

\begin{gather*} 
\hat{y}_{t}=\hat{c}
\end{gather*}

where $\hat{c}$ is the average computed over all available observations, including those at times after $t$. Similarly, for the drift method, the drift parameter is estimated using all available observations. In this case, the fitted values are given by

\begin{gather*} 
\hat{y}_{t}={y}_{t-1}\hat{c}
\end{gather*}

Where $\hat{c}=(y_{T}-y_{1})/(T-1)$ In both cases, there is a parameter to be estimated from the data. The “hat” above the $c$ reminds us that this is an estimate. When the estimate of $c$ involves observations after time $t$, the fitted values are not true forecasts. On the other hand, naïve or seasonal naïve forecasts do not involve any parameters, and so fitted values are true forecasts in such cases.

## Residuals

The “residuals” in a time series model are what is left over after fitting a model. The residuals are equal to the difference between the observations and the corresponding fitted values:

\begin{gather*} 
e_{t}=y_{t}-\hat{y}_{t*}
\end{gather*}

If a transformation has been used in the model, then it is often useful to look at residuals on the transformed scale. We call these **“innovation residuals”**. For example, suppose we modelled the logarithms of the data, $w_{t}=log(y_{t})$. Then the innovation residuals are given by $w_{t}-\hat{w}_{t}$ whereas the regular residuals are given by $y_{t}-\hat{y}_{t*}$. If no transformation has been used then the innovation residuals are identical to the regular residuals, and in such cases we will simply call them “residuals”.

The fitted values and residuals from a model can be obtained using the `augment()` function. In the beer production example in Section 2, we saved the fitted models as `beer_fit`. So we can simply apply `augment()` to this object to compute the fitted values and residuals for all models.

In [1]:
# Import the libraries that we are going to use for the analysis:
import pandas as pd
import numpy as np

from statsforecast import StatsForecast
from statsforecast.models import __all__
from statsforecast.models import HistoricAverage

In [2]:
# Create a dataframe from a csv file:
df = pd.read_csv("Assets/aus-production.csv", sep=";")

# Create a dataframe with beer production: 
beer = df[["Quarter","Beer"]]

#Inferior limit:
beer_mask=beer['Quarter']>="1992 Q1"
filtered_beer = beer[beer_mask]

#Superior limit:
beer_mask=filtered_beer['Quarter']<="2006 Q4"
beer = filtered_beer[beer_mask]

# Create an array with the observed values
y_beer = beer["Beer"].values

# Mean method:
model = HistoricAverage()
model = model.fit(y=y_beer)
mean = model.predict(h=14)

In [3]:
'''
The function augment() return a DataFrame with five columns (model, time_var, obs_values, fitted_values and residuals) and has 3 arguments:

    * model_name: String variable. It´s reffer to what kind of model are we using (Mean, Naive, SNaive, ETS, etc.).

    * time_var: Column of the input dataframe or numpy array. It´s reffer to the time index of the observed values.

    * obs_values: Column of the input dataframe or numpy array. It´s reffer to the the observed values.

'''

def augment(model_name, time_var, obs_values):

    type_model_list = []
    for n in range(len(obs_values)):
        type_model_list.append(model_name)
        n += 1
    fitted_values = model.predict_in_sample()
    residuals = obs_values - fitted_values["mean"]

    augment_df = pd.DataFrame({'model':type_model_list,
                        'time_var':time_var,
                        'obs_values':obs_values,
                        'fitted_values':fitted_values["mean"],
                        'residuals':residuals})

    return(augment_df)

In [4]:
# augment(model_name, time_var, obs_values)
augment_df = augment("Mean",beer["Quarter"],beer["Beer"])

augment_df.tail()

Unnamed: 0,model,time_var,obs_values,fitted_values,residuals
199,Mean,2005 Q4,482,436.450012,45.549988
200,Mean,2006 Q1,438,436.450012,1.549988
201,Mean,2006 Q2,386,436.450012,-50.450012
202,Mean,2006 Q3,405,436.450012,-31.450012
203,Mean,2006 Q4,491,436.450012,54.549988


There are two new columns added to the original data:

+ `fitted_values` contains the fitted values;
+ `residuals` contains the residuals;

If patterns are observable in the residuals, the model can probably be improved. We will look at some tools for exploring patterns in residuals in the next section.