# Tutorial 4: Auto regression

Autoregression is a time series model that uses observations from previous time steps as input to a regression equation to predict the value at the next time step.

We start with the same model as in the previous tutorial.

In [None]:
import pandas as pd
from neuralprophet import NeuralProphet, set_log_level

# Disable logging messages unless there is an error
set_log_level("ERROR")

# Load the dataset from the CSV file using pandas
df = pd.read_csv("https://github.com/ourownstory/neuralprophet-data/raw/main/kaggle-energy/datasets/tutorial01.csv")

# Reduce the number of epochs to 5 for faster training
EPOCHS = 10

# Model and prediction
m = NeuralProphet(
    epochs=EPOCHS,
    # Disable trend changepoints
    n_changepoints=10,
    # Disable seasonality components
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=True,
)
metrics = m.fit(df)
forecast = m.predict(df)
m.plot(forecast)

To better understand what the remaining mismatch between our model and the real data is, we can look at the residuals. The residuals are the difference between the model's prediction and the real data. If the model is perfect, the residuals should be zero.

In [None]:
df_residuals = pd.DataFrame({
    "ds": df["ds"],
    "residuals": df["y"] - forecast["yhat1"]
})
df_residuals.plot(x="ds", y="residuals", backend="plotly")

Let us explore what a good value for the autoregression would be. Create a autocorrelation chart.

In [None]:
from statsmodels.graphics.tsaplots import plot_acf

plt = plot_acf(df_residuals["residuals"], lags=50)

Now we add autoregression to our model with the `n_lags` parameter.

In [None]:

# Model and prediction
m = NeuralProphet(
    epochs=EPOCHS,
    # Disable trend changepoints
    n_changepoints=10,
    # Disable seasonality components
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=True,
    # Add the autogression
    n_lags=10,
)
metrics = m.fit(df)
forecast = m.predict(df)
m.plot(forecast)

As we can see the forecasting model does not fit the data very well. Feel free to explore how different numbers of lags `n_lags` affect the model.

In [None]:
m.plot_parameters(components=["autoregression"])

In [None]:
m.plot_components(forecast, components=["autoregression"])