# Tutorial 5: Lagged regressors

Lagged regressors are used to correlate other observed variables to our target time series. For example the temperature of the previous day might be a good predictor of the temperature of the next day.

They are often referred to as covariates. Unlike future regressors, the future of lagged regressors is unknown to us.

At the time $t$ of forecasting, we only have access to their observed, past values up to and including $t − 1$.

$$\text{Lagged regressor}(t) = L(t) = \sum_{x \in X}L_x(x_{t-1},x_{t-2},...,x_{t-p})$$

First we load a new dataset which also contains the temperature of the previous day.

In [6]:
import pandas as pd

# Load the dataset for tutorial 4 with the extra temperature column
df = pd.read_csv("https://github.com/ourownstory/neuralprophet-data/raw/f9d4b8303fffb57b4a8304484c27baef1d134cfc/kaggle-energy/datasets/tutorial04.csv")
df.head()

Unnamed: 0,ds,y,temperature
0,2015-01-01,64.92,277.0
1,2015-01-02,58.46,277.95
2,2015-01-03,63.35,278.83
3,2015-01-04,50.54,279.64
4,2015-01-05,64.89,279.05


In [13]:
df.plot(backend="plotly", x="ds", y=["y", "temperature"])

FigureWidgetResampler({
    'data': [{'hovertemplate': 'variable=y<br>ds=%{x}<br>value=%{y}<extra></extra>',
              'legendgroup': 'y',
              'line': {'color': '#636efa', 'dash': 'solid'},
              'marker': {'symbol': 'circle'},
              'mode': 'lines',
              'name': '<b style="color:sandybrown">[R]</b> y <i style="color:#fc9944">~1D</i>',
              'showlegend': True,
              'type': 'scattergl',
              'uid': 'abf64c94-8507-4072-a912-2ad685a501bb',
              'x': array([datetime.datetime(2015, 1, 1, 0, 0),
                          datetime.datetime(2015, 1, 2, 0, 0),
                          datetime.datetime(2015, 1, 3, 0, 0), ...,
                          datetime.datetime(2018, 12, 28, 0, 0),
                          datetime.datetime(2018, 12, 29, 0, 0),
                          datetime.datetime(2018, 12, 31, 0, 0)], dtype=object),
              'xaxis': 'x',
              'y': array([64.92, 58.46, 63.35, ..., 68.61, 6

After viewing the additional data we will add it as lagged regressor to our model. We start with our model from the previous tutorial. And then add the lagged regressor for the temperature to get a better energy price prediction.

In [15]:
from neuralprophet import NeuralProphet, set_log_level

# Disable logging messages unless there is an error
set_log_level("ERROR")

 
# Reduce the number of epochs to 5 for faster training
EPOCHS = 10

# Model and prediction
m = NeuralProphet(
    epochs=EPOCHS,
    # Disable trend changepoints
    n_changepoints=10,
    # Disable seasonality components
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=True,
    # Add the autogression
    n_lags=10,
)

# Add the new lagged regressor
m.add_lagged_regressor("temperature")

# Continue training the model and making a prediction
metrics = m.fit(df)
forecast = m.predict(df)
m.plot(forecast)

Finding best initial lr:   0%|          | 0/229 [00:00<?, ?it/s]

Training: 0it [00:00, ?it/s]

Predicting: 46it [00:00, ?it/s]

FigureWidgetResampler({
    'data': [{'fill': 'none',
              'line': {'color': 'rgba(45, 146, 255, 1.0)', 'width': 2},
              'mode': 'lines',
              'name': '<b style="color:sandybrown">[R]</b> yhat1 <i style="color:#fc9944">~1D</i>',
              'type': 'scatter',
              'uid': 'c5fb3b45-f5db-49b7-aabe-0245fbd57bd4',
              'x': array([datetime.datetime(2015, 1, 11, 0, 0),
                          datetime.datetime(2015, 1, 12, 0, 0),
                          datetime.datetime(2015, 1, 13, 0, 0), ...,
                          datetime.datetime(2018, 12, 28, 0, 0),
                          datetime.datetime(2018, 12, 30, 0, 0),
                          datetime.datetime(2018, 12, 31, 0, 0)], dtype=object),
              'y': array([35.52199936, 32.67403793, 36.22837067, ..., 47.99213409, 61.8654213 ,
                          63.16692734])},
             {'marker': {'color': 'black', 'size': 4},
              'mode': 'markers',
              '

In [20]:
m.plot_components(forecast, components=["lagged_regressors"])

FigureWidgetResampler({
    'data': [{'marker': {'color': '#2d92ff'},
              'name': 'Lagged Regressor "temperature"',
              'opacity': 1.0,
              'showlegend': False,
              'type': 'bar',
              'uid': 'd8e1db8e-d7c3-4fc7-bd84-c879226b3065',
              'x': array([datetime.datetime(2015, 1, 11, 0, 0),
                          datetime.datetime(2015, 1, 12, 0, 0),
                          datetime.datetime(2015, 1, 13, 0, 0), ...,
                          datetime.datetime(2018, 12, 29, 0, 0),
                          datetime.datetime(2018, 12, 30, 0, 0),
                          datetime.datetime(2018, 12, 31, 0, 0)], dtype=object),
              'xaxis': 'x',
              'y': array([ 1.54037797, -0.3934435 , -0.13060333, ...,  4.44011164,  4.24673796,
                           0.        ]),
              'yaxis': 'y'}],
    'layout': {'autosize': True,
               'barmode': 'overlay',
               'font': {'size': 10},
         

In [19]:
m.plot_parameters(components=["lagged_regressors"])

FigureWidgetResampler({
    'data': [{'marker': {'color': '#2d92ff'},
              'name': 'Lagged Regressor "temperature"',
              'type': 'bar',
              'uid': 'e0637e71-2d49-42b6-8d6c-58755b3cbbde',
              'x': array([10,  9,  8,  7,  6,  5,  4,  3,  2,  1]),
              'xaxis': 'x',
              'y': array([0.00719348, 0.13254295, 0.00373627, 0.05125141, 0.11323944, 0.19764778,
                          0.2576612 , 0.17806965, 0.04651543, 0.01214237], dtype=float32),
              'yaxis': 'y'}],
    'layout': {'autosize': True,
               'font': {'size': 10},
               'height': 210,
               'hovermode': 'x unified',
               'margin': {'b': 0, 'l': 0, 'pad': 0, 'r': 10, 't': 10},
               'showlegend': False,
               'template': '...',
               'title': {'font': {'size': 12}},
               'width': 700,
               'xaxis': {'anchor': 'y',
                         'domain': [0.0, 1.0],
                       

Let us explore how our model improved after adding the lagged regressor.

In [25]:
metrics

Unnamed: 0,MAE,RMSE,Loss,RegLoss,epoch
0,56.707123,66.461082,0.410679,0.0,0
1,36.053833,45.058567,0.209306,0.0,1
2,24.99967,31.965115,0.112717,0.0,2
3,18.371998,23.810799,0.062862,0.0,3
4,15.520124,20.278652,0.045442,0.0,4
5,14.13405,18.557184,0.038628,0.0,5
6,13.565326,17.680754,0.035644,0.0,6
7,13.152133,17.312595,0.034045,0.0,7
8,13.115194,17.306236,0.033544,0.0,8
9,12.967725,16.949823,0.033046,0.0,9


In [24]:
df_residuals = pd.DataFrame({
    "ds": df["ds"],
    "residuals": df["y"] - forecast["yhat1"]
})
df_residuals.plot(x="ds", y="residuals", backend="plotly")

FigureWidgetResampler({
    'data': [{'hovertemplate': 'ds=%{x}<br>residuals=%{y}<extra></extra>',
              'legendgroup': '',
              'line': {'color': '#636efa', 'dash': 'solid'},
              'marker': {'symbol': 'circle'},
              'mode': 'lines',
              'name': '<b style="color:sandybrown">[R]</b>  <i style="color:#fc9944">~1D</i>',
              'showlegend': False,
              'type': 'scattergl',
              'uid': '8e1702a0-9759-4c1d-9c5e-df0dd95a0f16',
              'x': array([datetime.datetime(2015, 1, 11, 0, 0),
                          datetime.datetime(2015, 1, 12, 0, 0),
                          datetime.datetime(2015, 1, 13, 0, 0), ...,
                          datetime.datetime(2018, 12, 28, 0, 0),
                          datetime.datetime(2018, 12, 30, 0, 0),
                          datetime.datetime(2018, 12, 31, 0, 0)], dtype=object),
              'xaxis': 'x',
              'y': array([19.45800064, 25.59596207, 16.31162933, ...