A common use case requires the forecaster to regularly update with new data and make forecasts on a rolling basis. This is especially useful if the same kind of forecast has to be made at regular time points, e.g., daily or weekly. sktime forecasters support this type of deployment workflow via the update and update_predict methods.

The update method can be called when a forecaster is already fitted, to ingest new data and make updated forecasts - this is referred to as an “update step”.

After the update, the forecaster’s internal “now” state (the cutoff) is set to the latest time stamp seen in the update batch (assumed to be later than previously seen data).

The general pattern is as follows:

1. specify a forecasting strategy

2. specify a relative forecasting horizon

3. fit the forecaster to an initial batch of data using fit

4. make forecasts for the relative forecasting horizon, using predict

5. obtain new data; use update to ingest new data

6. make forecasts using predict for the updated data

7. repeat 5 and 6 as often as required

Example: suppose that, in the airline example, we want to make forecasts a year ahead, but every month, starting December 1957. The first few months, forecasts would be made as follows:

In [None]:
import pandas as pd
from sktime.forecasting.ets import AutoETS
from sktime.utils.plotting import plot_series
import numpy as np

In [None]:
# we prepare the full data set for convenience
# note that in the scenario we will "know" only part of this at certain time points
df = pd.read_csv("../../data/later/profile_growth.csv")

#df.columns

followers = df[['Date', 'Followers']]

followers['Date'] = pd.PeriodIndex(pd.DatetimeIndex(followers['Date']), freq='D') 

y = followers.set_index('Date').sort_index()

In [None]:
from sktime.forecasting.naive import NaiveForecaster
forecaster = NaiveForecaster(strategy="last")
# December 1957

# this is the data known in December 1975
y_1957Dec = y[:-36]

# step 1: specifying the forecasting strategy
#forecaster = AutoETS(auto=True, sp=7, n_jobs=-1)

# step 2: specifying the forecasting horizon: one year ahead, all months
fh = np.arange(1, 13)

# step 3: this is the first time we use the model, so we fit it
forecaster.fit(y_1957Dec)

# step 4: obtaining the first batch of forecasts for Jan 1958 - Dec 1958
y_pred_1957Dec = forecaster.predict(fh)

In [None]:
# plotting predictions and past data
plot_series(y_1957Dec, y_pred_1957Dec, labels=["y_1957Dec", "y_pred_1957Dec"])

In [None]:
# January 1958

# new data is observed:
y_1958Jan = y[:-36]

# step 5: we update the forecaster with the new data
forecaster.update(y_1958Jan)

# step 6: making forecasts with the updated data
y_pred_1958Jan = forecaster.predict(fh)

In [None]:
# note that the fh is relative, so forecasts are automatically for 1 month later
#  i.e., from Feb 1958 to Jan 1959
y_pred_1958Jan

In [None]:
# plotting predictions and past data
plot_series(
    y[:-35],
    y_pred_1957Dec,
    y_pred_1958Jan,
    labels=["y_1957Dec", "y_pred_1957Dec", "y_pred_1958Jan"],
)

In [None]:
# February 1958

# new data is observed:
y_1958Feb = y[:-35]

# step 5: we update the forecaster with the new data
forecaster.update(y_1958Feb)

# step 6: making forecasts with the updated data
y_pred_1958Feb = forecaster.predict(fh)

In [None]:
# plotting predictions and past data
plot_series(
    y[:-35],
    y_pred_1957Dec,
    y_pred_1958Jan,
    y_pred_1958Feb,
    labels=["y_1957Dec", "y_pred_1957Dec", "y_pred_1958Jan", "y_pred_1958Feb"],
)

… and so on.

A shorthand for running first update and then predict is update_predict_single - for some algorithms, this may be more efficient than the separate calls to update and predict:

In [None]:
# March 1958

# new data is observed:
y_1958Mar = y[:-34]

# step 5&6: update/predict in one step
forecaster.update_predict_single(y_1958Mar, fh=fh)

In the rolling deployment mode, may be useful to move the estimator’s “now” state (the cutoff) to later, for example if no new data was observed, but time has progressed; or, if computations take too long, and forecasts have to be queried.

The update interface provides an option for this, via the update_params argument of update and other update funtions.

If update_params is set to False, no model update computations are performed; only data is stored, and the internal “now” state (the cutoff) is set to the most recent date.

In [None]:
# April 1958

# new data is observed:
y_1958Apr = y[:-33]

# step 5: perform an update without re-computing the model parameters
forecaster.update(y_1958Apr, update_params=False)

sktime can also simulate the update/predict deployment mode with a full batch of data.

This is not useful in deployment, as it requires all data to be available in advance; however, it is useful in playback, such as for simulations or model evaluation.

The update/predict playback mode can be called using update_predict and a re-sampling constructor which encodes the precise walk-forward scheme.

To evaluate forecasters with respect to their performance in rolling forecasting, the forecaster needs to be tested in a set-up mimicking rolling forecasting, usually on past data. Note that the batch back-testing as in Section 1.3 would not be an appropriate evaluation set-up for rolling deployment, as that tests only a single forecast batch.

The advanced evaluation workflow can be carried out using the evaluate benchmarking function. evalute takes as arguments: - a forecaster to be evaluated - a scikit-learn re-sampling strategy for temporal splitting (cv below), e.g., ExpandingWindowSplitter or SlidingWindowSplitter - a strategy (string): whether the forecaster should be always be refitted or just fitted once and then updated

In [None]:
#from sktime.forecasting.arima import AutoARIMA
from sktime.forecasting.ets import AutoETS
from sktime.forecasting.model_evaluation import evaluate
from sktime.forecasting.model_selection import ExpandingWindowSplitter

In [None]:
#forecaster = AutoARIMA(sp=12, suppress_warnings=True)
forecaster = AutoETS(auto=True, sp=12, n_jobs=-1)
#forecaster = NaiveForecaster(strategy="last")

cv = ExpandingWindowSplitter(
    step_length=12, fh=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], initial_window=72
)

df = evaluate(forecaster=forecaster, y=y, cv=cv, strategy="refit", return_data=True)

df.iloc[:, :5]

In [None]:
# visualization of a forecaster evaluation
fig, ax = plot_series(
    y,
    df["y_pred"].iloc[0],
    df["y_pred"].iloc[1],
    df["y_pred"].iloc[2],
    df["y_pred"].iloc[3],
    df["y_pred"].iloc[4],
    df["y_pred"].iloc[5],
    markers=["o", "", "", "", "", "", ""],
    labels=["y_true"] + ["y_pred (Backtest " + str(x) + ")" for x in range(6)],
)
ax.legend();

In [None]:
df["y_pred"].iloc[0]

In [None]:
df["y_pred"].iloc[1]