Recursive Strategy Bug #4

ThiagoCM · 2021-04-14T19:15:20Z

Hello.
First, I would like to thank you guys for this amazing example on applying Direct and Recursive Strategy to N Step Ahead Forecasting. I was taking a look on the recursive strategy and came upon a doubt regarding it's implementation, where I think there's a bug.

If you take a look at the picture above (you can find the math in this article), the recursive strategy is basically the 1-step-ahead direct strategy with a "feedback" (the value found at each iteration will be inserted on target array).

When you're doing this piece of code

new_point = fcasted_values[-1] if len(fcasted_values) > 0 else 0.0
target = target.append(pd.Series(index=[date], data=new_point))

You're actually inserting the first prediction (N=1) on the recursive strategy with 0.0 value, instead of actually finding the prediction (N=1) value. This will affect the lags used on the features matrix, since there will be a lag with an incorrect value in all prediction steps.

Below you can see the target and feature values for 3 iteractions after inserting 0.0 as the first prediction.

Iteraction 1

Features
                     hour  weekday  dayofyear  ...      lag_1      lag_8     lag_25
2020-01-12 20:00:00    20        6         12  ...  65.919495  80.427320  72.718000
2020-01-12 21:00:00    21        6         12  ...  34.952133  57.917430  33.341960
2020-01-12 22:00:00    22        6         12  ...  33.911217  56.941563  33.081734
2020-01-12 23:00:00    23        6         12  ...  33.244377  56.193405  33.683514
2020-01-13 00:00:00     0        0         13  ...  33.390755  53.786278  33.244377

Target
2020-01-12 20:00:00    34.952133
2020-01-12 21:00:00    33.911217
2020-01-12 22:00:00    33.244377
2020-01-12 23:00:00    33.390755
2020-01-13 00:00:00     0.000000

Iteraction 2

Features
                     hour  weekday  dayofyear  ...      lag_1      lag_8     lag_25
2020-01-12 21:00:00    21        6         12  ...  34.952133  57.917430  33.341960
2020-01-12 22:00:00    22        6         12  ...  33.911217  56.941563  33.081734
2020-01-12 23:00:00    23        6         12  ...  33.244377  56.193405  33.683514
2020-01-13 00:00:00     0        0         13  ...  33.390755  53.786278  33.244377
2020-01-13 01:00:00     1        0         13  ...   0.000000  59.202316  33.407020

Target
2020-01-12 21:00:00    33.911217
2020-01-12 22:00:00    33.244377
2020-01-12 23:00:00    33.390755
2020-01-13 00:00:00     0.000000
2020-01-13 01:00:00    34.342800

Iteraction 3

Features
                     hour  weekday  dayofyear  ...      lag_1      lag_8     lag_25
2020-01-12 22:00:00    22        6         12  ...  33.911217  56.941563  33.081734
2020-01-12 23:00:00    23        6         12  ...  33.244377  56.193405  33.683514
2020-01-13 00:00:00     0        0         13  ...  33.390755  53.786278  33.244377
2020-01-13 01:00:00     1        0         13  ...   0.000000  59.202316  33.407020
2020-01-13 02:00:00     2        0         13  ...  34.342800  68.944670  32.057076

Target
2020-01-12 22:00:00    33.244377
2020-01-12 23:00:00    33.390755
2020-01-13 00:00:00     0.000000
2020-01-13 01:00:00    34.342800
2020-01-13 02:00:00     2.395295

Also, I didn't understand why you used, on the recursive strategy, the trained model (which is returned either from the linear_model or xgboost_model functions) instead of the 1 Step Ahead model (which is used on the Direct Estrategy).

Does this make any sense or have I understand something wrong?

The text was updated successfully, but these errors were encountered:

JamesLarkinWhite · 2022-04-22T15:09:04Z

I just found this tutorial and had the same thought rerading the implementation of the recursive forecast.

What i wrote before seems to be nonesense to me now...

I guess you would have to make a prediction before entering the loop and append the last value of the resulting array instead of 0.0 in case of the first prediction (N=1) .

At least i have seen this in a few entries for the M4 competition?

Edit: I try to implement this idea. The two variables initial_target and intial_prediction are not really needed but i thought it might help to understand my general idea.It would be really nice if somebody could give me a feedback wether or not this is a viable solution or not:

def forecast_multi_recursive_fix(y, model, lags, n_steps=FCAST_STEPS, step="1H"):

	"""Multi-step recursive forecasting using the input time 
	series data and a pre-trained machine learning model
	
	Parameters
	----------
	y: pd.Series holding the input time-series to forecast
	model: an already trained machine learning model implementing the scikit-learn interface
	lags: list of lags used for training the model
	n_steps: number of time periods in the forecasting horizon
	step: forecasting time period given as Pandas time series frequencies
	
	Returns
	-------
	fcast_values: pd.Series with forecasted values indexed by forecast horizon dates 
	"""

	def create_recursive_features(target, lags):
		rec_target = target.copy()
		# forecast: create ts features
		ts_features = create_ts_features(rec_target)
		# forecast: create lag features
		if len(lags) > 0:
			lags_features = create_lag_features(rec_target, lags=lags)
			rec_features = ts_features.join(lags_features, how="outer").dropna()
		else:
			rec_features = ts_features

		return rec_features


	# get the dates to forecast
	last_date = y.index[-1] + pd.Timedelta(hours=1)
	fcast_range = pd.date_range(last_date, periods=n_steps, freq=step)

	fcasted_values = []
	target = y.copy()

	# initial Prediction for first step:
	initial_features = create_recursive_features(target, lags)
	initial_prediction = model.predict(initial_features)  # take value from original target array

	for date in fcast_range:

		new_point = fcasted_values[-1] if len(fcasted_values) > 0 else initial_prediction[-1]

		target = target.append(pd.Series(index=[date], data=new_point))
		# forecast: create recursive features
		features = create_recursive_features(target,lags)

		# forecast: Predict
		predictions = model.predict(features)
		# forecast: append predictions to fcasted_values List
		fcasted_values.append(predictions[-1])


	return pd.Series(index=fcast_range, data=fcasted_values)

JamesLarkinWhite · 2023-01-12T13:36:51Z

It would be nice if you could revie this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recursive Strategy Bug #4

Recursive Strategy Bug #4

ThiagoCM commented Apr 14, 2021

JamesLarkinWhite commented Apr 22, 2022 •

edited

JamesLarkinWhite commented Jan 12, 2023

Recursive Strategy Bug #4

Recursive Strategy Bug #4

Comments

ThiagoCM commented Apr 14, 2021

JamesLarkinWhite commented Apr 22, 2022 • edited

JamesLarkinWhite commented Jan 12, 2023

JamesLarkinWhite commented Apr 22, 2022 •

edited