# Window and custom features

When forecasting time series data, it may be useful to consider additional characteristics beyond just the lagged values. For example, the moving average of the previous *n* values may help to capture the trend in the series. The `window_features` argument allows the inclusion of additional predictors created with the previous values of the series.

## Libraries and data

In [1]:
# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from skforecast.datasets import fetch_dataset
from skforecast.recursive import ForecasterRecursive
from skforecast.preprocessing import RollingFeatures
from lightgbm import LGBMRegressor

In [2]:
# Data
# ==============================================================================
data = fetch_dataset(name="h2o", raw=False)
data.index.name = 'datetime'
data = data.rename(columns={'x': 'y'})
y = data['y']
y

h2o
---
Monthly expenditure ($AUD) on corticosteroid drugs that the Australian health
system had between 1991 and 2008.
Hyndman R (2023). fpp3: Data for Forecasting: Principles and Practice(3rd
Edition). http://pkg.robjhyndman.com/fpp3package/,https://github.com/robjhyndman
/fpp3package, http://OTexts.com/fpp3.
Shape of the dataset: (204, 1)


datetime
1991-07-01    0.429795
1991-08-01    0.400906
1991-09-01    0.432159
1991-10-01    0.492543
1991-11-01    0.502369
                ...   
2008-02-01    0.761822
2008-03-01    0.649435
2008-04-01    0.827887
2008-05-01    0.816255
2008-06-01    0.762137
Freq: MS, Name: y, Length: 204, dtype: float64

## RollingFeatures

The <code>RollingFeatures</code> class availabe is skforecast allows the creation of some of the most commonly used predictors:

+ 'mean': the mean of the previous *n* values.
+ 'std': the standard deviation of the previous *n* values.
+ 'min': the minimum of the previous *n* values.
+ 'max': the maximum of the previous *n* values.
+ 'sum': the sum of the previous *n* values.
+ 'median': the median of the previous *n* values.
+ 'ratio_min_max': the ratio between the minimum and maximum of the previous *n* values.
+ 'coef_variation': the coefficient of variation of the previous *n* values.

The user can specify a different window size for each of them or the same for all of them.

The following example demonstrates how to use the <code>RollingFeatures</code> class to calculate rolling statistics, including the mean, minimum, and maximum values. Here, the rolling mean is computed with a window size of 20, while the minimum and maximum values use a window size of 10.

In [3]:
# Window features
# ==============================================================================
window_features = RollingFeatures(
                    stats        = ['mean', 'min', 'max'],
                    window_sizes = [20, 10, 10]
                  )

In [4]:
# Create and fit forecaster
# ==============================================================================
forecaster = ForecasterRecursive(
                regressor       = LGBMRegressor(random_state=123, n_jobs=-1, verbose=-1),
                lags            = 3,
                window_features = window_features,
             )
forecaster.fit(y=y)
forecaster

In [5]:
# Predict
# ==============================================================================
steps = 36
predictions = forecaster.predict(steps=steps)
predictions.head(3)

2008-07-01    0.880138
2008-08-01    1.020947
2008-09-01    1.061728
Freq: MS, Name: pred, dtype: float64

By default, windowed feature names follow the pattern <code>roll_<stat>_<window_size></code>. For instance, a rolling mean with a window size of 20 is named *roll_mean_20*. Users can also assign custom names to each feature using the <code>features_names</code> argument. Additionally, the <code>min_periods</code> argument allows specifying the minimum number of observations required to compute the statistics, while the <code>fill_na</code> argument defines the strategy for handling missing values.

By inspecting the training matrices, it is possible to check that the rolling features have been correctly included in the design matrix.

In [6]:
# Training matrices used internally to fit the regressor
# ==============================================================================
X_train, y_train = forecaster.create_train_X_y(y=y)
X_train

Unnamed: 0_level_0,lag_1,lag_2,lag_3,roll_mean_20,roll_min_10,roll_max_10
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1993-03-01,0.387554,0.751503,0.771258,0.496401,0.361801,0.771258
1993-04-01,0.427283,0.387554,0.751503,0.496275,0.387554,0.771258
1993-05-01,0.413890,0.427283,0.387554,0.496924,0.387554,0.771258
1993-06-01,0.428859,0.413890,0.427283,0.496759,0.387554,0.771258
1993-07-01,0.470126,0.428859,0.413890,0.495638,0.387554,0.771258
...,...,...,...,...,...,...
2008-02-01,1.219941,1.176589,1.163534,0.980390,0.561760,1.219941
2008-03-01,0.761822,1.219941,1.176589,0.978582,0.745258,1.219941
2008-04-01,0.649435,0.761822,1.219941,0.966838,0.649435,1.219941
2008-05-01,0.827887,0.649435,0.761822,0.955750,0.649435,1.219941


It is also possible to use the <code>RollingFeatures</code> class outside the forecaster to gain a deeper insight into its behaviour. The `transform` method computes the rolling features for a given numpy array, which is assumed to contain as many past observations as the maximum window size required to compute the rolling features. The output is a numpy array with the rolling features for that array.

In [7]:
# Create rolling features from a given array
# ==============================================================================
x = np.arange(20)
window_features.transform(X=x)

array([ 9.5, 10. , 19. ])

The `transform_batch` method is designed to transform a whole pandas series from which multiple rolling windows can be extracted. The output is a pandas dataframe with the rolling features.

In [8]:
# Create rolling features from a pandas series
# ==============================================================================
window_features.transform_batch(y).head(3)

Unnamed: 0_level_0,roll_mean_20,roll_min_10,roll_max_10
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1993-03-01,0.496401,0.361801,0.771258
1993-04-01,0.496275,0.387554,0.771258
1993-05-01,0.496924,0.387554,0.771258


The reason for this two different ways of transforming the data is that the first one is used during the prediction process, where the forecaster only has access last window of the series, while the second one is used during the training process, where the forecaster has access to the whole series.

## Create your custom window features

`RollingFeatures` is very useful for including some of the most commonly used predictors.  However, users may need to include additional predictors that are not provided by this class. In such cases, users can create their own custom class to compute the desired features and include them in the forecaster.

The custom class only requires 2 methods:

+ `transform`: method to compute the features from a numpy array. This method will be used to compute the features during the prediction process.

+ `transform_batch`: method to compute the features in batch from a pandas Series. This method will be used to compute the features during the training process.

and 2 attributes:

+ `features_names`: list with the names of the features.

+ `window_sizes`: maximum window size required to compute the features.

The follwing example shows how to create a custom class to include the rolling skewness and kurtosis with a window size of 20.

In [9]:
# Custom class to create rolling skewness features
# ==============================================================================
from scipy.stats import skew

class RollingSkewness():
    """
    Custom class to create rolling skewness features.
    """

    def __init__(self, window_sizes, features_names='rolling_skewness'):
        
        if not isinstance(window_sizes, list):
            window_sizes = [window_sizes]
        self.window_sizes = window_sizes
        self.features_names = features_names

    def transform_batch(self, X: pd.Series) -> pd.DataFrame:
        
        rolling_obj = X.rolling(window=self.window_sizes[0], center=False, closed='left')
        rolling_skewness = rolling_obj.skew()
        rolling_skewness = pd.DataFrame({
                                self.features_names: rolling_skewness
                           }).dropna()

        return rolling_skewness

    def transform(self, X: np.ndarray) -> np.ndarray:
        
        X = X[~np.isnan(X)]
        if len(X) > 0:
            rolling_skewness = np.array([skew(X, bias=False)])
        else:
            rolling_skewness = np.nan
        
        return rolling_skewness

In [10]:
window_features = RollingSkewness(window_sizes=3)
window_features.transform_batch(y)

Unnamed: 0_level_0,rolling_skewness
datetime,Unnamed: 1_level_1
1991-10-01,-1.696160
1991-11-01,0.897261
1991-12-01,-1.602797
1992-01-01,1.681518
1992-02-01,-0.778727
...,...
2008-02-01,1.359033
2008-03-01,-1.674974
2008-04-01,1.466482
2008-05-01,-0.747574


In [11]:
window_features.transform(X=np.array([6, 12, 8]))

array([0.93521953])

In [12]:
forecaster = ForecasterRecursive(
                regressor       = LGBMRegressor(random_state=123, n_jobs=-1, verbose=-1),
                lags            = 3,
                window_features = window_features,
             )
forecaster.fit(y=y)
forecaster.predict(steps=5)

2008-07-01    0.769870
2008-08-01    0.824683
2008-09-01    0.819595
2008-10-01    0.799849
2008-11-01    0.802432
Freq: MS, Name: pred, dtype: float64