## Choosing window features using LASSO

In [54]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Lasso
from sklearn.pipeline import Pipeline, make_pipeline

from feature_engine.timeseries.forecasting import WindowFeatures, LagFeatures, ExpandingWindowFeatures
from feature_engine.imputation import DropMissingData

from sktime.utils.plotting import plot_series


## Data
We will work with the hourly electricity demand dataset. It is the electricity demand for the state of Victora in Australia from 2002 to the start of 2015. 


In [3]:
data = pd.read_csv('../../Datasets/victoria_electricity_demand.csv', 
                   usecols=["demand", "temperature", "date_time"], 
                   index_col='date_time', parse_dates=['date_time'])

# For this demo we will use a subset of the data
data = data.loc["2010":]

# plot
plot_series(data['demand'])
plt.xticks(rotation=30);

<img src='./plots/victoria-electricity-deman-2010-2015.png'>

#### Lag features
* Create recent lags like lag-1, lag-2, lag-3
* Create a lag of 24hr
* Create a lag of 1-week


In [6]:
lag_transformer = LagFeatures(variables=['demand', 'temperature'], periods=[1,2,3, 24, 7*24])

#### Window Features
* Aggregation function 
    * "mean"
    * "std"
    * "kurt"
    * "skew"
* Window size
    * Daily ( 24hr )
    * Weekly ( 24hr * 7days )
    * Monthly ( 24hr * 7days * 4)
    * Yearly ( 24hr * 7days * 4 * 12 )

In [7]:
window_transformer = WindowFeatures( 
    variables=['demand', 'temperature'], 
    window=[24, 24*7, 24*7*4, 24*7*4*12],
    functions=["mean", "std", "kurt", "skew"],
    periods=1, freq='H'    
)

#### Expanding features

* Aggregation function 
    * "mean"
    * "std"
    * "kurt"
    * "skew"

In [8]:
expanding_transformer = ExpandingWindowFeatures(
    variables=['demand','temperature'],
    functions=["mean", "std", "kurt", "skew"],
    periods=1, freq='H'
)

#### Drop missing data introduced by window and lag features

In [9]:
drop_na_transformer = DropMissingData()

#### Feature pipeline

In [10]:
feature_pipe = Pipeline(steps=[
    ('lags', lag_transformer),
    ('window_summary', window_transformer),
    ('expanding-window-summary', expanding_transformer),
    ('drop-na',drop_na_transformer)
])

In [11]:
df = feature_pipe.fit_transform(data)

#### Split the data into a training set and test set

In [51]:
split_date = df.index[-1] - pd.Timedelta(days=1)

train_df = df[df.index <= split_date]
test_df = df[df.index > split_date]

print('Train data ends at :',train_df.index[-1])
print('Test data start at :',test_df.index[0])

Train data ends at : 2015-02-27 23:00:00
Test data start at : 2015-02-28 00:00:00


## Use LASSO for feature selection

In [53]:
features = train_df.drop(columns=['demand', 'temperature'])
target = train_df['demand']

In [58]:
model_pipe =make_pipeline(StandardScaler(), Lasso(alpha=1))
model_pipe.fit(features, target)

lasso_coef = model_pipe[-1].coef_

In [63]:
feature_importance = pd.DataFrame(data=lasso_coef, columns=['Feature Importance'], index=features.columns)
feature_importance.abs().nlargest(n=15, columns=['Feature Importance'])

Unnamed: 0,Feature Importance
demand_lag_1,2190.864887
demand_lag_2,877.442149
demand_lag_168,214.023662
temperature_lag_1,191.264622
demand_window_24_mean,147.228444
temperature_lag_3,103.872531
demand_lag_24,102.21162
demand_window_24_skew,62.94547
temperature_lag_168,54.882168
demand_window_168_mean,41.652587


We can see that the lag features are most important but some window features are also selected! This is to be expected as the most recent observations tend to be very predictive of the next immediate observation.