# Feature importance 

The importance of the predictors included in a forecaster can be obtained using the method `get_feature_importance`. This method accesses the `coef_` and `feature_importances_` attributes of the internal regressor.

<script src="https://kit.fontawesome.com/d20edc211b.js" crossorigin="anonymous"></script>

<div class="admonition note" name="html-admonition" style="background: rgba(255,145,0,.1); padding-top: 0px; padding-bottom: 6px; border-radius: 8px; border-left: 8px solid #ff9100;">

<p class="title">
    <i class="fa-triangle-exclamation fa" style="font-size: 18px; color:#ff9100;"></i>
    <b> &nbsp Warning</b>
</p>

This method only returns values if the regressor used inside the forecaster has the attribute `coef_` or `feature_importances_`.

</div>

## Libraries

In [7]:
# Libraries
# ==============================================================================
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestRegressor
from sklearn.linear_model import Ridge
from skforecast.ForecasterAutoreg import ForecasterAutoreg
from skforecast.ForecasterAutoregDirect import ForecasterAutoregDirect

## Data

In [8]:
# Download data
# ==============================================================================
url = ('https://raw.githubusercontent.com/JoaquinAmatRodrigo/skforecast/master/data/h2o_exog.csv')
data = pd.read_csv(url, sep=',', header=0, names=['date', 'y', 'exog_1', 'exog_2'])

# Data preprocessing
# ==============================================================================
data['date'] = pd.to_datetime(data['date'], format='%Y/%m/%d')
data = data.set_index('date')
data = data.asfreq('MS')

## Extract feature importance from trained forecaster

In [9]:
# Create and fit forecaster
# ==============================================================================
forecaster = ForecasterAutoreg(
                regressor = RandomForestRegressor(),
                lags = 5
             )

forecaster.fit(y=data['y'], exog=data[['exog_1', 'exog_2']])

# Predictors importance
# ==============================================================================
forecaster.get_feature_importance()

Unnamed: 0,feature,importance
0,lag_1,0.533117
1,lag_2,0.097804
2,lag_3,0.029198
3,lag_4,0.073071
4,lag_5,0.053683
5,exog_1,0.042011
6,exog_2,0.171117


In [10]:
# Create and fit forecaster
# ==============================================================================
forecaster = ForecasterAutoreg(
                    regressor = Ridge(),
                    lags = 5
                )

forecaster.fit(y=data['y'], exog=data[['exog_1', 'exog_2']])

forecaster.get_feature_importance()

Unnamed: 0,feature,importance
0,lag_1,0.327688
1,lag_2,-0.073593
2,lag_3,-0.152202
3,lag_4,-0.217106
4,lag_5,-0.1458
5,exog_1,0.379798
6,exog_2,0.668162


Since `ForecasterAutoregDirect` fits one model per step, it is necessary to specify from which model retrieves its feature importance.

In [11]:
# Create and fit forecaster
# ==============================================================================
forecaster = ForecasterAutoregDirect(
                regressor = RandomForestRegressor(),
                steps = 10,
                lags = 5
             )

forecaster.fit(y=data['y'], exog=data[['exog_1', 'exog_2']])

# Predictors importance
# ==============================================================================
forecaster.get_feature_importance(step=1)

Unnamed: 0,feature,importance
0,lag_1,0.53758
1,lag_2,0.102076
2,lag_3,0.024955
3,lag_4,0.074393
4,lag_5,0.051036
5,exog_1,0.05578
6,exog_2,0.15418


In [12]:
%%html
<style>
.jupyter-wrapper .jp-CodeCell .jp-Cell-inputWrapper .jp-InputPrompt {display: none;}
</style>