[ENH] Get feature importance or coef from reduction forecaster #3849

iki77 · 2022-11-29T06:34:27Z

Is your feature request related to a problem? Please describe.
There is no way to find out the feature importance (non linear models) and coef (linear model) from reduction forecaster

Describe the solution you'd like
Impelemnt feature_importance and coef to reduction forecaster and forecastingpipeline if it wraps around the reduction forecaster

fkiraly · 2022-11-29T07:29:56Z

have you tried the get_fitted_params method? That should produce feature importances or coefficients for the reduction foreaster.

If it does not, code for a call forecaster.get_fitted_params() after fitting it on sktime dummy data would be appreciated.

iki77 · 2022-11-29T09:54:11Z

Thanks, I can get the feature importances but it is not easy for beginner

Reduction forecaster with regressor as estimator:
forecaster.get_fitted_params()["estimator"].feature_importances_
Reduction forecaster with regressor pipeline as estimator:
forecaster.get_fitted_params()["estimator"].named_steps["regressor"].feature_importances_
Forecasting pipeline with reduction forecaster:
forecaster_pipe.get_fitted_params()["forecaster"].get_fitted_params()["estimator"].named_steps["regressor"].feature_importances_

It would be helpful if reduction forecaster and forecasting pipeline to have feature_importances_ & coef_ that function as a shortcut

fkiraly · 2022-11-29T14:56:58Z

ok, does it not show up at the key "estimator__feature_importances" if you use version 0.14.0?
In the second step under the key "estimator__named_steps__regressor__feature_importances"_

If not, it is a bug, please report with full code that produces the first example.

The second one might not be solvable though as sklearn does not have get_fitted_params interfaces.

iki77 · 2022-11-30T13:50:49Z

Only case 1 works, here is the code

import pandas as pd

from sklearn.ensemble import RandomForestRegressor
from sklearn.pipeline import Pipeline
from sktime.forecasting.compose import make_reduction
from sktime.forecasting.compose import ForecastingPipeline
from sktime.transformations.series.summarize import WindowSummarizer

y = _make_hierarchical(n_columns=3, random_state=42)
X = y.iloc[:, :-1]
y = y.iloc[:, -1:]

regressor = RandomForestRegressor(random_state=42)
regressor_pipe = Pipeline(
    steps=[("regressor", regressor)]
)

# CASE 1 - WORKS
forecaster_1 = make_reduction(
    regressor,
    scitype="tabular-regressor",
    transformers=[WindowSummarizer(n_jobs=1)],
    window_length=None,
    strategy="recursive",
    pooling="global",
)
forecaster_1.fit(y=y, X=X)
forecaster_1.get_fitted_params()["estimator__feature_importances"] 

# CASE 2 - DOES NOT WORK
forecaster_2 = make_reduction(
    regressor_pipe,
    scitype="tabular-regressor",
    transformers=[WindowSummarizer(n_jobs=1)],
    window_length=None,
    strategy="recursive",
    pooling="global",
)
forecaster_2.fit(y=y, X=X)
forecaster_2.get_fitted_params()["estimator__named_steps__regressor__feature_importances"] # DOES NOT EXIST

# CASE 3
forecaster_pipe = ForecastingPipeline(
    steps=[
        ("exog_dynamics", WindowSummarizer(n_jobs=1, target_cols=X.columns.tolist())),
        ("forecaster", forecaster_2),
    ]
)
forecaster_pipe.fit(y=y, X=X)

fkiraly · 2022-12-01T20:38:31Z

Ah!

Thanks for the example.

I think this is an instance of an incomplete feature, i.e., implementing get_fitted_params for the forecasting pipelines here: item 5 on the list in
#1497

In general, the special composites with get_params overrides also need get_fitted_params overrides; ordinary composites should be covered by the base class already.

I'll have a look at it, should not be too difficult to add given the existing default implementations.

fkiraly · 2022-12-01T21:56:28Z

example 3 is addressed by this: #3863

example 2 is more difficult, since the fitted parameters are not passed on within sklearn, and the "underscore at the end" syntax for fitted parameters is not consistent within sklearn itself - e.g., it's called named_steps rather than named_steps_.

To fix this, one would have to write an estimator crawler specifically for scikit-learn - it might be easier to integrate a coherent get_fitted_params interface with sktime? Here's the package for it: https://github.com/sktime/skbase ...

iki77 added the enhancement Adding new functionality label Nov 29, 2022

fkiraly added the feature request New feature or request label Nov 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Get feature importance or coef from reduction forecaster #3849

[ENH] Get feature importance or coef from reduction forecaster #3849

iki77 commented Nov 29, 2022

fkiraly commented Nov 29, 2022

iki77 commented Nov 29, 2022

fkiraly commented Nov 29, 2022

iki77 commented Nov 30, 2022

fkiraly commented Dec 1, 2022

fkiraly commented Dec 1, 2022 •

edited

[ENH] Get feature importance or coef from reduction forecaster #3849

[ENH] Get feature importance or coef from reduction forecaster #3849

Comments

iki77 commented Nov 29, 2022

fkiraly commented Nov 29, 2022

iki77 commented Nov 29, 2022

fkiraly commented Nov 29, 2022

iki77 commented Nov 30, 2022

fkiraly commented Dec 1, 2022

fkiraly commented Dec 1, 2022 • edited

fkiraly commented Dec 1, 2022 •

edited