-
Notifications
You must be signed in to change notification settings - Fork 855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[timeseries] add feature importance to TimeSeriesPredictor #4033
[timeseries] add feature importance to TimeSeriesPredictor #4033
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very elegant!
The main points that I want to discuss:
- Moving all feature importance logic to Trainer
- Using statistics of train data for masking
- Use full train_data/tuning_data by default
- Fixing seed for determinism
timeseries/src/autogluon/timeseries/trainer/abstract_trainer.py
Outdated
Show resolved
Hide resolved
timeseries/src/autogluon/timeseries/trainer/abstract_trainer.py
Outdated
Show resolved
Hide resolved
timeseries/src/autogluon/timeseries/trainer/abstract_trainer.py
Outdated
Show resolved
Hide resolved
) | ||
|
||
if features is None: | ||
features = feature_importance_transform.covariate_metadata.all_features |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible that some features are present in data.columns
or data.static_features.columns
but are missing from covariate_metadata.all_features
because TimeSeriesFeatureGenerator
removes non-informative features from the data (e.g., duplicate feature, feature consisting of all constant values, feature that takes different value for each row, unrecognized dtype time datetime
).
We should manually check the columns in original data and assign feature importance of 0 to columns that are missing from transformed data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what is done in Tabular I believe
Job PR-4033-32a1e3d is done. |
|
Job PR-4033-c0a0175 is done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added initial review
timeseries/src/autogluon/timeseries/trainer/abstract_trainer.py
Outdated
Show resolved
Hide resolved
timeseries/src/autogluon/timeseries/trainer/abstract_trainer.py
Outdated
Show resolved
Hide resolved
timeseries/src/autogluon/timeseries/trainer/abstract_trainer.py
Outdated
Show resolved
Hide resolved
timeseries/src/autogluon/timeseries/trainer/abstract_trainer.py
Outdated
Show resolved
Hide resolved
Thanks for the review @Innixma ! This was the first cut trying to pin down the design. I'll base my replies on that. |
|
Job PR-4033-fe04e5f is done. |
timeseries/src/autogluon/timeseries/trainer/abstract_trainer.py
Outdated
Show resolved
Hide resolved
|
|
Job PR-4033-0898cb0 is done. |
|
1 similar comment
|
|
timeseries/src/autogluon/timeseries/models/autogluon_tabular/mlforecast.py
Show resolved
Hide resolved
timeseries/src/autogluon/timeseries/trainer/abstract_trainer.py
Outdated
Show resolved
Hide resolved
timeseries/src/autogluon/timeseries/trainer/abstract_trainer.py
Outdated
Show resolved
Hide resolved
timeseries/src/autogluon/timeseries/trainer/abstract_trainer.py
Outdated
Show resolved
Hide resolved
Job PR-4033-5051648 is done. |
|
c517448
to
68b8095
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you for taking care of this massive feature! 🚀
timeseries/src/autogluon/timeseries/models/abstract/abstract_timeseries_model.py
Show resolved
Hide resolved
|
Job PR-4033-0c74a01 is done. |
Issue #, if available:
#3924
Description of changes:
This draft PR introduces the high level design for
TimeSeriesPredictor.feature_importance
.Additional to-dos:
(stretch) Revisit prediction caching logic to prevent models that don't use features from re-inferringBy submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.