-
Notifications
You must be signed in to change notification settings - Fork 862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[timeseries] Add native support for missing values #3995
Conversation
|
1 similar comment
|
bdf53bc
to
640f677
Compare
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall thanks a lot for this! Dropped a few minor comments and questions.
timeseries/src/autogluon/timeseries/models/abstract/abstract_timeseries_model.py
Outdated
Show resolved
Hide resolved
timeseries/src/autogluon/timeseries/models/abstract/abstract_timeseries_model.py
Show resolved
Hide resolved
timeseries/src/autogluon/timeseries/models/abstract/abstract_timeseries_model.py
Show resolved
Hide resolved
timeseries/src/autogluon/timeseries/models/local/abstract_local_model.py
Show resolved
Hide resolved
timeseries/src/autogluon/timeseries/models/local/abstract_local_model.py
Show resolved
Hide resolved
timeseries/src/autogluon/timeseries/models/autogluon_tabular/mlforecast.py
Show resolved
Hide resolved
|
Job PR-3995-7ff79e4 is done. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Only one comment.
if known_covariates_names == []: | ||
assert known_covariates_transformed is None | ||
else: | ||
assert not known_covariates_transformed[known_covariates_names].isna().any(axis=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great and I'll probably use it for feature importance.
Should we also test the filling logic, at least for the 'median' and 'mode' scenarios?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, added a test for that in test_learner
to ensure that this logic works after loading from disk
7ff79e4
to
bdf8f67
Compare
|
bdf8f67
to
b055879
Compare
|
1 similar comment
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @canerturkmen, I've added the tests for mode/median imputation + added missing values to DUMMY_TS_DATAFRAME
so that we consider more settings with NaNs
@@ -46,7 +46,7 @@ ignore-words-list = 'mape,ans,2st,fo,nd,te,fpr,coo,rouge' | |||
|
|||
|
|||
[tool.ruff] | |||
ignore = [ | |||
lint.ignore = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes a deprecation warning from ruff
if known_covariates_names == []: | ||
assert known_covariates_transformed is None | ||
else: | ||
assert not known_covariates_transformed[known_covariates_names].isna().any(axis=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, added a test for that in test_learner
to ensure that this logic works after loading from disk
@canerturkmen just finished some benchmarking on 12 datasets with missing values / 3 folds each: This PR branch has 65% win rate vs. current |
|
9122bbc
to
51aeea5
Compare
@@ -86,25 +86,6 @@ def test_when_local_model_saved_then_local_model_args_are_saved(model_class, hyp | |||
assert dict_equal_primitive(model._local_model_args, loaded_model._local_model_args) | |||
|
|||
|
|||
@pytest.mark.parametrize("model_class", TESTABLE_MODELS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have the same test here https://github.com/autogluon/autogluon/blob/master/timeseries/tests/unittests/models/test_models.py#L269.
|
|
Job PR-3995-e50709a is done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 🚀
Issue #, if available: fixes #3886
Description of changes:
target
column inTimeSeriesPredictor._check_and_prepare_data_frame
, we let each model use its own logic for handling missing values._get_tags()
mechanismTimeSeriesPredictor
now removes time series consisting of only NaN values fromtrain_data
duringfit()
TimeSeriesFeatureGenerator
DUMMY_TS_DATAFRAME
used in the tests to ensure that NaN support works in all scenariosTo do:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.