New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update ProphetModel
to handle external timestamp
#203
Conversation
# Conflicts: # tests/test_models/test_inference/test_forecast.py # tests/test_models/test_inference/test_predict.py
🚀 Deployed on https://deploy-preview-203--etna-docs.netlify.app |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## unaligned-data #203 +/- ##
=================================================
Coverage ? 89.88%
=================================================
Files ? 198
Lines ? 13231
Branches ? 0
=================================================
Hits ? 11893
Misses ? 1338
Partials ? 0 ☔ View full report in Codecov by Sentry. |
if not pd.api.types.is_datetime64_dtype(df[self.timestamp_column]): | ||
raise ValueError("Invalid timestamp_column! Only datetime type is supported.") | ||
|
||
if len(df[self.timestamp_column]) >= 3 and pd.infer_freq(df[self.timestamp_column]) is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't check that frequency is always the same. For example, it works fine if we have one frequency for train and for test. In theory, prophet can work fine even if there no regular frequency, but I'm not sure should we support this case or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could probably infer frequency during train, for example, and check if it is as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should check the freq and give a warning if the freq for the train is different from the freq for the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont like the idea of warning. It will be thrown in every per-segment model, which seems too much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a bad idea to have different frequencies, so the only option is to fail in this situation.
# Conflicts: # CHANGELOG.md # tests/test_models/test_inference/test_forecast.py # tests/test_models/test_inference/test_predict.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we don't add this code to tests/conftest.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, it seems specific for inference tests. I'll think about moving it higher.
tests/test_models/test_prophet.py
Outdated
@@ -15,14 +16,26 @@ | |||
from tests.test_models.utils import assert_sampling_is_valid | |||
|
|||
|
|||
@pytest.fixture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code duplication (like in tests/test_models/test_inference/conftest.py)
@d-a-bunin What do you think if Prophet would pick timestamp implicitly? And we should specify this implicit logic |
Where do you think this could be useful? I can see it useful for working with pre-defined models on different datasets, e.g. in auto-ml there set of configs is pre-defined.
We could make smth simple (like it works now) and easy-to-improve and then improve it according to our needs later. |
No, if we choose random one
It's just simpler to support. You've changed a lot of code here. Do we really want to change so many places just for supporting corner case?
Because we already have similiar beahivour for other transforms |
It works unpredictable, I don't think it is a good idea. Moreover, you suggested selecting the first one in the first message.
I don't think so. We have to make the same code changes + add logic for automatic column detection, which selects Current solution could be extended into automatic in the future by adding, e.g. parameter |
# Conflicts: # tests/test_models/test_inference/test_forecast.py
Before submitting (must do checklist)
Proposed Changes
Look #186.
Closing issues
Closes #186.