Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax contraints on test set for time series #3071

Merged
merged 4 commits into from
Nov 18, 2021

Conversation

freddyaboulton
Copy link
Contributor

Pull Request Description

Fixes #3070


After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

@codecov
Copy link

codecov bot commented Nov 17, 2021

Codecov Report

Merging #3071 (76fe47e) into main (292d5aa) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #3071     +/-   ##
=======================================
+ Coverage   99.8%   99.8%   +0.1%     
=======================================
  Files        312     312             
  Lines      30421   30433     +12     
=======================================
+ Hits       30330   30343     +13     
+ Misses        91      90      -1     
Impacted Files Coverage Δ
.../pipelines/time_series_classification_pipelines.py 100.0% <100.0%> (ø)
evalml/pipelines/time_series_pipeline_base.py 100.0% <100.0%> (+1.0%) ⬆️
.../tests/pipeline_tests/test_time_series_pipeline.py 99.8% <100.0%> (+0.1%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 292d5aa...76fe47e. Read the comment docs.

Copy link
Contributor

@ParthivNaresh ParthivNaresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid work mate, just left a question about our implementation of _move_index_forward. I think it might be difficult for OS users to understand the limitations of what they can do with forecasting time series when the test set is more than gap observations beyond the training set.

@@ -292,7 +292,7 @@
"You'll notice that in the code snippets here, we use the ``predict_in_sample`` pipeline method as opposed to the usual ``predict`` method. What's the difference?\n",
"\n",
"* ``predict_in_sample`` is used when the target value is known on the dates we are predicting on. This is true in cross validation. This method has an expected ``y`` parameter so that we can compute features using previous target values for all of the observations on the holdout set.\n",
"* ``predict`` is used when the target value is not known, e.g. the test dataset. The y parameter is not expected as only the target is observed in the training set. The test dataset must be separated by ``gap`` days from the training dataset. For the moment, the test set size must match ``forecast_horizon``.\n",
"* ``predict`` is used when the target value is not known, e.g. the test dataset. The y parameter is not expected as only the target is observed in the training set. The test dataset must be separated by ``gap`` days from the training dataset. For the moment, the test set size be less than or equal to ``forecast_horizon``.\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the test set size must be less than..."

@@ -247,8 +247,11 @@ def predict(self, X, objective=None, X_train=None, y_train=None):
"Cannot call predict() on a component graph because the final component is not an Estimator."
)
X = infer_feature_types(X)
X.index = self._move_index_forward(
X_train.index[-X.shape[0] :], self.gap + X.shape[0]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this implementation but I had a question about it. Previously if a user were to set a forecast for indices that were more than gap units ahead of the training data, the second part of the warning above would be relevant: its index needs to start {self.gap + 1} values ahead of the training index.

Screen Shot 2021-11-17 at 5 58 04 PM

Now the same forecast attempt doesn't raise the error, but still sets the forecasts to the end of the training set.

Screen Shot 2021-11-17 at 5 42 34 PM

This is really useful for when the test set starts at 0 but if it continues the index from the end of the training set, this might give a false impression of future predictions. Maybe we should add a check here for test indices that begin more than gap observations away from the training set so the original error can still be raised.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question @ParthivNaresh ! I think the reason why I didn't want the check to be based on the index is that the index will probably not start at gap + 1 units away from the training/validation index.

My reasoning is that the kind of examples we've been working off, where we manually split the same dataset into separate chunks are not reflective of what happens in "real life". Users will give one dataset to AutoMLSearch (that will have index from 0 to X.shape[0] or whatever) and then the test set will come from a different csv or file. When you read that in with pandas the index will start at 0. I feel like it's awkward for users to have to re-index when we can do that calculation for them.

That being said, it does assume the user is uploading data that's temporally right after the dataset given to AutoMLSearch. We can check that via the new date_index parameter that we're now requiring!

I was actually thinking of doing that in a separate PR but the implementation should be small enough to do in this one maybe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I absolutely agree with the intent behind this. Users are almost guaranteed in the vast majority of cases to pass in a separate dataset for testing and forecasting from index 0, so this works great for that.

In the case that a user passes in a test set separately but tries to do something similar to above, where they might try and forecast values starting from [50 : ] as opposed to starting with the first observation, they'll run into a similar issue. But if that's going in as part of a separate PR I'm totally good with leaving it out of this one!

@freddyaboulton freddyaboulton force-pushed the 3070-relax-constraints-test-set-ts branch from 03e071b to 76fe47e Compare November 18, 2021 17:05
@freddyaboulton freddyaboulton merged commit 26d38a4 into main Nov 18, 2021
@freddyaboulton freddyaboulton deleted the 3070-relax-constraints-test-set-ts branch November 19, 2021 02:59
@chukarsten chukarsten mentioned this pull request Nov 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Relax constraints on test set for time series
2 participants