Skip to content

Improve time series data splitting#3616

Merged
eccabay merged 22 commits into
mainfrom
3287_ts_eval
Jul 22, 2022
Merged

Improve time series data splitting#3616
eccabay merged 22 commits into
mainfrom
3287_ts_eval

Conversation

@eccabay

@eccabay eccabay commented Jul 18, 2022

Copy link
Copy Markdown
Contributor

Closes #3287

@eccabay eccabay changed the title 3287 ts eval Improve time series data splitting Jul 18, 2022
@codecov

codecov Bot commented Jul 18, 2022

Copy link
Copy Markdown

Codecov Report

Merging #3616 (bd42deb) into main (4672902) will increase coverage by 0.1%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##            main   #3616     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        335     335             
  Lines      33504   33508      +4     
=======================================
+ Hits       33382   33386      +4     
  Misses       122     122             
Impacted Files Coverage Δ
evalml/data_checks/ts_parameters_data_check.py 100.0% <ø> (ø)
evalml/tests/automl_tests/test_automl.py 99.5% <ø> (ø)
.../automl_tests/test_automl_search_classification.py 96.4% <ø> (ø)
...valml/tests/automl_tests/test_time_series_split.py 100.0% <ø> (ø)
...data_checks_tests/test_ts_parameters_data_check.py 100.0% <ø> (ø)
evalml/automl/engine/engine_base.py 100.0% <100.0%> (ø)
.../preprocessing/data_splitters/time_series_split.py 96.7% <100.0%> (ø)
.../integration_tests/test_time_series_integration.py 100.0% <100.0%> (ø)
evalml/tests/utils_tests/test_gen_utils.py 100.0% <100.0%> (ø)
evalml/utils/gen_utils.py 99.3% <100.0%> (+0.1%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4672902...bd42deb. Read the comment docs.

@eccabay eccabay marked this pull request as ready for review July 21, 2022 14:42

@chukarsten chukarsten left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, just some minor nits. I think I get how this is working...it seems like a super minor change made, passing the forecast horizon into the TSTimeSplit had major results. Do we need to update the docs at all?

self.n_splits = n_splits
self._splitter = SkTimeSeriesSplit(n_splits=n_splits)
self._splitter = SkTimeSeriesSplit(
n_splits=n_splits, test_size=forecast_horizon or None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol what's going on with forecast_horizon? docstring says int, here it's None. CAn't we just pass forecast_horizon here? Seems the or None is a little redundant as it won't do anything to catch non-integer inputs

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was an oversight on my part. There was an issue with the forecast horizon defaulting to 1 caused issues so this was a workaround, but it's not actually necessary after updating the default value. I'll fix this.

@pytest.mark.parametrize(
"gap,max_delay,forecast_horizon,n_splits",
[[7, 3, 1, 4], [0, 3, 2, 3], [1, 1, 1, 4]],
[[7, 3, 1, 5], [0, 8, 2, 3], [5, 4, 2, 4]],

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I can read these changes and know if they're correct.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I refactor this to make it clearerer?

@jeremyliweishih jeremyliweishih left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - some suggestions for improvements and some clarification questions. @eccabay and I spoke offline about the changes to the forecast horizon in the tests with the root cause being having a forecast_horizon of 1 errors out on scoring for AUC since it requires more than 1 class. The proposed solution is to remove AUC as a default objective for TS classification and @eccabay will file an issue to track.

Comment thread docs/source/release_notes.rst Outdated
Comment thread evalml/tests/automl_tests/test_automl.py Outdated
self.max_delay = max_delay
self.gap = gap
self.forecast_horizon = forecast_horizon
self.forecast_horizon = forecast_horizon if forecast_horizon else 1

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would we want 1 if forecast_horizon isn't passed? or is there a scenario where forecast_horizon isn't passed? If there isn't I would argue to make horizon a required parameter or we select a better default option other than 1 by looking at the time frequency for the passed in data. Could be a low priority followup issue since we shouldn't be using it without forecast_horizon anyways!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This situation is a little tricky, but I think we should keep forecast_horizon as an optional parameter.

Previously, the test size was None for scikit's TimeSeriesSplit, so it defaulted to n_samples // (n_splits + 1). The only reason forecast_horizon was used in this function was to validate the split size was large enough.

Notice here that self.forecast_horizon gets set to 1 if forecast_horizon is None. However, forecast_horizon and not self.forecast_horizon is what's passed in to the SkTimeSeriesSplit. This way, if no forecast horizon is passed in, we maintain our historical behavior and only update the split size when a forecast horizon is explicitly set. Does that make sense?

Comment thread evalml/tests/conftest.py Outdated
Comment thread evalml/utils/gen_utils.py
@eccabay eccabay enabled auto-merge (squash) July 22, 2022 18:40
@eccabay eccabay disabled auto-merge July 22, 2022 18:53
@eccabay eccabay enabled auto-merge (squash) July 22, 2022 19:47
@eccabay eccabay merged commit 54b53b0 into main Jul 22, 2022
@eccabay eccabay deleted the 3287_ts_eval branch July 22, 2022 20:02
@chukarsten chukarsten mentioned this pull request Jul 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve Time Series Evaluation Algorithm

4 participants