[BUG] Delay trimming in ForecastingGridSearchCV until after transforming #3132

miraep8 · 2022-07-29T17:29:40Z

@anthonygiorgio97 noticed + helpfully pointed out a strange bug that came up when using Differencer within TransformedTargetForecaster in ForecastingGridSearchCV. (see #2807 and #2880)

From what I could tell this bug was caused by the fact that Differencer reduces the size of X. Usually this isn't a problem, but because ForecastingGridSearchCV was pre-trimming all X input before passing it into the underlying forecaster, X ended up being too small by the time it got to the final forecaster and was causing problems.

My proposed solution is thus simply - avoid pre-trimming X in the _split function and instead trim it to the right size right before the actual forecasting step. (I do this specifically in _predict_last_window, but there is a decent chance it might need to be changed elsewhere as well).

miraep8 · 2022-07-29T19:59:27Z

As a note -tests are currently mostly passing, (minus weird CI/CD bug error) but I only because I have commented out one of the tests in evaluate though. I did this because not trimming X was causing some issues in ARIMA. Personally I think it makes more sense to test evaluate on a grid search object than ARIMA, (as my understanding is that is what evaluate was designed to test) so my plan is to restructure that test with something other than ARIMA. (Also need to update the docs to make sure that its clear we are no longer trimming the same way as before)

fkiraly · 2022-07-30T22:13:49Z

@miraep8, disagreed, evaluate needs to work with any estimator, and non-composite ones first, composite ones like grid search second.

There is nothing specific about grid search in the context of evaluate, even though both internally do temporal re-sampling.

I think your understanding of what evaluate does might be incorrect, hence?

It does evaluation of an estimator, not tuning. Tuning and evaluation are two different things, although they are frequently confused.

miraep8 · 2022-07-31T14:59:10Z

Yep, I do think I thought it was a tuning specific thing, thanks for catching that! :) I will put forward another solution that doesn't break evaluate for other forecasters.

fkiraly · 2022-08-06T21:39:41Z

checks seem to pass - is this ready?

fkiraly

Hm, this might solve the problem, but it introduces other problems in its stead:

the grid search gets a problematic trim_X argument, which changes nothing about the "formal" behaviour of the algorithm
evaluate gets the same argument, it does not seem to add to the specification semantics

Why I think this is a problem: think of the user journey. There does not seem to be a clear use of that parameter, as in expressing what the user wants. Instead, it is a fiddle/hack/magic argument, that the user needs to "know" how to set.

Also, it seems to introduce behavioural coupling between the grid search estimators and the evaluation function, not a good idea as it dilutes separated interfaces.

fkiraly · 2022-08-06T23:31:59Z

I would suggest instead: think along the lines of "all estimators do XYZ", don't introduce parameters that break the strategy pattern, or that are not directly useful as a clear element of semantic specification.

miraep8 · 2022-08-07T00:47:06Z

I agree @fkiraly ! Thanks for looking this over + feedback

fkiraly · 2022-08-10T19:49:24Z

Hm, looks simple enough, and localized.

Could we add a test that fails currently, but would be made to succeed after this change, i.e., a representative for the issue fixed?

fkiraly · 2022-08-15T15:47:24Z

Hmm, the test that you introduced does not break on main - so it isn´t representative for the issue?
(a useful test, so let´s keep it, but not a representative)

fkiraly · 2022-08-15T20:54:33Z

... does it break now?

miraep8 · 2022-08-21T14:40:38Z

Had been trying some experimental work to delay trimming for both X and y, but now planning to switch into separate PRs.

fkiraly · 2022-08-22T21:44:47Z

Let us know when you feel this is ready for review.

miraep8 · 2022-08-22T22:33:52Z

I would appreciate a look at it now!

fkiraly

Looks good to me!

In the end, not too much of a change, but of course the difficulty lies in the "where".

making naive fix and remove one test

702b519

miraep8 added 3 commits August 1, 2022 07:32

targeted attempt

5a34bf3

remove commented out section

b50c6fe

Merge branch 'main' into forecastinggridcv_pipeline_bug

e1a1eff

Adding more user control over trim_X

5d5f928

miraep8 marked this pull request as ready for review August 6, 2022 23:06

miraep8 requested review from fkiraly and aiwalter as code owners August 6, 2022 23:06

fkiraly requested changes Aug 6, 2022

View reviewed changes

miraep8 added 2 commits August 8, 2022 15:19

trying original solution after ARIMA fix

5a69fe9

removing missed trim_X

3ee8ed5

Adding test with dimension changing Transformer

797e8b3

update Differencer parameters

466b654

extending to y_test as well (Transformed Target Forecaster should need)

a934f07

miraep8 added 3 commits August 22, 2022 10:25

removing effort to fix y

04cf4bc

merge main into forecastinggridcsbug

9b351ba

changing score function

fcff6e0

miraep8 removed the request for review from aiwalter August 23, 2022 11:41

miraep8 requested a review from fkiraly August 23, 2022 11:41

fkiraly approved these changes Aug 23, 2022

View reviewed changes

fkiraly merged commit 8a18f71 into sktime:main Aug 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Delay trimming in ForecastingGridSearchCV until after transforming #3132

[BUG] Delay trimming in ForecastingGridSearchCV until after transforming #3132

miraep8 commented Jul 29, 2022

miraep8 commented Jul 29, 2022

fkiraly commented Jul 30, 2022 •

edited

miraep8 commented Jul 31, 2022

fkiraly commented Aug 6, 2022

fkiraly left a comment •

edited

fkiraly commented Aug 6, 2022

miraep8 commented Aug 7, 2022

fkiraly commented Aug 10, 2022

fkiraly commented Aug 15, 2022 •

edited

fkiraly commented Aug 15, 2022

miraep8 commented Aug 21, 2022

fkiraly commented Aug 22, 2022

miraep8 commented Aug 22, 2022

fkiraly left a comment

[BUG] Delay trimming in ForecastingGridSearchCV until after transforming #3132

[BUG] Delay trimming in ForecastingGridSearchCV until after transforming #3132

Conversation

miraep8 commented Jul 29, 2022

miraep8 commented Jul 29, 2022

fkiraly commented Jul 30, 2022 • edited

miraep8 commented Jul 31, 2022

fkiraly commented Aug 6, 2022

fkiraly left a comment • edited

Choose a reason for hiding this comment

fkiraly commented Aug 6, 2022

miraep8 commented Aug 7, 2022

fkiraly commented Aug 10, 2022

fkiraly commented Aug 15, 2022 • edited

fkiraly commented Aug 15, 2022

miraep8 commented Aug 21, 2022

fkiraly commented Aug 22, 2022

miraep8 commented Aug 22, 2022

fkiraly left a comment

Choose a reason for hiding this comment

fkiraly commented Jul 30, 2022 •

edited

fkiraly left a comment •

edited

fkiraly commented Aug 15, 2022 •

edited