New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] fix temporal_train_test_split
for hierarchical and panel data in case where fh
is not passed
#5330
Conversation
…_split-hierarchical
…_split-hierarchical
not sure why this fails - perhaps undocumented part of contract? |
are you referring to the fh argument? I think the original author may not follow the convention of writing the docstring but he/she did specify the fh's expected input type. |
no, to the failure that seems to arise from sth numpy |
works now - re-requested review @hazrulakmal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_size=test_size, | ||
train_size=train_size, | ||
) | ||
return y_train, y_test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably does not need an else block, since if block contains a return already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the else block made the condition easier to spot for a reader of the code, because it matches the indentation and can be tracked up to the if
condition.
If we remove the else
, the if
condition will be harder to find.
This is my rationale (in general) to do this even if you have if
-return
, but I'm not too strongly opined about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For instance, I did the opposite with if fh is not None:
, because the else
was very long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a change to the code flow, and added comments - better? I went with your suggestion, but added comments to the fh
and the X
branching to make it clear to the reader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a non-blocking comment really. Usually tools like pylint
will complain against these (ref. R1705).
The change looks good to me. pylint
or etc. may still complain that function return signature vary between different return statements (ref. R1710), but I don't think there's a way to solve that here at all.
Hm, we should also perhaps add tests for this? The part that solves this imo is as I claimed in the PR preamble: "The issue with the panel data is solved by recurring to the existing broadcasting boilerplate of In more detail, It does not resolve #4968, as this deals only with the case where
To point you to the precise line, there are multiple of relevance, all in
|
temporal_train_test_split
for hierarchical and panel pandas
datatemporal_train_test_split
for hierarchical and panel data
temporal_train_test_split
for hierarchical and panel datatemporal_train_test_split
for hierarchical and panel data in case where fh
is not passed
* origin/main: (24 commits) [MNT] skpro as a soft dependency (sktime#5273) [DOC] Correct code block formatting for pre-commit install command (sktime#5377) [ENH] parallelization backend calls in utility module - part 2, backend parameter passing (sktime#5311) [MNT] [Dependabot](deps): Bump stefanzweifel/git-auto-commit-action from 4 to 5 (sktime#5373) [MNT] python 3.12 compatibility - replace `distutils` calls with equivalent functionality (sktime#5376) [DOC] `sktime` intro notebook (sktime#3793) [ENH] Skip unnecessary fit in `ForecastX` if inner `forecaster_y` ignores `X` (sktime#5353) [ENH] refactor parallelization backend calls in utility module (sktime#5268) [ENH] ARCH model (arch package) Issue sktime#2173 (sktime#5326) [ENH] add error message return to `deep_equals` assert in `test_reconstruct_identical` (sktime#4927) [DOC] Added all feature names to docstring for DateTimeFeatures class (sktime#5283) [BUG] fix `STLForecaster` tag `ignores-exogenous-X` to be correctly set for composites (sktime#5365) [BUG] Fix inconsistent date/time index in `plot_windows` sktime#4919 (sktime#5321) &benshamza, fkiraly [BUG] Fix `numba` errors when calling `tslearn` `lcss` (sktime#5368) [MNT] Dataset file restructure (sktime#5239) [BUG] fix `temporal_train_test_split` for hierarchical and panel data in case where `fh` is not passed (sktime#5330) [MNT] [Dependabot](deps): Bump styfle/cancel-workflow-action from 0.11.0 to 0.12.0 (sktime#5355) [ENH] minor fixes to `NaiveAligner` (sktime#5344) [DOC] Improve Readability of Notebook 2 - Classification, Regression & Clustering (sktime#5312) [DOC] Documented ax argument and the figure in plot_series (sktime#5325) ...
…recasting * origin/split-ci: (25 commits) added flags to codecov [MNT] skpro as a soft dependency (sktime#5273) [DOC] Correct code block formatting for pre-commit install command (sktime#5377) [ENH] parallelization backend calls in utility module - part 2, backend parameter passing (sktime#5311) [MNT] [Dependabot](deps): Bump stefanzweifel/git-auto-commit-action from 4 to 5 (sktime#5373) [MNT] python 3.12 compatibility - replace `distutils` calls with equivalent functionality (sktime#5376) [DOC] `sktime` intro notebook (sktime#3793) [ENH] Skip unnecessary fit in `ForecastX` if inner `forecaster_y` ignores `X` (sktime#5353) [ENH] refactor parallelization backend calls in utility module (sktime#5268) [ENH] ARCH model (arch package) Issue sktime#2173 (sktime#5326) [ENH] add error message return to `deep_equals` assert in `test_reconstruct_identical` (sktime#4927) [DOC] Added all feature names to docstring for DateTimeFeatures class (sktime#5283) [BUG] fix `STLForecaster` tag `ignores-exogenous-X` to be correctly set for composites (sktime#5365) [BUG] Fix inconsistent date/time index in `plot_windows` sktime#4919 (sktime#5321) &benshamza, fkiraly [BUG] Fix `numba` errors when calling `tslearn` `lcss` (sktime#5368) [MNT] Dataset file restructure (sktime#5239) [BUG] fix `temporal_train_test_split` for hierarchical and panel data in case where `fh` is not passed (sktime#5330) [MNT] [Dependabot](deps): Bump styfle/cancel-workflow-action from 0.11.0 to 0.12.0 (sktime#5355) [ENH] minor fixes to `NaiveAligner` (sktime#5344) [DOC] Improve Readability of Notebook 2 - Classification, Regression & Clustering (sktime#5312) ...
* origin/split-ci: (25 commits) added flags to codecov [MNT] skpro as a soft dependency (sktime#5273) [DOC] Correct code block formatting for pre-commit install command (sktime#5377) [ENH] parallelization backend calls in utility module - part 2, backend parameter passing (sktime#5311) [MNT] [Dependabot](deps): Bump stefanzweifel/git-auto-commit-action from 4 to 5 (sktime#5373) [MNT] python 3.12 compatibility - replace `distutils` calls with equivalent functionality (sktime#5376) [DOC] `sktime` intro notebook (sktime#3793) [ENH] Skip unnecessary fit in `ForecastX` if inner `forecaster_y` ignores `X` (sktime#5353) [ENH] refactor parallelization backend calls in utility module (sktime#5268) [ENH] ARCH model (arch package) Issue sktime#2173 (sktime#5326) [ENH] add error message return to `deep_equals` assert in `test_reconstruct_identical` (sktime#4927) [DOC] Added all feature names to docstring for DateTimeFeatures class (sktime#5283) [BUG] fix `STLForecaster` tag `ignores-exogenous-X` to be correctly set for composites (sktime#5365) [BUG] Fix inconsistent date/time index in `plot_windows` sktime#4919 (sktime#5321) &benshamza, fkiraly [BUG] Fix `numba` errors when calling `tslearn` `lcss` (sktime#5368) [MNT] Dataset file restructure (sktime#5239) [BUG] fix `temporal_train_test_split` for hierarchical and panel data in case where `fh` is not passed (sktime#5330) [MNT] [Dependabot](deps): Bump styfle/cancel-workflow-action from 0.11.0 to 0.12.0 (sktime#5355) [ENH] minor fixes to `NaiveAligner` (sktime#5344) [DOC] Improve Readability of Notebook 2 - Classification, Regression & Clustering (sktime#5312) ...
This PR fixes #5107 by replacing the inner workings of
temporal_train_test_split
with a splitter, making the design both DRY and reliant on splitter boilerplate.This is done as follows:
TemporalTrainTestSplitter
, which encapsulates thetemporal_train_test_split
logicTemporalTrainTestSplitter
is a re-implementation of thetemporal_train_test_split
logic as a splitter class, giving access to theBaseSplitter
boilerplate.The issue with the panel data is solved by recurring to the existing broadcasting boilerplate of
BaseSplitter
.Depends on #5332 for testing.