[ENH] splitter that replicates `loc` of another splitter #4851

fkiraly · 2023-07-08T00:58:58Z

This adds a splitter that takes a splitter and some data, and always replicates the same loc references for splits of another splitter.

Related issue and discussion: #4842

More general form of temporal_train_test_split, particularly useful for the case where we want to split X and y, and primarily y, and X can have different indices from y.

This ability partially addresses #4842, FYI @felipeangelimvieira

fkiraly · 2023-07-08T09:40:14Z

this is weird, equality tests fails only on windows, as array type seems to be int64 vs int. Why only on windows?

benHeid

Only one comment, I am also fine if this comment is answered on #4862 since it is exactly the same comment

sktime/forecasting/model_selection/_split.py

fkiraly · 2023-07-13T11:44:21Z

review comments addressed, kindly re-review

This PR adds a composite splitter which takes any splitter and changes its test splits to be the union of respective test plus train split. Related issues and PR: #4842 #4851 #4861

benHeid

Thank you for addressing the comments :)

…not equal length (#4861) This refactors the `evaluate` forecasting benchmark tool and adds features: * refactors the internal `_split` to use the `BaseSplitter` interface instead of separate logic. * to handle `X`, we use the `SameLocSplitter` from #4851 and the `TestPlusTrainSplitter` from #4862 * As #4851 is more general and allows `y` and `X` of different length, this fixes #4842 * new argument to `evaluate` which allows the user to pass the splitter for `X` explicitly as `cv_X`. If not passed, defaults to the `SameLocSplitter`, i.e., `X` split indices are same as from `y`. (requires no deprecation as added as last arg, and default is existing behaviour) * proper docstring Note: this changes behaviour in cases where `y` and `X` had same length but different index set. However, I would contend (is this correct?) that in these cases behaviour was generally unexpected or buggy, so no deprecation is needed. Depends on: #4851 #4862 Related to, and indirectly fixes #4842 Includes MRE from #4842 as an integration test.

split same loc

7385918

fkiraly added module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting enhancement Adding new functionality module:metrics&benchmarking metrics and benchmarking modules labels Jul 8, 2023

fkiraly requested review from achieveordie, benHeid and yarnabrina as code owners July 8, 2023 00:58

let's try without deep_equals

bd6c24c

fkiraly added 2 commits July 12, 2023 00:01

add tag

48d2369

add test

71ff91c

benHeid requested changes Jul 13, 2023

View reviewed changes

sktime/forecasting/model_selection/_split.py Outdated Show resolved Hide resolved

Update _split.py

cd468c6

fkiraly requested a review from benHeid July 13, 2023 11:44

benHeid previously approved these changes Jul 14, 2023

View reviewed changes

fkiraly added 2 commits July 14, 2023 12:42

merge from main manually

ce496f3

Merge branch 'main' into split_sameloc

88fe022

fkiraly dismissed benHeid’s stale review via 88fe022 July 14, 2023 11:43

Merge branch 'main' into split_sameloc

17c21e9

fkiraly merged commit ed5ea2d into main Jul 15, 2023
24 checks passed

fkiraly deleted the split_sameloc branch July 15, 2023 00:59

fkiraly mentioned this pull request Jul 16, 2023

[ENH] Simplified expanding window splitter anchored at time series end #4874

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] splitter that replicates `loc` of another splitter #4851

[ENH] splitter that replicates `loc` of another splitter #4851

fkiraly commented Jul 8, 2023

fkiraly commented Jul 8, 2023

benHeid left a comment

fkiraly commented Jul 13, 2023

benHeid left a comment

[ENH] splitter that replicates loc of another splitter #4851

[ENH] splitter that replicates loc of another splitter #4851

Conversation

fkiraly commented Jul 8, 2023

fkiraly commented Jul 8, 2023

benHeid left a comment

Choose a reason for hiding this comment

fkiraly commented Jul 13, 2023

benHeid left a comment

Choose a reason for hiding this comment

[ENH] splitter that replicates `loc` of another splitter #4851

[ENH] splitter that replicates `loc` of another splitter #4851