Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] splitter that replicates loc of another splitter #4851

Merged
merged 8 commits into from Jul 15, 2023
Merged

Conversation

fkiraly
Copy link
Collaborator

@fkiraly fkiraly commented Jul 8, 2023

This adds a splitter that takes a splitter and some data, and always replicates the same loc references for splits of another splitter.

Related issue and discussion: #4842

More general form of temporal_train_test_split, particularly useful for the case where we want to split X and y, and primarily y, and X can have different indices from y.

This ability partially addresses #4842, FYI @felipeangelimvieira

@fkiraly fkiraly added module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting enhancement Adding new functionality module:metrics&benchmarking metrics and benchmarking modules labels Jul 8, 2023
@fkiraly
Copy link
Collaborator Author

fkiraly commented Jul 8, 2023

this is weird, equality tests fails only on windows, as array type seems to be int64 vs int. Why only on windows?

Copy link
Contributor

@benHeid benHeid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one comment, I am also fine if this comment is answered on #4862 since it is exactly the same comment

sktime/forecasting/model_selection/_split.py Outdated Show resolved Hide resolved
@fkiraly fkiraly requested a review from benHeid July 13, 2023 11:44
@fkiraly
Copy link
Collaborator Author

fkiraly commented Jul 13, 2023

review comments addressed, kindly re-review

benHeid pushed a commit that referenced this pull request Jul 14, 2023
This PR adds a composite splitter which takes any splitter and changes
its test splits to be the union of respective test plus train split.

Related issues and PR:
#4842
#4851
#4861
benHeid
benHeid previously approved these changes Jul 14, 2023
Copy link
Contributor

@benHeid benHeid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing the comments :)

@fkiraly fkiraly merged commit ed5ea2d into main Jul 15, 2023
24 checks passed
@fkiraly fkiraly deleted the split_sameloc branch July 15, 2023 00:59
fkiraly added a commit that referenced this pull request Jul 15, 2023
…not equal length (#4861)

This refactors the `evaluate` forecasting benchmark tool and adds
features:

* refactors the internal `_split` to use the `BaseSplitter` interface
instead of separate logic.
* to handle `X`, we use the `SameLocSplitter` from
#4851 and the
`TestPlusTrainSplitter` from #4862
* As #4851 is more general and allows `y` and `X` of different length,
this fixes #4842
* new argument to `evaluate` which allows the user to pass the splitter
for `X` explicitly as `cv_X`. If not passed, defaults to the
`SameLocSplitter`, i.e., `X` split indices are same as from `y`.
(requires no deprecation as added as last arg, and default is existing
behaviour)
* proper docstring

Note: this changes behaviour in cases where `y` and `X` had same length
but different index set. However, I would contend (is this correct?)
that in these cases behaviour was generally unexpected or buggy, so no
deprecation is needed.

Depends on:
#4851
#4862

Related to, and indirectly fixes
#4842

Includes MRE from #4842 as an integration test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding new functionality module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting module:metrics&benchmarking metrics and benchmarking modules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants