Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
[MRG+1] ENH: added max_train_size to TimeSeriesSplit #8282
What does this implement/fix? Explain your changes.
This adds a parameter
Any other comments?
There is one corner case where the size of the first train fold is smaller than the
@@ Coverage Diff @@ ## master #8282 +/- ## ========================================== + Coverage 94.75% 94.75% +<.01% ========================================== Files 342 342 Lines 60809 60920 +111 ========================================== + Hits 57617 57726 +109 - Misses 3192 3194 +2
I would have considered writing tests that avoided writing out each split, but instead checked something like the following invariant: test set is same as without
max_train_size, train with
max_train_size is a suffix of train without
max_train_size but limited to that length.
But this looks fine to me apart from those nitpicks.