Add a first draft of stratification with groups #360
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi all!
Following some conversation in #317 I figured I'd post a first draft of what stratification with grouping might look like. This first draft only handles
group_vfold_cv()
whenbalance = "groups"
.This version requires
strata
to be constant across all groups. If users have a non-constant variable they want to use for stratification, I think it makes sense to force them to create the strata (as I do in the example), because any automated binning implement will probably be non-optimal.This implementation differs from
make_strata()
in thatstrata
doesn't need to be quoted.I'm not sure how you feel about putting the arguments after
...
-- because I only addedstrata
andpool
, and didn't includebreaks
,nunique
, ordepth
, I didn't want to let people pass stratification arguments by position in case those other arguments get added later.Addresses #317