Skip to content

Stratification in grouped resampling #317

@mikemahoney218

Description

@mikemahoney218

Feature

As part of closing #207, we've recently implemented a number of grouping functions, with group_mc_cv() (#313), group_initial_split() and group_validation_split() (#315), and group_bootstraps() (#316).

Right now, none of these functions support stratification -- which would be useful if, for instance, you had repeated measurements of a number of patients and needed to stratify by outcome. We haven't included this partially so that we could implement grouped resampling quickly, but also because we aren't exactly sure what people would expect stratification to do when resampling by groups. Specific questions include:

  • How should strata be determined when the stratification variable isn't constant within a group? Median, mode, user-provided functions? What's a good default option?
  • What rules can we use to determine when a (group x strata) needs to be pooled with others?

If anyone has any thoughts on what they'd expect stratification to do in grouping functions, let us know here!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions