Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lower threshold for strata pooling #149

Merged
merged 6 commits into from May 19, 2020
Merged

Lower threshold for strata pooling #149

merged 6 commits into from May 19, 2020

Conversation

juliasilge
Copy link
Member

This PR lowers the threshold for strata pooling from 15% of the total to 10% of the total. It also adds info that this pooling is happening to the .Rd files.

This new threshold seems to solve the problems users have reported. For example:

library(tidyverse)
library(rsample)

X <- matrix(rnorm(140 * 100), ncol = 100, nrow = 140)
y <- rep(letters[1:7], each = 20)

df <- tibble(X) %>%
  mutate(label = y)

vfold_cv(df, v = 3, strata = label)
#> #  3-fold cross-validation using stratification 
#> # A tibble: 3 x 2
#>   splits          id   
#>   <named list>    <chr>
#> 1 <split [91/49]> Fold1
#> 2 <split [91/49]> Fold2
#> 3 <split [98/42]> Fold3

Created on 2020-05-07 by the reprex package (v0.3.0)

Notice there is no longer the warning about "too little data to stratify".

This PR closes #86, #91, and #110.

@juliasilge
Copy link
Member Author

GH actions are failing because no tidyr

@juliasilge juliasilge requested a review from topepo May 8, 2020 22:00
@github-actions
Copy link

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Feb 21, 2021
@hfrick hfrick deleted the new-strata-threshold branch September 17, 2021 14:32
@hfrick hfrick restored the new-strata-threshold branch September 17, 2021 14:32
@hfrick hfrick deleted the new-strata-threshold branch September 17, 2021 14:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stratum sample sizes aren't maintained using rsample bootstraps() function.
2 participants