-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thinking about bootstrap grouping #269
Comments
Might be useful to have:
So that internally we can potentially do something different. Although I guess we can use the For Alternatively we could materialise directly, but then I'm not sure what would the grouping mean. We could materialise data for each group adjacently in memory, which can be interesting too.
|
I'm not certain about this, but in answering this Stack Overflow question, I believe that I have discovered a bug in this function. Here are the steps to reproduce the bug: library(dplyr)
mboot <- bootstrap(mtcars, 10)
bootstrap(mtcars, 3) %>% do(data.frame(x=1:2))
# Error: index out of bounds And here is the proposed fix: bootstrap <- function(df, m) {
n <- nrow(df)
attr(df, "indices") <- replicate(m, sample(n, replace = TRUE),
simplify = FALSE)
attr(df, "drop") <- TRUE
attr(df, "group_sizes") <- rep(n, m)
attr(df, "biggest_group_size") <- n
attr(df, "labels") <- data.frame(replicate = 1:m)
attr(df, "vars") <- list(quote(replicate)) # Change
# attr(df, "vars") <- list(quote(boot)) # list(substitute(bootstrap(m)))
class(df) <- c("grouped_df", "tbl_df", "tbl", "data.frame")
df
} Which fixes the above case: bootstrap(mtcars, 3) %>% do(data.frame(x=1:2))
# Source: local data frame [6 x 2]
# Groups: replicate
# replicate x
# 1 1 1
# 2 1 2
# 3 2 1
# 4 2 2
# 5 3 1
# 6 3 2 |
It looks like there's another small bug: since a grouped_df's indices are 0-indexed, not 1-indexed,
should be
Otherwise
and
|
Now think this should go in separate partition package |
The text was updated successfully, but these errors were encountered: