-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
do() drops columns when input is a 0-row grouped_df #625
Comments
This is cropping up in many places in ggvis, for example rstudio/ggvis#281. Perhaps there should at least be an option for |
Some more test cases: df_str <- function(x) str(as.data.frame(x))
dat <- data.frame(x = numeric(0), g = character(0))
grp <- dat %>% group_by(g)
grp %>% do(.) %>% df_str()
grp %>% do(data.frame(y = integer(0))) %>% df_str()
grp %>% do(cbind(., y = integer(0))) %>% df_str()
f <- function() {
cat("Hi!\n")
data.frame()
}
dat %>% filter(FALSE) %>% do(f()) %>% df_str()
grp %>% filter(FALSE) %>% do(f()) %>% df_str() |
A couple more test cases: # Start with no groups, do()
dat <- data.frame(x = numeric(0), g = character(0)) %>% group_by(g)
res <- dat %>% do(identity(.))
expect_true(setequal(names(res), names(dat)))
res <- dat %>% do(blankdf(.))
expect_true(setequal(names(res), c("g", "blank", "blank2")))
# Start with some groups, drop all rows, then do()
# The resulting tbl_df after the filter() has a slightly different structure
# from the one that started with zero rows.
dat <- data.frame(x = 1:4, g = c("a", "c", "a", "b")) %>% group_by(g) %>%
filter(FALSE)
res <- dat %>% do(identity(.))
expect_true(setequal(names(res), names(dat)))
res <- dat %>% do(blankdf(.))
expect_true(setequal(names(res), c("g", "blank", "blank2"))) |
dplyr still drops columns when f <- function(x) 1:4
dat <- mtcars %>% group_by(fcyl = factor(cyl))
# OK
dat %>% do(f = f(.)) %>% names()
# [1] "fcyl" "f"
# Drops cols
dat %>% filter(FALSE) %>% do(f = f(.)) %>% names()
# [1] "fcyl" |
Do you think these tests cover all the possibilities? df_str <- function(x) str(as.data.frame(x))
dat <- data_frame(x = numeric(0), g = character(0))
grp <- dat %>% group_by(g)
emt <- grp %>% filter(FALSE)
dat %>% do(data.frame()) %>% type_sum()
dat %>% do(data.frame(y = integer(0))) %>% type_sum()
dat %>% do(data.frame(.)) %>% type_sum()
dat %>% do(data.frame(., y = integer(0))) %>% type_sum()
dat %>% do(y = ncol(.)) %>% type_sum()
grp %>% do(data.frame()) %>% type_sum()
grp %>% do(data.frame(y = integer(0))) %>% type_sum()
grp %>% do(data.frame(.)) %>% type_sum()
grp %>% do(data.frame(., y = integer(0))) %>% type_sum()
grp %>% do(y = ncol(.)) %>% type_sum()
emt %>% do(data.frame()) %>% type_sum()
emt %>% do(data.frame(y = integer(0))) %>% type_sum()
emt %>% do(data.frame(.)) %>% type_sum()
emt %>% do(data.frame(., y = integer(0))) %>% type_sum()
emt %>% do(y = ncol(.)) %>% type_sum() |
@wch ping |
Looks good to me. |
dat <- data_frame(x = numeric(0), g = character(0))
grp <- dat %>% group_by(g)
emt <- grp %>% filter(FALSE)
dat %>% do(data.frame()) %>% type_sum() %>%
expect_equal(character())
dat %>% do(data.frame(y = integer(0))) %>% type_sum() %>%
expect_equal(c(y = "int"))
dat %>% do(data.frame(.)) %>% type_sum() %>%
expect_equal(c(x = "dbl", g = "chr"))
dat %>% do(data.frame(., y = integer(0))) %>% type_sum() %>%
expect_equal(c(x = "dbl", g = "chr", y = "int"))
dat %>% do(y = ncol(.)) %>% type_sum() %>%
expect_equal(c(y = "list"))
# Grouped data frame should have same col types as ungrouped, with addition
# of grouping variable
grp %>% do(data.frame()) %>% type_sum() %>%
expect_equal(c(g = "chr"))
grp %>% do(data.frame(y = integer(0))) %>% type_sum() %>%
expect_equal(c(g = "chr", y = "int"))
grp %>% do(data.frame(.)) %>% type_sum() %>%
expect_equal(c(x = "dbl", g = "chr"))
grp %>% do(data.frame(., y = integer(0))) %>% type_sum() %>%
expect_equal(c(x = "dbl", g = "chr", y = "int"))
grp %>% do(y = ncol(.)) %>% type_sum() %>%
expect_equal(c(g = "chr", y = "list"))
# A empty grouped dataset should have same types as grp
emt %>% do(data.frame()) %>% type_sum() %>%
expect_equal(c(g = "chr"))
emt %>% do(data.frame(y = integer(0))) %>% type_sum() %>%
expect_equal(c(g = "chr", y = "int"))
emt %>% do(data.frame(.)) %>% type_sum() %>%
expect_equal(c(x = "dbl", g = "chr"))
emt %>% do(data.frame(., y = integer(0))) %>% type_sum() %>%
expect_equal(c(x = "dbl", g = "chr", y = "int"))
emt %>% do(y = ncol(.)) %>% type_sum() %>%
expect_equal(c(g = "chr", y = "list")) Currently the only failures are with named inputs |
For the first test, perhaps you could use this so that it gets a named char vector: dat %>% do(data.frame()) %>% type_sum() %>%
expect_equal(c(x="chr")[0]) |
Looks good to me! |
For example, the input here has two columns (and zero rows), but the output has only one column:
In my particular case, I would like it to simply return the input unchanged, but I'm not sure what the correct output should be, in general.
Since the function isn't actually called on zero-row data, there's no way of knowing what columns it would return. Maybe it should be called once with the zero-row data frame? In this case, that would be equivalent to doing
dat %>% group_by(g) %>% identity()
.The text was updated successfully, but these errors were encountered: