-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failing test with next version of dtplyr #17
Comments
Ooooooh, the problem is that dtplyr always creates a unique name for the intermediate table it now needs for the grouped filter: _DT10 <- `_DT2`[between(distance, 200, 300) & !is.na(air_time)]
head(`_DT10`[`_DT10`[, .I[sum(!is.na(flight)) > 3000], by = .(origin,
dest)]$V1, .(num_flts = sum(!is.na(flight)), dist = round(mean(distance,
na.rm = TRUE)), avg_delay = round(mean(arr_delay, na.rm = TRUE))),
keyby = .(origin, dest)][order(desc(num_flts), desc(avg_delay))],
n = 100L) And since the name is session unique, it's incremented between the two tests. Maybe I can make the name unique with in a pipeline. |
I'm having difficulties figuring out how to do this better because independent pipelines might get combined into one with a |
@hadley Thanks for looking into this. In these tests, I suppose I should have used I just committed that change (7aa7dbb), but now I'm getting some strange test errors. I'll try to debug. https://travis-ci.org/github/ianmcook/tidyquery/builds/757640217#L3025 |
Here's a minimal reprex of the type of failure I'm seeing after adding If set up a testthat file like this: library(dtplyr)
library(dplyr)
iris_dt <- lazy_dt(iris)
test_that("testthat works with dtplyr", {
iris_dt %>% select(Sepal.Length) %>% as.data.frame()
succeed()
}) then I run
If I just run the code outside of |
The traceback is: 9: `[.data.frame`(x, i, j)
8: `[.data.table`(`_DT12`, , .(Sepal.Length))
7: `_DT12`[, .(Sepal.Length)]
6: eval_tidy(quo) at tidyeval.R#11
5: dt_eval(x) at step.R#144
4: as.data.frame(dt_eval(x)) at step.R#144
3: as.data.frame.dtplyr_step(.)
2: as.data.frame(.)
1: iris_dt %>% select(Sepal.Length) %>% as.data.frame() So I don't know why this behaviour would have changed, but I suspect it's caused by this commit: tidyverse/dtplyr@5f8d51b. |
Yeah, with cedta decided 'tidyquery' wasn't data.table aware. Here is call stack with [[1L]] applied:
[[1]]
`%>%`
[[2]]
as.data.frame
[[3]]
as.data.frame.dtplyr_step
[[4]]
as.data.frame
[[5]]
dt_eval
[[6]]
eval_tidy
[[7]]
`[`
[[8]]
`[.data.table`
[[9]]
cedta I can fix this in dtplyr. |
The problem is that |
Looking at the code in https://github.com/Rdatatable/data.table/blob/master/R/cedta.R, it's not clear how to force this to be true; any approach is going to be hacky. The easiest fix would be for you to include |
This test now fails:
Unfortunately you don't get a particularly informative error (even with
local_edition(3)
) because the pipeline is rather deep. However, I think this is the key difference:i.e. expected is generating one additional intermediate data table name than expected — this is probably due to the new grouped filter behaviour. Indeed, if I remove
filter(sum(!is.na(flight)) > 3000)
andHAVING num_flts > 3000
the test passesThe text was updated successfully, but these errors were encountered: