Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scoped *_all/*_at/*_if functions break on tbl_dt #43

Closed
alistaire47 opened this issue May 7, 2017 · 1 comment

Comments

@alistaire47
Copy link

@alistaire47 alistaire47 commented May 7, 2017

I ran into an issue today when trying to use summarise_all on a data.table:

library(dplyr)
library(data.table)
library(dtplyr)

dtcars <- as.data.table(mtcars)

# what should happen
mtcars %>% group_by(cyl) %>% summarise_all(mean)
#> # A tibble: 3 × 11
#>     cyl      mpg     disp        hp     drat       wt     qsec        vs
#>   <dbl>    <dbl>    <dbl>     <dbl>    <dbl>    <dbl>    <dbl>     <dbl>
#> 1     4 26.66364 105.1364  82.63636 4.070909 2.285727 19.13727 0.9090909
#> 2     6 19.74286 183.3143 122.28571 3.585714 3.117143 17.97714 0.5714286
#> 3     8 15.10000 353.1000 209.21429 3.229286 3.999214 16.77214 0.0000000
#> # ... with 3 more variables: am <dbl>, gear <dbl>, carb <dbl>

# what does happen
dtcars %>% group_by(cyl) %>% summarise_all(mean)
#> Error in !funs: invalid argument type

# explicitly coercing to tbl_dt doesn't help
dtcars %>% tbl_dt() %>% group_by(cyl) %>% summarise_all(mean)
#> Error in !funs: invalid argument type

# it's not a data.table issue
dtcars[, lapply(.SD, mean), by = cyl]
#>    cyl      mpg     disp        hp     drat       wt     qsec        vs
#> 1:   6 19.74286 183.3143 122.28571 3.585714 3.117143 17.97714 0.5714286
#> 2:   4 26.66364 105.1364  82.63636 4.070909 2.285727 19.13727 0.9090909
#> 3:   8 15.10000 353.1000 209.21429 3.229286 3.999214 16.77214 0.0000000
#>           am     gear     carb
#> 1: 0.4285714 3.857143 3.428571
#> 2: 0.7272727 4.090909 1.545455
#> 3: 0.1428571 3.285714 3.500000

To be clear, the same thing happens whether dtplyr is loaded or not, and given the ! in the error looks like it's related to the rlang switchover. I'd post this as a dplyr issue, but since it's data.table-source specific, here seems appropriate.

Versions:

sapply(c('dplyr', 'data.table', 'dtplyr'), packageVersion, simplify = FALSE)
#> $dplyr
#> [1] ‘0.5.0.9004’
#> 
#> $data.table
#> [1] ‘1.10.4’
#> 
#> $dtplyr
#> [1] ‘0.0.2.9000’

Update: I realized this was all on a slightly old version of rlang. I updated to 0.1, and now they all (including the data.table-free mtcars one) error out with

#> Error in new_language(expr, quote(.), !(!(!.args))) : 
#>   could not find function "new_language"

which is a bigger problem. I'm pretty sure there's still an underlying issue with data.tables, though, so I'll post anyway.

@alistaire47

This comment has been minimized.

Copy link
Author

@alistaire47 alistaire47 commented May 8, 2017

Update 2: The issue in the previous update is now fixed by dplyr #2754. Now dplyr works on data.tables without dtplyr by treating them as data.frames and converting them to tbl_dfs, but with the dtplyr backend to operate on them as data.tables/tbl_dts, the scoped functions are still broken:

library(dplyr)
library(data.table)

dtcars <- as.data.table(mtcars)

dtcars %>% group_by(cyl) %>% summarise_all(mean)
#> # A tibble: 3 × 11
#>     cyl      mpg     disp        hp     drat       wt     qsec        vs
#>   <dbl>    <dbl>    <dbl>     <dbl>    <dbl>    <dbl>    <dbl>     <dbl>
#> 1     4 26.66364 105.1364  82.63636 4.070909 2.285727 19.13727 0.9090909
#> 2     6 19.74286 183.3143 122.28571 3.585714 3.117143 17.97714 0.5714286
#> 3     8 15.10000 353.1000 209.21429 3.229286 3.999214 16.77214 0.0000000
#> # ... with 3 more variables: am <dbl>, gear <dbl>, carb <dbl>

dtcars %>% group_by(cyl) %>% mutate_all(mean) %>% head()
#> # A tibble: 6 × 11
#> # Groups: cyl [3]
#>        mpg   cyl     disp        hp     drat       wt     qsec        vs
#>      <dbl> <dbl>    <dbl>     <dbl>    <dbl>    <dbl>    <dbl>     <dbl>
#> 1 19.74286     6 183.3143 122.28571 3.585714 3.117143 17.97714 0.5714286
#> 2 19.74286     6 183.3143 122.28571 3.585714 3.117143 17.97714 0.5714286
#> 3 26.66364     4 105.1364  82.63636 4.070909 2.285727 19.13727 0.9090909
#> 4 19.74286     6 183.3143 122.28571 3.585714 3.117143 17.97714 0.5714286
#> 5 15.10000     8 353.1000 209.21429 3.229286 3.999214 16.77214 0.0000000
#> 6 19.74286     6 183.3143 122.28571 3.585714 3.117143 17.97714 0.5714286
#> # ... with 3 more variables: am <dbl>, gear <dbl>, carb <dbl>

library(dtplyr)

dtcars %>% summarise_all(mean)
#> Error in !funs: invalid argument type

dtcars %>% group_by(cyl) %>% summarise_all(mean)
#> Error in !funs: invalid argument type

dtcars %>% tbl_dt() %>% group_by(cyl) %>% summarise_all(mean)
#> Error in !funs: invalid argument type

dtcars %>% group_by(cyl) %>% mutate_all(mean)
#> Error in .subset(x, j): invalid subscript type 'language'
@alistaire47 alistaire47 changed the title summarise_all/_at/_if broken Scoped *_all/*_at/*_if functions break on tbl_dt May 8, 2017
@hadley hadley closed this in addd7ff May 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.