Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

relocate #232

Closed
D-Se opened this issue Mar 28, 2021 · 3 comments · Fixed by #233
Closed

relocate #232

D-Se opened this issue Mar 28, 2021 · 3 comments · Fixed by #233

Comments

@D-Se
Copy link

D-Se commented Mar 28, 2021

Relocate executes and yields data table directly, without a as.data.table call

library(data.table)
library(dtplyr)

flights_DT <- dtplyr::lazy_dt(nycflights13::flights) %>%
    as.data.table()

flights_DT %>%
    relocate(where(is.character), .before = where(is.numeric))
@myoung3
Copy link

myoung3 commented Mar 28, 2021

Thanks for reporting.

From what I understand, piping flights_DT into tidyverse queries won't make use of dtplyr functions at all since flights_DT is a data.table and thus will be treated as a data.frame using normal dplyr methods. If you want to use the speed of data.table/dtplyr you need to pipe an object of class dtplyr_step (i.e., the objects that get created by calling lazy_dt on data.tables)

Your code:

flights_DT <- dtplyr::lazy_dt(nycflights13::flights) %>%
    as.data.table()

is just a complicated way of writing as.data.table(nycflights13::flights) and thus your example isn't really using dtplyr at all.

When piping a lazy dt (ie an object of class dtplyr_step) to relocate, the return is also a lazy dt, as is intended.

library(data.table)
library(dplyr,warn.conflicts = FALSE)
library(dtplyr)
flights_lazyDT <- dtplyr::lazy_dt(nycflights13::flights)
flights_DT <- flights_lazyDT %>% as.data.table()

out <- flights_lazyDT %>%
  relocate(where(is.character), .before = where(is.numeric)) 

class(flights_DT)
#> [1] "data.table" "data.frame"
class(out)
#> [1] "dtplyr_step_first" "dtplyr_step"

Created on 2021-03-27 by the reprex package (v1.0.0)

With that said, it does seem like you're found an issue with dplyr and not dtplyr, since it seems the default behavior should be conversion to tibble but that's not happening with relocate:

library(data.table)
library(dplyr,warn.conflicts = FALSE)
flights_DT <- as.data.table(nycflights13::flights)

out2 <- flights_DT %>%
  relocate(where(is.character), .before = where(is.numeric)) 


out3 <- flights_DT %>% 
  group_by(month) %>%
  summarize(x=mean(dep_delay,na.rm=TRUE))

class(out2)
#> [1] "data.table" "data.frame"
class(out3)
#> [1] "tbl_df"     "tbl"        "data.frame"

Created on 2021-03-27 by the reprex package (v1.0.0)

@markfairbanks
Copy link
Collaborator

@myoung3 As of dtplyr v1.1.0 one of the new features is that the .data.table method of functions automatically convert to a lazy data.table. For example with mutate():

library(dtplyr)
library(data.table)
library(dplyr)

test_dt <- data.table(x = 1:3, y = 1:3)

test_dt %>%
  mutate(double_x = x * 2)
#> Source: local data table [3 x 3]
#> Call:   copy(`_DT1`)[, `:=`(double_x = x * 2)]
#> 
#>       x     y double_x
#>   <int> <int>    <dbl>
#> 1     1     1        2
#> 2     2     2        4
#> 3     3     3        6
#> 
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results

It looks like relocate() doesn't have this implemented (even though it should)

@myoung3
Copy link

myoung3 commented Mar 28, 2021

@markfairbanks Very cool! I had missed that. Sorry, @D-Se. My mistake.

And on the second point I was mistaken as well, it seems dplyr's behavior (whether it returns a tibble or a data.frame) actually depends on the function. https://stackoverflow.com/questions/61067989/which-tidyverse-functions-return-tibbles

hadley pushed a commit that referenced this issue Mar 29, 2021
…min/_sample (#233)

* Add data.table methods for arrange,  relocate,  slice_head, slice_tail, slice_min, slice_max, & slice_sample
* Extract expression inside desc in case of quosure
* Use capture_dot() in slice_min_max() helper

Fixes #232
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants