Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

group_by.dtplyr_step issue #383

Closed
gpierard opened this issue Aug 7, 2022 · 1 comment
Closed

group_by.dtplyr_step issue #383

gpierard opened this issue Aug 7, 2022 · 1 comment

Comments

@gpierard
Copy link

gpierard commented Aug 7, 2022

The below example illustrates a potential bug found with group_by.dtplyr_step, please see this stackoverflow question for workarounds.


I'd like to group_by across several variables in dtplyr within a lapply loop, and I find that I somehow can't use the same syntax as dplyr after calling lazy_dt().

library(dplyr)
mycolumns= c("Wind", "Month", "Ozone", "Solar.R")
columnpairs <- as.data.frame(combn(mycolumns, 2))

#         V1    V2      V3    V4      V5      V6
#    1  Wind  Wind    Wind Month   Month   Ozone
#    2 Month Ozone Solar.R Ozone Solar.R Solar.R

result_dplyr <- lapply(columnpairs, function(x) {
  airquality %>% 
    select(all_of(x)) %>% 
    group_by(across(all_of(x))) %>% filter(n() > 1)
  }
)

$V1
# A tibble: 105 x 2
# Groups:   Wind, Month [40]
    Wind Month
   <dbl> <int>
 1   7.4     5
 2   8       5
 3  11.5     5
 4  14.9     5
 5   8.6     5
 6   8.6     5
 7   9.7     5
 8  11.5     5
 9  12       5
10  11.5     5
# ... with 95 more rows

Using the same syntax, I encounter an issue after calling lazy_dt with dtplyr.

library(dtplyr)
airq <- lazy_dt(airquality)

lapply(columnpairs, function(x) {
  airq %>% select(all_of(x)) %>% 
    group_by(across(all_of(x))) %>% filter(n() > 1)
})

Error in `all_of()`:
! object 'x' not found

This was answered with workarounds at https://stackoverflow.com/q/73267732/5224236

@eutwt
Copy link
Collaborator

eutwt commented Aug 7, 2022

Thanks for the bug report! It looks like this is fixed in the development version, since #318. I'll close this now, but will reopen if you install using devtools::install_github("tidyverse/dtplyr") and still see a problem.

Reprex using v1.12.1 (current CRAN version)

devtools::load_all('~/Documents/GitHub/dtp')
#> ℹ Loading dtplyr
#> Warning: package 'testthat' was built under R version 4.1.2
library(dplyr, warn.conflicts = FALSE)

fun <- function(df, x) {
  df %>% group_by(across(all_of(x)))
}

lazy_dt(data.frame(a = 1)) %>% 
  fun('a')
#> Error in `dt_squash_across()` at dtp/R/tidyeval.R:88:4:
#> Caused by error in `all_of()`:
#> ! object 'x' not found

Created on 2022-08-07 by the reprex package (v2.0.1.9000)

Reprex using latest commit - cf7c2d8

devtools::load_all('~/Documents/GitHub/dtp')
#> ℹ Loading dtplyr
#> Warning: package 'testthat' was built under R version 4.1.2
library(dplyr, warn.conflicts = FALSE)

fun <- function(df, x) {
  df %>% group_by(across(all_of(x)))
}

lazy_dt(data.frame(a = 1)) %>% 
  fun('a')
#> Source: local data table [1 x 1]
#> Groups: a
#> Call:   `_DT1`
#> 
#>       a
#>   <dbl>
#> 1     1
#> 
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results

Created on 2022-08-07 by the reprex package (v2.0.1.9000)

@eutwt eutwt closed this as completed Aug 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants