Skip to content

dtplyr not recognizing .data$var syntax and sometimes returning 0s after summarise with no warning #138

@sbashevkin

Description

@sbashevkin

I recently noticed that using dtplyr with summarise and .data$var syntax results in some unexpected behavior without warning. I am using dtplyr within my package and have been referencing unquoted variable names with .data$varname as recommended. dtplyr does not seem to recognize that syntax and, at least when used with summarise, returns sums of 0s.

Please see the reprex below.

Thank you,
Sam

library(magrittr) #Normally my package would import just %>%
library(rlang) # Normally my package would import .data
#> 
#> Attaching package: 'rlang'
#> The following object is masked from 'package:magrittr':
#> 
#>     set_names
d<-tibble::tibble(Group=rep(c("A", "B"), 10), Num=1:20)

d
#> # A tibble: 20 x 2
#>    Group   Num
#>    <chr> <int>
#>  1 A         1
#>  2 B         2
#>  3 A         3
#>  4 B         4
#>  5 A         5
#>  6 B         6
#>  7 A         7
#>  8 B         8
#>  9 A         9
#> 10 B        10
#> 11 A        11
#> 12 B        12
#> 13 A        13
#> 14 B        14
#> 15 A        15
#> 16 B        16
#> 17 A        17
#> 18 B        18
#> 19 A        19
#> 20 B        20

# Works without `dtplyr`

d%>%
  dplyr::group_by(.data$Group)%>%
  dplyr::summarise(Num=sum(.data$Num, na.rm=TRUE))%>%
  dplyr::ungroup()%>%
  tibble::as_tibble()
#> # A tibble: 2 x 2
#>   Group   Num
#>   <chr> <int>
#> 1 A       100
#> 2 B       110

# `.data` does not seem to work with `dtplyr`

d%>%
  dtplyr::lazy_dt()%>%
  dplyr::group_by(.data$Group)%>%
  dplyr::summarise(Num=sum(.data$Num, na.rm=TRUE))%>%
  dplyr::ungroup()%>%
  tibble::as_tibble()
#> Error in eval(bysub, x, parent.frame()): object 'Group' not found

# But if you remove the `.data$` from `group_by` and leave it in
# the `summarise` call, it returns 0s, but no warnings or errors

d%>%
  dtplyr::lazy_dt()%>%
  dplyr::group_by(Group)%>%
  dplyr::summarise(Num=sum(.data$Num, na.rm=TRUE))%>%
  dplyr::ungroup()%>%
  tibble::as_tibble()
#> # A tibble: 2 x 2
#>   Group   Num
#>   <chr> <int>
#> 1 A         0
#> 2 B         0

# With `group_by_at` (what I was actually trying to use in my case),
# you can use `.data$` but it again returns 0s with no warnings or errors

d%>%
  dtplyr::lazy_dt()%>%
  dplyr::group_by_at(dplyr::vars(.data$Group))%>%
  dplyr::summarise(Num=sum(.data$Num, na.rm=TRUE))%>%
  dplyr::ungroup()%>%
  tibble::as_tibble()
#> # A tibble: 2 x 2
#>   Group   Num
#>   <chr> <int>
#> 1 A         0
#> 2 B         0

Created on 2019-12-20 by the reprex package (v0.3.0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behaviordplyr-compat 💞dplyr compatibility issues

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions