Skip to content

Across subquery produces wrong results when referring to another column of data #1015

Closed
@machow

Description

@machow

Hello--it appears that across in dbplyr returns different results from dplyr when both these happen...

  • a formula contains another variable from the data (e.g. ~ .x / mpg)
  • an early operation in across overrides the variable's name (e.g. result gets named mpg)

When this happens, operations on other selected variables use the overridden column, rather than the original (see this dplyr test, which I think would fail for dbplyr).

This is shown in the reprex below. It looks like the issue is a subquery is created during the across..

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

# dplyr ----
cars_small <- mtcars %>%
  select(mpg, cyl, gear) %>%
  head(2)

# all columns divided by original mpg
cars_small %>%
  mutate(across(c(mpg, cyl, gear), ~ .x / mpg))
#>               mpg       cyl      gear
#> Mazda RX4       1 0.2857143 0.1904762
#> Mazda RX4 Wag   1 0.2857143 0.1904762

# dbplyr ----
con <- DBI::dbConnect(RSQLite::SQLite(), dbname = ":memory:")
copy_to(con, cars_small)

# mpg re-assigned in first operation, then used in later ones
q <- tbl(con, "cars_small") %>%
  mutate(across(everything(), ~ .x / mpg))

q
#> # Source:   SQL [2 x 3]
#> # Database: sqlite 3.38.5 [:memory:]
#>     mpg   cyl  gear
#>   <dbl> <dbl> <dbl>
#> 1     1     6     4
#> 2     1     6     4

show_query(q)
#> <SQL>
#> SELECT `mpg`, `cyl` / `mpg` AS `cyl`, `gear` / `mpg` AS `gear`
#> FROM (
#>   SELECT `mpg` / `mpg` AS `mpg`, `cyl`, `gear`
#>   FROM `cars_small`
#> )

Created on 2022-09-28 by the reprex package (v2.0.1)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behaviorverb trans 🤖Translation of dplyr verbs to SQL

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions