Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL translation includes vector names #269

Closed
alaindanet opened this issue Mar 22, 2019 · 1 comment
Milestone

Comments

@alaindanet
Copy link

@alaindanet alaindanet commented Mar 22, 2019

filter() has an inconsistent behavior when applied to a database or a data.frame. Ideally, I would expect the same behavior.

For a database (ie sql query) filter() sends the names of the named vector, resulting in an error. filter() does not result in an error for a data.frame and returns the expected output.
library('dplyr')
#> 
#> Attachement du package : 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library('datasets')

con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
copy_to(con, mtcars)

cyl_id <- mtcars %>%
  filter(carb ==1) %>%
  select(cyl) %>%
  unlist

## With df:
mtcars %>%
  filter(cyl %in% cyl_id)
#>     mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#> 1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
#> 2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
#> 3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
#> 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
#> 5  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
#> 6  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
#> 7  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
#> 8  19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
#> 9  17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
#> 10 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
#> 11 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
#> 12 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
#> 13 21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
#> 14 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
#> 15 26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
#> 16 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
#> 17 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
#> 18 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

## With db:
mtcars2 <- tbl(con, "mtcars")
mtcars2 %>%
  filter(cyl %in% cyl_id)
#> Error in result_create(conn@ptr, statement): near "AS": syntax error

# Fix:
cyl_id2 <- mtcars %>%
  filter(carb ==1) %>%
  select(cyl) %>%
  unlist(., use.names = FALSE)

mtcars2 %>%
  filter(cyl %in% cyl_id2)
#> # Source:   lazy query [?? x 11]
#> # Database: sqlite 3.22.0 [:memory:]
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21       6 160     110  3.9   2.62  16.5     0     1     4     4
#>  2  21       6 160     110  3.9   2.88  17.0     0     1     4     4
#>  3  22.8     4 108      93  3.85  2.32  18.6     1     1     4     1
#>  4  21.4     6 258     110  3.08  3.22  19.4     1     0     3     1
#>  5  18.1     6 225     105  2.76  3.46  20.2     1     0     3     1
#>  6  24.4     4 147.     62  3.69  3.19  20       1     0     4     2
#>  7  22.8     4 141.     95  3.92  3.15  22.9     1     0     4     2
#>  8  19.2     6 168.    123  3.92  3.44  18.3     1     0     4     4
#>  9  17.8     6 168.    123  3.92  3.44  18.9     1     0     4     4
#> 10  32.4     4  78.7    66  4.08  2.2   19.5     1     1     4     1
#> # … with more rows

# Because the sql request send names and the values: 
my_query <- mtcars2 %>%
  filter(cyl %in% cyl_id)
my_query %>% show_query
#> <SQL>
#> SELECT *
#> FROM `mtcars`
#> WHERE (`cyl` IN (4.0 AS "cyl1", 6.0 AS "cyl2", 6.0 AS "cyl3", 4.0 AS "cyl4", 4.0 AS "cyl5", 4.0 AS "cyl6", 4.0 AS "cyl7"))

Created on 2019-03-22 by the [reprex package](http://reprex.tidyverse.org) (v0.2.0.9000).

Thank you very much for this package (and the others)!

@hadley

This comment has been minimized.

Copy link
Member

@hadley hadley commented Mar 22, 2019

Minimal reprex:

library(dbplyr)
translate_sql(x %in% c(a = 1, b = 2))
#> <SQL> `x` IN (1.0 AS `a`, 2.0 AS `b`)

Created on 2019-03-22 by the reprex package (v0.2.1.9000)

@hadley hadley changed the title inconsistent filter() behavior for database and data.frame SQL translation includes vector names Mar 22, 2019
@hadley hadley added this to the v1.4.1 milestone Jun 4, 2019
@hadley hadley closed this in 7e177ea Jun 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.