Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL translation includes vector names #269

Closed
alaindanet opened this issue Mar 22, 2019 · 1 comment
Closed

SQL translation includes vector names #269

alaindanet opened this issue Mar 22, 2019 · 1 comment
Labels
bug an unexpected problem or unintended behavior func trans 🌍 Translation of individual functions to SQL
Milestone

Comments

@alaindanet
Copy link

alaindanet commented Mar 22, 2019

filter() has an inconsistent behavior when applied to a database or a data.frame. Ideally, I would expect the same behavior.

For a database (ie sql query) filter() sends the names of the named vector, resulting in an error. filter() does not result in an error for a data.frame and returns the expected output.
library('dplyr')
#> 
#> Attachement du package : 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library('datasets')

con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
copy_to(con, mtcars)

cyl_id <- mtcars %>%
  filter(carb ==1) %>%
  select(cyl) %>%
  unlist

## With df:
mtcars %>%
  filter(cyl %in% cyl_id)
#>     mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#> 1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
#> 2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
#> 3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
#> 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
#> 5  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
#> 6  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
#> 7  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
#> 8  19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
#> 9  17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
#> 10 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
#> 11 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
#> 12 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
#> 13 21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
#> 14 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
#> 15 26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
#> 16 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
#> 17 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
#> 18 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

## With db:
mtcars2 <- tbl(con, "mtcars")
mtcars2 %>%
  filter(cyl %in% cyl_id)
#> Error in result_create(conn@ptr, statement): near "AS": syntax error

# Fix:
cyl_id2 <- mtcars %>%
  filter(carb ==1) %>%
  select(cyl) %>%
  unlist(., use.names = FALSE)

mtcars2 %>%
  filter(cyl %in% cyl_id2)
#> # Source:   lazy query [?? x 11]
#> # Database: sqlite 3.22.0 [:memory:]
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21       6 160     110  3.9   2.62  16.5     0     1     4     4
#>  2  21       6 160     110  3.9   2.88  17.0     0     1     4     4
#>  3  22.8     4 108      93  3.85  2.32  18.6     1     1     4     1
#>  4  21.4     6 258     110  3.08  3.22  19.4     1     0     3     1
#>  5  18.1     6 225     105  2.76  3.46  20.2     1     0     3     1
#>  6  24.4     4 147.     62  3.69  3.19  20       1     0     4     2
#>  7  22.8     4 141.     95  3.92  3.15  22.9     1     0     4     2
#>  8  19.2     6 168.    123  3.92  3.44  18.3     1     0     4     4
#>  9  17.8     6 168.    123  3.92  3.44  18.9     1     0     4     4
#> 10  32.4     4  78.7    66  4.08  2.2   19.5     1     1     4     1
#> # … with more rows

# Because the sql request send names and the values: 
my_query <- mtcars2 %>%
  filter(cyl %in% cyl_id)
my_query %>% show_query
#> <SQL>
#> SELECT *
#> FROM `mtcars`
#> WHERE (`cyl` IN (4.0 AS "cyl1", 6.0 AS "cyl2", 6.0 AS "cyl3", 4.0 AS "cyl4", 4.0 AS "cyl5", 4.0 AS "cyl6", 4.0 AS "cyl7"))

Created on 2019-03-22 by the [reprex package](http://reprex.tidyverse.org) (v0.2.0.9000).

Thank you very much for this package (and the others)!

@hadley
Copy link
Member

hadley commented Mar 22, 2019

Minimal reprex:

library(dbplyr)
translate_sql(x %in% c(a = 1, b = 2))
#> <SQL> `x` IN (1.0 AS `a`, 2.0 AS `b`)

Created on 2019-03-22 by the reprex package (v0.2.1.9000)

@hadley hadley changed the title inconsistent filter() behavior for database and data.frame SQL translation includes vector names Mar 22, 2019
@hadley hadley added bug an unexpected problem or unintended behavior func trans 🌍 Translation of individual functions to SQL labels Mar 22, 2019
@hadley hadley added this to the v1.4.1 milestone Jun 4, 2019
@hadley hadley closed this as completed in 7e177ea Jun 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior func trans 🌍 Translation of individual functions to SQL
Projects
None yet
Development

No branches or pull requests

2 participants