Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query fails with joins using the . suffix when using BigQuery via odbc #214

Closed
nwstephens opened this issue Jan 11, 2019 · 4 comments
Closed
Labels
bug an unexpected problem or unintended behavior verb trans 🤖 Translation of dplyr verbs to SQL

Comments

@nwstephens
Copy link

nwstephens commented Jan 11, 2019

When I use dplyr to join tables, the default is to use .x and .y as a suffix for the column names. However, BigQuery does not support dots in column names. There is a feature to translate column names for rbigquery connections, however this feature does not exist for odbc connections.

When I use joins with odbc my query fails. A workaround is to manually set the suffix to _x or _y. However, I would like dbplyr to make that conversion for me as the default.

Example:

testr_mtcars_left <- tbl(odbcCon, "testr_mtcars")
testr_mtcars_right <- tbl(odbcCon, "testr_mtcars")
show_query(testr_mtcars_left)
show_query(testr_mtcars_right)
testr_mtcars_join_tbl <- testr_mtcars_left %>%
  inner_join(testr_mtcars_right, by = "row_names")
show_query(testr_mtcars_join_tbl)
testr_mtcars_join_tbl
Error: <SQL> 'SELECT *
FROM (SELECT `TBL_LEFT`.`row_names` AS `row_names`, `TBL_LEFT`.`mpg` AS `mpg.x`, `TBL_LEFT`.`cyl` AS `cyl.x`, `TBL_LEFT`.`disp` AS `disp.x`, `TBL_LEFT`.`hp` AS `hp.x`, `TBL_LEFT`.`drat` AS `drat.x`, `TBL_LEFT`.`wt` AS `wt.x`, `TBL_LEFT`.`qsec` AS `qsec.x`, `TBL_LEFT`.`vs` AS `vs.x`, `TBL_LEFT`.`am` AS `am.x`, `TBL_LEFT`.`gear` AS `gear.x`, `TBL_LEFT`.`carb` AS `carb.x`, `TBL_RIGHT`.`mpg` AS `mpg.y`, `TBL_RIGHT`.`cyl` AS `cyl.y`, `TBL_RIGHT`.`disp` AS `disp.y`, `TBL_RIGHT`.`hp` AS `hp.y`, `TBL_RIGHT`.`drat` AS `drat.y`, `TBL_RIGHT`.`wt` AS `wt.y`, `TBL_RIGHT`.`qsec` AS `qsec.y`, `TBL_RIGHT`.`vs` AS `vs.y`, `TBL_RIGHT`.`am` AS `am.y`, `TBL_RIGHT`.`gear` AS `gear.y`, `TBL_RIGHT`.`carb` AS `carb.y`
  FROM `testr_mtcars` AS `TBL_LEFT`
  INNER JOIN `testr_mtcars` AS `TBL_RIGHT`
  ON (`TBL_LEFT`.`row_names` = `TBL_RIGHT`.`row_names`)
) `lexvfpziyg`
LIMIT 10'
  nanodbc/nanodbc.cpp:1587: 42000: [RStudio][BigQuery] (70) Invalid query: Invalid field name "mpg.x". Fields mu
> class(con)
[1] "BigQuery"
attr(,"package")
[1] ".GlobalEnv"
@hadley hadley added bug an unexpected problem or unintended behavior verb trans 🤖 Translation of dplyr verbs to SQL labels Feb 4, 2019
@hadley hadley changed the title Query fails with joins using the . suffix when using dplyr with odbc Query fails with joins using the . suffix when using BigQuery via odbc Feb 6, 2019
@hadley
Copy link
Member

hadley commented Sep 24, 2020

Minimal reprex:

library(bigrquery)
library(dplyr, warn.conflicts = FALSE)

ds <- bq_test_dataset()
con <- DBI::dbConnect(ds)

df <- data.frame(x = 1:5, y = 5:1)
db <- copy_to(con, df, temporary = FALSE)
left_join(db, db, by = "x")
#> Error: Job 'gargle-169921.job_Dsq6JU2F71CYwujDlUhFQz18DMKC.US' failed
#> x Invalid field name "y.x". Fields must contain only letters, numbers, and underscores, start with a letter or underscore, and be at most 128 characters long. [invalid]

Created on 2020-09-24 by the reprex package (v0.3.0.9001)

Docs for column names: https://cloud.google.com/bigquery/docs/schemas

@richy1996
Copy link

@nwstephens and/or @hadley is this solved now? I'm still getting the error. Is there anything special to add in the joins regarding sql_join_suffix()?

@nwstephens
Copy link
Author

@mgirlich
Copy link
Collaborator

It might be that odbc does does not create a connection of class <BigQueryConnection>. Therefore, the method sql_join_suffix.BigQueryConnection() does not apply. I think you could force that class via the dbms.name argument to odbc::dbConnect().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior verb trans 🤖 Translation of dplyr verbs to SQL
Projects
None yet
Development

No branches or pull requests

4 participants