Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

joins fail on Oracle because of multiple quoting #1177

Closed
eduardszoecs opened this issue Mar 2, 2023 · 5 comments · Fixed by #1260
Closed

joins fail on Oracle because of multiple quoting #1177

eduardszoecs opened this issue Mar 2, 2023 · 5 comments · Fixed by #1260

Comments

@eduardszoecs
Copy link

eduardszoecs commented Mar 2, 2023

This is similar to #1091
Although a fix was merged there, I still see the issue with current dbplyr 2.3.1 using an Oracle backend with pool.

This describes the problem, but is not a strictly reprocible example (I don't know how to mock a oracle database).

r$> con
<Pool> of OraConnection objects
  Objects checked out: 0
  Available in pool: 1
  Max size: Inf
  Valid: TRUE
 tbl(con, dbplyr::in_schema("s", "df1")) %>%
        dplyr::inner_join(tbl(con, in_schema("s", "df2")) ,
                          by = "x") %>% 
      dplyr::show_query()

Gives

SELECT
  """s"".""df1"""."x" AS "x",
  "col2"
FROM ("s"."df1") """s"".""df1"""
INNER JOIN ("s"."df2") """s".""df2"""
  ON ("""s"".""df1"""."x" = """s"".""df2"""."x")

and fails with

Error in `collect()`:
! Failed to collect lazy table.
Caused by error in `.oci.SendQuery()`:
! ORA-01741: illegal zero-length identifier
Run `rlang::last_error()` to see where the error occurred.

likekly because of the multiple quotes inserted. I don't know why or where they come from.

The proposed fix in #1091 using x_as and y_as works, but is ugly (in fact on x_as is needed):

 tbl(con, dbplyr::in_schema("s", "df1")) %>%
        dplyr::inner_join(tbl(con, in_schema("s", "df2")) ,
                          by = "x", x_as = "x", y_as = "y") %>% 
      dplyr::show_query()
SELECT
  "x"."x" AS "x",
FROM ("s"."df1") "x"
INNER JOIN ("s"."df2") "y"
  ON ("x"."x" = "y"."x")

Downgrading dbplyr to 2.2.1 also fixes the issue (uses LHS and RHS as names), but is also ugly to fix versions.

SELECT
  "x"."x" AS "x",
FROM ("s"."df1") "LHS"
INNER JOIN ("s"."df2") "RHS"
  ON ("LHS"."x" = "RHS"."x")

The tests are using SQLite, and work. So I assume it's a Oracle specific problem?

Any ideas what's going on or where / how to fix this?
I am not too deep in the dbplyr code-basis, I ssupect it's in the name creation....

session_info()
r$> devtools::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.2.2 Patched (2022-11-10 r83330)
 os       Ubuntu 20.04.5 LTS
 system   x86_64, linux-gnu
 ui       X11
 language en_US.UTF-8
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       UTC
 date     2023-03-02
 pandoc   2.19.2 @ /usr/bin/pandoc

─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 ! package       * version date (UTC) lib source
   blob            1.2.3   2022-04-10 [2] RSPM (R 4.2.0)
   brio            1.1.3   2021-11-30 [2] RSPM (R 4.2.0)
   bslib           0.4.2   2022-12-16 [2] RSPM (R 4.2.0)
   cachem          1.0.7   2023-02-24 [1] CRAN (R 4.2.2)
   callr           3.7.3   2022-11-02 [2] RSPM (R 4.2.0)
   cli             3.6.0   2023-01-09 [2] RSPM (R 4.2.0)
   crayon          1.5.2   2022-09-29 [2] RSPM (R 4.2.0)
   crul            1.3     2022-09-03 [2] RSPM (R 4.2.0)
   curl            5.0.0   2023-01-12 [2] RSPM (R 4.2.0)
   DBI             1.1.3   2022-06-18 [2] RSPM (R 4.2.0)
   dbplyr          2.3.1   2023-02-24 [1] CRAN (R 4.2.2)
   desc            1.4.2   2022-09-08 [2] RSPM (R 4.2.0)
   devtools        2.4.5   2022-10-11 [2] RSPM (R 4.2.0)
   digest          0.6.31  2022-12-11 [2] RSPM (R 4.2.0)
   dotenv        * 1.0.3   2021-04-22 [2] RSPM (R 4.2.0)
   dplyr           1.1.0   2023-01-29 [2] RSPM (R 4.2.0)
   ellipsis        0.3.2   2021-04-29 [2] RSPM (R 4.2.0)
   fansi           1.0.4   2023-01-22 [2] RSPM (R 4.2.0)
   fastmap         1.1.1   2023-02-24 [1] CRAN (R 4.2.2)
   fontawesome     0.5.0   2023-01-25 [2] RSPM (R 4.2.0)
   fs              1.6.1   2023-02-06 [2] RSPM (R 4.2.0)
   generics        0.1.3   2022-07-05 [2] RSPM (R 4.2.0)
   glue            1.6.2   2022-02-24 [2] RSPM (R 4.2.0)
   htmltools       0.5.4   2022-12-07 [2] RSPM (R 4.2.0)
   htmlwidgets     1.6.1   2023-01-07 [2] RSPM (R 4.2.0)
   httpcode        0.3.0   2020-04-10 [2] RSPM (R 4.2.0)
   httpuv          1.6.9   2023-02-14 [2] RSPM (R 4.2.0)
   httr            1.4.5   2023-02-24 [1] CRAN (R 4.2.2)
   jquerylib       0.1.4   2021-04-26 [2] RSPM (R 4.2.0)
   jsonlite        1.8.4   2022-12-06 [2] RSPM (R 4.2.0)
   later           1.3.0   2021-08-18 [2] RSPM (R 4.2.0)
   lifecycle       1.0.3   2022-10-07 [2] RSPM (R 4.2.0)
   magrittr        2.0.3   2022-03-30 [2] RSPM (R 4.2.0)
   memoise         2.0.1   2021-11-26 [2] RSPM (R 4.2.0)
   mime            0.12    2021-09-28 [2] RSPM (R 4.2.0)
   miniUI          0.1.1.1 2018-05-18 [2] RSPM (R 4.2.0)
   pillar          1.8.1   2022-08-19 [2] RSPM (R 4.2.0)
   pkgbuild        1.4.0   2022-11-27 [2] RSPM (R 4.2.0)
   pkgconfig       2.0.3   2019-09-22 [2] RSPM (R 4.2.0)
   pkgload         1.3.2   2022-11-16 [2] RSPM (R 4.2.0)
   pool            1.0.1   2023-02-21 [2] RSPM (R 4.2.0)
   prettyunits     1.1.1   2020-01-24 [2] RSPM (R 4.2.0)
   processx        3.8.0   2022-10-26 [2] RSPM (R 4.2.0)
   profvis         0.3.7   2020-11-02 [2] RSPM (R 4.2.0)
   promises        1.2.0.1 2021-02-11 [2] RSPM (R 4.2.0)
   ps              1.7.2   2022-10-26 [2] RSPM (R 4.2.0)
   purrr           1.0.1   2023-01-10 [2] RSPM (R 4.2.0)
   R6              2.5.1   2021-08-19 [2] RSPM (R 4.2.0)
   Rcpp            1.0.10  2023-01-22 [2] RSPM (R 4.2.0)
   remotes         2.4.2   2021-11-30 [2] RSPM (R 4.2.0)
   rlang           1.0.6   2022-09-24 [2] RSPM (R 4.2.0)
   ROracle         1.3-1.1 2021-11-10 [1] CRAN (R 4.2.2)
   rprojroot       2.0.3   2022-04-02 [2] RSPM (R 4.2.0)
   rstudioapi      0.14    2022-08-22 [2] RSPM (R 4.2.0)
   sass            0.4.5   2023-01-24 [2] RSPM (R 4.2.0)
   sessioninfo     1.2.2   2021-12-06 [2] RSPM (R 4.2.0)
   shiny           1.7.4   2022-12-15 [2] RSPM (R 4.2.0)
   shinyFeedback   0.4.0   2021-09-23 [1] CRAN (R 4.2.2)
   shinyjs         2.1.0   2021-12-23 [1] CRAN (R 4.2.2)
   shinyWidgets    0.7.6   2023-01-08 [1] CRAN (R 4.2.2)
   sofa            0.4.0   2020-06-26 [2] RSPM (R 4.2.0)
   stringi         1.7.12  2023-01-11 [2] RSPM (R 4.2.0)
   stringr         1.5.0   2022-12-02 [2] RSPM (R 4.2.0)
   testthat      * 3.1.6   2022-12-09 [2] RSPM (R 4.2.0)
   tibble          3.1.8   2022-07-22 [2] RSPM (R 4.2.0)
   tictoc          1.1     2022-09-03 [1] CRAN (R 4.2.2)
   tidyselect      1.2.0   2022-10-10 [2] RSPM (R 4.2.0)
   urlchecker      1.0.1   2021-11-30 [2] RSPM (R 4.2.0)
   usethis         2.1.6   2022-05-25 [2] RSPM (R 4.2.0)
   utf8            1.2.3   2023-01-31 [2] RSPM (R 4.2.0)
   vctrs           0.5.2   2023-01-23 [2] RSPM (R 4.2.0)
   waiter          0.2.5   2022-01-03 [1] CRAN (R 4.2.2)
   withr           2.5.0   2022-03-03 [2] RSPM (R 4.2.0)
   xtable          1.8-4   2019-04-21 [2] RSPM (R 4.2.0)

 [1] /usr/local/lib/R/site-library
 [2] /usr/lib/R/site-library
 [3] /usr/lib/R/library
@eduardszoecs
Copy link
Author

eduardszoecs commented Mar 2, 2023

This does not happen with SQLite

require(dplyr)
require(dbplyr)
con <- pool::dbPool(RSQLite::SQLite())
DBI::dbExecute(con, "ATTACH ':memory:' AS foo")
DBI::dbWriteTable(con, DBI::Id(schema = "foo", table = "df"), dplyr::tibble(x = 1:3, y = "a"))
DBI::dbWriteTable(con, DBI::Id(schema = "foo", table = "df2"), dplyr::tibble(x = 2:3, z = "b"))
df1 <- tbl(con, in_schema("foo", "df"))
df2 <- tbl(con, in_schema("foo", "df2"))
df1 %>% inner_join(df2, by = "x") 

SELECT ```foo``.``df```.*, `z`
FROM `foo`.`df` AS ```foo``.``df```
INNER JOIN `foo`.`df2` AS ```foo``.``df2```
  ON (```foo``.``df```.`x` = ```foo``.``df2```.`x`)

Or PostgreSQL

SELECT """foo"".""df""".*, "z"
FROM "foo"."df" AS """foo"".""df"""
INNER JOIN "foo"."df2" AS """foo"".""df2"""
  ON ("""foo"".""df"""."x" = """foo"".""df2"""."x")

So it's Oracle specific?

@apalacio9502
Copy link

Hello,

I confirm the error using a connection through Pool.

Perform a test using DBI and the result is as expected, multiple quotes do not appear. And in my case I have been using version 2.3.1 for multiple analyzes and I have only detected the error #1178.

With DBI + ODBC (Oracle Driver)
conexion <- dbConnect(odbc::odbc(), driver = "oracle")
table_dbi <- tbl(conexion,in_schema("CDATAPRE","EXT_DA_DIV_TRM"))
table_dbi %>% inner_join(table_dbi, "FECHA") %>% show_query()

image

However, using pool multiple quotes appear which is not supported by Oracle Data Base.

With Pool + ODBC (Oracle Driver)  
pool <- pool::dbPool(drv = odbc::odbc(),Driver = "Oracle") 
table_pool <- tbl(pool,in_schema("CDATAPRE","EXT_DA_DIV_TRM"))  
table_pool %>% inner_join(table_pool, "FECHA") %>% show_query() 

image

Regards,

@mgirlich
Copy link
Collaborator

I created a PR that should fix the issue. As I don't have an Oracle database to test with, it would be great if you could test the PR. You can install it via devtools::install_github("tidyverse/dbplyr").

@ablack3
Copy link
Contributor

ablack3 commented Jun 29, 2023

I'm still getting this error with the current CRAN version of dbplyr but not the current development version FYI.

@mgirlich
Copy link
Collaborator

Thanks for the feedback. I know there was an issue with joins in Oracle in the CRAN version, good to get confirmation it is fixed in the dev version 😄

@ablack3 ablack3 mentioned this issue Jul 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants