Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tbl %>% compute(name="test.table") tries to return a tbl-pointer for, e.g., "default.test.table" #74

Closed
OssiLehtinen opened this issue Jan 31, 2020 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@OssiLehtinen
Copy link

Issue Description

Doing something like the following:

tbl(con, "table1") %>%
  compute(name = "test.table")

will result in an error, such as:

Error: SYNTAX_ERROR: line 2:6: Table awsdatacatalog.default.test.table does not exist

The command does produce the table 'table' in database 'test', but compute tries to a table in the 'default' database (or which every you have set when connecting to Athena) named 'test.table'

@DyfanJones DyfanJones added the bug Something isn't working label Jan 31, 2020
@DyfanJones
Copy link
Owner

DyfanJones commented Jan 31, 2020

Sorry about this, I think this is due to the compute method in dbplyr. When the table has been created in Athena dbplyr::compute calls the following:

tbl(x$src, name) %>%
    group_by(!!! syms(op_grps(x))) %>%
    add_op_order(op_sort(x))

Which means it will be looking for a table called "test.table". have you tried changing your connection to:

library(DBI)
library(dplyr)

con <- dbConnect(RAthena::athena(), schema_name = "test")

tbl(con, "table1") %>%
  compute(name = "table")

DyfanJones pushed a commit that referenced this issue Jan 31, 2020
@DyfanJones
Copy link
Owner

DyfanJones commented Jan 31, 2020

Managed to create a solution that doesn't mean a rebuild of the dbplyr::compute method. Please check out PR #72

library(DBI)
library(dplyr)

# connecting to default schema
con <- dbConnect(RAthena::athena())

# writing temp schema
tbl(con, "iris") %>%
  compute(name = "temp.iris")

Info: (Data scanned: 1.35 KB)
# Source:   table<temp.iris> [?? x 5]
# Database: Athena 1.11.5 [eu-west-1/default]
   sepal_length sepal_width petal_length petal_width species
          <dbl>       <dbl>        <dbl>       <dbl> <chr>  
 1          5.1         3.5          1.4         0.2 setosa 
 2          4.9         3            1.4         0.2 setosa 
 3          4.7         3.2          1.3         0.2 setosa 
 4          4.6         3.1          1.5         0.2 setosa 
 5          5           3.6          1.4         0.2 setosa 
 6          5.4         3.9          1.7         0.4 setosa 
 7          4.6         3.4          1.4         0.3 setosa 
 8          5           3.4          1.5         0.2 setosa 
 9          4.4         2.9          1.4         0.2 setosa 
10          4.9         3.1          1.5         0.1 setosa 
# … with more rows

Note: This feature is different than alot of other DBI packages that integrate with dplyr. They will return your above error.

@DyfanJones DyfanJones self-assigned this Jan 31, 2020
@OssiLehtinen
Copy link
Author

Alright, the fix in PR #72 seems to do it.

I agree this is a bit of a deviation from the 'standard', but don't see an immediate issue with it. One can still use %>% compute(name = in_schema("schema", "table")) as usual and things work ok.

@DyfanJones
Copy link
Owner

Closing issue, if you find anything else please raise a ticket and i will try to get it in before the next cran release

DyfanJones pushed a commit that referenced this issue Jan 31, 2020
@DyfanJones DyfanJones mentioned this issue Jan 31, 2020
25 tasks
@DyfanJones
Copy link
Owner

These change have been pushed to cran

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants