Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should DBI::dbWriteTable support overloading value argument to support a SQL query #488

Open
talegari opened this issue Jul 16, 2024 · 6 comments

Comments

@talegari
Copy link

This is one of my typical workflow: con (say odbc) --> tbl --> some_dplyr_ops --> save (as a table). Note that data does NOT come into the R's memory. But the code does look clumpsy:

tbl_dbplyr %>% 
  dbplyr::sql_render() %>% 
  paste("create table a.b.c as", .) %>%
  DBI::dbExecute(con, .)

This could be (intuitively) replaced by:

DBI::dbWriteTable(con, DBI::Id("a", "b", "c"), value = <sql>)
# or: DBI::dbAppendTable(con, DBI::Id("a", "b", "c"), value = <sql>)

Right now, value is understood to be a data.frame in R session's memory. In some cases, some database backends allow value to be a filename too.

Q1. Would overloading DBI::dbWriteTable to cover this case add value without breaking the semantics?
Q2. Having a new generic say DBI::dbWriteQuery make sense?

PS:

  1. dbplyr::compute does this job is most cases. In practice, it has hit roadblocks like "only temporary tables are supported" (based on DB backend) whereas raw DBI::dbExecute gets the job done.
  2. IMHO, support for value = <sql> should be at DBI level, not dbplyr.
@krlmlr
Copy link
Member

krlmlr commented Jul 16, 2024

Thanks. This is a tight balance -- how much SQL generation should DBI support. I see how it would be useful to have this "just work", at the same time, I don't want to take too much responsibility regarding SQL generation.

The specific cases where compute() doesn't work -- are there corresponding issues in the dbplyr repo?

@talegari
Copy link
Author

talegari commented Jul 16, 2024

@krlmlr

  1. About sql generation: DBI need not generate any sql, dbplyr does. Simply put, DBI needs to accept a query like it accepts a dataframe.
  2. compute() is supposed to fail (gracefully) in some cases. AFAIK, Hadley/dbplyr visualizes compute() more like a exploratory tool rather than a serious "write to database" tool. So, not ideal for production usecases.

One example is spark sql where we get a meaningful error.
image

IMHO, DBI has worked like a hard button (DBI::dbExecute) and it is the last resort for what a R-user can do from within R. Whatever did not work with compute() was accomplished this way successfully:

# this works
tile_details_tbl %>% 
  sql_render() %>% 
  paste("CREATE TABLE sandbox.srikanthks.tile_data AS", .) %>% 
  DBI::dbExecute(con, .)

This brings us exactly to:

Now that DBI already does it, should it support the functionality via dbWriteTable (most intuitive place a user would for).

As an individual, I can wrap the previous chunk within a function and make my workflow clean, but I think, getting this as a part of dbWriteTable makes me feel empowered!

@krlmlr
Copy link
Member

krlmlr commented Jul 16, 2024

The "CREATE TABLE sandbox.srikanthks.tile_data AS" part is what I refer to as SQL generation.

The dbplyr message seems to originate here: https://github.com/tidyverse/dbplyr/pull/1379/files#diff-f099d8ec45544adf5c40196344d55c458564434dbaa5d7ad090510f33d5cf63bR115 .

@hadley: do you remember why you added the early exit for compute(temporary = FALSE) in Spark?

@hadley
Copy link
Member

hadley commented Jul 16, 2024

@krlmlr I think that original hack was due to a misunderstanding. I wouldn't replicate it.

@krlmlr
Copy link
Member

krlmlr commented Jul 16, 2024

@hadley: Thanks. Does this mean that you'd accept a PR that fixes compute(temporary = FALSE) on Spark?

@hadley
Copy link
Member

hadley commented Jul 17, 2024

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants