New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dbSendQuery documentation unclear regarding when SQL request is executed vs prepared #268
Comments
Thanks for the feedback. RPostgres, RMariaDB, RSQLite and most likely odbc use the same code path both for parametrized and non-parametrized SQL. The SQL will be parsed for placeholders by the database library. Would that be an option for Teradata? I remember problems with RMariaDB where some queries could not be executed as parametrized SQL, we had to work around and revert to a different code path after failure. Perhaps you could look for Happy to adapt/enhance DBItest to this use case. |
We would like to avoid that approach if at all possible. We try to avoid parsing SQL in our drivers.
This is also true for the Teradata Database. Some kinds of SQL requests cannot be prepared. Unfortunately, this approach won't work for us, because the same error code is returned for syntax errors and errors due to a SQL request that can't be prepared.
Yes, this approach could work. This approach would have 3 use cases:
|
Thanks. Unfortunately, the cases 1 and 2 are indistinguishable. How about:
This would be consistent with existing implementations and, I guess, achieve what you're looking for.? |
Yes, that should work. The documentation does not currently talk about the expected behavior for non-parameterized SQL requests. Can the documentation for
Thanks! |
Yes, we need to add test cases and implementations in the different backends. I'll consider this for the upcoming update in February. |
This may be even more relevant for |
@krlmlr I have implemented Many other DBI methods such as Therefore, given how other DBI methods call We should state generally that
Thanks! |
A few loose thoughts:
I skimmed the DBI specification, time of execution/query doesn't currently seem to be specified indeed. It can't be "too early", though: it's currently not mandatory for the caller to indicate whether the query contains placeholders. This is done with a call to
Agree to update the documentation once this is adopted. |
Yes
Yes.
No, we do not need to override
It could, but it does not. Our goal for the Teradata driver is to be compatible with DBI >= 0.8 Therefore, we need the meaning of the
We have already implemented this behavior in the Teradata DBI driver, and it works well with DBI 0.8.
|
Thanks. I hear you, but the suggested treatment of Does your implementation pass the tests in DBItest? It would be great to achieve consensus here. Can we find a way that doesn't break current implementations of RMariaDB, RPostgres and RSQLite and achieves what you are looking for? Compatibility with DBI 0.8 will require overriding a few methods, but can be done. Alternatively, you could require a recent version of DBI. |
If you are willing to change the default implementation for DBI methods, then I recommend that the distinction between preparing and executing a parameterized SQL request be made explicit by introducing two new methods: In addition, the DBI documentation must be changed to clearly articulate the behavior of each of the DBI methods with respect to parameterized versus non-parameterized SQL. Here is a proposal that I believe is compatible with existing DBI drivers... (Existing method, modified documentation, modified default implementation)
(New method)
(Existing method, modified documentation, unchanged default implementation)
(Existing method, modified documentation, modified default implementation)
(New method)
(Existing method, modified documentation)
|
Thanks. I was contemplating new interface methods, it's a bit of work but seems worth it. I'm not sure I follow your proposal entirely, I'll draft mine and we see if we can meet halfway. I'm proposing two new methods,
Is the |
@krlmlr Yes, your proposed methods sound similar to mine. |
@tomcnolan: I feel that new generics are a bit too much for this problem. Did we discuss an optional argument to |
@krlmlr Clearly people expect
I disagree. DBI lacks a distinction between preparing and executing a SQL request. Other database access APIs such as ODBC and JDBC offer separate API functions for those two operations.
No, we have not discussed that approach yet. In my opinion, adding a non-default We have implemented the following behavior in the DBI driver for Teradata:
|
@tomcnolan it's not feasible to change the meaning of
It's not clear that is better than adding new I think adding new generics here is too heavy. |
@hadley @krlmlr People expect |
@tomcnolan: SummaryDifferent interfaces for prepared queries and immediate execution found in (at least) Teradata, ODBC, MSSQL and MySQL/MariaDB. The current default is to always use prepared queries. We need to offer a clear and unambiguous way to access the "immediate" API. I agree with @hadley that we shouldn't make backward-incompatible changes at this point. I also agree with @tomcnolan that we should make it easy for backend implementers to support the new interface so that no behavioral differences occur. Tests will be one aspect of this, but these might be difficult to implement in a backend-agnostic way. I also think we need to differentiate between The "immediate" API is optional, but important for some queries/backends. I rather like the idea of an optional Most users will call
|
@krlmlr Yes, I agree with this proposal. Thanks! |
@krlmlr I have a follow-up question regarding this topic:
What is the expected behavior from the driver when the app specifies |
I'll add it to the spec very soon, there's a related PR in the {odbc} package. The current implementation for {odbc}, {RSQLite}, {RMariaDB} and {RPostgres} always uses the corresponding "prepared query" API, even for queries/statements without parameters. I'd argue that Can Teradata use the "prepared query" API for queries without parameters? |
For the Teradata DBI Driver, we have added the The Teradata DBI Driver will behave as follows for the
The only remaining question is the expected behavior of the
So it seems reasonable that |
The Teradata Database allows almost all non-parameterized SQL requests to be prepared. Some (obviously non-parameterized) DDL commands cannot be prepared, so the Teradata DBI Driver cannot blindly attempt to prepare all SQL requests. The Teradata DBI Driver must assume by default that an app intends to execute a SQL request, and only prepare a SQL request if it's unambiguously certain that the app intended to prepare a SQL request. |
I think the following implementation sketch of dbGetQuery <- function(..., params = NULL, immediate = NA) {
immediate <- decide_immediate(immediate, params, ...)
res <- dbSendQuery(..., params = params, immediate = immediate)
on.exit(dbClearResult(res))
dbFetch(res, ...)
} This doesn't differ much from the current implementation, only the Similarly for Ideally you shouldn't even need to override Thank you for your patience and for your very valuable input! |
@krlmlr We still need guidance as to the expected behavior from |
Here is what our implementation of setMethod ("dbGetQuery", signature ("TeradataConnection", "character"), function (conn, statement, n = -1, params = NULL, immediate = NA, ...) {
# Optional arguments params and immediate appear before dots in order to accommodate partial argument matching.
if (conn@m_bTraceLog) {
cat (paste0 ("> enter dbGetQuery ", conn, " immediate=", immediate, " ", statement, "\n"))
on.exit (cat (paste0 ("< leave dbGetQuery ", conn, " immediate=", immediate, " ", statement, "\n")))
}
res <- DBI::dbSendQuery (conn, statement, params, immediate)
tryCatch ({
return (DBI::dbFetch (res, n = n))
}, finally = {
DBI::dbClearResult (res)
}) # end finally
}) # end method dbGetQuery |
What needs to be done to actually execute the request in the Have you seen DBIlog, which is a generic logging wrapper for DBI connections? I'm planning to send it to CRAN soon. |
The DBI documentation says that
In both your version of The |
I see now that the text is unclear -- it's referring only to queries that contain placeholders. For queries that don't contain placeholders, the
We might need a basic documentation that describes these workflows, for both parametrized and non-parametrized queries. |
OK, let's assume that the SQL request is parameterized. What is the expected behavior of the In that situation, both your version of |
Yes, this is the expected behavior, regardless of the value of the library(DBI)
conn <- dbConnect(RSQLite::SQLite())
dbGetQuery(conn, "SELECT ? AS a", params = list(1))
#> a
#> 1 1
dbGetQuery(conn, "SELECT ? AS a")
#> Error: Query needs to be bound before fetching Created on 2019-04-23 by the reprex package (v0.2.1.9000) |
@krlmlr @hadley We suspect that other existing libraries, like dbplyr, also expect The Teradata DBI Driver must be able to interoperate with dbplyr and other existing libraries that expect immediate execution from The Teradata DBI Driver will behave as follows for the
|
The following reprex gives an overview over the expected behavior (irrespective of the library(DBI)
con <- dbConnect(RSQLite::SQLite())
res <- dbSendQuery(con, "SELECT 1 AS a")
dbFetch(res)
#> a
#> 1 1
dbClearResult(res)
res <- dbSendQuery(con, "SELECT 1 AS a")
dbBind(res, params = list())
#> Error: Query does not require parameters.
dbFetch(res)
#> a
#> 1 1
dbClearResult(res)
res <- dbSendQuery(con, "SELECT ? AS a")
dbBind(res, params = list(1))
dbFetch(res)
#> a
#> 1 1
dbClearResult(res)
res <- dbSendQuery(con, "SELECT ? AS a")
dbFetch(res)
#> Error: Query needs to be bound before fetching
dbClearResult(res)
dbDisconnect(con) Created on 2019-04-24 by the reprex package (v0.2.1.9000)
Currently, Let me try to rephrase the problem, please correct me if I'm mistaken.
|
@krlmlr Yes, your points 1, 2, 3 are correct. For the Teradata DBI Driver to provide this behavior from your example, res <- dbSendQuery(con, "SELECT 1 AS a")
dbFetch(res)
#> a
#> 1 1
dbClearResult(res) Per your point 3, the driver cannot blindly attempt to prepare all SQL requests, because the Teradata Database supports a prepare operation for most, but not every, kind of SQL request. The error returned by the Teradata Database for an unsupported prepare operation is the same as the generic syntax error. The Teradata DBI Driver needs explicit unambiguous direction from the app as to whether the app wants to prepare versus the app wants to execute. Note that this problem doesn't arise in other language database APIs such the JDBC API which offers non-parameterized SQL functionality via |
Thanks. It feels that lifting the requirement that When to send the query to the database?The Teradata DBI Driver should send the query to the database on one of the following events:
The following example illustrates these modes of operation: library(DBI)
# Database communication occurs after <<<<< lines
con <- dbConnect(RSQLite::SQLite())
## <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
res <- dbSendQuery(con, "SELECT ? AS a", params = list(1))
dbFetch(res)
#> a
#> 1 1
dbClearResult(res)
res <- dbSendQuery(con, "SELECT 1 AS a")
## <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
dbFetch(res)
#> a
#> 1 1
dbClearResult(res)
res <- dbSendQuery(con, "SELECT ? AS a")
## <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
dbBind(res, params = list(1))
dbFetch(res)
#> a
#> 1 1
dbClearResult(res) Created on 2019-04-25 by the reprex package (v0.2.1.9000) As per your requirement, no client-side checking occurs, and the query and all parameters are sent in at the same time. No need to introduce new arguments to library(DBI)
con <- dbConnect(RSQLite::SQLite())
dbSendQuery(con, "SELECTT ? AS a")
#> Error: near "SELECTT": syntax error Created on 2019-04-25 by the reprex package (v0.2.1.9000) It feels like Most users will/should use "Prepared" vs. "direct" APIThis would come in addition to the behavior described above. It would only hint at the choice of the API. In particular, It would help if you please could point me to the upstream API the Teradata DBI Driver is using, and perhaps to the DBI Driver's source code. |
From my perspective, the duality of |
Unfortunately, that ship has sailed now -- I'd rather not add new methods to DBI at this time. |
- Specify `immediate` argument to `dbSendQuery()`, `dbGetQuery()`, `dbSendStatement()` and `dbExecute()` (#268).
- Specify `immediate` argument (r-dbi/DBI#268).
DBI 1.1.0 is on CRAN now. Thanks again for your input! |
What we plan to do in DuckDB is always internally prepare statements, even if no parameters are given. |
This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary. |
Hi folks, my team at Teradata is implementing a DBI Driver for the Teradata Database.
The documentation for dbSendQuery says:
The documentation for dbBind says:
The documentation appears to be saying that the dbSendQuery method will execute a non-parameterized SQL request, and will prepare (but not execute) a parameterized SQL request.
The documentation does not indicate how a DBI Driver should distinguish between a parameterized versus non-parameterized SQL request, in order to decide how to transmit the SQL request to the backend database in the appropriate manner (prepare versus execute).
In other database APIs, such as JDBC, there are separate API calls for prepare versus execute, so expected driver behavior is obvious.
My expectation is that the dbSendQuery documentation can be augmented to explain the decision-making criteria for whether the specified SQL request is to be prepared versus executed.
In particular, the documentation should indicate whether this behavior is driver-specific. If there is a recommended way for the application to indicate to the driver whether to prepare versus execute, such as an extra named parameter for the dbSendQuery, then the documentation should mention that.
The text was updated successfully, but these errors were encountered: