New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non standard handling of iso-date strings in PrestoDB/AWS Athena #384
Comments
It's not super elegant, but you can use the technique below to work around the problem in the interim: library(dbplyr)
library(dplyr, warn.conflicts = FALSE)
athena_date <- function(x) sql(paste0("date('", x, "')"))
today <- Sys.Date()
tbl_lazy(mtcars) %>%
filter(date_column <= !!athena_date(today))
#> <SQL>
#> SELECT *
#> FROM `df`
#> WHERE (`date_column` <= date('2019-12-13')) Created on 2019-12-13 by the reprex package (v0.3.0) Otherwise, you have to wait for a custom backend to provide the correct escaping behaviour. |
Thanks for the response! The question would still be how to implement the escaping in the custom backend "RAthena"? Any rough guidelines would be highly appreciated, to help the development of the package. |
I don't have that loaded in my head right now, sorry. But it should be possible — you'll just need to track through the various escaping generics in dbplyr. |
I've been scratching my head trying to track the right generics, but a bit of a beginner with those, so no success so far. Well, I'll scratch some more, knowing doing this should be possible. |
Oh hmmm, maybe it's not currently possible? It looks like we'd need a new |
OK! Here's some documentation on Presto's datatypes (Athena is Presto under the hood): https://prestosql.io/docs/current/language/types.html In general, Presto doesn't automatically convert between varchar and other types, even for constants. So for a timestamp constant one would use "timestamp '2019-01-01 01:01:01.000'", for dates "date '2019-01-01'" and so on. Edit: the official (pretty much identical) documentation, as prestosql is a fork of prestodb: https://prestodb.io/docs/current/language/types |
I tried my hand on implementing the required changes, which can be found in the related pull request. Pretty new to this process, so hopefully it's not a complete mess and didn't step on too many toes along the way. Is it, btw, better to have the backend-athena.R type of content (the specific method implementations) here or in the custom backend package? In your book 'Advanced R' you point out that the owner of the generic should define the meththods too, but there seem to be exceptions to the rule with these backends. |
Fixed in #391 |
I wasn't paying attention and you're right, these should live in the backend packages. So I've removed from dbplyr and filed issues: |
Most databases are able to interpret iso-date strings as dates in, e.g., "where" clauses, such as
select * from table where date_column = '2019-01-01'
This is not the case with PrestoDB/AWS Athena, however. Instead, the strings need to be cast as dates first like:
select * from table where date_column = date('2019-01-01')
,select * from table where date_column = date'2019-01-01'
orselect * from table where date_column = cast('2019-01-01' as date)
However, when a date variable is used as a parameter id dplyr/dbplyr, is is translated to a iso-date string in the generated sql, so a command such as:
produces and error, as the generated sql does not have the explicit cast to date of the iso-date string. One needs to manually cast the date variable as a date as in
... %>% filter(date_column <= as.Date(today))
, but this feels unnecessary, as R knows today is a date already.A DBI backend for AWS Athena is being developed at https://github.com/DyfanJones/RAthena, and I'm asking for some guidance on how to address this issue there.
My question is: is there a method for overriding the default behavior of dbplyr, where dates are translated to quoted strings and instead have them translated to, e.g., "date'2019-01-01'"?
I'm sorry if this seems like a trivial question, but I'm drawing a blank from reading the database specific source files and packages. All I can come up with is to use assignInNameSpace to replace the escape.Date -method, but this seems pretty brutal.
Thanks for any hints!
The text was updated successfully, but these errors were encountered: