Related to #1163 , I found myself to have to write some methods from scratch to support sparkSQL (see @RevolutionAnalytics/dplyr.spark) that do what methods for other databases do with minor variations. I am not sure it's feasible nor urgent to provide some intermediate layer of abstraction, but I thought I'd list them here since it seems bad from a code reuse standpoint and maybe supporting new DBs could be made simpler.
-db_explain doesn't have a DBIConnection method. The method for MySQLConnection works. I suspect that's standard SQL and should become a DBIConnection method
-db_insert_into: very minor difference between hiveql and mysql here, only the file separator and reserved word INPATH vs INFILE. I would suggest a sql_join generic instead of hardcoding the sql in this method and for the file input, I am not sure what I would do
-sql_join: only the DBIConnection method exists. Differences between this and hiveql are: extension to unique col names; ON syntax with fully qualified names instead of USING; add a class "join" to the return type; don't nest the sql inside of a SELECT * FROM. I still have problems with duplicated columns, so the list is not be complete
The text was updated successfully, but these errors were encountered:
db_insert_into(), I think it's better to fix this at the DBI level
sql_join(): I think you'll need to do what you want in your method. Making sql_join() more flexible is going to be tricky, and as you can tell by how long it took me to respond to this issue, relying on me is likely to create a bottleneck.