Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upKirill method #8
Conversation
|
Thanks for looking into it! This is what I was thinking about, I haven't reviewed the tests yet. |
| @@ -36,9 +40,11 @@ | |||
| #' } | |||
| ark <- function(db_con, dir, lines = 10000L, | |||
| compress = c("bzip2", "gzip", "xz", "none"), | |||
| tables = list_tables(db_con)){ | |||
| tables = list_tables(db_con), | |||
| use_alternate = FALSE){ | |||
krlmlr
Jul 25, 2018
Perhaps a method = c("keep-open", "window", "sql-window", "manual-window") would give more flexibility in the long run? I'm not in love with "keep-open", but let's think of a better name for the method than kirill ;-)
Perhaps a method = c("keep-open", "window", "sql-window", "manual-window") would give more flexibility in the long run? I'm not in love with "keep-open", but let's think of a better name for the method than kirill ;-)
cboettig
Jul 25, 2018
Author
Member
nice, I like this. I struggled with concise meaningful names and clearly didn't come up with something. keep-open is intuitive.
I think you're suggesting here that "window" would be the method that uses OFFSET, and "sql-window" would be the method that uses BETWEEN? (might be good to bipass my 'automatic' has_between() method.
nice, I like this. I struggled with concise meaningful names and clearly didn't come up with something. keep-open is intuitive.
I think you're suggesting here that "window" would be the method that uses OFFSET, and "sql-window" would be the method that uses BETWEEN? (might be good to bipass my 'automatic' has_between() method.
| ## Create header to avoid duplicate column names | ||
| query <- paste("SELECT * FROM", tablename, "LIMIT 0") | ||
| header <- DBI::dbGetQuery(db_con, query) | ||
| readr::write_tsv(header, con, append = FALSE) |
krlmlr
Jul 25, 2018
I wonder if we can get rid of this dbGetQuery() call and handle also the initialization in the loop.
I wonder if we can get rid of this dbGetQuery() call and handle also the initialization in the loop.
cboettig
Jul 25, 2018
Author
Member
That would be great. Any suggestions? (I tried just commenting this out, which kinda works with the readr functions since they actually don't mind the headers being repeated, readr::read_tsv() detects this and remove it, but clearly not a robust solution).
That would be great. Any suggestions? (I tried just commenting this out, which kinda works with the readr functions since they actually don't mind the headers being repeated, readr::read_tsv() detects this and remove it, but clearly not a robust solution).
| append = append) | ||
|
|
||
| } | ||
|
|
||
| ## need to convert large integers to characters | ||
| sql_integer <- function(x){ |
krlmlr
Jul 25, 2018
Would that be an option?
sprintf("%.0f", 1e13)
#> [1] "10000000000000"
Created on 2018-07-25 by the reprex package (v0.2.0).
Would that be an option?
sprintf("%.0f", 1e13)
#> [1] "10000000000000"Created on 2018-07-25 by the reprex package (v0.2.0).
cboettig
Jul 25, 2018
Author
Member
yes! that's much nicer than messing with scipen option...
yes! that's much nicer than messing with scipen option...
|
@krlmlr Thanks for the feedback, I've implemented the |
Implements the method proposed by @krlmlr in #7. Kirill, would be great if you could take a quick glance at this and let me know if this looks like what you proposed. Does seem to work nicely in my (unit) tests.