Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kirill method #8

Merged
merged 14 commits into from Jul 25, 2018
Merged

Kirill method #8

merged 14 commits into from Jul 25, 2018

Conversation

@cboettig
Copy link
Member

@cboettig cboettig commented Jul 25, 2018

Implements the method proposed by @krlmlr in #7. Kirill, would be great if you could take a quick glance at this and let me know if this looks like what you proposed. Does seem to work nicely in my (unit) tests.

Copy link

@krlmlr krlmlr left a comment

Thanks for looking into it! This is what I was thinking about, I haven't reviewed the tests yet.

R/ark.R Outdated
@@ -36,9 +40,11 @@
#' }
ark <- function(db_con, dir, lines = 10000L,
compress = c("bzip2", "gzip", "xz", "none"),
tables = list_tables(db_con)){
tables = list_tables(db_con),
use_alternate = FALSE){

This comment has been minimized.

@krlmlr

krlmlr Jul 25, 2018

Perhaps a method = c("keep-open", "window", "sql-window", "manual-window") would give more flexibility in the long run? I'm not in love with "keep-open", but let's think of a better name for the method than kirill ;-)

This comment has been minimized.

@cboettig

cboettig Jul 25, 2018
Author Member

nice, I like this. I struggled with concise meaningful names and clearly didn't come up with something. keep-open is intuitive.

I think you're suggesting here that "window" would be the method that uses OFFSET, and "sql-window" would be the method that uses BETWEEN? (might be good to bipass my 'automatic' has_between() method.

R/ark.R Outdated
## Create header to avoid duplicate column names
query <- paste("SELECT * FROM", tablename, "LIMIT 0")
header <- DBI::dbGetQuery(db_con, query)
readr::write_tsv(header, con, append = FALSE)

This comment has been minimized.

@krlmlr

krlmlr Jul 25, 2018

I wonder if we can get rid of this dbGetQuery() call and handle also the initialization in the loop.

This comment has been minimized.

@cboettig

cboettig Jul 25, 2018
Author Member

That would be great. Any suggestions? (I tried just commenting this out, which kinda works with the readr functions since they actually don't mind the headers being repeated, readr::read_tsv() detects this and remove it, but clearly not a robust solution).

R/ark.R Outdated
append = append)

}

## need to convert large integers to characters
sql_integer <- function(x){

This comment has been minimized.

@krlmlr

krlmlr Jul 25, 2018

Would that be an option?

sprintf("%.0f", 1e13)
#> [1] "10000000000000"

Created on 2018-07-25 by the reprex package (v0.2.0).

This comment has been minimized.

@cboettig

cboettig Jul 25, 2018
Author Member

yes! that's much nicer than messing with scipen option...

cboettig added 8 commits Jul 25, 2018
no idea why travis and appveyor are each choosing a different test to fail on.  Cannot reproduce.
@cboettig cboettig merged commit 8642ba1 into master Jul 25, 2018
5 of 6 checks passed
5 of 6 checks passed
codecov/patch 84% of diff hit (target 91.22%)
Details
codecov/project 91.4% (+0.17%) compared to 01b3759
Details
continuous-integration/appveyor/branch AppVeyor build succeeded
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
@cboettig
Copy link
Member Author

@cboettig cboettig commented Jul 25, 2018

@krlmlr Thanks for the feedback, I've implemented the method = strategy you proposed, and fixed the handling of sci notation etc. Merging now as I think this is a solid improvement on the previous approach, but further feedback still welcome.

@cboettig cboettig deleted the kirill-method branch Jul 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.