Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally avoid scratch #117

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Optionally avoid scratch #117

wants to merge 5 commits into from

Conversation

wlandau
Copy link
Contributor

@wlandau wlandau commented Dec 3, 2019

As mentioned in #80, some use cases of RDS storrs require atomic writes, which depend on the scratch directory. However, writing to scratch and then moving the file creates a bottleneck on some systems, Windows in particular. This workflow spends a lot of time renaming tiny files, and the total runtime was around 104 seconds on my machine.

before-104s

The changes in this PR cut the total runtime down to about 50 seconds, and file.rename() is no longer a bottleneck.

after-50s

I need to do more digging to make sure people can disable scratch with drake, but since progress logging is different than it once was, I think it is worth a shot.

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] withr_2.1.2          tibble_2.1.3         storr_1.2.2          microbenchmark_1.4-7
[5] MakefileR_1.0        profile_1.0          fs_1.3.1             drake_7.7.0.9002    

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.3      txtq_0.2.0      crayon_1.3.4    digest_0.6.23   R6_2.4.1       
 [6] backports_1.1.5 magrittr_1.5    pillar_1.4.2    rlang_0.4.2     rstudioapi_0.10
[11] filelock_1.0.2  tools_3.6.1     igraph_1.2.4.2  yaml_2.2.0      compiler_3.6.1 
[16] pkgconfig_2.0.3 base64url_1.4  

@wlandau
Copy link
Contributor Author

@wlandau wlandau commented Dec 6, 2019

Hmm... the segfault on Travis seems to trace back to has_postgres(). @richfitz, what do you suggest we do?

@wlandau
Copy link
Contributor Author

@wlandau wlandau commented Dec 7, 2019

Also, I just remembered that even if the keys are different, the data might still be the same, which creates a race condition. I ran right into that Chesterton fence 😅. But we might still be able to save time by skipping scratch for the keys in many use cases, including drake.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants