New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelization in tar_render_rep() #36
Comments
You are right, multiple workers are trying to write to the same temporary directory. We need to set |
For reference, sometimes I still get this when I use multicore parallelism (but no errors with multiprocess parallelism): > tar_make_clustermq(workers = 10)
Loading required package: future
● start target report_params
● built target report_params
● start branch report_05f093ab
● start branch report_082d5890
● start branch report_b31bb919
● start branch report_d0f80795
● start branch report_c5cceb83
● start branch report_f3ab2a96
● start branch report_557b3a5e
● start branch report_8f41e77f
● start branch report_51a54713
● start branch report_5779b1db
pandoc: /var/folders/k3/q1f45fsn4_13jbn0742d4zj40000gn/T//RtmpvbnUEf/rmarkdown-str17b97612e1f10.html: openBinaryFile: does not exist (No such file or directory)
pandoc: /var/folders/k3/q1f45fsn4_13jbn0742d4zj40000gn/T//RtmpvbnUEf/rmarkdown-str17ba155f17ca4.html: openBinaryFile: does not exist (No such file or directory) After some experimentation, I am still not sure why |
Thank you very much for the fix 👍 Anyway I have found some mistakes in my reprex and I am actually surprised it worked 😄 Fixed the Rmd generation and Error you are getting is related to rstudio/rmarkdown#1632 (comment) and could be fixed by a dirty fix |
Thanks for filling me in on the rest of the issue. Really helps to understand. I agree, I think the rest is outside the control of |
Prework
tarchetypes
and not a known limitation, a usage error, or a bug in another package that tarchetypes depends on.Description
When
tar_render_rep()
is run in parallel (tested withclustermq
), intermediate knitr files with the same names (e.g.<Rmd_file>.knit.md
etc.) are used for all workers, and, thus, removed before pandoc is run.assignInNamespace("clean_tmpfiles", function() {}, ns = "rmarkdown")
is used because of rstudio/rmarkdown#1632 (comment)Reproducible example
Created on 2021-04-06 by the reprex package (v1.0.0)
Session info
Expected result
Different
intermediate_dir
should be passed tormarkdown::render()
in order to avoid the concurrent removal of intermediate knitr files. My suggestion: forintermediate_dir
use basenames ofoutput_files
or create random-named directories.Diagnostic information
See Reproducible example
Thanks in advance for looking into this! For now, I have to run the pipeline sequentially with
tar_make()
to avoid this problem.The text was updated successfully, but these errors were encountered: