pandoc error with pdf when .Rmd is on network file share (NFS) #701

Open
opened this issue May 19, 2016 · 20 comments
Open

pandoc error with pdf when .Rmd is on network file share (NFS)#701

opened this issue May 19, 2016 · 20 comments

GeorgeTomlinson commented May 19, 2016

 I am on OSX El Capitain and have a network file share mounted using sshfs. When I run markdown::render() (or use the menu item knit->PDF in Studio) on a .Rmd file on my local drive it works fine. When the .Rmd is located on the file share, it will generate a PDF, but it terminates with the message below. This in itself is usually not a problem but there are two side effects: (1) The PDF does not open in the internal viewer in RStudio. I have to open the PDF directly with another application. (2) When I use the "keep_tex: true" option in the preamble, I don't get a .tex file One possible clue: path names to files and directories are not case-sensitive on my local drive (OSX) but are case-sensitive on the file share drive. Error message: pandoc: InterimReportMay18.pdf: hClose: invalid argument (Bad file descriptor) Error: pandoc document conversion failed with error 1 Execution halted

bweigel commented Jul 8, 2016 • edited

 I have a similar problem, however my mountpoint is cifs (running a linux server and client): /usr/bin/pandoc +RTS -K512m -RTS test.md --to html --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash-implicit_figures --output test.html --smart --email-obfuscation none --self-contained --standalone --section-divs --template /home/mori/R/x86_64-pc-linux-gnu-library/3.2/rmarkdown/rmd/h/default.html --variable 'theme:bootstrap' --include-in-header /tmp/RtmpqFVGzL/rmarkdown-str12bf7b3a9f6c.html --mathjax --variable 'mathjax-url:https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --no-highlight --variable highlightjs=/home/mori/R/x86_64-pc-linux-gnu-library/3.2/rmarkdown/rmd/h/highlight abrts with pandoc: test.html: hClose: hardware fault (Input/output error) Error: pandoc document conversion failed with error 1 Execution halted However, I get it to run from the terminal, when I remove some of the optional parameters: /usr/bin/pandoc +RTS -K512m -RTS test.md --to html --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash-implicit_figures --output test.html --smart --email-obfuscation none --self-contained --standalone --section-divs --template /home/mori/R/x86_64-pc-linux-gnu-library/3.2/rmarkdown/rmd/h/default.html No error, resulting html looks ok. Would be neat, if anyone had an idea how to fix this.

stroobandt commented Aug 2, 2016 • edited

 I ran into the same error using pandoc without RStudio on a remote Samba/CIFS drive. .pdf: hClose: does not exist (Host is down) Generating the PDF locally works. The error reminds me of a clock skew error. Using touch * did not resolve the problem though.

cderv commented Aug 2, 2016

 By the way, this issue was also reported to pandoc at jgm/pandoc#1326. Do not seem to be a pandoc issue for them. I have the same error with html conversion from markdown on a network file share.

stroobandt commented Aug 2, 2016

 Yup, I am convinced this is not an RStudio problem. It only got reported first by people using RStudio. We need more testing with producing pandoc output on remote drives. Here, it seems to happen only when a large PDF output needs to be generated. Most of the time it does not work, but sometimes it does for the same file. Hence my intuition that timing might be involved.

kevinushey commented Aug 2, 2016 • edited

 One way that rmarkdown could potentially handle this would be to have pandoc generate the resulting document in the same directory as the input document, and then copy that document back to the desired location. (We might already do that in some cases?) That may work more reliably than having pandoc to attempt to write files to remote drives.

stroobandt commented Aug 2, 2016

 In my case, this would not be a solution; both the input and output file are in the same remote folder. Better were for pandoc to generate the output in some fail safe local temporary directory (e.g. /tmp) and then copy the output file to the final remote destination directory on the network drive.

kevinushey commented Aug 2, 2016 • edited

 Maybe an R_PANDOC_OUTPUT_DIR environment variable to specify where pandoc should attempt to generate its outputs, and then we could copy those rendered outputs to the desired final destination?

jjallaire commented Aug 2, 2016

 I realize it seems like this is an expedient solution to the problem at hand, but anything to do with output directories in rmarkdown gets dicey pretty quickly. That is because we've already got output_dir and intermediates_dir arguments (I'd suggest looking at those to see if they solve the problem) and any new options/features around directories have to deal with the intersection of states created by those options. On Tue, Aug 2, 2016 at 4:39 PM, Kevin Ushey notifications@github.com wrote: Maybe an R_PANDOC_OUTPUT_DIR to specify where pandoc should attempt to generate its outputs, and then we could copy those rendered outputs to the desired final destination? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub #701 (comment), or mute the thread https://github.com/notifications/unsubscribe-auth/AAGXxxa9LIO_xaA6vcx0nxfMwvh1fnALks5qb6sagaJpZM4Iicaj .

cderv commented Aug 3, 2016

 Following your suggestion, I used intermediates_dir and output_dir in order to solve the issue with standalone html document from Rmd. (I notice that if I do no want a standalone html, there is no error) When I run rmarkdown::render with default intermediates_dir and output_dir #> output file: Test_file.knit.md /usr/lib/rstudio-server/bin/pandoc/pandoc +RTS -K512m -RTS Test_file.utf8.md --to html --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash --output Test_file.html --smart --email-obfuscation none --self-contained --standalone --section-divs --table-of-contents --toc-depth 3 --variable toc_float=1 --variable toc_selectors=h1,h2,h3 --variable toc_collapsed=1 --variable toc_smooth_scroll=1 --variable toc_print=1 --template /home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/default.html --variable 'theme:united' --include-in-header /tmp/RtmpU2y6ls/rmarkdown-str3f1b55553a92.html --mathjax --variable 'mathjax-url:https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --highlight-style tango --variable navigationjs=/home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/navigation-1.1 #> pandoc: Test_file.html: hClose: does not exist (Host is down) #> Erreur : pandoc document conversion failed with error 1  Test_file.html is created in working directory and Test_file_files directory too. However, pandoc does not seem to find the file and to convert it to a self_contained html. I first set up intermediates_dir to a local directory but I leave output_dir as default. That is to say the network drive where my Rmd file is located. Every intermediates files (md and png images) are placed in local drive but html output file and his associated folder (before conversion to standalone html file) are located in network drive. Their is still an error with pandoc conversion as the files folder of the html is not found. Example for Test_file.Rmd and intermediates_dir = "~/TempRMD" #> output file: /home/ruser01/TempRMD/Test_file.knit.md /usr/lib/rstudio-server/bin/pandoc/pandoc +RTS -K512m -RTS /home/ruser01/TempRMD/Test_file.utf8.md --to html --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash --output Test_file.html --smart --email-obfuscation none --self-contained --standalone --section-divs --table-of-contents --toc-depth 3 --variable toc_float=1 --variable toc_selectors=h1,h2,h3 --variable toc_collapsed=1 --variable toc_smooth_scroll=1 --variable toc_print=1 --template /home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/default.html --variable 'theme:united' --include-in-header /tmp/RtmpU2y6ls/rmarkdown-str3f1b7118b9a.html --mathjax --variable 'mathjax-url:https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --highlight-style tango --variable navigationjs=/home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/navigation-1.1 #> pandoc: Could not fetch Test_file_files/figure-html/Fig_1-1.png #> Test_file_files/figure-html/Fig_1-1.png: openBinaryFile: does not exist (No such file or directory) #> Erreur : pandoc document conversion failed with error 67  The error is now different. Test_file.html and Test_file_files directory are in my working directory (default output_dir) but setting intermediates_dir seems to make pandoc search elsewhere. Example for Test_file.Rmd and intermediates_dir = "~/TempRMD" and output_dir = "~/TempRMD/doc #> output file: /home/ruser01/TempRMD/Test_file.knit.md /usr/lib/rstudio-server/bin/pandoc/pandoc +RTS -K512m -RTS /home/ruser01/TempRMD/Test_file.utf8.md --to html --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash --output /home/ruser01/TempRMD/doc/Test_file.html --smart --email-obfuscation none --self-contained --standalone --section-divs --table-of-contents --toc-depth 3 --variable toc_float=1 --variable toc_selectors=h1,h2,h3 --variable toc_collapsed=1 --variable toc_smooth_scroll=1 --variable toc_print=1 --template /home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/default.html --variable 'theme:united' --include-in-header /tmp/RtmpU2y6ls/rmarkdown-str3f1b1896b739.html --mathjax --variable 'mathjax-url:https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --highlight-style tango --variable navigationjs=/home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/navigation-1.1 Output created: /home/ruser01/TempRMD/doc/Test_file.html  It works ! However everything is in my local drive and I have to move the output html file to my working directory located in network drive. Not perfect, but better than the annoying error. Theses tests make me think that their is someting odd with file paths with rmarkdown render and pandoc because I think that example 2 with just intermediates_dir arg should have worked. However, do not know if it is a rmarkdown::render issue or a pandoc issue. I may try some pandoc command line to check that. I did not test with pdf output. If someone willing to check if intermediates_dir and output_dir works too, could be helpful.

stroobandt commented Aug 3, 2016 • edited

 I am not (yet) an RStudio user. I only commented this bug because I wanted to make clear that this is a more general pandoc issue. So, I am not going to let me in about local RStudio specifics. I resolved my issue by letting pandoc generate its PDF output in /tmp, before moving it back to the remote folder which contains the input file and the makefile. More details on this solution can be found over at the pertaining pandoc issue page.

raubreywhite commented Oct 27, 2016 • edited

 This is my solution for Ubuntu (windows will require different system calls(  RmdToDOCX <- function (inFile = "", outFile = "", tocDepth = 2, copyFrom = NULL) { if (!is.null(copyFrom)) { if (!stringr::str_detect(inFile, paste0("^", copyFrom, "/"))) { stop(paste0("inFile does not start with ", copyFrom, "/ and you are using copyFrom=", copyFrom)) } file.copy(inFile, gsub(paste0("^", copyFrom, "/"), "", inFile), overwrite = TRUE) inFile <- gsub(paste0("^", copyFrom, "/"), "", inFile) } try({ outDir <- tempdir() originalOutFile <- outFile #if (RAWmisc::PandocInstalled()) { outFile <- unlist(stringr::str_split(outFile, "/")) if (length(outFile) == 1) { #outDir <- getwd() } else { #outDir <- file.path(getwd(), outFile[-length(outFile)]) outFile <- outFile[length(outFile)] } css <- system.file("extdata", "custom.css", package = "RAWmisc") rmarkdown::render(input = inFile, output_file = outFile, output_dir = outDir, output_format = rmarkdown::word_document(toc = TRUE, toc_depth = tocDepth)) #} #else { #} cmd <- paste0("rm -f ",file.path(getwd(),originalOutFile)) system(cmd) print(cmd) cmd <- paste0("cp -f ",file.path(outDir,outFile)," ",file.path(getwd(),originalOutFile)) system(cmd) print(cmd) }, TRUE) if (!is.null(copyFrom)) { file.remove(inFile) } } RmdToDOCX( inFile = "RunWP2.Rmd",outFile = paste0("reports_formatted/WP2_",format(Sys.time(), "%Y_%m_%d"),".docx")) 
mentioned this issue Nov 3, 2016

harrismcgehee commented Feb 22, 2017

 @kevinushey Is there any chance you all are still looking at this? Would there be a way to use a temp directory / temp file and then copy to destination? I believe this issue or similar also affects Notebook files on CIFS drives. They show output in the editor, but not in the Viewer and an error message displays at the top: "Error creating notebook: pandoc document conversion failed with error 1"
added this to the v1.8 milestone Oct 17, 2017
modified the milestones: v1.8, v1.9 Nov 15, 2017

mwip commented Jan 21, 2018 • edited

I happened to encounter the same problem. I use Linux Mint (18.3) on both my Laptop as well as my Desktop. I store my .Rmd on a NAS which is mounted via CIFS on the Desktop and synchronized on the Laptop (some Cloud Service).
When knitting the .Rmd on the Laptop (i.e. on its hard drive) no problems occur.
However, knitting on the Desktop (via. CIFS) it seems that the size of the resulting PDF has an influence on the pandoc conversion success. Strangely, when I add a few images to my Beamer presentation, the file will not compile anymore. The error is:

pandoc: 01_courseintro.pdf: hPutBuf: invalid argument (Bad file descriptor)
Error: pandoc document conversion failed with error 1
Execution halted


Yet, when I randomly comment the ![](somepic.png) lines, the compilation will suceed again.
Furthermore, a test on the local hard drive of my Desktop worked fine as well...

I am looking forward to seeing this one fixed in v 1.9. Thanks in advanced. Let me know if you need any more info.

UPDATE:

devtools::install_github("rstudio/rmarkdown") from #590 fixed it for me so far.

modified the milestones: v1.9, v1.10 Mar 4, 2018
modified the milestones: v1.10, v1.11 Jun 15, 2018
removed this from the v1.11 milestone Jun 25, 2018
mentioned this issue Jun 27, 2018
mentioned this issue Oct 2, 2018

fvanrenterghem commented Jan 31, 2019

 With the latest rmarkdown, I'm still experiencing this issue on Windows 10 with the Rmd on a network drive. Saving it locally and knitting works.

bac3917 commented Apr 10, 2019

 I'm struggling with this issue now, using Windows 10, and the following sessionInfo(): R version 3.5.1 (2018-07-02) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 17134) Matrix products: default locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.5.1 rsconnect_0.8.13 htmltools_0.3.6 tools_3.5.1 yaml_2.2.0 Rcpp_1.0.1 [7] rmarkdown_1.12 knitr_1.22 xfun_0.6 digest_0.6.18 evaluate_0.13

 It looks like Pandoc actually found a fix for this issue using \\?\UNC\. But this comment jgm/pandoc#5127 (comment) shows that this user had to use forward slashes / instead of backslashes \ in the folder path. By default, the rmarkdown package seems to convert all folder paths to the R friendly \ before passing them to Pandoc.

 I've been trying to make a fix but am a bit stuck. If the RMD is very basic and doesn't have any figures or special options then changing line 61 of pandoc.R (the pandoc_convert function) from args <- c(input) to args <- c(normalizePath(input)) line 71 from args <- c(args, "--output", output) to args <- c(args, "--output", normalizePath(output)) fixes the NFS issue. However, this creates several more issues. Figures no longer can be found. Additionally, this hard-coded change causes the intermediates_dir option (render("testRMDknit.RMD",intermediates_dir = "C:\\temp")) to fail. So it is clear that I need to change the path in other locations than I'm doing it. I'm not really familiar enough with this package. I've been trying to figure out where in the render() function these changes would need to be made.

 Trying to change this in the render() function, but it's complicated. I can get the function to run until I build it in the package. There is a lot going on here with these directories.
mentioned this issue Jul 25, 2019

rogerjbos commented Sep 25, 2019

 Does knitting files on a network still work if using an older version of Pandoc? Is downgrading pandoc the (short-term) answer?
mentioned this issue Oct 23, 2019