Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandoc error with pdf when .Rmd is on network file share (NFS) #701

Open
GeorgeTomlinson opened this issue May 19, 2016 · 20 comments
Open

pandoc error with pdf when .Rmd is on network file share (NFS) #701

GeorgeTomlinson opened this issue May 19, 2016 · 20 comments

Comments

@GeorgeTomlinson
Copy link

@GeorgeTomlinson GeorgeTomlinson commented May 19, 2016

I am on OSX El Capitain and have a network file share mounted using sshfs.

When I run markdown::render() (or use the menu item knit->PDF in Studio) on a .Rmd file on my local drive it works fine. When the .Rmd is located on the file share, it will generate a PDF, but it terminates with the message below. This in itself is usually not a problem but there are two side effects:

(1) The PDF does not open in the internal viewer in RStudio. I have to open the PDF directly with another application.

(2) When I use the "keep_tex: true" option in the preamble, I don't get a .tex file

One possible clue: path names to files and directories are not case-sensitive on my local drive (OSX) but are case-sensitive on the file share drive.

Error message:

pandoc: InterimReportMay18.pdf: hClose: invalid argument (Bad file descriptor)
Error: pandoc document conversion failed with error 1
Execution halted

@bweigel

This comment has been minimized.

Copy link

@bweigel bweigel commented Jul 8, 2016

I have a similar problem, however my mountpoint is cifs (running a linux server and client):

/usr/bin/pandoc +RTS -K512m -RTS test.md --to html --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash-implicit_figures --output test.html --smart --email-obfuscation none --self-contained --standalone --section-divs --template /home/mori/R/x86_64-pc-linux-gnu-library/3.2/rmarkdown/rmd/h/default.html --variable 'theme:bootstrap' --include-in-header /tmp/RtmpqFVGzL/rmarkdown-str12bf7b3a9f6c.html --mathjax --variable 'mathjax-url:https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --no-highlight --variable highlightjs=/home/mori/R/x86_64-pc-linux-gnu-library/3.2/rmarkdown/rmd/h/highlight
abrts with

pandoc: test.html: hClose: hardware fault (Input/output error)
Error: pandoc document conversion failed with error 1
Execution halted

However, I get it to run from the terminal, when I remove some of the optional parameters:

/usr/bin/pandoc +RTS -K512m -RTS test.md --to html --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash-implicit_figures --output test.html --smart --email-obfuscation none --self-contained --standalone --section-divs --template /home/mori/R/x86_64-pc-linux-gnu-library/3.2/rmarkdown/rmd/h/default.html

No error, resulting html looks ok.
Would be neat, if anyone had an idea how to fix this.

@stroobandt

This comment has been minimized.

Copy link

@stroobandt stroobandt commented Aug 2, 2016

I ran into the same error using pandoc without RStudio on a remote Samba/CIFS drive.
.pdf: hClose: does not exist (Host is down)
Generating the PDF locally works.
The error reminds me of a clock skew error.
Using touch * did not resolve the problem though.

@cderv

This comment has been minimized.

Copy link
Contributor

@cderv cderv commented Aug 2, 2016

By the way, this issue was also reported to pandoc at jgm/pandoc#1326. Do not seem to be a pandoc issue for them.

I have the same error with html conversion from markdown on a network file share.

@stroobandt

This comment has been minimized.

Copy link

@stroobandt stroobandt commented Aug 2, 2016

Yup, I am convinced this is not an RStudio problem. It only got reported first by people using RStudio. We need more testing with producing pandoc output on remote drives.
Here, it seems to happen only when a large PDF output needs to be generated. Most of the time it does not work, but sometimes it does for the same file. Hence my intuition that timing might be involved.

@kevinushey

This comment has been minimized.

Copy link
Contributor

@kevinushey kevinushey commented Aug 2, 2016

One way that rmarkdown could potentially handle this would be to have pandoc generate the resulting document in the same directory as the input document, and then copy that document back to the desired location. (We might already do that in some cases?)

That may work more reliably than having pandoc to attempt to write files to remote drives.

@stroobandt

This comment has been minimized.

Copy link

@stroobandt stroobandt commented Aug 2, 2016

In my case, this would not be a solution; both the input and output file are in the same remote folder.
Better were for pandoc to generate the output in some fail safe local temporary directory (e.g. /tmp) and then copy the output file to the final remote destination directory on the network drive.

@kevinushey

This comment has been minimized.

Copy link
Contributor

@kevinushey kevinushey commented Aug 2, 2016

Maybe an R_PANDOC_OUTPUT_DIR environment variable to specify where pandoc should attempt to generate its outputs, and then we could copy those rendered outputs to the desired final destination?

@jjallaire

This comment has been minimized.

Copy link
Member

@jjallaire jjallaire commented Aug 2, 2016

I realize it seems like this is an expedient solution to the problem at
hand, but anything to do with output directories in rmarkdown gets dicey
pretty quickly. That is because we've already got output_dir and
intermediates_dir arguments (I'd suggest looking at those to see if they
solve the problem) and any new options/features around directories have to
deal with the intersection of states created by those options.

On Tue, Aug 2, 2016 at 4:39 PM, Kevin Ushey notifications@github.com
wrote:

Maybe an R_PANDOC_OUTPUT_DIR to specify where pandoc should attempt to
generate its outputs, and then we could copy those rendered outputs to the
desired final destination?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#701 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAGXxxa9LIO_xaA6vcx0nxfMwvh1fnALks5qb6sagaJpZM4Iicaj
.

@cderv

This comment has been minimized.

Copy link
Contributor

@cderv cderv commented Aug 3, 2016

Following your suggestion, I used intermediates_dir and output_dir in order to solve the issue with standalone html document from Rmd. (I notice that if I do no want a standalone html, there is no error)


When I run rmarkdown::render with default intermediates_dir and output_dir

#> output file: Test_file.knit.md

/usr/lib/rstudio-server/bin/pandoc/pandoc +RTS -K512m -RTS Test_file.utf8.md --to html 
--from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash 
--output Test_file.html --smart --email-obfuscation none --self-contained --standalone 
--section-divs --table-of-contents --toc-depth 3 --variable toc_float=1 
--variable toc_selectors=h1,h2,h3 --variable toc_collapsed=1 --variable toc_smooth_scroll=1 
--variable toc_print=1 --template /home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/default.html --variable 'theme:united' 
--include-in-header /tmp/RtmpU2y6ls/rmarkdown-str3f1b55553a92.html --mathjax 
--variable 'mathjax-url:https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --highlight-style tango 
--variable navigationjs=/home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/navigation-1.1 

#> pandoc: Test_file.html: hClose: does not exist (Host is down)
#> Erreur : pandoc document conversion failed with error 1

Test_file.html is created in working directory and Test_file_files directory too. However, pandoc does not seem to find the file and to convert it to a self_contained html.


I first set up intermediates_dir to a local directory but I leave output_dir as default. That is to say the network drive where my Rmd file is located.
Every intermediates files (md and png images) are placed in local drive but html output file and his associated folder (before conversion to standalone html file) are located in network drive.
Their is still an error with pandoc conversion as the files folder of the html is not found.

Example for Test_file.Rmd and intermediates_dir = "~/TempRMD"

#> output file: /home/ruser01/TempRMD/Test_file.knit.md

/usr/lib/rstudio-server/bin/pandoc/pandoc +RTS -K512m -RTS /home/ruser01/TempRMD/Test_file.utf8.md --to html 
--from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash 
--output Test_file.html --smart --email-obfuscation none 
--self-contained --standalone --section-divs --table-of-contents 
--toc-depth 3 --variable toc_float=1 --variable toc_selectors=h1,h2,h3 
--variable toc_collapsed=1 --variable toc_smooth_scroll=1 
--variable toc_print=1 
--template /home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/default.html --variable 'theme:united' 
--include-in-header /tmp/RtmpU2y6ls/rmarkdown-str3f1b7118b9a.html 
--mathjax --variable 'mathjax-url:https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' 
--highlight-style tango --variable navigationjs=/home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/navigation-1.1 

#> pandoc: Could not fetch Test_file_files/figure-html/Fig_1-1.png
#> Test_file_files/figure-html/Fig_1-1.png: openBinaryFile: does not exist (No such file or directory)
#> Erreur : pandoc document conversion failed with error 67

The error is now different. Test_file.html and Test_file_files directory are in my working directory (default output_dir) but setting intermediates_dir seems to make pandoc search elsewhere.


Example for Test_file.Rmd and intermediates_dir = "~/TempRMD" and output_dir = "~/TempRMD/doc

#> output file: /home/ruser01/TempRMD/Test_file.knit.md

/usr/lib/rstudio-server/bin/pandoc/pandoc +RTS -K512m -RTS /home/ruser01/TempRMD/Test_file.utf8.md --to html 
--from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash --output /home/ruser01/TempRMD/doc/Test_file.html --smart 
--email-obfuscation none --self-contained --standalone --section-divs 
--table-of-contents --toc-depth 3 --variable toc_float=1 --variable toc_selectors=h1,h2,h3 
--variable toc_collapsed=1 --variable toc_smooth_scroll=1 
--variable toc_print=1 --template /home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/default.html --variable 'theme:united' 
--include-in-header /tmp/RtmpU2y6ls/rmarkdown-str3f1b1896b739.html --mathjax 
--variable 'mathjax-url:https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --highlight-style tango --variable navigationjs=/home/ruser01/R/x86_64-redhat-linux-gnu-library/3.1/rmarkdown/rmd/h/navigation-1.1 

Output created: /home/ruser01/TempRMD/doc/Test_file.html

It works ! However everything is in my local drive and I have to move the output html file to my working directory located in network drive. Not perfect, but better than the annoying error.


Theses tests make me think that their is someting odd with file paths with rmarkdown render and pandoc because I think that example 2 with just intermediates_dir arg should have worked. However, do not know if it is a rmarkdown::render issue or a pandoc issue.
I may try some pandoc command line to check that.

I did not test with pdf output. If someone willing to check if intermediates_dir and output_dir works too, could be helpful.

@stroobandt

This comment has been minimized.

Copy link

@stroobandt stroobandt commented Aug 3, 2016

I am not (yet) an RStudio user.
I only commented this bug because I wanted to make clear that this is a more general pandoc issue.
So, I am not going to let me in about local RStudio specifics.

I resolved my issue by letting pandoc generate its PDF output in /tmp, before moving it back to the remote folder which contains the input file and the makefile.

More details on this solution can be found over at the pertaining pandoc issue page.

@stroobandt

This comment has been minimized.

Copy link

@stroobandt stroobandt commented Aug 3, 2016

@raubreywhite

This comment has been minimized.

Copy link

@raubreywhite raubreywhite commented Oct 27, 2016

This is my solution for Ubuntu (windows will require different system calls(


RmdToDOCX <- function (inFile = "", outFile = "", tocDepth = 2, copyFrom = NULL) 
{
  if (!is.null(copyFrom)) {
    if (!stringr::str_detect(inFile, paste0("^", copyFrom, 
                                            "/"))) {
      stop(paste0("inFile does not start with ", copyFrom, 
                  "/ and you are using copyFrom=", copyFrom))
    }
    file.copy(inFile, gsub(paste0("^", copyFrom, "/"), "", 
                           inFile), overwrite = TRUE)
    inFile <- gsub(paste0("^", copyFrom, "/"), "", inFile)
  }
  try({
    outDir <- tempdir()
    originalOutFile <- outFile
    #if (RAWmisc::PandocInstalled()) {
      outFile <- unlist(stringr::str_split(outFile, "/"))
      if (length(outFile) == 1) {
        #outDir <- getwd()
      }
      else {
        #outDir <- file.path(getwd(), outFile[-length(outFile)])
        outFile <- outFile[length(outFile)]
      }
      css <- system.file("extdata", "custom.css", package = "RAWmisc")
      rmarkdown::render(input = inFile, output_file = outFile, 
                        output_dir = outDir, output_format = rmarkdown::word_document(toc = TRUE, 
                                                                                      toc_depth = tocDepth))
    #}
    #else {
    #}
    cmd <- paste0("rm -f ",file.path(getwd(),originalOutFile))
    system(cmd)
    print(cmd)
    cmd <- paste0("cp -f ",file.path(outDir,outFile)," ",file.path(getwd(),originalOutFile))
    system(cmd)
    print(cmd)
  }, TRUE)
  if (!is.null(copyFrom)) {
    file.remove(inFile)
  }
}


RmdToDOCX(
  inFile = "RunWP2.Rmd",outFile = paste0("reports_formatted/WP2_",format(Sys.time(), "%Y_%m_%d"),".docx"))
@harrismcgehee

This comment has been minimized.

Copy link
Contributor

@harrismcgehee harrismcgehee commented Feb 22, 2017

@kevinushey Is there any chance you all are still looking at this? Would there be a way to use a temp directory / temp file and then copy to destination?

I believe this issue or similar also affects Notebook files on CIFS drives. They show output in the editor, but not in the Viewer and an error message displays at the top: "Error creating notebook: pandoc document conversion failed with error 1"

@yihui yihui added this to the v1.8 milestone Oct 17, 2017
@yihui yihui modified the milestones: v1.8, v1.9 Nov 15, 2017
@mwip

This comment has been minimized.

Copy link

@mwip mwip commented Jan 21, 2018

I happened to encounter the same problem. I use Linux Mint (18.3) on both my Laptop as well as my Desktop. I store my .Rmd on a NAS which is mounted via CIFS on the Desktop and synchronized on the Laptop (some Cloud Service).
When knitting the .Rmd on the Laptop (i.e. on its hard drive) no problems occur.
However, knitting on the Desktop (via. CIFS) it seems that the size of the resulting PDF has an influence on the pandoc conversion success. Strangely, when I add a few images to my Beamer presentation, the file will not compile anymore. The error is:

pandoc: 01_courseintro.pdf: hPutBuf: invalid argument (Bad file descriptor)
Error: pandoc document conversion failed with error 1
Execution halted

Yet, when I randomly comment the ![](somepic.png) lines, the compilation will suceed again.
Furthermore, a test on the local hard drive of my Desktop worked fine as well...

I am looking forward to seeing this one fixed in v 1.9. Thanks in advanced. Let me know if you need any more info.

UPDATE:

devtools::install_github("rstudio/rmarkdown") from #590 fixed it for me so far.

@fvanrenterghem

This comment has been minimized.

Copy link

@fvanrenterghem fvanrenterghem commented Jan 31, 2019

With the latest rmarkdown, I'm still experiencing this issue on Windows 10 with the Rmd on a network drive. Saving it locally and knitting works.

@bac3917

This comment has been minimized.

Copy link

@bac3917 bac3917 commented Apr 10, 2019

I'm struggling with this issue now, using Windows 10, and the following sessionInfo():

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] compiler_3.5.1 rsconnect_0.8.13 htmltools_0.3.6 tools_3.5.1 yaml_2.2.0 Rcpp_1.0.1
[7] rmarkdown_1.12 knitr_1.22 xfun_0.6 digest_0.6.18 evaluate_0.13

@adam-sampson

This comment has been minimized.

Copy link

@adam-sampson adam-sampson commented Jun 3, 2019

It looks like Pandoc actually found a fix for this issue using \\?\UNC\. But this comment jgm/pandoc#5127 (comment) shows that this user had to use forward slashes / instead of backslashes \ in the folder path. By default, the rmarkdown package seems to convert all folder paths to the R friendly \ before passing them to Pandoc.

@adam-sampson

This comment has been minimized.

Copy link

@adam-sampson adam-sampson commented Jun 25, 2019

I've been trying to make a fix but am a bit stuck.

If the RMD is very basic and doesn't have any figures or special options then changing

  • line 61 of pandoc.R (the pandoc_convert function) from args <- c(input) to args <- c(normalizePath(input))
  • line 71 from args <- c(args, "--output", output) to args <- c(args, "--output", normalizePath(output))
    fixes the NFS issue.

However, this creates several more issues. Figures no longer can be found. Additionally, this hard-coded change causes the intermediates_dir option (render("testRMDknit.RMD",intermediates_dir = "C:\\temp")) to fail. So it is clear that I need to change the path in other locations than I'm doing it. I'm not really familiar enough with this package. I've been trying to figure out where in the render() function these changes would need to be made.

@adam-sampson

This comment has been minimized.

Copy link

@adam-sampson adam-sampson commented Jun 25, 2019

Trying to change this in the render() function, but it's complicated. I can get the function to run until I build it in the package. There is a lot going on here with these directories.

@rogerjbos

This comment has been minimized.

Copy link

@rogerjbos rogerjbos commented Sep 25, 2019

Does knitting files on a network still work if using an older version of Pandoc? Is downgrading pandoc the (short-term) answer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.