{{ message }}

# Add Pandoc support, refactor external helpers#4060

Merged
merged 1 commit into from Nov 30, 2017
Merged

# Add Pandoc support, refactor external helpers#4060

merged 1 commit into from Nov 30, 2017

## Conversation

### betaveros commented Nov 7, 2017 • edited

 Recognize the Pandoc format under the file extension .pandoc or .pdc, and shell out to pandoc as an external helper to format Pandoc content. Add a configuration option, externalHelperArguments, that can be used to override the additional arguments passed to external helpers. Refactor out repeated code with external helpers. Change the error output formatting. I did not see any of the external helpers print the string "" to represent stdin as a file; just prepending the file name to error output is more general and doesn't sacrifice that much in terms of readability. The text was updated successfully, but these errors were encountered:

### bep commented Nov 7, 2017

 A quick question: Is it a remote possibility to do "shell injection" via os/exec so I could configure it to do pandoc --strip-comments && rm -rf / In the above I'm mostly worried about the people on the forum etc. that asks me questions of the type "can you clone my site and see what the problem is?" ... If I would start worrying about some malware type of situations, I would stop doing that. Note that I would expect the golang team to protect against this. But I would like to be really sure.

### betaveros commented Nov 7, 2017

 I don't think shell injection can be achieved in exec.Command, but the broader point is well-taken; if you're worried about malicious config files we shouldn't allow the user to pass arbitrary arguments to external helpers, since these helpers might not be designed to be safe under those circumstances. For example, it looks like pandoc takes an option --filter=PROGRAM that will call an arbitrary executable (perhaps, a malicious shell script elsewhere in the repository), so an attacker with full control over pandoc's arguments could do a lot of damage even without shell injection. It's unfortunate though, most of the allure of using pandoc is getting access to all of its options and I don't think there is a universal set that will make everybody who wants to use pandoc happy (I for one haven't even decided which of the math output formats I want for my static site yet). I could whitelist some pandoc options, but it would either be woefully incomplete or extremely bloated for an external helper that I understand probably isn't a central use case Hugo wants to support. Still, I guess an incomplete whitelist is better than nothing. (If you're worried about malicious config files in general, I'll also note in passing that it looks like attackers can also specify arbitrary executables in the configuration option newContentEditor, so they could also cause a malicious script to be invoked on hugo new.)

### bep commented Nov 7, 2017

 If you're worried about malicious config files in general I will make a note about the newContentEditor, but that command is at least not part of the main Hugo build.

mentioned this pull request Nov 7, 2017

### bep commented Nov 7, 2017

 I will think about this a little, about the cost vs value vs added support etc. This isn't "core Hugo", so any new additions in this department should come cheaply, and this currently does not look like it.

### betaveros commented Nov 7, 2017

 I dropped the options for now to keep things simple. If there's demand for external options we (or somebody else) can revisit this later.

 Add Pandoc support, refactor external helpers 
 a3f0dc6 
Recognize the Pandoc format under the file extension .pandoc or .pdc,
and shell out to pandoc as an external helper to format Pandoc content.

Refactor out repeated code with external helpers. Change the error
output formatting. I did not see any of the external helpers print the
string "<input>" to represent stdin as a file; just prepending the file
name to error output is more general and doesn't sacrifice that much in
terms of readability.

### betaveros commented Nov 8, 2017

 Actually I'm going to go out on a limb and make --mathjax the single option to pandoc, since the default math option is rather lame, MathJax is somewhat officially recommended in the Hugo documentation, and the output markup looks fine for rendering with other LaTeX libraries if people want to customize that.

### betaveros commented Nov 13, 2017

 Any updates? The PR as it currently stands should be fairly cheap; it doesn't use pandoc any differently from the way it previously used asciidoc/asciidoctor/rst2html.

### betaveros commented Nov 22, 2017

 @bep any updates on the status of this PR, or anything else you'd like to be changed?

### bep commented Nov 22, 2017

 It looks very good, I will merge it in "a couple of days"; I will probably have to get a 0.31.1 patch release out the window, and try to keep the master branch as quiet as possible right after a release.

mentioned this pull request Nov 22, 2017

### shawnohare commented Nov 30, 2017

 I stumbled upon this PR because I was interested in adding the feature as well. Thanks for putting in the effort to do this! I wonder if the current references to "pandoc" (especially in the documentation) should be changed to more explicitly reference "pandoc markdown", since pandoc itself is not a format, per se.

merged commit e69da7a into gohugoio:master Nov 30, 2017
3 checks passed
mentioned this pull request Dec 24, 2017

### adityam commented Jan 4, 2018 • edited

 Is it possible to reconsider allowing options to pandoc (either via config.toml or via the options in each .pdc file)? In particular, I am thinking of the possibility of adding bibliography files and citation style. This will make it much easier to create course notes using hugo. If you are reluctant to add arbitrary flags, one option is to simply add --filter pandoc-citeproc flag to pandoc.

### rdwatters commented Jan 4, 2018 • edited

 @betaveros I’m curious if you see utility in pandoc’s MD-to-Word functionality (what I admittedly have to use Pandoc for most at work) for Hugo sites. I could see some very cool COPE workflows for users who need to create PDF versions of content as well. I’ve been woefully absent in the Hugo world for the past few months. Thanks for your work on this. I might be an edge case, but I’m really excited about this feature!

mentioned this pull request Jan 23, 2018
mentioned this pull request Jan 23, 2018

### dreamcat4 commented Jan 23, 2018 • edited

 Documentation here: https://gohugo.io/content-management/formats/#additional-formats-through-external-helpers This is a refactor to inlude panadoc support, still with default arguments - Hugo version 0.32 and higher.

### dreamcat4 commented Jan 23, 2018 • edited

 Ah, it looks like you dropped this feature from the PR: Add a configuration option, externalHelperArguments, that can be used to override the additional arguments passed to external helpers. Well I was actually very much looking forward to seeing that, for passing custom asciidoc default arguments. This is a valuable improvement (for asciidoc users). I dropped the options for now to keep things simple. If there's demand for external options we (or somebody else) can revisit this later. Yes I need this / would like to see this. Is it possible to reconsider allowing options to pandoc (either via config.toml or via the options in each .pdc file)? In particular, I am thinking of the possibility of adding bibliography files and citation style. This will make it much easier to create course notes using hugo. This is a similar / same reasons why asciidoctor users also need this externalHelperArguments override string in config.toml global configuration. @betaveros do you still want to make a new PR for the externalHelperArguments feature? As a seperate PR ? @bep Can we have further discussions about this feature? Do you see the value in it?

### valouille commented Feb 16, 2018 • edited

 Hello, For my usecase, I certainly see value in this feature, because I would like to be able to pass options to asciidoctor like -r asciidoctor-diagram for the generation of plantUML diagrams. Currently, I'm getting this error : ERROR 2018/02/16 10:46:15 post/2018-01-012-doc.adoc: asciidoctor: WARNING: : line 20: invalid style for literal block: plantuml

### bilke commented Feb 22, 2018

 @bep Would you allow the argument --filter pandoc-citeproc? One had to check if the executable pandoc-citeproc is found in the path and if yes this argument can be added. I would make a PR if you agree. cc: @adityam

mentioned this pull request May 30, 2018
mentioned this pull request Jun 7, 2018

### raghur commented Aug 6, 2018

 I've run into this as well - need to customize args passed to asciidoctor - attributes, plugins etc and right now there's no way to do it short of putting everything in each asciidoc file Also, is something like PR 4310 planned? It doesn't say why it was closed (just points to this issue)

### kirubakaran commented Nov 11, 2018

 @shawnohare Re your #4060 (comment) I use org-mode. So I pass --from=org flag myself, as a temporary solution, as I've described here: https://kirubakaran.com/blog/org-pandoc-hugo/#update I think that inferring from format from the filename would be a good solution. Example: foo.org.pandoc to send --from=org when calling Pandoc. Thoughts?

mentioned this pull request Mar 11, 2019

### ankitp94 commented May 30, 2019

 Is it possible to reconsider allowing options to pandoc (either via config.toml or via the options in each .pdc file)? In particular, I am thinking of the possibility of adding bibliography files and citation style. This will make it much easier to create course notes using hugo. If you are reluctant to add arbitrary flags, one option is to simply add --filter pandoc-citeproc flag to pandoc. This would be certainly helpful!

### RMStoica-zivver commented May 11, 2020

 @rikhuijzer it would be nice if we didn't have to resort to bash trickery though. it's an answer, not a solution

### rikhuijzer commented May 11, 2020 • edited

 @RMStoica-zivver Why do you aim this critique to me exactly? Actually, I get that this is not a priority. I assume that a typical user would like to build the Hugo website via CI. The default Hugo Docker image does not include pandoc [1]. So you will always have to create a build image yourself with Pandoc and Hugo. Switching to the Netlify build image [2] is okay but you don't have any Pandoc extensions there, such as pandoc-citeproc [2]. For a typical user I would say lets not use Pandoc and keep the build simple. An atypical user can copy the "bash trickery" listed above. About the "bash trickery". It is fairly straightforward, and I think that Hugo is quite properly set up that it even allows for such "trickery". They could have hard-coded Pandoc.

### RMStoica-zivver commented May 11, 2020

 @rikhuijzer it was not a critique of course, just a statement of fact - that the answer by @adytiam is a workaround, not a solution to the problem of hugo not passing arguments to pandoc or any other external renderer. As for the discussion on typical or atypical users, I would refrain, since I do not know how many use pandoc/asciidoctor or would like to, but perhaps you know more.

mentioned this pull request Aug 12, 2020