pandoc-tar is a simple command line tool to help with batch
conversions of documents using Pandoc. It reads
a tar archive of
documents from standard input, converts them to the desired format,
and writes a tar archive of the converted documents to standard
output.
Running the pandoc command separately for each document can incur
several seconds' worth of startup overhead for a batch with tens of
documents. This is too much for interactive use.
Running pandoc *.md, i.e. giving many source filenames on the
command line, requires the source documents to be disk files, and the
destination documents will be concatenated into one file, making it
hard to split them apart later.
A separate tool called
pandoc-server runs a web
server which provides a /convert-batch endpoint. This is fast and
avoids temp files, but even localhost-only web servers come with
security implications and other complexities.
Tar is a simple, ubiquitous format that is easy to read and write from countless programming languages. Tar files are well suited for pipes.
JSON would also be good for sending over a pipe, but is less standard than tar, and how to handle mixed character encodings and arbitrary binary data is less clear.
You will need stack
or cabal. With stack:
% stack install
A pandoc-tar executable will be put in ~/.local/bin.
produce-markdown-tar | pandoc-tar --from markdown --to json | consume-json-tar
pandoc-tar [--version] [-v|--verbose] [-f|--from FORMAT] (-t|--to FORMAT)
[-w|--wrap WRAPOPT] [-c|--columns INT] [-s|--standalone]
[-m|--template TEMPLATE]
Available options:
-h,--help Show this help text
--version Show version.
-v,--verbose Write details to standard output
-f,--from FORMAT Force input markup format
-t,--to FORMAT Output markup format
-w,--wrap WRAPOPT Text-wrapping style for output. (default: WrapAuto)
-c,--columns INT Width of output in columns. (default: 72)
-s,--standalone Produce stand-alone output documents.
-m,--template TEMPLATE Pandoc template to use.
pandoc-tar is a straightforward adaptation of pandoc-server to use
stdio and tar instead of HTTP and JSON. pandoc-server is written by
Pandoc's author, John MacFarlane, who gracefully listened to but
ultimately denied my feature request to support tar in pandoc
itself.