Skip to content

Pandoc Integration

Brett Terpstra edited this page Jan 14, 2026 · 3 revisions

Integrating with Pandoc

Apex focuses on producing high‑quality HTML. For richer publishing targets like DOCX, PDF, RTF, OPML, ODT, and more, you can combine Apex with Pandoc by piping Apex’s HTML output into Pandoc.

The general pattern is:

apex input.md --standalone | pandoc -f html -t FORMAT -o output.FILE
  • --standalone tells Apex to generate a full HTML5 document (<html>, <head>, <body>).
  • -f html tells Pandoc to read HTML.
  • -t FORMAT and -o output.FILE select Pandoc’s output format and destination file.

Pandoc must be installed separately (brew install pandoc on macOS, or your platform’s package manager).


Converting to DOCX

Use Pandoc’s DOCX writer to produce Word documents from Apex HTML output:

apex input.md --standalone \
  | pandoc -f html -t docx -o output.docx

You can also write the HTML to a file first:

apex input.md --standalone -o output.html
pandoc -f html -t docx output.html -o output.docx

To set the document title from Apex:

apex input.md --standalone --title "My Document" \
  | pandoc -f html -t docx -o output.docx

Pandoc will use the <title> element from the HTML when available.


Converting to PDF

You can generate PDFs by letting Pandoc handle the PDF engine (usually via LaTeX or a PDF HTML engine, depending on your Pandoc setup):

apex input.md --standalone \
  | pandoc -f html -t pdf -o output.pdf

With an explicit title:

apex input.md --standalone --title "My PDF Document" \
  | pandoc -f html -t pdf -o output.pdf

If you prefer an intermediate HTML file:

apex input.md --standalone -o output.html
pandoc -f html -t pdf output.html -o output.pdf

Including a CSS Stylesheet

Apex can link or embed a CSS stylesheet in the generated HTML. Pandoc will respect this styling when converting formats that support it (especially PDF via HTML engines, and some DOCX templates).

Linking an external stylesheet

apex input.md --standalone --style styles.css \
  | pandoc -f html -t pdf -o output.pdf

This produces HTML with:

<link rel="stylesheet" href="styles.css">

Pandoc uses this when rendering HTML‑based outputs.

Embedding CSS into the HTML

To avoid external file dependencies, embed CSS directly:

apex input.md --standalone --style styles.css --embed-css \
  | pandoc -f html -t pdf -o output.pdf

This inlines styles.css into a <style> block in the HTML <head>, which is convenient for sharing a single self‑contained document pipeline.


Using Metadata and Title

Apex supports metadata via YAML front matter, external metadata files, and --meta CLI flags. This metadata can influence both Apex output and what Pandoc sees.

YAML front matter

In your Markdown:

---
title: My Research Paper
author: Jane Example
date: 2024-01-01
---

Then:

apex input.md --standalone \
  | pandoc -f html -t docx -o output.docx

Apex will turn this into appropriate HTML metadata (including <title>), which Pandoc can map into the target format’s title and author fields.

External metadata file

You can load shared metadata for multiple documents:

# meta.yml
title: Shared Title
author: Team Apex
apex input.md --standalone --meta-file meta.yml \
  | pandoc -f html -t pdf -o output.pdf

Inline CLI metadata

You can set or override metadata from the command line:

apex input.md --standalone \
  | pandoc -f html -t docx -o output.docx \
      --metadata=author="Your Name" \
      --metadata=subject="Project Report"

This lets you mix Apex’s metadata (front matter, --meta-file) with Pandoc’s own metadata flags.


Other Useful Formats

The same pattern works for many other Pandoc targets:

  • RTF

    apex input.md --standalone \
      | pandoc -f html -t rtf -o output.rtf
  • OPML (outline)

    apex input.md --standalone \
      | pandoc -f html -t opml -o output.opml
  • ODT (LibreOffice/OpenOffice)

    apex input.md --standalone \
      | pandoc -f html -t odt -o output.odt
  • EPUB

    apex input.md --standalone \
      | pandoc -f html -t epub -o output.epub

Preprocessing with md-fixup

For a complete pipeline, you can preprocess your Markdown with md-fixup before passing it to Apex:

md-fixup input.md | apex --standalone | pandoc -f html -t docx -o output.docx

md-fixup can fix common Markdown issues and apply search-and-replace transformations before Apex processes the document. See Usage for more details.


By combining Apex and Pandoc, you can keep Apex focused on Markdown parsing and HTML generation while Pandoc handles the heavy lifting for complex document formats.

Quick Links

Clone this wiki locally