Releases: jgm/pandoc
pandoc 3.1.3
Click to expand changelog
-
New output format:
typst
. -
New module: Text.Pandoc.Readers.Typst [API change].
-
DocBook reader:
- Support more emphasis roles (Albert Krewinkel). The role “bf” is taken to indicate “bold face”, i.e., “strongly emphasized” text, while “underline” leads to underlined text.
-
JATS reader:
-
Org reader (Albert Krewinkel):
- Require abstract environment to use lowercase.
- Treat
#+NAME
as synonym for#+LABEL
(#8578).
-
ODT reader:
-
RST reader:
- Fix sorting on anonymous keys (#8877). This fixes a link resolution bug bug affecting RST documents with anonymous links.
-
HTML reader:
- Fix iframe with data URI of an image (#8856). In this case we don’t want to try to parse the data at the URL. Instead, create an image inside a div.
-
RTF reader:
- Fix bug in table parsing (#8767). In certain cases, text before a table was being incorporated into the table itself.
-
Docx reader:
- Introduce support for Intense Quote (Stephan Meijer).
-
Markdown reader:
- Disallow escaping of
~
and"
inmarkdown_strict
(#8777, Albert Krewinkel). This matches the behavior of the legacyMarkdown.pl
as well as what is described in the manual.
- Disallow escaping of
-
LaTeX reader: ignore args to column type in
\multicolumn
(#8789). -
HTML writer:
- Use first paragraph in task item as checkbox label (#8729, Albert Krewinkel).
-
Ms writer:
- Coerce titles to inlines (#8835). Block-level formatting is not allowed inside
.TL
.
- Coerce titles to inlines (#8835). Block-level formatting is not allowed inside
-
LaTeX writer:
- Fix width for multicolumn simple table (#8831).
-
Jira writer:
- Use first code block class as highlighting language (#8814, Albert Krewinkel). The writer no longer searches the list of classes for a known programming language but always uses the first class in that list as the language identifier.
-
OpenDocument writer:
-
ODT writer:
- Don’t add settings.xml (Michael Stahl). This will cause defaults to be used, which is what we want.
- Don’t add unnecessary Configurations2 directory (Michael Stahl).
- Don’t add thumbnail (Michael Stahl).
- Put
manifest.version
on directory file-entry (Michael Stahl). See ODF 1.3 part 2, 4.16.14.1. - Stop validator complaints by producing ODF 1.3 (Michael Stahl).
-
MediaWiki writer:
- Remove links from inside links in mediawiki writer (#8739, Wout Gevaert).
-
Typst writer:
- Omit bibliography if
citations
not enabled (#8763). With this change, the typst writer will omit the#bibliography
command whencitations
is not enabled. (If you want to use pandoc’s own--citeproc
, you should combine it with-t typst-citations
to disable native typst citations. - Use
<..>
for labels, create internal links. - Use
#footnote
for notes (#8893). - Fix alignment issue in lists. It’s an aesthetic issue only; the first line had an extra space indent after the list marker.
- Omit bibliography if
-
Commonmark writer:
- Use shortcut reference links: commonmark supports these.
-
EPUB template: add
lang
attribute to<html>
(Gabriel Lewertoski). -
Template styles.html: fix task-list styling in reveal.js (#8731, Albert Krewinkel).
-
LaTeX template: Fix
\babelfont
(#8728). -
Text.Pandoc.Parsing:
- Remove unnecessary ‘spaces’ in
parseFromString
.
- Remove unnecessary ‘spaces’ in
-
Text.Pandoc.ImageSize: Drop BOM at start of SVG if present. Otherwise our code can fail to determine image size.
-
Lua subsystem:
- Fix value of PANDOC_SCRIPT_FILE for custom readers & writers (#8781, Albert Krewinkel). The value did not hold the actual file path for scripts in the custom folder of the datadir.
-
Fix YAML in translation files for
cs
andpl
(#8787). -
Fix pdf output via typst (#8754). One must now use
typst compile
rather thantypst
. -
MANUAL.txt:
- Added note that the user will need to create the user data dir (#8727).
- Add
wikilinks
to non-default extensions (Ilona). - Update link to custom djot writer (Albert Krewinkel).
- Better link to citation syntax.
- Fix typo (sdhoward).
- Note that
#
fancy list markers don’t work with commonmark (#8772, William Lupton). - Add commonmark
fenced_div
note (#8773, William Lupton). - Move highlighting documentation, with minor adjustments (William Lupton).
- Fix inaccurate statement about spaces and tabs in template syntax (Frank Seifferth).
-
Update documentation for org-mode (Christian Christiansen, #8716).
-
doc/lua-filter.md:
-
CONTRIBUTING.md: update info on ghc versions.
-
INSTALL.md:
- Fix cabal install instructions (Albert Krewinkel).
- Use more relevant link to NetBSD/pkgsrc entry (Charlotte Koch).
- Fix Windows install instructions for winget (#8799).
-
Tests: Rename test/docx/block_quotes_parse_indent.native for consistency (Stephan Meijer).
-
Add
tls
constraint on cabal.project. This is needed to avoid problems caused by the transition tocrypton
. -
Require texmath 0.12.8.
pandoc 3.1.2
Click to expand changelog
-
Add a Lua REPL (Albert Krewinkel). This can be started with
pandoc lua -i
. It is also possible to instruct a filter to open the REPL at a certain point, for debugging (seepandoc.cli.repl
). -
Support
typst
as a--pdf-engine
. -
Add typst writer (#8713). New module Text.Pandoc.Writers.Typst, exporting
writeTypst
[API change]. -
Org reader:
- Allow zero width space as an escape character (#8716, Christian Christiansen). Allow the character U+200B to be used as an escape character as described in the Org-mode documentation (https://orgmode.org/manual/Escape-Character.html).
-
DocBook reader:
-
HTML reader:
- Fix behavior with
-native_spans-raw_html
(#8711). Previously with this configuration,<span>
s were not treated as inline elements at all.
- Fix behavior with
-
HTML writer:
- Avoid duplicate classes (#8705).
- Use img element instead of embed for
.svg.gz
and.png.gz
etc. (#8699). - HTML writer footnotes changes (#8695): when
--reference-location=section
or=block
, use anaside
element for the notes rather than asection
. When--reference-location=section
, include theaside
element inside the section element, rather than outside. (In slide shows, this option causes footnotes on a slide to be displayed at the bottom of the slide.)
-
EPUB writer:
- Use different structure for epub footnotes (#8676, see #8672, #5583). Many EPUB readers are thrown off by pandoc’s current footnote output. Both the ol and the fact that the footnote backlink is at the end of the note seem to pose problems. With this commit, we now create a list of aside (or div) elements, instead of an ordered list. Each element begins with a note number that is linked back to the note reference. (So, the backlink occurs at the beginning rather than the end.) Thanks to @Porges and @lewer.
-
Docx writer:
- Include abstract title (#8702). Uses localized term for abstract.
-
Markdown writer:
- Use implicit figures if there’s a caption but no alt (#8689, Albert Krewinkel).
-
Jira reader (Albert Krewinkel):
- Add panel title as nested div (#8681).
- Require jira-wiki-markup 1.5.1 (#8680). This fixes a bug in the parser that caused text between two exclamation marks to be parsed as an image. The first
!
of image markup must now be followed by a non-space character; otherwise, the enclosed text is parsed as normal content.
-
Ms writer:
- Fix handling of Figure (#8660).
-
ICML writer:
- Fix images with data (#8675). The Contents element should be inside Properties.
-
LaTeX writer:
- Add Chinese to Babel languages.
- Fix background image in Beamer when there are figure environments (#8671, Martín Pozo).
-
LaTeX template:
- Add
babelfonts
variable to default LaTeX template. This allows specifying certain fonts to be used with certain babel languages. Thanks to Frederik Elwert. - Fix highlight/underline with lualatex (#8707). We need the lua-ul package instead of soul, which doesn’t work with lualatex.
- Add
-
Lua (Albert Krewinkel):
- Add
pandoc.cli.repl
function - Fix
json.encode
for nested AST elements. Ensures that objects with nested AST elements can be encoded as JSON. - Auto-generate docs for pandoc modules.
- Load text module as
pandoc.text
. This only affects the name in the Lua-internal documentation. It is still possible to load the modules viarequire 'text'
, although this is deprecated. - Move docs from module
text
topandoc.text
The latter is easier to use and more consistent with the other modules. - Keep the Lua stack clean A metatable used during initialization was not properly removed from the stack. Likewise, accessing the CommonState from Lua previously led to the pollution of the Lua stack with a left-over value.
- Add function
pandoc.format.from_path
. - Allow to get the JSON encoding of log messages.
- Add
-
Text.Pandoc.Format: Add new function
formatFromFilePaths
[API change] (#8710, Albert Krewinkel). -
The old Text.Pandoc.App.FormatHeuristics module has been removed.
-
In
--version
, use Windows%APPDATA%
variable to describe user data dir (#8686, Pablo Rodríguez). -
Text.Pandoc.App.CommandLineOptions: don’t lowercase arg to
--from
/--read
(Albert Krewinkel). This prevented users to use custom writers with uppercase characters in their filenames. Format-normalization, including lower-casing of format identifiers, happens during format parsing. -
Documentation:
- Add
doc/nix.md
. - Add
doc/extras.md
. This was formally in the website repo. doc/lua-filters.md
: improve docs forpandoc.zip
.
- Add
-
Factor out
make_macos_release.sh
from the release candidate workflow. Use cabal instead of stack to build the macos binary. -
Modify linux/make_artifacts.sh so it will work on cirrus.
-
Switch to hslua-2.3
-
Depend on latest releases of texmath, doclayout.
pandoc 3.1.1
Click to expand changelog
-
EPUB reader: Give additional information in error if the epub zip container can’t be unpacked.
-
TSV reader: don’t gobble tabs as whitespace (#8661).
-
Org reader: accept empty tables (#8659).
-
LaTeX reader: fix multiplication syntax for tabular (#8658). We recognized
*{6}{...}
but not*6{...}
or*6c
. -
Docx reader: parse image alt texts in LibreOffice generated files. LibreOffice tags images slightly differently than Word; this change lets the parses take that difference into account when looking for an image description (alt text).
-
DocBook reader:
-
JATS reader: avoid generating duplicate figure captions (#8669).
-
RST reader: align with spec in syntax for role names (#8653). In particular, we now allow colons in row names.
-
Add note on converting from .doc format to FAQs (#8654).
-
Trap error in getAppUserDataDirectory (#8648). This can raise an error if pandoc is run in a non-user environment.
-
LaTeX writer: do not use longtable foot with Beamer (#8638, Albert Krewinkel). The table foot is made part of the table body, as otherwise it won’t show up in the output. The root cause for this is that longtable cannot detect page breaks in Beamer.
-
LaTeX template: Add CJKsansfont and CJKmonofont for XeLaTeX (#8656, Yudong Jin).
CJKsansfont
andCJKmonofont
will be set for xelatex only ifCJKmainfont
is also provided. -
URL style in ConTeXt (#8612, Thomas Hodgson). Previously, a URL like this would be in monospace text:
\useURL[url1][https://example.com]
. Now, it will match the main text unless thelinkstyle
variable is set, which controls the styling of all links. Closes #8602. -
Asciidoc writer: Properly escape
|
in table cells (#8665). -
asciidoc{,tor} template: fix revision date when author is unset (#8637, arcnmx). Revision line syntax is only valid in combination with an author line, so the date attribute must be set explicitly when the author is missing
-
HTML writer: allow “track” element to be treated as block-level HTML (#8629).
-
Include needed polyfill when MathJaX is used (#8625).
-
JATS writer: include alt-text in
<graphic>
,<inline-graphic>
elements (#8631, Albert Krewinkel). -
Chunked HTML writer: Retain metadata in processing sections for chunked HTML (#8620). Previously we suppressed metadata in all but the top page, in order to prevent the title block from being printed on every page. This prevented use of custom variables set by metadata fields. This commit moves to a better solution: a conditional in the default template restricts the title block to the top page.
-
Lua API:
- Add new function
pandoc.system.cputime
(Albert Krewinkel). The function returns the CPU time consumed by pandoc and can be used to benchmark Lua computations. - Add module
pandoc.json
to handle JSON encoding (#8605, Albert Krewinkel).
- Add new function
-
Use pandoc-lua-marshal 0.2.1 (Albert Krewinkel). All major AST elements now have
__tojson
metamethods that return the JSON representation of an element. This allows to JSON-encode these elements with libraries that respect the__tojson
metamethod, including dkjson. -
Use latest zip-archive. This allows pandoc to open certain epubs that it could not open before.
-
Use commonmark-extensions 0.2.3.4. This fixes some bugs involving definition lists and inline formatting.
-
Use latest skylighting-format-context
-
MANUAL.txt:
- Document chunk-template in defaults file.
- Remove obsolete “raw content in a style” section.
- Revise documentation for
--mathml
to reflect support in all major browsers (#8667).
-
docs/custom-readers.md: Update JSON parsing example. The example now uses the built-in
pandoc.json
library to parse the API output. -
doc/press.md: Add article on CiTO in J Cheminform by @egonw.
-
doc/lua-filters.md: fix typo in
run_json_filter
(Morgan Willcock).
pandoc 3.1
Click to expand changelog
-
Fix regression with
--print-highlight-style
option (#8586). -
Add new
--chunk-template
option (#8581), allowing more control over the filenames in chunked HTML output. -
Text.Pandoc.App: Add
optChunkTemplate
constructor to Opt [API change]. -
Text.Pandoc.Options: add
writerChunkTemplate
constructor toWriterOptions
[API change]. -
Text.Pandoc.Chunks: add Data, Typeable, Generic, ToJSON, FromJSON instances for
PathTemplate
[API change]. -
Text.Pandoc.Citeproc: Fix bug in
metaValueToReference
(#8611). This bug caused us to get some repeated content when converting MetaBlock to Inlines. -
Textile reader:
-
ODT reader: fix blockquote indent detection (#3437, Daniel Kessler).
-
LaTeX writer: include short figure/table caption if one is given (Albert Krewinkel). Short captions are used by LaTeX when generating the list of figures or list of tables. Adding a short caption will now overwrite the full caption in these lists.
-
Powerpoint writer: fix handling of simple figures (#8565, Albert Krewinkel). This ensures that simple figures are displayed in the same way as before the introduction of a dedicated
Figure
constructor in the AST. -
Use released skylighting 0.13.2.1
-
INSTALL.md: direct people to cabal install pandoc-cli.
-
doc/lua-filters.md: document ‘Figure’ type and constructor (Albert Krewinkel). Fix typos (Martin Joerg).
-
Fix link in manual (#8583, Salim B).
pandoc 3.0.1
Click to expand changelog
-
Fix use of extensions with custom readers (#8571).
-
Text.Pandoc.Writers.Shared: export
setupTranslations
[API change]. Use this in HTML and OpenDocument writers, to ensure that translations are set up properly even when we don’t go throughconvertWithOpts
. -
LaTeX reader: fix regression in macro resolution for environments (#8573).
-
Chunked HTML writer: Fix handling of images with absolute URLs (#8567).
-
HTML writer:
- Don’t omit newlines in task lists.
- Don’t disable checkboxes in task lists (#8562).
-
Ensure that automatically set variables
pandoc-version
,outputfile
,title-prefix
,epub-cover-image
,curdir
,dzslides-core
can be overridden by--variable
on the command line. Previously they would create lists in the template Context, which is not desirable. -
Fix man page copying in
linux/make_artifacts.sh
(#8566). Previously we were copying the pandoc-server.1 pandoc page to pandoc-lua.1. -
pandoc.cabal: remove pandoc.cabal, stack.cabal from extra-source-files (#8560). The problem is that if these are in extra-source-files, then they get put in the tarball, and then anyone trying to build the source from an unpacked tarball will run into the problem that cabal.project and stack.yaml refer to pandoc-server, pandoc-lua-engine, and pandoc-cli, which aren’t in the tarball.
-
Require texmath 0.12.6 for better MathML output.
-
Fix typo in Lua filter documentation (Carlos Scheidegger).
-
Fix formatting of link in pandoc-server.md (James Scott-Brown).
-
Minor changelog fixups.
pandoc 3.0
Click to expand changelog
-
Split pandoc-server, pandoc-cli, and pandoc-lua-engine into separate packages (#8309). Note that installing the
pandoc
package from Hackage will no longer give you thepandoc
executable; for that you need to installpandoc-cli
. -
Pandoc now behaves like a Lua interpreter when called as
pandoc-lua
or whenpandoc lua
is used (#8311, Albert Krewinkel). The Lua API that is available in filters is automatically available to the interpreter. (See thepandoc-lua
man page.) -
Pandoc behaves like a server when called as
pandoc-server
or whenpandoc server
is used. (See thepandoc-server
man page.) -
A new command-line option
--list-tables
, causes tables to be formatted as list tables in RST (#4564, with Francesco Occhipinti). -
New command line option:
--epub-title-page=true|false
allows the EPUB title page to be omitted (#6097). -
--reference-doc
can now accept a URL argument (#8535) and load a remote reference doc. -
--version
output no longer contains version info for dependent packages. Instead, it contains a “Features” line that indicates whether the binary was compiled with support for acting as a server, and for using Lua filters and Custom writers. -
A new option
--split-level
replaces--epub-chapter-level
and affects both EPUB and chunked HTML output.--epub-chapter-level
will still work but is deprecated. -
Multiple input files with
--file-scope
: fix case where the links are URL-encoded, e.g. with%20
(#8467). -
Produce error if
--csl
is used more than once (#8195, Prat). -
Remove deprecated
--atx-headers
option. -
Remove deprecated option
--strip-empty-paragraphs
. -
In
--verbose
mode add message when running citeproc (as with other filters). -
Add new
mark
extension for highlighted text in Markdown, using==
delimiters (#7743). -
Add new extensions
wikilinks_title_after_pipe
andwikilinks_title_before_pipe
forcommonmark
andmarkdown
. (#2923, Albert Krewinkel). The former enables links of style[[Name of page|Title]]
and the latter[[Title|Name of page]]
. Titles are optional in both variants, so this works for both:[[https://example.org]]
,[[Name of page]]
. The writer is modified to render links with titlewikilink
as a wikilink if a respective extension is enabled. Pandoc will usewikilinks_title_after_pipe
if both extensions are enabled. -
Add prefixes to identifiers with
--file-scope
(#6384). This change only affects the case where--file-scope
is used and more than one file is specified on the command line. In this case, identifiers will be prefixed with a string derived from the file path, to disambiguate them. For example, an identifierfoo
incontents/file1.txt
will becomecontents__file1.txt__foo
. Links will be adjusted accordingly: iffile2.txt
links tofile1.txt#foo
, then the link will be changed to point to#file1.txt__foo
. Similarly, a link tofile1.txt
will point to#file1.txt
. A Div with an identifier derived from the file path will be added around each file’s content, so that links to files will still work. -
New output format:
chunkedhtml
. This creates a zip file containing multiple HTML files, one for each section, linked with “next,” “previous,” “up,” and “top” links. (If-o
is used with an argument without an extension, it is treated as a directory and the zip file is automatically extracted there, unless it already exists.) The top page will contain a table of contents if--toc
is used. Asitemap.json
file is also included. The option--split-level
determines the level at which sections are to be split. -
Support complex figures (Albert Krewinkel, Aner Lucero). There is now a dedicate Figure block constructor for figures. The old hack of representing a figure as
Para [Image attr [..alt..] (source, "fig:title")]
has been dropped. Here is a summary of figure support in different formats:- Markdown reader: paragraphs containing just an image are treated as figures if the
implicit_figures
extension is enabled. The identifier is used as the figure’s identifier and the image description is also used as figure caption; all other attributes are treated as belonging to the image. - Markdown writer: figures are output as implicit figures if possible, via HTML if the
raw_html
extension is enabled, and as Div elements otherwise. - HTML reader:
<figure>
elements are parsed as figures, with the caption taken from the respective<figcaption>
elements. - HTML writer: the alt text is no longer constructed from the caption, as was the case with implicit figures. This reduces duplication, but comes at the risk of images that are missing alt texts. Authors should take care to provide alt texts for all images. Some readers, most notably the Markdown reader with the
implicit_figures
extension, add a caption that’s identical to the image description. The writer checks for this and adds anaria-hidden
attribute to the<figcaption>
element in that case. - JATS reader: The
<fig>
and<caption>
elements are parsed into figure elements, even if the contents is more complex. - JATS writer: The
<fig>
and<caption>
elements are used write figures. - LaTeX reader: support for figures with non-image contents and for subfigures.
- LaTeX writer: complex figures, e.g. with non-image contents and subfigures, are supported. The
subfigure
template variable is set if the document contains subfigures, triggering the conditional loading of the subcaption package. Contants of figures that contain tables are become unwrapped, as longtable environments are not allowed within figures. - DokuWiki, Haddock, Jira, Man, MediaWiki, Ms, Muse, PPTX, RTF, TEI, ZimWiki writers: Figures are rendered like Div elements.
- Asciidoc writer: The figure contents is unwrapped; each image in the the figure becomes a separate figure.
- Classic custom writers: Figures are passed to the global function
Figure(caption, contents, attr)
, wherecaption
andcontents
are strings andattr
is a table of key-value pairs. - ConTeXt writer: Figures are wrapped in a “placefigure” environment with
\startplacefigure
/\endplacefigure
, adding the features caption and listing title as properties. Subfigures are place in a single row with the\startfloatcombination
environment. - DocBook writer: Uses
mediaobject
elements, unless the figure contains subfigures or tables, in which case the figure content is unwrapped. - Docx writer: figures with multiple content blocks are rendered as tables with style
FigureTable
; like before, single-image figures are still output as paragraphs with styleFigure
orCaptioned Figure
, depending on whether a caption is attached. - DokuWiki writer: Caption and “alt-text” are no longer combined. The alt text of a figure will now be lost in the conversion.
- FB2 writer: The figure caption is added as alt text to the images in the figure; pre-existing alt texts are kept.
- ICML writer: Only single-image figures are supported. The contents of figures with additional elements gets unwrapped.
- OpenDocument writer: A separate paragraph is generated for each block element in a figure, each with style
FigureWithCaption
. Behavior for single-image figures therefore remains unchanged. - Org writer: Only the first element in a figure is given a caption; additional block elements in the figure are appended without any caption being added.
- RST writer: Single-image figures are supported as before; the contents of more complex images become nested in a container of type
float
. - Texinfo writer: Figures are rendered as float with type
figure
. - Textile writer: Figures are rendered with the help of HTML elements.
- XWiki: Figures are placed in a group.
- Markdown reader: paragraphs containing just an image are treated as figures if the
-
Changes in custom readers/writers:
- It is now possible to have a custom reader and a custom writer for a format together in the same file. The file may also define a custom template for the writer.
- Pandoc now checks the folder
custom
in the user’s data directory for a matching script if it can’t find one in the local directory. Previously, thereaders
andwriters
data directories were searched for custom readers and writers, respectively. Scripts in those directories must be moved to thecustom
folder. - Custom readers used to implement a fallback behavior that allowed to consume just a string value as input to the
Reader
function. This has been removed, the first argument is now always a list of sources. Usetostring
on that argument to get a string.
-
New module Text.Pandoc.Writers.ChunkedHTML, exporting
writeChunkedHtml
[API change]. -
We now set the
pandoc-version
variable centrally rather than in the writers. One effect is the man writer now emits a comment with the pandoc version. -
pandoc-server:
- Add simple CORS support to pandoc-server (#8427).
- Print message to stderr when starting the server.
-
Docx reader:
-
ODT reader:
-
DocBook reader:
-
JATS reader:
- Handle uri element in references (#8270).
-
Ipynb reader:
...
pandoc 2.19.2
Click to expand changelog
-
Fix regression with data uris in 2.19.1 (#8239). In 2.19.1 we used the base64URL encoding rather than base64.
-
pandoc-server: handle
citeproc
parameter as documented (#8235). -
Org reader: treat emacs-jupyter src blocks as code cells (#8236, Albert Krewinkel). This improves support for notebook-like org files that are intended to be used with emacs-jupyter package.
-
HTML writer and templates: revert to using
width
property for column widths (Albert Krewinkel). The defaultflex
andoverflow-x
properties of a column are set toauto
. In combination, these changes allow to get good results when using columns with or without explicit widths. -
Org writer (Albert Krewinkel):
- Add support for jupyter nodebook cells (#6367).
- Prefix code language of ipynb code blocks with
jupyter-
. This is the convention used by the emacs-jupyter package. - Keep code block attributes as header args. This allows to keep more information in the resulting
src
blocks, making it easier to roundtrip from or through Org. Org babel ignores unknown header arguments. - Add code block identifier as
#+name
to src blocks.
-
Fix some typos in the codebase (luz paz).
-
Require hslua-module-path 1.0.3 (#8228, Albert Krewinkel).
pandoc 2.19.1
Click to expand changelog
-
Add server capabilities.
- New exported module Text.Pandoc.Server [API change].
- The pandoc executable now starts up a web server when renamed or symlinked as
pandoc-server
, and functions as a CGI program when renamed or symlinked aspandoc-server.cgi
. See the man page forpandoc-server
for full documentation.
-
Text.Pandoc.App.Opts: Redo
FromJSON
forOpt
so that optional values can be omitted (in which case the values fromdefaultOptions
are used). -
Org reader: treat “abstract” block as metadata (Albert Krewinkel, #8204). A block of type “abstract” is assumed to define the document’s abstract. It is transferred from the main text to the metadata.
-
Org template: add abstract from metadata as block of type “abstract” (#8204).
-
HTML writer: use
flex
property for column widths (Albert Krewinkel, #8232). -
LaTeX writer:
-
LaTeX template: fix behavior of
colorlinks
variable (Albert Krewinkel, #8226). Fixes a regression in 2.19 that required theboxlinks
variable to be set in addition to the usual link coloring variables. Otherwise links were never colored in LaTeX PDF output. -
Text.Pandoc.Highlighting: Export
lookupHighlightingStyle
[API change]. Previously this lived in an unexported module Text.Pandoc.App.CommandLineOptions, under the namelookupHighlightStyle
. -
Text.Pandoc.App:
- Remove unneeded MonadIO constraints in readSources.
- Factor out
convertWithOpts'
fromconvertWithOpts
. This runs in any PandocMonad, MonadIO, MonadMask instance. So far it is not exported, but it might find a use later.
-
Support
--strip-comments
in commonmark/gfm (#8222). This change makes the commonmark reader sensitive toreaderStripComments
. -
Lua: add function
pandoc.utils.citeproc
(Albert Krewinkel). The function runs the citeproc processor on a Pandoc document. Exposing this functionality to Lua allows to make citation processing part of a filter or writer, simplifies the creation of multiple bibliographies, and enables the use of varying citation styles in different parts of a document. -
Refactor
linux/make_artifacts.sh
. -
Update INSTALL.md installation from source instructions.
-
Use base64 package instead of base64-bytestring. It is supposed to be faster and more standards-compliant.
-
trypandoc improvements:
- Add dropdown with canned examples.
- Add citeproc support.
- Support csv, bibliographic and binary formats.
- Add load from file.
- Add permalink. Don’t always reload page.
- Use vanilla JS and CSS + the new
pandoc-server.cgi
.
-
Allow haddock-library-1.11.0.
-
Convert
tool/extract-changes.hs
to a Lua filter.
pandoc 2.19
Click to expand changelog
-
Add
--embed-resources
flag (Elliot Bobrow, #7331). This can be used to embed resources without implying--standalone
. Deprecate--self-contained
in favor of--embed-resources --standalone
. -
Allow environment variable interpolation in
highlight-style
andpdf-engine
fields in defaults files (#8061; Jaehwang Jung, #8073). -
Allow placing custom readers and writers in user data directory (Albert Krewinkel, #8112) (
readers
andwriters
subdirectories). -
Add
tsv
(tab separated values) as an input format (#7974). [API change]: Text.Pandoc.Readers.CSV now exportsreadTSV
. Internal change: In Text.Pandoc.CSV,CSVOptions
has changed so thatcsvQuote
takes a Maybe value. -
Add
tex_math_dollars
togfm
default extensions (reflecting gfm’s new support for math). -
RST, Org, Markdown readers: support rowspans and colspans in grid tables (#8202, Albert Krewinkel). Note: the writers does not yet support these more complex grid table features, so these complex grid tables will not round-trip.
-
HTML, LaTeX, and MediaWiki readers: use
formatCode
(#8162, #8129, Elliot Bobrow). This moves formatting from inside inline code elements to the outside, since pandoc’s Code element only takes string content. -
Markdown reader:
-
HTML reader:
- Allow sublists that are not marked as items (Albert Krewinkel, #8150). This is technically invalid HTML, but it can be found in the wild and browsers handle it.
-
Org reader (Albert Krewinkel):
- Recognize absolute paths on Windows (Albert Krewinkel, #8201).
- Recognize {webp,jxl} files as images (YI).
- Allow attrs for Org tables (Albert Krewinkel, #8049). Tables with attributes are no longer wrapped in Div elements; attributes are added directly to the table element.
- Support line selection in INCLUDE directives (Brian Leung, #8060).
- Fix Post / Pre mixup when setting emphasis chars (Amir Dekel, #8134).
-
LaTeX reader:
- Support
\includesvg
(#8027). - Unescape characters in
\lstinline
inside\passthrough
(#8179). - Improve
mathEnvWith
(#8122). When converting e.g. an align environment to an aligned environment inside a Math element, we need to include a newline before the\end{aligned}
, since the previous line might end in a comment. - Fix treatment of extensions for
\input
in LaTeX reader (#8092). Previously we required a.tex
extension, but TeX allows any extension for\input
(as opposed to\include
).
- Support
-
RTF reader:
- support
\nosupersub
(#8170).
- support
-
TikiWiki reader:
- Support underlined text
-
DocBook reader:
- Improved reading
<xref>
elements (Frerich Raabe, #8065).
- Improved reading
-
JATS reader:
-
RIS reader:
-
MediaWiki reader:
- Allow HTML comment after row start (#8110).
-
DokuWiki reader:
- The
tex_math_dollars
extension is now supported fordokuwiki
(but off by default) (#8178). - Content inside
<latex>...</latex>
is parsed as raw LaTeX inline, and inside<LATEX>..</LATEX>
as raw LaTeX block (#8178). - The behavior of
<php>...</php>
is changed, so that instead of producing a code block, it produces raw HTML with<?php ... ?>
.
- The
-
LaTeX writer:
- Improve grouping with autocites (#8088).
- Extend list of book documentclasses (Wentau Han, #8053).
- Fix width of multicolumn cells (Albert Krewinkel, #8090). Cells spanning multiple columns must be given an explicit width, calculated from the table properties.
- Beamer: allow containsverbatim as alternative to fragile (#8080).
-
HTML writer:
- Add ‘footnotes’ identifier to footnotes section (#8043).
- Fix bug with
--number-offset
. This formerly caused section divs to be produced, even when--section-divs
was not specified (#8097). - Use CSS flexboxes for columns (Albert Krewinkel). This allows an arbitrary number of columns, while the previous approach assumed exactly two columns.
- Allow “spanlike” classes to be combined (see #8194). Previously classes like “underline” and “marked” had to be the first class in a span in order for the span to be interpreted as a “ul” or “mark” element. This commit allows these special classes to be “stacked,” e.g.
[test]{.mark .underline}
; in addition, the special classes are no longer required to come first in the list of classes. - Avoid doubled style attribute when height and width are added to style because of an image, but the image already has a style attribute (#8047).
- Do not include the deprecated doc-endnote role (#8030). doc-endnote was deprecated in DPUB-ARIA 1.1.
- Remove extra soft break for tasklist (black-desk, #8142). Browser will display the extra newline character between checkbox and text as a space, which make tasklist items cannot be aligned.
-
EPUB writer:
- Allow choice of math method for v3 (#8164). Previously we always used MathML for math in EPUB3, because the spec includes MathML. But this is not widely supported by readers, so it seems better to allow users to choose their math method as they can with EPUB2 or HTML. NOTE: Existing workflows that produce EPUBv3 documents including math will be affected by this change. You must add
--mathml
to your command line if you want to continue producing MathML.
- Allow choice of math method for v3 (#8164). Previously we always used MathML for math in EPUB3, because the spec includes MathML. But this is not widely supported by readers, so it seems better to allow users to choose their math method as they can with EPUB2 or HTML. NOTE: Existing workflows that produce EPUBv3 documents including math will be affected by this change. You must add
-
RST writer:
-
Ms writer:
- Add comment in preamble stating generator.
- Fix roff ms syntax highlighting definitions (#8175, thanks to Branden Robinson).
-
ConTeXt writer:
-
Support complex table structures (Albert Krewinkel, #8116). The following table feature are now supported in ConTeXt:
- colspans,
- rowspans,
- multiple bodies,
- row headers, and
- multi-row table head and foot.
The wrapping
placetable
environment is also given areference
option with the table identifier, enabling referencing of the table from within the document. -
Unify link handling (Albert Krewinkel, #8096). Autolinks, i.e. links with content that’s the same as the linked URL, are now marked with the
\url
command. All other links, both internal and external, are created with the\goto
command, leading to shorter, slightly more idiomatic code. As before, autolinks can still be styled via\setupurl
, other links via\setupinteraction
. -
Use “sectionlevel” environment for headings (Albert Krewinkel, #5539). The document hierarchy is now conveyed using the
\startsectionlevel
/\stopsectionlevel
by default. This makes it easy to include pandoc-generated snippets in documents at arbitrary levels. The more semantic environments “chapter”, “section”, “subsection”, etc. are used if the--top-level-division
command line parameter is set to a non-default value.
-
-
Docx writer:
- Add
w:lang
torPr
for Span and Div with lang attribute, so that Word can know that “Apfel” is not a spelling error (#8026). - Prevent crashing when handling invalid tables (Albert Krewinkel, #8102). Tables with different numbers of cells per row would sometimes crash pandoc. This fix prevents this by cutting off overlong rows.
- Add
-
ICML writer:
- Support custom-style attribute on Table (#8079).
-
AsciiDoc writer:
- Fix commas in link text (#8070). Commas in link text trigger interpretation of attributes. To block this, we replace them with numeric entities.
- Fix underline. We were rendering it as
+++text+++
; this is now changed to[.underline]#text#
. See comment at #8070 (comment).
-
FB2 writer:
- Fix handling of non-section Divs (#8123).
-
Markdown writer:
-
Text.Pandoc.Class:
- Add new function
findFileWithDataFallback
[API Change] (Albert Krewinkel). fillMediaBag
: Keep attributes of original image on Span (Albert Krewinkel, #8099). Images that cannot be fetched are replaced with a Span that contains the image’s description. The span now also retains all original image attributes and inherits all attributes of the image. Furthermore, the classesimage
andplaceholder
are added, and path and title are store in attributesoriginal-image-src
andoriginal-image-title
, respectively.
- Add new function
-
Text.Pandoc.Shared:
makeSections
: don’t make a section for a div with class “fragments” (#8098).- Ensure that Nulls are ignored by
makeSection
and in segmenting slides (#8155). - Add
formatCode
function to Text.Pandoc.Shared [API change] (Elliot Bobrow, #8129). taskListItemToAscii
: handle asciidoctor’s characters (#8011). Asciidoctor uses different unicode characters for task lists; we should recognize them too and be able to convert them to ascii task lists in formats like gfm.- Deprecate
deLink
and mark for later removal.
-
Text.Pandoc.Writers.Shared:
toTableOfContents
: Don’t replace links with empty spans in TOC (#8020).
-
Text.Pandoc.Readers.Metadata:
- Ensure that metadata values w/o trailing newlines are parsed as inlines, ...
pandoc 2.18
Click to expand changelog
-
New input formats:
endnotexml
(EndNote XML bibliography),ris
(RIS bibliography). -
A RIS bibliography file may now be used with
--citeproc
. -
Citeproc: Allow a formatted bibliography to be placed in metadata fields via a Div with class
refs
(#7969, #526). Thus, one can include a metadata field, sayrefs
, whose content is an empty div with idrefs
, and the formatted bibliography will be put into this metadata field. It may then be interpolated into a template using the variablerefs
. -
Ensure that you don’t get PDF output to terminal.
-t pdf
now behaves like-t docx
and gives an error unless the output is redirected. -
--version
now prints hslua version (#7929) and Lua version (#7997, Albert Krewinkel). -
Change
--metadata-file
parsing so that, when the input format is not markdown or a markdown variant, pandoc’s markdown is used (#6832, #7926). When the input format is a markdown variant, the same format is used. Reason for the change: it doesn’t make sense to run the markdown parser with a set of extensions designed for a non-markdown format, and this dramatically limits what people can do in metadata files. -
Trim whitespace from math in
--webtex
(#7892). This fixes problems with –webtex and markdown output, when display math starts or ends with a newline. -
New exported module Text.Pandoc.Readers.EndNote, exporting
readEndNoteXML
,readEndNoteXMLCitation
, andreadEndNoteXMLReferences
. [API change] -
--self-contained
: issue warning rather than failing with an error if a resource can’t be found (#7904). -
New exported module, Text.Pandoc.Readers.RIS, exporting
readRIS
(#7894). -
LaTeX reader:
- Handle subequations as inline math environment (#7883).
- Rudimentary support for
vbox
(#7939). - Support
\today
(#7905). - Handle
\label
and\ref
for footnotes (#7930). - Allow inline groups starting with
\bgroup
(#7953). - Use custom TokStream that keeps track of whether macros are expanded. This allows us to improve performance a bit by avoiding unnecessary runs of the macro expansion code (e.g. from 24 ms to 20 ms on our standard benchmark).
- Further optimizations for inline parsing.
- Better handling of
\usepackage
. If the package is local but causes parse errors, parse everything up to the error and skip the rest. Issue aCouldNotParseIncludeFile
warning indicating that parsing failed at that point. - Text.Pandoc.Readers.LaTeX.Parsing: Monoid and Semigroup instances for TokStream.
-
HTML reader:
-
DocBook reader:
- Handle complete set of entities as specified at https://www.w3.org/2003/entities/2007doc/byalpha.html (#7938).
- Handle abstract in info section (#7747).
- Improve info parsing.
- Simplify metadata parsing code (#7747). Handle abstract as block-level content. Report skipped info elements with
--verbose
. - Handle address and coyright in metadata (#7747).
-
DokuWiki reader:
- Add DokuWiki table alignment (#5202, damon-sava-stanley).
-
RST reader:
-
JATS reader:
- Improve handling of fn-group elements (#6348, Albert Krewinkel). Footnotes in
<fn-group>
elements are collected and re-inserted into the document as proper footnotes in the place where they are referenced. - Handle
pub-date
(#8000). - Support PMID, DOI, issue in citations (#7995).
- Improve refs parsing. Handle
issn
andisbn
; use simpler form for issued date. - Strip ‘ref-’ from ref id in constructing CSL id. This allows better round-tripping, because the JATS writer adds the
ref-
prefix to the citation id to get the ref element’s id.
- Improve handling of fn-group elements (#6348, Albert Krewinkel). Footnotes in
-
Org reader:
-
Allow “:” in property drawer keys (Lucas V. R). Any non-space character is allowed as property drawer key, including “:” itself (so it is not really a delimiter). The real delimiter is a space character, so in a drawer like
:PROPERTIES: ::k:ey:: value :END:
“:k:ey:” is a key with value “value”.
-
Allow comments above property drawer.
-
More flexible LaTeX environments (Lucas V. R).
-
Handle
#+bibliography:
as metadata so that it can work with--citeproc
. -
Parse
#+print_bibliography:
as Div with idrefs
. -
Allow multiple
#+bibliography:
.
-
-
Markdown reader:
-
Docx reader:
- Enable
citations
extension for docx reader (#7840). When enabled, Zotero, Mendeley, and EndNote citations embedded in a docx are parsed as native pandoc citations. (When disabled, the generated citation text and bibliography are passed through as regular text.) The bibliography generated by the plugin is suppressed. Instead, bibliographic data embedded in citation items is added to thereferences
metadata field so that it can be used with--citeproc
.
- Enable
-
Docbook writer:
- Interpret links without contents as cross-references (#7360, Jan Tojnar). Links without text contents are converted to
<xref>
elements. DocBook processors will generate appropriate cross-reference text when presented with an xref element.
- Interpret links without contents as cross-references (#7360, Jan Tojnar). Links without text contents are converted to
-
Docx writer:
- Single numbering ID for examples (#7895, mjfs). This change ensures that example list items all belong to a single number sequence, so that if items are added or deleted in a word processor, the other items will renumber automatically.
- Add bookmark with table id to table (#7989, Nikolai Korobeinikov, #7285). This allows tables with ids to be linked to.
-
Ipynb writer:
- Handle metadata better (#7928). Previously we used the markdown writer to render metadata. This had some undesirable consequences (e.g. en dash expanded to
--
whensmart
enabled), so now we use the plain writer.
- Handle metadata better (#7928). Previously we used the markdown writer to render metadata. This had some undesirable consequences (e.g. en dash expanded to
-
LaTeX writer:
- Avoid extra space before
\CSLRightInline
(#7932). - Add
scrreport
tochaptersClasses
(#6168, ivardb). - Support
page
,trim
,clip
attributes on images (#7181). - Add
()
after booktabs rules (#8001). These commands take optional arguments with () and [], which can lead to problems if the content of the table cell begins with these characters.
- Avoid extra space before
-
RST writer:
- Support all standard metadata (“bibliographic”) fields.
-
HTML writer: performance improvements.
-
Org writer:
- Stop indenting property drawers, quote blocks (#3245, Albert Krewinkel). This follows the current default org-mode behavior.
-
Markdown writer:
- Move table-related code into submodule (Albert Krewinkel).
- Don’t produce redundant header identifier when the
gfm_auto_identifiers
extension is set (#7941). - Update escaping rules for
\
. We now escape\
only ifraw_tex
is enabled or it is followed by a non-alphanumeric.
-
JATS writer:
- Encode author “others” as
<etal/>
(Albert Krewinkel). Citeproc adopted the BibTeX convention to use the author name “others” when there are additional authors that are not named. JATS uses the<etal>
element for this. - Avoid doubled ref-list element (#7990). Previously when generating JATS with the
element_citations
extension enabled, the references were put in a doubly-nested ref-list element (<ref-list><ref-list>...
). - Keep edition info in element citations (#7993, Albert Krewinkel).
- Fix handling of CSL variable ‘page’ (not ‘pages’ as we had before). It should go to ‘lpage’ and ‘rpage’, not ‘page-range’.
- Encode author “others” as
-
EPUB writer: refactor for clarity (#7991, Jonathan Dönszelmann, Ola Wolska, Ivar de Bruin, Jaap de Jong).
-
Custom writer (Albert Krewinkel):
- Support new-style Writer function (Albert Krewinkel). See the documentation for custom writers for details.
- Produce stacktrace if Writer function fails
-
Text.Pandoc.Logging: add
CouldNotParseIncludeFile
constructor forLogMessage
[API change]. -
Text.Pandoc.Shared:
- Put id attributes on TOC entries (#7907, damon-sava-stanley). Naming scheme of id is “toc-” + id of linked to header/section. Effects HTML, Markdown, Powerpoint, and RTF.
- Define
ordNub
as alias fornubOrd
from containers package (#7963, Albert Krewinkel). - Export
ensureValidXmlIdentifiers
. This function changes identifiers that don’t start with letters, and internal links to these identifiers, making them compatible with XML standards. The change is simple: we addid_
to the front. There is potential for duplication if there are alreadyid_...
identifiers defined, but this seems rare enough not to worry too much about.
-
Ensure that valid XML identifiers are used in Docbook, EPUB, FB2, HTML4, S5, Slidy, Slideous, ICML, ODT, TEI writers. Thus, if you convert
[anchor]{#1} and [link to](#1)
,id_1
will be used instead of1
for the identifier. -
Lua (Albert Krewinkel).
- Add module
pandoc.layout
to format and layout text. - Move custom writer code into Lua hierarchy.
- Use pandoc-lua-marshal 0.1.5.
- Allow any type of callable object as argument to List functions
filter
,map
, andfind_if
. These previously required the argument to be of...
- Add module