pandoc 3.0
Click to expand changelog
-
Split pandoc-server, pandoc-cli, and pandoc-lua-engine into separate packages (#8309). Note that installing the
pandoc
package from Hackage will no longer give you thepandoc
executable; for that you need to installpandoc-cli
. -
Pandoc now behaves like a Lua interpreter when called as
pandoc-lua
or whenpandoc lua
is used (#8311, Albert Krewinkel). The Lua API that is available in filters is automatically available to the interpreter. (See thepandoc-lua
man page.) -
Pandoc behaves like a server when called as
pandoc-server
or whenpandoc server
is used. (See thepandoc-server
man page.) -
A new command-line option
--list-tables
, causes tables to be formatted as list tables in RST (#4564, with Francesco Occhipinti). -
New command line option:
--epub-title-page=true|false
allows the EPUB title page to be omitted (#6097). -
--reference-doc
can now accept a URL argument (#8535) and load a remote reference doc. -
--version
output no longer contains version info for dependent packages. Instead, it contains a “Features” line that indicates whether the binary was compiled with support for acting as a server, and for using Lua filters and Custom writers. -
A new option
--split-level
replaces--epub-chapter-level
and affects both EPUB and chunked HTML output.--epub-chapter-level
will still work but is deprecated. -
Multiple input files with
--file-scope
: fix case where the links are URL-encoded, e.g. with%20
(#8467). -
Produce error if
--csl
is used more than once (#8195, Prat). -
Remove deprecated
--atx-headers
option. -
Remove deprecated option
--strip-empty-paragraphs
. -
In
--verbose
mode add message when running citeproc (as with other filters). -
Add new
mark
extension for highlighted text in Markdown, using==
delimiters (#7743). -
Add new extensions
wikilinks_title_after_pipe
andwikilinks_title_before_pipe
forcommonmark
andmarkdown
. (#2923, Albert Krewinkel). The former enables links of style[[Name of page|Title]]
and the latter[[Title|Name of page]]
. Titles are optional in both variants, so this works for both:[[https://example.org]]
,[[Name of page]]
. The writer is modified to render links with titlewikilink
as a wikilink if a respective extension is enabled. Pandoc will usewikilinks_title_after_pipe
if both extensions are enabled. -
Add prefixes to identifiers with
--file-scope
(#6384). This change only affects the case where--file-scope
is used and more than one file is specified on the command line. In this case, identifiers will be prefixed with a string derived from the file path, to disambiguate them. For example, an identifierfoo
incontents/file1.txt
will becomecontents__file1.txt__foo
. Links will be adjusted accordingly: iffile2.txt
links tofile1.txt#foo
, then the link will be changed to point to#file1.txt__foo
. Similarly, a link tofile1.txt
will point to#file1.txt
. A Div with an identifier derived from the file path will be added around each file’s content, so that links to files will still work. -
New output format:
chunkedhtml
. This creates a zip file containing multiple HTML files, one for each section, linked with “next,” “previous,” “up,” and “top” links. (If-o
is used with an argument without an extension, it is treated as a directory and the zip file is automatically extracted there, unless it already exists.) The top page will contain a table of contents if--toc
is used. Asitemap.json
file is also included. The option--split-level
determines the level at which sections are to be split. -
Support complex figures (Albert Krewinkel, Aner Lucero). There is now a dedicate Figure block constructor for figures. The old hack of representing a figure as
Para [Image attr [..alt..] (source, "fig:title")]
has been dropped. Here is a summary of figure support in different formats:- Markdown reader: paragraphs containing just an image are treated as figures if the
implicit_figures
extension is enabled. The identifier is used as the figure’s identifier and the image description is also used as figure caption; all other attributes are treated as belonging to the image. - Markdown writer: figures are output as implicit figures if possible, via HTML if the
raw_html
extension is enabled, and as Div elements otherwise. - HTML reader:
<figure>
elements are parsed as figures, with the caption taken from the respective<figcaption>
elements. - HTML writer: the alt text is no longer constructed from the caption, as was the case with implicit figures. This reduces duplication, but comes at the risk of images that are missing alt texts. Authors should take care to provide alt texts for all images. Some readers, most notably the Markdown reader with the
implicit_figures
extension, add a caption that’s identical to the image description. The writer checks for this and adds anaria-hidden
attribute to the<figcaption>
element in that case. - JATS reader: The
<fig>
and<caption>
elements are parsed into figure elements, even if the contents is more complex. - JATS writer: The
<fig>
and<caption>
elements are used write figures. - LaTeX reader: support for figures with non-image contents and for subfigures.
- LaTeX writer: complex figures, e.g. with non-image contents and subfigures, are supported. The
subfigure
template variable is set if the document contains subfigures, triggering the conditional loading of the subcaption package. Contants of figures that contain tables are become unwrapped, as longtable environments are not allowed within figures. - DokuWiki, Haddock, Jira, Man, MediaWiki, Ms, Muse, PPTX, RTF, TEI, ZimWiki writers: Figures are rendered like Div elements.
- Asciidoc writer: The figure contents is unwrapped; each image in the the figure becomes a separate figure.
- Classic custom writers: Figures are passed to the global function
Figure(caption, contents, attr)
, wherecaption
andcontents
are strings andattr
is a table of key-value pairs. - ConTeXt writer: Figures are wrapped in a “placefigure” environment with
\startplacefigure
/\endplacefigure
, adding the features caption and listing title as properties. Subfigures are place in a single row with the\startfloatcombination
environment. - DocBook writer: Uses
mediaobject
elements, unless the figure contains subfigures or tables, in which case the figure content is unwrapped. - Docx writer: figures with multiple content blocks are rendered as tables with style
FigureTable
; like before, single-image figures are still output as paragraphs with styleFigure
orCaptioned Figure
, depending on whether a caption is attached. - DokuWiki writer: Caption and “alt-text” are no longer combined. The alt text of a figure will now be lost in the conversion.
- FB2 writer: The figure caption is added as alt text to the images in the figure; pre-existing alt texts are kept.
- ICML writer: Only single-image figures are supported. The contents of figures with additional elements gets unwrapped.
- OpenDocument writer: A separate paragraph is generated for each block element in a figure, each with style
FigureWithCaption
. Behavior for single-image figures therefore remains unchanged. - Org writer: Only the first element in a figure is given a caption; additional block elements in the figure are appended without any caption being added.
- RST writer: Single-image figures are supported as before; the contents of more complex images become nested in a container of type
float
. - Texinfo writer: Figures are rendered as float with type
figure
. - Textile writer: Figures are rendered with the help of HTML elements.
- XWiki: Figures are placed in a group.
- Markdown reader: paragraphs containing just an image are treated as figures if the
-
Changes in custom readers/writers:
- It is now possible to have a custom reader and a custom writer for a format together in the same file. The file may also define a custom template for the writer.
- Pandoc now checks the folder
custom
in the user’s data directory for a matching script if it can’t find one in the local directory. Previously, thereaders
andwriters
data directories were searched for custom readers and writers, respectively. Scripts in those directories must be moved to thecustom
folder. - Custom readers used to implement a fallback behavior that allowed to consume just a string value as input to the
Reader
function. This has been removed, the first argument is now always a list of sources. Usetostring
on that argument to get a string.
-
New module Text.Pandoc.Writers.ChunkedHTML, exporting
writeChunkedHtml
[API change]. -
We now set the
pandoc-version
variable centrally rather than in the writers. One effect is the man writer now emits a comment with the pandoc version. -
pandoc-server:
- Add simple CORS support to pandoc-server (#8427).
- Print message to stderr when starting the server.
-
Docx reader:
-
ODT reader:
-
DocBook reader:
-
JATS reader:
- Handle uri element in references (#8270).
-
Ipynb reader:
- Add cell id to attachment filename when storing in MediaBag (#8415). Otherwise attachments with the same name can overwrite each other.
-
LaTeX reader:
- Skip parenthenized args of toprule, midrule, etc (#8242).
- Handle
##
macro arguments properly (#8243). - Remove unused function
toksToString
in Parsing module. - Support more
soul
commands, including\hl
. - Add
unnumbered
class for\part*
(#8447) - Fix
TEXINPUTS
handling (#8392). IfTEXINPUTS
ends with:
, then the system defaultTEXINPUTS
is added. We handle this by just adding the working directory in this case. - Parse short table caption (see jgm/pandoc-types#103). This is not too useful yet, because writers don’t do anything with the short caption.
-
MediaWiki reader:
- Parse table cell with attributess, to support rowspan, colspan (#8231, Ruqi).
- Refine “blending” rules for MediaWiki links (#8525, Ruqi). The rules for “blending” characters outside a link into the link are described here: https://en.wikipedia.org/wiki/Help:Wikitext#Blend_link These pose a problem for CJK languages, which generally don’t have spaces after links. However, it turns out that the blending behavior, as implemented on Wikipedia, is (contrary to the documentation) only for ASCII letters. This commit implements that restriction, which fixes the problem for CJK.
-
HTML reader:
- Fix regression for
<tt>
(#8330). It was no longer being parsed as Code (Justin Wood).
- Fix regression for
-
RST reader:
- Support
mark
role for round-trip.
- Support
-
Textile reader:
-
Markdown reader:
-
Allow fenced code block “bare” language to be combined with attributes (#8174, Siphalor), e.g.
```haskell {.class #id} ```
-
Allow table caption labels to start with lowercase
t
(#8259). -
Grid tables: allow specifying a table foot by enclosing it with part separator lines, i.e., row separator lines consisting only of
+
and=
characters (#8257, Albert Krewinkel). E.g.:+------+-------+ | Item | Price | +======+=======+ | Eggs | 5£ | +------+-------+ | Spam | 3£ | +======+=======+ | Sum | 8£ | +======+=======+
-
Fix
implicit_header_references
with duplicate headings (#8300). Documentation says that when more than one heading has the same text, an implicit reference[Heading text][]
refers to the first one. Previously pandoc linked to the last one instead. This patch makes pandoc conform to the documented behavior. -
Parse highlighted text inside
==..==
ifmark
extension enabled.
-
-
Org reader:
-
BibTeX reader:
- Fix handling of
%
inurl
field (#7678).%
does not function as a comment character insideurl
(where URL-encoding is common). - Allow
url
field inbibtex
as well asbiblatex
(#8287). This field is not officially supported for BibTeX, but many styles can handle it (https://www.bibtex.com/f/url-field/), and others will ignore it. - Support
software
type in biblatex <-> CSL conversions (#8504). - Make sure
version
field comes through in biblatex (#8504).
- Fix handling of
-
BibTeX writer:
- Pass through
url
even forbibtex
(#8287).
- Pass through
-
Org writer:
-
EndNote reader:
- Better error when parsing EndNote references fails.
-
DocBook writer:
- Rename Text.Pandoc.Writers.Docbook -> Text.Pandoc.Writers.DocBook. Rename
writeDocbook
->writeDocBook
, for consistency with the DocBook reader’s naming. [API change] - Fix position of textobject (#8437). It is a child of
inlinemediaobject
, notimageobject
. - Add regression tests for #8437.
- Render image alt text using textobject element (#8437).
- Don’t indent contents of title element.
- Store “unnumbered” class in DocBook role attribute (#1402, lifeunleaded).
- Rename Text.Pandoc.Writers.Docbook -> Text.Pandoc.Writers.DocBook. Rename
-
ConTeXt writer (Albert Krewinkel):
- Support syntax highlighting for code.
- Always use
\type
for inline code. Inline codes that contained curly braces where previously rendered with\mono
; this led to unexpected results when the presentation of\type
was customized, as those changes would not have been applied to code rendered with\mono
. - Add support for unlisted, unnumbered headings (#8486).
- Support
tagging
extension (Albert Krewinkel). Paragraphs are enclosed by\bpar
and\epar
commands, andhighlight
commands are used for emphasis. This results in much better tagging in PDF output.
-
LaTeX writer:
- Do not repeat caption on headless tables (Albert Krewinkel). The caption of headless tables was repeated on each page that contained part of the table. It is now made part of the “first head”, i.e. the table head that is printed only once.
- Add separator line between table’s body and its foot (Albert Krewinkel).
- Ignore languages with no babel equivalent, instead of generating an invalid command in the preamble (#8325).
- Use
\includesvg
for SVGs and include thesvg
package (#8334). - Use
soul
instead ofulem
for strikeout, underline (#8411). This handles things like hyphenation, line breaks, and nonbreaking spaces better. - Use
\toprule\noalign{}
instead of\toprule()
in tables, and similarly for\midrule
and\bottomrule
(#8223). This facilitates redefining\toprule
,\midrule
, and\bottomrule
without needing to gobble the ()s. (Those who redefine these macros on the assumption that they will be followed by()
may need to change their definitions.) - Support highlighted text for Span with class
mark
.
-
JATS writer:
- Use
<break/>
for LineBreak in the limited contexts that accept it (#8344). - Officially deprecate
writeJATS
in favor ofwriteJatsArchiving
.
- Use
-
RTF writer:
- Add space after unicode escape commands (#8264). This fixes a bug that caused characters to disappear after unicode escapes.
-
RST writer:
-
Commonmark writer:
- Ensure that we don’t have blank lines in raw HTML (#8307).
-
HTML writer:
- Only add role attribute in HTML5 (#8241). It is not valid in HTML4.
- Avoid aria-hidden in code blocks for HTML4 (#8241).
- Only treat
. . .
as a slide pause in slides, and not in regular HTML output (#8281). - Properly merge classes for headings of level > 6 (#8363).
- Prevent
<a>
inside<a>
(#7585). If a link text contains a link, we replace it with a span. - Replace deprecated aria roles for bibliography entries (#8354).
doc-biblioentry
->listitem
,doc-bibliography
->list
. - Remove obsolete stuff about mathml-script. This was a shim we used to include for mathml support. We don’t do anything with this any more, so this is dead code.
- Include math links if there are raw commands or environments that can be interpreted as math e.g. by MathJax (#8469).
- Add prooftree to list of math environments (#8462). This will cause raw LaTeX prooftree environments to be rendered appropriately when
--mathjax
is used.
-
HTML, Markdown writers: filter out empty class attributes (#8251). These should not be generated by any pandoc readers, but they might be produced programatically.
-
Markdown writer:
- Avoid HTML fallbacks in the generated TOC (Albert Krewinkel, #8131). The generated table of contents usually has IDs for each TOC link, allowing to link back to specific parts of the TOC. However, this leads to unidiomatic markup in formats like gfm, which do not support attributes on links and hence fall back to HTML. The IDs on TOC items are now removed in that case, leading to more aesthetic TOCs.
- Escape
!
before[
(#8254). - Support
mark
extension.
-
AsciiDoc writer:
-
ODT writer:
- Fix relative links (#3524).
-
Docx writer:
- Better handling of tables in lists (#5947). Previously the content of each list cell was indented when the table belonged to a list item.
- Indent tables in list items (#5947).
- Adjust correct attribute on
lang
element (#7022). For East Asian languages, we need to adjustw:eastAsia
rather thanw:val
. This allows normal fonts to be used for any Latin-font text. Similarly, for bidi languages, we need to adjustw:bidi
rather thanw:val
. We treathe
andar
as bidi languages,zh
,ja
,ko
as East Asian languages. - Support relative image widths (Albert Krewinkel). Image widths given in percent are interpreted to be relative to the text width. Previously, percent widths were taken relative to the image’s native size, inconsistently with other writers.
- Avoid using ‘error’ for unassigned table cells (#8468). Instead, throw a regular pandoc error.
- Render a Span with class
mark
as highlighted. Currently yellow is hardcoded.
-
MediaWiki writer:
- Use the ‘new’ table structure, so that colspan and rowspan are supported (Wout Gevaert).
-
Man writer:
- Use UTF-8 by default for non-ascii characters (#8507). Only use groff escapes if
--ascii
has been specified on the command line (writerPreferAscii
).
- Use UTF-8 by default for non-ascii characters (#8507). Only use groff escapes if
-
ICML writer:
- Use Contents element for images with raw data instead of a link with a data: uri (#8398).
-
EPUB writer:
- Refactor to use Text.Pandoc.Chunks.
- Refactored and simplified code.
- Make title page optional (#6097).
-
Ms writer:
- Properly format display equations (#8308).
- Remove -C option on PSPIC. Some old versions don’t support this option, and since it’s the default it shouldn’t be necessary.
-
XWiki writer:
- Use template if it is specified (#8296). Previously templates were ignored.
-
LaTeX template:
- Set fonts after Beamer theme (Jeremie Knuesel). Beamer themes such as metropolis and saintpetersburg change the default fonts. This change gives precedence to the user font settings by moving them after the loading of the Beamer theme.
- Set
\babelfont
whenmainlang
andlang
are specified andpdflatex
is not being used (#8538). This is needed for good results in Arabic. - Add variable
urlstyle
(#8429, Amar Al-Zubaidi). This is set tosame
by default, so users should not experience any change.
-
HTML template:
- Remove default font size, line height and font family in default inline css (#8423).
mainfont
,fontsize
, andlinestretch
can still be used as before; the only difference is that we no longer provide opinionated defaults. This commit also adds amaxwidth
variable that setsmax-width
; if not set, 36em is used as a default. - Add
code { hyphens: manual; }
. - Use
styles.citations.html
partial instyles.html
. - Fix class name
hanging
->hanging-indent
instyles.citations.html
. - Put Consolas before Lucida Console for code font (#8543). This is to prevent Lucida Console from being used on Windows, where it causes spacing issues in some applications, with boldface glyphs wider than regular ones.
- Remove default font size, line height and font family in default inline css (#8423).
-
EPUB CSS changes: Reduce the amount of inline CSS used for EPUBs (#8379). Almost everything is now in the default EPUB CSS (
data/epub.css
), which can be overridden either by puttingepub.css
in the user data directory or by using--css
on the command line. Inline styles are only used for syntax highlighting (which depends on the style specified, and is only included on pages with highlighted code) and for bibliography formatting (which can depend on the CSL style, and is only used in the page containing the bibliography).Note that, for compatibility with older readers, we don’t use flexbox to style
column/columns
divs by default, as we do in HTML. Instead, we use an older method which only works when there are twocolumn
divs inside acolumns
- If you need more than two columns and aren’t worried about support for older EPUB readers, you can modify the default CSS (there is a comment in the CSS telling you what to do).
-
Reveal.js template: prevent line-wrapping of parallax options (#8503, Albert Krewinkel).
-
reference.pptx: Remove unsupported element (#8342, #6338, Link Swanson). The default template contained text above the header, which can mislead users into thinking there is a way to put text there using pandoc.
-
Text.Pandoc.Readers.Metadata:
-
Text.Pandoc.App:
- Move initial input-to-Pandoc code to internal submodule (Albert Krewinkel).
- Change
parseOptionsFromArgs
andparseOptions
(#8406) They now returnEither OptInfo Opt
. [API change] - Add
OptInfo
type [API change]. - Add
handleOptInfo
function. This performs the IO actions for things like--version
that were previously done inparseOptionsFromArgs
[API change]. convertWithOpts
: add argument for aScriptingEngine
[API change].- Unify check for standalone output (Albert Krewinkel).
- New
optEpubTitlePage
field onOpt
[API change] (#6097). - Remove
optEpubChapterLevel
, addoptSplitLevel
[API change]. - Export
IpynbOutput(..)
[API change].
-
Text.Pandoc.App.OutputSettings:
- Remove unused field
outputWriterName
inOutputSettings
.
- Remove unused field
-
Text.Pandoc.Citeproc:
- Check both extension and mime type to determine bibliography type when the bibliography is fetched remotely (#7151).
- CslJson: allow an object with
items
property in addition to an array of references. This is what is returned by e.g.https://api.zotero.org/groups/904125/items?v=...&format=csljson
- Require a digit for an implicit “page” locator inside explicit locator syntax
{...}
(#8288). Previously a locator specified as{}
would be rendered asp.
with nothing after it. - Update
sub verbo
tosub-verbo
(#8315). This is a change in the term’s canonical name in citeproc. As a result of this change,sub verbo
locators have not worked in pandoc since citeproc 0.7. - Text.Pandoc.Citeproc.MetaValue: remove unused function
metaValueToPath
. - Add internal module Text.Pandoc.Citeproc.Name (#8345). This exports
toName
, which previously had been part of T.P.Citeproc.BibTeX, and allows for cleaner module dependencies.
-
Export module
Text.Pandoc.Slides
[API Change] (Albert Krewinkel). -
Add new module Text.Pandoc.Format [API change] (Albert Krewinkel). The module provides functions and types for format spec parsing and processing. The function
parseFormatSpec
was moved from Text.Pandoc.Extensions to the new module and renamed toparseFlavoredFormat
. It now operates in a PandocMonad and is based on the updated types. -
Text.Pandoc.Sources:
- Add UpdateSourcePos instances for String and strict and lazy ByteString [API change].
-
Text.Pandoc.Extensions:
- Fix JSON decoding of Extensions (#8352, Albert Krewinkel).
- Add new exported function
readExtension
[API change]. - Remove
parseFormatSpec
[API change]. This has been moved to Text.Pandoc.Format and renamed asparseFlavoredFormat
(Albert Krewinkel). - Simpler implementation of Extensions based on Set (benchmarks show no performance penalty).
- Add
CustomExtension
constructor toExtension
[API change]. - Remove
Bounded
,Enum
instances forExtension
. - Add
extensionsToList
function. - Revise
readExtension
so it can handleCustomExtension
, and so that it returns a Text rather thanMaybe Text
. - Add
showExtension
[API change]. - Add
Ext_mark
extension [API change]. - Add
Ext_tagging
constructor [API change] (Albert Krewinkel). - Add
Ext_wikilinks_title_after_pipe
,Ext_wikilinks_title_before_pipe
[API change] (Albert Krewinkel).
-
Text.Pandoc.PDF:
-
Text.Pandoc.MIME:
-
Text.Pandoc.XML:
- Re-export
lookupEntity
from commonmark-hs [API change].
- Re-export
-
Text.Pandoc.Parsing:
- Remove gratuitious renaming of Parsec types. We were exporting Parser, ParserT as synonyms of Parsec, ParsecT. There is no good reason for this and it can cause confusion. Also, when possible, we replace imports of Text.Parsec with Text.Pandoc.Parsing. The idea is to make it easier, at some point, to switch to megaparsec or another parsing engine if we want to. New (re-)exports:
Stream(..)
,updatePosString
,SourceName
,Parsec
,ParsecT
. Removed exports:Parser
,ParserT
[API change]. - Export
errorMessages
,messageString
[API change]. - Export
fromParsecError
, which can be used to turn a parsec ParseError into a regular PandocParseError (#8382) [API change]. - Remove
nested
[API change]. It was not being used, and in fact it was a bad idea from the beginning, as it had no hope of solving the problem it was introduced to solve. - Change
characterReference
,charsInBalanced
.characterReference
so they now return a Text (some named references don’t correspond to a single Char). Use the thelookupEntity
function from commonmark-hs instead of the slow one from tagsoup [API change]. charsInBalanced
now takes a Text parser rather than a Char parser as argument [API change].
- Remove gratuitious renaming of Parsec types. We were exporting Parser, ParserT as synonyms of Parsec, ParsecT. There is no good reason for this and it can cause confusion. Also, when possible, we replace imports of Text.Parsec with Text.Pandoc.Parsing. The idea is to make it easier, at some point, to switch to megaparsec or another parsing engine if we want to. New (re-)exports:
-
Text.Pandoc.Shared:
- Export
textToIdentifier
[API change]. - Remove deprecated
crFilter
. [API change] - Remove deprecated
deLink
. [API change] - Deprecate
notElemText
. - Deprecate
makeMeta
. - Remove
pandocVersion
(now available in Text.Pandoc.Version aspandocVersionText
). - Remove
findM
[API change]. This was only used in one place, and can be replaced with simpler code. - Remove deprecated
makeMeta
[API change]. - Remove
ordNub
[API change]. This is justnubOrd
from Data.Containers.ListUtils. - Remove
mapLeft
[API change]. This is just a synonym for Bifunctor.first. - Remove
elemText
,notElemText
[API change]. - Drop export of
pandocVersion
andpandocVersionText
, which are now exported by Text.Pandoc.Version. - Remove
escapeURI
,isURI
. These are now exported by Text.Pandoc.URI, and removing them from Shared helps make the module structure more straightforward. - Use LineBreak as default block sep in
blocksToInlines
. (#8499, Albert Krewinkel). This change also affects thepandoc.utils.blocks_to_inlines
Lua function. defaultUserDataDir
is no longer exported (it has been moved to T.P.Data) [API change].- New function
figureDiv
, offering offers a standardized way to convert a figure into a Div element (Albert Krewinkel) [API change].
- Export
-
Text.Pandoc.Writers.Shared:
- Export
htmlAddStyle
,htmlAlignmentToString
andhtmlAttrs
[API change] (Wout Gevaert). - Use ‘literal tag’ instead of ‘text (T.unpack tag)’ in
tagWithAttrs
(Wout Gevaert). toTableOfContents
: handle nested Divs better (#8402).
- Export
-
Rename Text.Pandoc.Network.HTTP -> Text.Pandoc.URI. This is still an unexported internal module. Export
urlEncode
,escapeURI
,isURI
,schemes
,uriPathToPath
. Drop exports ofschemes
anduriPathToPath
. -
Text.Pandoc.URI
isURI
: don’t require non-ASCII characters to be escaped (#8508). -
Rename Text.Pandoc.Readers.LaTeX.Types -> Text.Pandoc.TeX (internal module).
-
Text.Pandoc.Options:
- WriterOptions now has a field
writerListTables
, specifying that list tables be used in RST output [API change]. - New
writerEpubTitlePage
field onWriterOptions
(#6097) [API change]. - Remove
writerEpubChapterLevel
, addwriterSplitLevel
[API change].
- WriterOptions now has a field
-
Text.Pandoc.Filter:
- Export
applyFilters
[API change]. - Export
applyJSONFilter
[API Change] (Albert Krewinkel). - Parameterize
applyFilters
over scripting engine [API change] (Albert Krewinkel).
- Export
-
New exported module Text.Pandoc.Chunks [API change]. This module provides functions to split Pandoc documents into chunks to be rendered in separate files, e.g. one per section. Internal identifiers are rewritten appropriately to point to the new locations (#6122).
-
Text.Pandoc.Readers:
- Change argument type of
getReader
, so it takes aFlavoredFormat
instead of aText
[API change] (Albert Krewinkel).
- Change argument type of
-
Text.Pandoc.Writers:
- Change argument type of
getWriter
, so it takes aFlavoredFormat
instead of aText
[API change] (Albert Krewinkel).
- Change argument type of
-
Text.Pandoc.Templates:
- Do not try to normalize input to
getDefaultTemplate
(Albert Krewinkel). The functiongetDefaultTemplate
no longer splits off extension modifers from the given format, as that conflicts with using custom writers as formats. Haskell library users should usegetDefaultTemplate <=< (fmap formatName . parseFlavoredFormat)
if the input format can still contain extensions. The same is true forcompileDefaultTemplate
, which callsgetDefaultTemplate
internally - Add Wrapper type documentation (#8490, William Rusnack).
- Do not try to normalize input to
-
New exported module Text.Pandoc.Scripting (Albert Krewinkel). The module contains the central data structure for scripting engines (e.g., Lua) [API change].
-
Text.Pandoc.Error:
- Add new PandocError constructor
PandocNoScriptingEngine
[API change] (Albert Krewinkel). - Add new PandocError constructor
PandocFormatError
[API change] (Albert Krewinkel). The new error is used to report problems with input or output format specifications. - Add new PandocError constructor
PandocNoTemplateError
(Albert Krewinkel). - Remove
PandocParsecError
constructor fromPandocError
(#8385). Henceforth we just usePandocParseError
.
- Add new PandocError constructor
-
New module Text.Pandoc.Version, exporting
pandocVersionText
andpandocVersion
[API change].pandocVersion
returns aVersion
instead of aText
, which is consistent withpandocTypesVersion
. -
Text.Pandoc.Class:
- Make
getPOSIXTime
,getZonedTime
sensitive toSOURCE_DATE_EPOCH
environment variable if set (#7093). (getTimestamp
was already sensitive.) This ensures that EPUB builds are reproducible. - Text.Pandoc.Class no longer exports
readDataFile
,readDefaultDataFile
,setTranslations
, andtranslateTerm
[API change]. - Text.Pandoc.Class now exports
checkUserDataDir
[API change].
- Make
-
T.P.Class.IO: export function
writeMedia
[API change] (Albert Krewinkel). This is useful for thepandoc.mediabag
module. -
Separate out Text.Pandoc.Data and Text.Pandoc.Translations from Text.Pandoc.Class (#8348). This makes Text.Pandoc.Class more self-contained.
- Text.Pandoc.Data is now an exported module, providing
readDataFile
andreadDefaultDataFile
(both formerly provided by Text.Pandoc.Class), and alsogetDataFileNames
(formerly unexported in Text.Pandoc.App.CommandLineOptions) anddefaultUSerDataDir
(formerly provided by Text.Pandoc.Shared). [API change] - Text.Pandoc.Translations is now an exported module (along with Text.Pandoc.Translations.Types), providing
readTranslations
,getTranslations
,setTranslations
,translateTerm
,lookupTerm
,readTranslations
,Term(..)
, andTranslations
[API change].
- Text.Pandoc.Data is now an exported module, providing
-
Text.Pandoc now exports Text.Pandoc.Data and
setTranslations
andtranslateTerm
{API change]. -
Export module Text.Pandoc.Class.IO [API change]. The module is useful when defining instances of class PandocMonad for types that are also instances of MonadIO.
-
Remove modules Text.Pandoc.Writers.Custom and Text.Pandoc.Readers.Custom [API Change] (Albert Krewinkel). The functions
writeCustom
andreadCustom
are available from module Text.Pandoc.Lua. -
Text.Pandoc.Server:
-
Split this module into a separate package,
pandoc-server
, allowing thepandoc
library to be compiled without server support. -
Return object if JSON is accepted. Previously we just returned a JSON-encoded string. Now we return something like:
{ "output": "<p>hello</p>" "base64": false, "messages": [ { "message": "Not rendering RawInline (Format \"tex\") \"\\\\noe\"", "verbosity": "INFO" } ], }
This is a change in the pandoc-server JSON API.
-
Set translations in the writer based on
lang
metadata. -
Return error in JSON object if response is JSON.
-
Remove
parseServerOpts
. [API change]
-
-
Text.Pandoc.Lua:
-
This module has been moved to a separate package,
pandoc-lua-engine
. -
Export
applyFilter
,readCustom
, andwriteCustom
. No longer export the lower-level functionrunFilterFile
[API change]. -
Change type of
applyFilter
[API Change] (Albert Krewinkel). The module Text.Pandoc.Filter.Lua has been merged into Text.Pandoc.Lua. The functionapplyFilter
now has typeapplyFilter :: (PandocMonad m, MonadIO m) => Environment-> [String]-> FilePath-> Pandoc-> m Pandoc
where
Environment
is defined in Text.Pandoc.Filter.Environment. -
Export new function
getEngine
[API Change]. The function returns the Lua scripting engine. -
Add unexported modules T.P.Lua.Reader, T.P.Lua.Writer. These contain the definitions of
readCustom
andwriteCustom
that were previously in T.P.Readers.Custom and T.P.Writers.Custom. -
Cleanup module dependencies, for a cleaner module dependency graph.
-
The
writeCustom
function has changed to return a Writer and an ExtensionsConfig [API change]. This allows ByteString writers to be defined. -
The
readCustom
function has changed to return a Reader and an ExtensionsConfig [API change].
-
-
Lua subsystem (Albert Krewinkel):
- The whole Lua subsystem has been moved to a separate package,
pandoc-lua-engine
.pandoc
does not depend on it.convertWithOpts
has a new parameter that can be used to pass in the scripting engine defined inpandoc-lua-engine
(or a different one, in theory). - Fix the behavior of Lua “Version” objects under equality comparisons (#8267).
- Support running Lua with a GC-collected Lua state.
- Ensure that extensions marshaling is consistent.
- Produce more informative error messages for pandoc errors. Errors are reported in Lua in the same words in which they would be reported in the terminal.
- Add new module
pandoc.format
. The module provides functions to query the set of extensions supported by formats and the set of extension enabled per default. - Add function
pandoc.template.apply
. - Add function
pandoc.template.meta_to_context
. The functions converts Meta values to template contexts; the intended use is in combination withpandoc.template.apply
. - Allow Doc values in
WriterOptions.variables
. The specialized peeker and pusher function forContext Text
values does not go via JSON, and thus keeps Doc values unchanged during round-tripping. - Fix rendering of Lua errors in Lua, so that the
Error running Lua
message is not prepended multiple times. - Add new module
pandoc.zip
. - Allow strings in place of compiled templates (#8321). This allows to use a string as parameter to
pandoc.template.apply
and in the WriterOptionstemplate
field. - Rename
reader_extensions
/writer_extensions
globals asExtensions
(#8390). - Add
pandoc.scaffolding.Writer
(#8377). This can be used to reduce boilerplate in custom writers. - Fix peeker for PandocError (Albert Krewinkel). String error messages were incorrectly popped of the stack when retrieving a PandocError.
- Add functions
pandoc.text.toencoding
,pandoc.text.fromencoding
(#8512, Albert Krewinkel). - Add
pandoc.cli
module. Allow processing of CLI options in Lua. - Support
-D
CLI option for custom writers. A new errorPandocNoTemplateError
(code 87) is thrown if a template is required but cannot be found. - Allow table structure as format spec. This allows to pass structured values as format specifiers to
pandoc.write
andpandoc.read
. - Add function
pandoc.mediabag.write
(Albert Krewinkel). - Add module
pandoc.structure
(Albert Krewinkel). The functionmake_sections
has been given a friendlier interface and moved to the new module; the oldpandoc.utils.make_sections
has been deprecated.
- The whole Lua subsystem has been moved to a separate package,
-
Custom writers:
-
The global variables
PANDOC_DOCUMENT
andPANDOC_WRITER_OPTIONS
are no longer set when the writer script is loaded. Both variables are still set in classic writers before the conversion is started, so they can be used when they are wrapped in functions. -
Deprecate classic custom writers.
-
Add function
pandoc.write_classic
. The function can be used to convert a classic writer into a new-style writer by setting it as the value ofWriter
:Writer = pandoc.write_classic
or to fully restore the old behavior:
function Writer (doc, opts) PANDOC_DOCUMENT = doc PANDOC_WRITER_OPTIONS = opts load(PANDOC_SCRIPT_FILE)() return pandoc.write_classic(doc, opts) end
-
Support extensions in custom writers. Custom writers can define the extensions that they support via the global
writer_extensions
. The variable’s value must be a table with all supported extensions as keys, and their default status as values. For example, the below specifies that the writer supports the extensionssmart
andsourcepos
, but only thesmart
extension is enabled by default:writer_extensions = { smart = true, sourcepos = false, }
-
Custom writers can define a default template via a global
Template
function; the data directory is no longer searched for a default template. Writer authors can restore the old lookup behavior withTemplate = function () local template return template.compile(template.default(PANDOC_SCRIPT_FILE)) end
-
-
Custom readers:
-
Support extensions in custom readers. Custom readers, like writers, can define the set of supported extensions by setting a global. E.g.:
reader_extensions = { smart = true, citations = false, }
-
-
Use latest versions of
commonmark-extensions
,texmath
,citeproc
,gridtables
, andskylighting
. -
Use pandoc-types 1.23. This adds the
Figure
Block constructor and removes theNull
Block constructor. -
Require aeson >= 2.0.
-
Use jira-wiki-markup 1.5.0 (#8511, Albert Krewinkel). Fixes issues with icon-like sequences at the beginning of words.
-
Use doctemplates 0.11, avoiding a transitive dependency on HsYAML.
-
Use skylighting 0.13.1.2.
-
Allow mtl 2.3.1 (Alexander Batischev).
-
Use latest skylighting-format-context.
-
Allow building with mtl 2.3.
-
Remove
lua53
flag. We now only support Lua 5.4. -
Add hie.yaml for haskell language server.
-
Add tools/latex-package-dependencies.lua.
-
Update default CSL with latest
chicago-author-date.csl
. -
make_artifacts.sh: various small improvements.
-
Remove sample.lua from data files (#8356).
-
Documentation:
- Deprecate
PANDOC_WRITER_OPTIONS
in custom writers (Albert Krewinkel). - Document
pandoc.write_classic
(Albert Krewinkel). - Document new table features (Albert Krewinkel).
- Clarify what background-image does in reveal.js (#6450).
- Documentation improvements for
blank_before_blockquote
(#8324, Pranesh Prakash). - Update grid table documentation (#8346).
- Add note about MathJax fonts to
--embed-resources
. - Use cabal’s –package-env more (#8317, Artem Pelenitsyn).
- Modify Zerobrane instructions to use Lua 5.4 (#8353, Ian Max Andolina).
- Fix documentation for highlight-style in
pandoc-server.md
. - Fix link to fedora package site (#8246, Akos Marton).
- Rephrase paragraph on format extensions (#8375, Ilona Silverwood).
- Update README.template (#8496, Sven Wick).
- Fix a tiny typo in lua-filters.md (TomBen).
- Clarify that
--css
should be used with-s
. - Clarify font selection for pdf -t ms (#8421, nbehrnd).
- Clarify docs for
--metadata-file
(#8459). - Fix typo in epub.md (Vladimir Alexiev).
- Add missing backtick in filters.md (R. N. West).
doc/lua-filters.md
: add documentation forpandoc.format
(Albert Krewinkel).- Fix epub-embed-font documentation (#8455, Terence Eden).
- Removed obsolete Templates section in CONTRIBUTING.md.
- Add manual section on accessible PDFs, archiving standards (#8312, Albert Krewinkel).
- Deprecate
-
Tests.Command: remove unused
runTest
. -
Add pandoc-lua.1 man page.
-
Improve
shell.nix
. -
Add
tools/moduledeps.lua
for inspecting the internal module dependency tree. -
Fix macOS zip so pandoc-server is a symlink. This cuts its size by 2x.
-
CI: Improve CI speed by caching more, eliminating macos builds, and splitting benchmarks into a separate action, run by manual dispatch. (We still test that benchmarks build in the regular CI.) The cache can be expired manually by modifying the secret
CACHE_VERSION
. -
Remove the unnecessary Setup.hs from pandoc. Cabal does not need this with build-type ‘simple’.
-
Add pandoc-lua and pandoc-server (symlinks) and their man pages to releases.
-
Use hslua-cli package for pandoc-lua interface (Albert Krewinkel).
-
Add
server
flag to pandoc-cli, allowing it to be compiled without server support. -
pandoc-cli: Allow building a binary without Lua support (Albert Krewinkel). Disabling the
lua
cabal flag will result in a binary without Lua. -
Move
--version
handling to pandoc-cli. We need it here in order to print information about whether server and Lua support have been compiled in. -
Move
nightly
flag from pandoc to pandoc-cli (#8339). -
Makefile changes:
make help
will now print all the targets and what they do.- Add targets:
coverage
,weeder
,moduledeps
,prerelease
,ghcid
,repl
,linecounts
,hie.yaml
,binpath
. - Note that you can
alias pandoc=`make binpath`
for convenient local testing of a build. - Rename
quick-cabal
->build
,quick-test
->test
. - Exclude tests from
SOURCEFILES
.
-
Factor out xml-light into an internal library.
-
Add CITATION.cff (#8434).
-
Move trypandoc to a separate repository, jgm/trypandoc.