Skip to content

Releases: jgm/pandoc

pandoc 3.1.3

07 Jun 04:42
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • New output format: typst.

  • New module: Text.Pandoc.Readers.Typst [API change].

  • DocBook reader:

    • Support more emphasis roles (Albert Krewinkel). The role “bf” is taken to indicate “bold face”, i.e., “strongly emphasized” text, while “underline” leads to underlined text.
  • JATS reader:

    • Improve title and label parsing in the JATS reader (#8718, Noah Malmed.)
    • Add rowspan, colspan and alignment to cells in jats table reader (#8408, Noah Malmed)
  • Org reader (Albert Krewinkel):

    • Require abstract environment to use lowercase.
    • Treat #+NAME as synonym for #+LABEL (#8578).
  • ODT reader:

    • Allow lists in table cells (#8892).
    • Allow frames inside spans (#8886).
  • RST reader:

    • Fix sorting on anonymous keys (#8877). This fixes a link resolution bug bug affecting RST documents with anonymous links.
  • HTML reader:

    • Fix iframe with data URI of an image (#8856). In this case we don’t want to try to parse the data at the URL. Instead, create an image inside a div.
  • RTF reader:

    • Fix bug in table parsing (#8767). In certain cases, text before a table was being incorporated into the table itself.
  • Docx reader:

    • Introduce support for Intense Quote (Stephan Meijer).
  • Markdown reader:

    • Disallow escaping of ~ and " in markdown_strict (#8777, Albert Krewinkel). This matches the behavior of the legacy Markdown.pl as well as what is described in the manual.
  • LaTeX reader: ignore args to column type in \multicolumn (#8789).

  • HTML writer:

    • Use first paragraph in task item as checkbox label (#8729, Albert Krewinkel).
  • Ms writer:

    • Coerce titles to inlines (#8835). Block-level formatting is not allowed inside .TL.
  • LaTeX writer:

    • Fix width for multicolumn simple table (#8831).
  • Jira writer:

    • Use first code block class as highlighting language (#8814, Albert Krewinkel). The writer no longer searches the list of classes for a known programming language but always uses the first class in that list as the language identifier.
  • OpenDocument writer:

    • Handle row header column cells as header cells (#8764, Michael Stahl).
    • Fix invalid text:p inside text:p from meta (#8256).
  • ODT writer:

    • Don’t add settings.xml (Michael Stahl). This will cause defaults to be used, which is what we want.
    • Don’t add unnecessary Configurations2 directory (Michael Stahl).
    • Don’t add thumbnail (Michael Stahl).
    • Put manifest.version on directory file-entry (Michael Stahl). See ODF 1.3 part 2, 4.16.14.1.
    • Stop validator complaints by producing ODF 1.3 (Michael Stahl).
  • MediaWiki writer:

    • Remove links from inside links in mediawiki writer (#8739, Wout Gevaert).
  • Typst writer:

    • Omit bibliography if citations not enabled (#8763). With this change, the typst writer will omit the #bibliography command when citations is not enabled. (If you want to use pandoc’s own --citeproc, you should combine it with -t typst-citations to disable native typst citations.
    • Use <..> for labels, create internal links.
    • Use #footnote for notes (#8893).
    • Fix alignment issue in lists. It’s an aesthetic issue only; the first line had an extra space indent after the list marker.
  • Commonmark writer:

    • Use shortcut reference links: commonmark supports these.
  • EPUB template: add lang attribute to <html> (Gabriel Lewertoski).

  • Template styles.html: fix task-list styling in reveal.js (#8731, Albert Krewinkel).

  • LaTeX template: Fix \babelfont (#8728).

  • Text.Pandoc.Parsing:

    • Remove unnecessary ‘spaces’ in parseFromString.
  • Text.Pandoc.ImageSize: Drop BOM at start of SVG if present. Otherwise our code can fail to determine image size.

  • Lua subsystem:

    • Fix value of PANDOC_SCRIPT_FILE for custom readers & writers (#8781, Albert Krewinkel). The value did not hold the actual file path for scripts in the custom folder of the datadir.
  • Fix YAML in translation files for cs and pl (#8787).

  • Fix pdf output via typst (#8754). One must now use typst compile rather than typst.

  • MANUAL.txt:

    • Added note that the user will need to create the user data dir (#8727).
    • Add wikilinks to non-default extensions (Ilona).
    • Update link to custom djot writer (Albert Krewinkel).
    • Better link to citation syntax.
    • Fix typo (sdhoward).
    • Note that # fancy list markers don’t work with commonmark (#8772, William Lupton).
    • Add commonmark fenced_div note (#8773, William Lupton).
    • Move highlighting documentation, with minor adjustments (William Lupton).
    • Fix inaccurate statement about spaces and tabs in template syntax (Frank Seifferth).
  • Update documentation for org-mode (Christian Christiansen, #8716).

  • doc/lua-filter.md:

    • Fix typos (#8734, perro tuerto).
    • Fix anchor (Toni Dietze).
    • Use full field name in example (#8857, Matt Dodson).
    • Fix copy-paste error (#8798, thron7).
  • CONTRIBUTING.md: update info on ghc versions.

  • INSTALL.md:

    • Fix cabal install instructions (Albert Krewinkel).
    • Use more relevant link to NetBSD/pkgsrc entry (Charlotte Koch).
    • Fix Windows install instructions for winget (#8799).
  • Tests: Rename test/docx/block_quotes_parse_indent.native for consistency (Stephan Meijer).

  • Add tls constraint on cabal.project. This is needed to avoid problems caused by the transition to crypton.

  • Require texmath 0.12.8.

pandoc 3.1.2

28 Mar 00:00
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • Add a Lua REPL (Albert Krewinkel). This can be started with pandoc lua -i. It is also possible to instruct a filter to open the REPL at a certain point, for debugging (see pandoc.cli.repl).

  • Support typst as a --pdf-engine.

  • Add typst writer (#8713). New module Text.Pandoc.Writers.Typst, exporting writeTypst [API change].

  • Org reader:

  • DocBook reader:

    • Handle “book” for xref references (#8712, Andres Freund) This also adds a test xref to book and part.
    • Handle <part> (#8712).
  • HTML reader:

    • Fix behavior with -native_spans-raw_html (#8711). Previously with this configuration, <span>s were not treated as inline elements at all.
  • HTML writer:

    • Avoid duplicate classes (#8705).
    • Use img element instead of embed for .svg.gz and .png.gz etc. (#8699).
    • HTML writer footnotes changes (#8695): when --reference-location=section or =block, use an aside element for the notes rather than a section. When --reference-location=section, include the aside element inside the section element, rather than outside. (In slide shows, this option causes footnotes on a slide to be displayed at the bottom of the slide.)
  • EPUB writer:

    • Use different structure for epub footnotes (#8676, see #8672, #5583). Many EPUB readers are thrown off by pandoc’s current footnote output. Both the ol and the fact that the footnote backlink is at the end of the note seem to pose problems. With this commit, we now create a list of aside (or div) elements, instead of an ordered list. Each element begins with a note number that is linked back to the note reference. (So, the backlink occurs at the beginning rather than the end.) Thanks to @Porges and @lewer.
  • Docx writer:

    • Include abstract title (#8702). Uses localized term for abstract.
  • Markdown writer:

    • Use implicit figures if there’s a caption but no alt (#8689, Albert Krewinkel).
  • Jira reader (Albert Krewinkel):

    • Add panel title as nested div (#8681).
    • Require jira-wiki-markup 1.5.1 (#8680). This fixes a bug in the parser that caused text between two exclamation marks to be parsed as an image. The first ! of image markup must now be followed by a non-space character; otherwise, the enclosed text is parsed as normal content.
  • Ms writer:

    • Fix handling of Figure (#8660).
  • ICML writer:

    • Fix images with data (#8675). The Contents element should be inside Properties.
  • LaTeX writer:

    • Add Chinese to Babel languages.
    • Fix background image in Beamer when there are figure environments (#8671, Martín Pozo).
  • LaTeX template:

    • Add babelfonts variable to default LaTeX template. This allows specifying certain fonts to be used with certain babel languages. Thanks to Frederik Elwert.
    • Fix highlight/underline with lualatex (#8707). We need the lua-ul package instead of soul, which doesn’t work with lualatex.
  • Lua (Albert Krewinkel):

    • Add pandoc.cli.repl function
    • Fix json.encode for nested AST elements. Ensures that objects with nested AST elements can be encoded as JSON.
    • Auto-generate docs for pandoc modules.
    • Load text module as pandoc.text. This only affects the name in the Lua-internal documentation. It is still possible to load the modules via require 'text', although this is deprecated.
    • Move docs from module text to pandoc.text The latter is easier to use and more consistent with the other modules.
    • Keep the Lua stack clean A metatable used during initialization was not properly removed from the stack. Likewise, accessing the CommonState from Lua previously led to the pollution of the Lua stack with a left-over value.
    • Add function pandoc.format.from_path.
    • Allow to get the JSON encoding of log messages.
  • Text.Pandoc.Format: Add new function formatFromFilePaths [API change] (#8710, Albert Krewinkel).

  • The old Text.Pandoc.App.FormatHeuristics module has been removed.

  • In --version, use Windows %APPDATA% variable to describe user data dir (#8686, Pablo Rodríguez).

  • Text.Pandoc.App.CommandLineOptions: don’t lowercase arg to --from/--read (Albert Krewinkel). This prevented users to use custom writers with uppercase characters in their filenames. Format-normalization, including lower-casing of format identifiers, happens during format parsing.

  • Documentation:

    • Add doc/nix.md.
    • Add doc/extras.md. This was formally in the website repo.
    • doc/lua-filters.md: improve docs for pandoc.zip.
  • Factor out make_macos_release.sh from the release candidate workflow. Use cabal instead of stack to build the macos binary.

  • Modify linux/make_artifacts.sh so it will work on cirrus.

  • Switch to hslua-2.3

  • Depend on latest releases of texmath, doclayout.

pandoc 3.1.1

05 Mar 23:10
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • EPUB reader: Give additional information in error if the epub zip container can’t be unpacked.

  • TSV reader: don’t gobble tabs as whitespace (#8661).

  • Org reader: accept empty tables (#8659).

  • LaTeX reader: fix multiplication syntax for tabular (#8658). We recognized *{6}{...} but not *6{...} or *6c.

  • Docx reader: parse image alt texts in LibreOffice generated files. LibreOffice tags images slightly differently than Word; this change lets the parses take that difference into account when looking for an image description (alt text).

  • DocBook reader:

    • Fix <xref> references to tables in DocBook files (#8626, Pavol Otto).
    • Parse figure as a Figure element in the AST (#8668).
  • JATS reader: avoid generating duplicate figure captions (#8669).

  • RST reader: align with spec in syntax for role names (#8653). In particular, we now allow colons in row names.

  • Add note on converting from .doc format to FAQs (#8654).

  • Trap error in getAppUserDataDirectory (#8648). This can raise an error if pandoc is run in a non-user environment.

  • LaTeX writer: do not use longtable foot with Beamer (#8638, Albert Krewinkel). The table foot is made part of the table body, as otherwise it won’t show up in the output. The root cause for this is that longtable cannot detect page breaks in Beamer.

  • LaTeX template: Add CJKsansfont and CJKmonofont for XeLaTeX (#8656, Yudong Jin). CJKsansfont and CJKmonofont will be set for xelatex only if CJKmainfont is also provided.

  • URL style in ConTeXt (#8612, Thomas Hodgson). Previously, a URL like this would be in monospace text: \useURL[url1][https://example.com]. Now, it will match the main text unless the linkstyle variable is set, which controls the styling of all links. Closes #8602.

  • Asciidoc writer: Properly escape | in table cells (#8665).

  • asciidoc{,tor} template: fix revision date when author is unset (#8637, arcnmx). Revision line syntax is only valid in combination with an author line, so the date attribute must be set explicitly when the author is missing

  • HTML writer: allow “track” element to be treated as block-level HTML (#8629).

  • Include needed polyfill when MathJaX is used (#8625).

  • JATS writer: include alt-text in <graphic>, <inline-graphic> elements (#8631, Albert Krewinkel).

  • Chunked HTML writer: Retain metadata in processing sections for chunked HTML (#8620). Previously we suppressed metadata in all but the top page, in order to prevent the title block from being printed on every page. This prevented use of custom variables set by metadata fields. This commit moves to a better solution: a conditional in the default template restricts the title block to the top page.

  • Lua API:

    • Add new function pandoc.system.cputime (Albert Krewinkel). The function returns the CPU time consumed by pandoc and can be used to benchmark Lua computations.
    • Add module pandoc.json to handle JSON encoding (#8605, Albert Krewinkel).
  • Use pandoc-lua-marshal 0.2.1 (Albert Krewinkel). All major AST elements now have __tojson metamethods that return the JSON representation of an element. This allows to JSON-encode these elements with libraries that respect the __tojson metamethod, including dkjson.

  • Use latest zip-archive. This allows pandoc to open certain epubs that it could not open before.

  • Use commonmark-extensions 0.2.3.4. This fixes some bugs involving definition lists and inline formatting.

  • Use latest skylighting-format-context

  • MANUAL.txt:

    • Document chunk-template in defaults file.
    • Remove obsolete “raw content in a style” section.
    • Revise documentation for --mathml to reflect support in all major browsers (#8667).
  • docs/custom-readers.md: Update JSON parsing example. The example now uses the built-in pandoc.json library to parse the API output.

  • doc/press.md: Add article on CiTO in J Cheminform by @egonw.

  • doc/lua-filters.md: fix typo in run_json_filter (Morgan Willcock).

pandoc 3.1

10 Feb 06:39
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • Fix regression with --print-highlight-style option (#8586).

  • Add new --chunk-template option (#8581), allowing more control over the filenames in chunked HTML output.

  • Text.Pandoc.App: Add optChunkTemplate constructor to Opt [API change].

  • Text.Pandoc.Options: add writerChunkTemplate constructor to WriterOptions [API change].

  • Text.Pandoc.Chunks: add Data, Typeable, Generic, ToJSON, FromJSON instances for PathTemplate [API change].

  • Text.Pandoc.Citeproc: Fix bug in metaValueToReference (#8611). This bug caused us to get some repeated content when converting MetaBlock to Inlines.

  • Textile reader:

    • Support footnote backlinks (#8585, Stephen Altamirano).
    • Don’t allow brackets in URLs (#8582).
  • ODT reader: fix blockquote indent detection (#3437, Daniel Kessler).

  • LaTeX writer: include short figure/table caption if one is given (Albert Krewinkel). Short captions are used by LaTeX when generating the list of figures or list of tables. Adding a short caption will now overwrite the full caption in these lists.

  • Powerpoint writer: fix handling of simple figures (#8565, Albert Krewinkel). This ensures that simple figures are displayed in the same way as before the introduction of a dedicated Figure constructor in the AST.

  • Improve handling of % in bib(la)tex parsing (#8597, #8595).

  • Use released skylighting 0.13.2.1

  • INSTALL.md: direct people to cabal install pandoc-cli.

  • doc/lua-filters.md: document ‘Figure’ type and constructor (Albert Krewinkel). Fix typos (Martin Joerg).

  • Fix link in manual (#8583, Salim B).

pandoc 3.0.1

25 Jan 19:32
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • Fix use of extensions with custom readers (#8571).

  • Text.Pandoc.Writers.Shared: export setupTranslations [API change]. Use this in HTML and OpenDocument writers, to ensure that translations are set up properly even when we don’t go through convertWithOpts.

  • LaTeX reader: fix regression in macro resolution for environments (#8573).

  • Chunked HTML writer: Fix handling of images with absolute URLs (#8567).

  • HTML writer:

    • Don’t omit newlines in task lists.
    • Don’t disable checkboxes in task lists (#8562).
  • Ensure that automatically set variables pandoc-version, outputfile, title-prefix, epub-cover-image, curdir, dzslides-core can be overridden by --variable on the command line. Previously they would create lists in the template Context, which is not desirable.

  • Fix man page copying in linux/make_artifacts.sh (#8566). Previously we were copying the pandoc-server.1 pandoc page to pandoc-lua.1.

  • pandoc.cabal: remove pandoc.cabal, stack.cabal from extra-source-files (#8560). The problem is that if these are in extra-source-files, then they get put in the tarball, and then anyone trying to build the source from an unpacked tarball will run into the problem that cabal.project and stack.yaml refer to pandoc-server, pandoc-lua-engine, and pandoc-cli, which aren’t in the tarball.

  • Require texmath 0.12.6 for better MathML output.

  • Fix typo in Lua filter documentation (Carlos Scheidegger).

  • Fix formatting of link in pandoc-server.md (James Scott-Brown).

  • Minor changelog fixups.

pandoc 3.0

18 Jan 21:28
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • Split pandoc-server, pandoc-cli, and pandoc-lua-engine into separate packages (#8309). Note that installing the pandoc package from Hackage will no longer give you the pandoc executable; for that you need to install pandoc-cli.

  • Pandoc now behaves like a Lua interpreter when called as pandoc-lua or when pandoc lua is used (#8311, Albert Krewinkel). The Lua API that is available in filters is automatically available to the interpreter. (See the pandoc-lua man page.)

  • Pandoc behaves like a server when called as pandoc-server or when pandoc server is used. (See the pandoc-server man page.)

  • A new command-line option --list-tables, causes tables to be formatted as list tables in RST (#4564, with Francesco Occhipinti).

  • New command line option: --epub-title-page=true|false allows the EPUB title page to be omitted (#6097).

  • --reference-doc can now accept a URL argument (#8535) and load a remote reference doc.

  • --version output no longer contains version info for dependent packages. Instead, it contains a “Features” line that indicates whether the binary was compiled with support for acting as a server, and for using Lua filters and Custom writers.

  • A new option --split-level replaces --epub-chapter-level and affects both EPUB and chunked HTML output. --epub-chapter-level will still work but is deprecated.

  • Multiple input files with --file-scope: fix case where the links are URL-encoded, e.g. with %20 (#8467).

  • Produce error if --csl is used more than once (#8195, Prat).

  • Remove deprecated --atx-headers option.

  • Remove deprecated option --strip-empty-paragraphs.

  • In --verbose mode add message when running citeproc (as with other filters).

  • Add new mark extension for highlighted text in Markdown, using == delimiters (#7743).

  • Add new extensions wikilinks_title_after_pipe and wikilinks_title_before_pipe for commonmark and markdown. (#2923, Albert Krewinkel). The former enables links of style [[Name of page|Title]] and the latter [[Title|Name of page]]. Titles are optional in both variants, so this works for both: [[https://example.org]], [[Name of page]]. The writer is modified to render links with title wikilink as a wikilink if a respective extension is enabled. Pandoc will use wikilinks_title_after_pipe if both extensions are enabled.

  • Add prefixes to identifiers with --file-scope (#6384). This change only affects the case where --file-scope is used and more than one file is specified on the command line. In this case, identifiers will be prefixed with a string derived from the file path, to disambiguate them. For example, an identifier foo in contents/file1.txt will become contents__file1.txt__foo. Links will be adjusted accordingly: if file2.txt links to file1.txt#foo, then the link will be changed to point to #file1.txt__foo. Similarly, a link to file1.txt will point to #file1.txt. A Div with an identifier derived from the file path will be added around each file’s content, so that links to files will still work.

  • New output format: chunkedhtml. This creates a zip file containing multiple HTML files, one for each section, linked with “next,” “previous,” “up,” and “top” links. (If -o is used with an argument without an extension, it is treated as a directory and the zip file is automatically extracted there, unless it already exists.) The top page will contain a table of contents if --toc is used. A sitemap.json file is also included. The option --split-level determines the level at which sections are to be split.

  • Support complex figures (Albert Krewinkel, Aner Lucero). There is now a dedicate Figure block constructor for figures. The old hack of representing a figure as Para [Image attr [..alt..] (source, "fig:title")] has been dropped. Here is a summary of figure support in different formats:

    • Markdown reader: paragraphs containing just an image are treated as figures if the implicit_figures extension is enabled. The identifier is used as the figure’s identifier and the image description is also used as figure caption; all other attributes are treated as belonging to the image.
    • Markdown writer: figures are output as implicit figures if possible, via HTML if the raw_html extension is enabled, and as Div elements otherwise.
    • HTML reader: <figure> elements are parsed as figures, with the caption taken from the respective <figcaption> elements.
    • HTML writer: the alt text is no longer constructed from the caption, as was the case with implicit figures. This reduces duplication, but comes at the risk of images that are missing alt texts. Authors should take care to provide alt texts for all images. Some readers, most notably the Markdown reader with the implicit_figures extension, add a caption that’s identical to the image description. The writer checks for this and adds an aria-hidden attribute to the <figcaption> element in that case.
    • JATS reader: The <fig> and <caption> elements are parsed into figure elements, even if the contents is more complex.
    • JATS writer: The <fig> and <caption> elements are used write figures.
    • LaTeX reader: support for figures with non-image contents and for subfigures.
    • LaTeX writer: complex figures, e.g. with non-image contents and subfigures, are supported. The subfigure template variable is set if the document contains subfigures, triggering the conditional loading of the subcaption package. Contants of figures that contain tables are become unwrapped, as longtable environments are not allowed within figures.
    • DokuWiki, Haddock, Jira, Man, MediaWiki, Ms, Muse, PPTX, RTF, TEI, ZimWiki writers: Figures are rendered like Div elements.
    • Asciidoc writer: The figure contents is unwrapped; each image in the the figure becomes a separate figure.
    • Classic custom writers: Figures are passed to the global function Figure(caption, contents, attr), where caption and contents are strings and attr is a table of key-value pairs.
    • ConTeXt writer: Figures are wrapped in a “placefigure” environment with \startplacefigure/\endplacefigure, adding the features caption and listing title as properties. Subfigures are place in a single row with the \startfloatcombination environment.
    • DocBook writer: Uses mediaobject elements, unless the figure contains subfigures or tables, in which case the figure content is unwrapped.
    • Docx writer: figures with multiple content blocks are rendered as tables with style FigureTable; like before, single-image figures are still output as paragraphs with style Figure or Captioned Figure, depending on whether a caption is attached.
    • DokuWiki writer: Caption and “alt-text” are no longer combined. The alt text of a figure will now be lost in the conversion.
    • FB2 writer: The figure caption is added as alt text to the images in the figure; pre-existing alt texts are kept.
    • ICML writer: Only single-image figures are supported. The contents of figures with additional elements gets unwrapped.
    • OpenDocument writer: A separate paragraph is generated for each block element in a figure, each with style FigureWithCaption. Behavior for single-image figures therefore remains unchanged.
    • Org writer: Only the first element in a figure is given a caption; additional block elements in the figure are appended without any caption being added.
    • RST writer: Single-image figures are supported as before; the contents of more complex images become nested in a container of type float.
    • Texinfo writer: Figures are rendered as float with type figure.
    • Textile writer: Figures are rendered with the help of HTML elements.
    • XWiki: Figures are placed in a group.
  • Changes in custom readers/writers:

    • It is now possible to have a custom reader and a custom writer for a format together in the same file. The file may also define a custom template for the writer.
    • Pandoc now checks the folder custom in the user’s data directory for a matching script if it can’t find one in the local directory. Previously, the readers and writers data directories were searched for custom readers and writers, respectively. Scripts in those directories must be moved to the custom folder.
    • Custom readers used to implement a fallback behavior that allowed to consume just a string value as input to the Reader function. This has been removed, the first argument is now always a list of sources. Use tostring on that argument to get a string.
  • New module Text.Pandoc.Writers.ChunkedHTML, exporting writeChunkedHtml [API change].

  • We now set the pandoc-version variable centrally rather than in the writers. One effect is the man writer now emits a comment with the pandoc version.

  • pandoc-server:

    • Add simple CORS support to pandoc-server (#8427).
    • Print message to stderr when starting the server.
  • Docx reader:

    • Mark unnumbered headings with class unnumbered (#8148, Albert Krewinkel). This change ensures good conversion results when converting with --number-sections.
    • Support parsing of highlighted text.
    • Fix handling of oMathPara in w:p with other content (#8483).
  • ODT reader:

    • Fix relative links. ODT adds a ../ to relative links (see #3524); this needs to be removed when converting from ODT.
    • Handle “section” elements (#8409).
    • Rename Text.Pandoc.Readers.Odt -> Text.Pandoc.Readers.ODT, for consistency with Writers.ODT. Rename readOdt -> readODT. [API change]
  • DocBook reader:

    • Support href on link even in a fragment (#8437). (We now just look for an href attribute without worrying about the namespace.)
    • Parse title from imageobject/objectinfo (#8437).
  • JATS reader:

    • Handle uri element in references (#8270).
  • Ipynb reader:

...

pandoc 2.19.2

22 Aug 19:15
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • Fix regression with data uris in 2.19.1 (#8239). In 2.19.1 we used the base64URL encoding rather than base64.

  • pandoc-server: handle citeproc parameter as documented (#8235).

  • Org reader: treat emacs-jupyter src blocks as code cells (#8236, Albert Krewinkel). This improves support for notebook-like org files that are intended to be used with emacs-jupyter package.

  • HTML writer and templates: revert to using width property for column widths (Albert Krewinkel). The default flex and overflow-x properties of a column are set to auto. In combination, these changes allow to get good results when using columns with or without explicit widths.

  • Org writer (Albert Krewinkel):

    • Add support for jupyter nodebook cells (#6367).
    • Prefix code language of ipynb code blocks with jupyter-. This is the convention used by the emacs-jupyter package.
    • Keep code block attributes as header args. This allows to keep more information in the resulting src blocks, making it easier to roundtrip from or through Org. Org babel ignores unknown header arguments.
    • Add code block identifier as #+name to src blocks.
  • Fix some typos in the codebase (luz paz).

  • Require hslua-module-path 1.0.3 (#8228, Albert Krewinkel).

pandoc 2.19.1

19 Aug 06:56
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • Add server capabilities.

    • New exported module Text.Pandoc.Server [API change].
    • The pandoc executable now starts up a web server when renamed or symlinked as pandoc-server, and functions as a CGI program when renamed or symlinked as pandoc-server.cgi. See the man page for pandoc-server for full documentation.
  • Text.Pandoc.App.Opts: Redo FromJSON for Opt so that optional values can be omitted (in which case the values from defaultOptions are used).

  • Org reader: treat “abstract” block as metadata (Albert Krewinkel, #8204). A block of type “abstract” is assumed to define the document’s abstract. It is transferred from the main text to the metadata.

  • Org template: add abstract from metadata as block of type “abstract” (#8204).

  • HTML writer: use flex property for column widths (Albert Krewinkel, #8232).

  • LaTeX writer:

    • Add label to tables that have an identifier (Albert Krewinkel, #8219). Tables with an identifier are marked with a \label. A caption is always included in this case, even if the caption is empty.
    • Use \textquotesingle for straight quotes in text.
    • Fix widths of multicolumn cells (#8218).
  • LaTeX template: fix behavior of colorlinks variable (Albert Krewinkel, #8226). Fixes a regression in 2.19 that required the boxlinks variable to be set in addition to the usual link coloring variables. Otherwise links were never colored in LaTeX PDF output.

  • Text.Pandoc.Highlighting: Export lookupHighlightingStyle [API change]. Previously this lived in an unexported module Text.Pandoc.App.CommandLineOptions, under the name lookupHighlightStyle.

  • Text.Pandoc.App:

    • Remove unneeded MonadIO constraints in readSources.
    • Factor out convertWithOpts' from convertWithOpts. This runs in any PandocMonad, MonadIO, MonadMask instance. So far it is not exported, but it might find a use later.
  • Support --strip-comments in commonmark/gfm (#8222). This change makes the commonmark reader sensitive to readerStripComments.

  • Lua: add function pandoc.utils.citeproc (Albert Krewinkel). The function runs the citeproc processor on a Pandoc document. Exposing this functionality to Lua allows to make citation processing part of a filter or writer, simplifies the creation of multiple bibliographies, and enables the use of varying citation styles in different parts of a document.

  • Refactor linux/make_artifacts.sh.

  • Update INSTALL.md installation from source instructions.

  • Use base64 package instead of base64-bytestring. It is supposed to be faster and more standards-compliant.

  • trypandoc improvements:

    • Add dropdown with canned examples.
    • Add citeproc support.
    • Support csv, bibliographic and binary formats.
    • Add load from file.
    • Add permalink. Don’t always reload page.
    • Use vanilla JS and CSS + the new pandoc-server.cgi.
  • Allow haddock-library-1.11.0.

  • Convert tool/extract-changes.hs to a Lua filter.

pandoc 2.19

04 Aug 07:01
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • Add --embed-resources flag (Elliot Bobrow, #7331). This can be used to embed resources without implying --standalone. Deprecate --self-contained in favor of --embed-resources --standalone.

  • Allow environment variable interpolation in highlight-style and pdf-engine fields in defaults files (#8061; Jaehwang Jung, #8073).

  • Allow placing custom readers and writers in user data directory (Albert Krewinkel, #8112) (readers and writers subdirectories).

  • Add tsv (tab separated values) as an input format (#7974). [API change]: Text.Pandoc.Readers.CSV now exports readTSV. Internal change: In Text.Pandoc.CSV, CSVOptions has changed so that csvQuote takes a Maybe value.

  • Add tex_math_dollars to gfm default extensions (reflecting gfm’s new support for math).

  • RST, Org, Markdown readers: support rowspans and colspans in grid tables (#8202, Albert Krewinkel). Note: the writers does not yet support these more complex grid table features, so these complex grid tables will not round-trip.

  • HTML, LaTeX, and MediaWiki readers: use formatCode (#8162, #8129, Elliot Bobrow). This moves formatting from inside inline code elements to the outside, since pandoc’s Code element only takes string content.

  • Markdown reader:

    • Don’t parse inline notes with blank lines inside (#8028).
    • Allow attributes in special spans (e.g. smallcaps, underline) (Albert krewinkel, #4102). These spans are parsed as SmallCaps or Underline elements, but any attributes are included in a wrapping Span.
  • HTML reader:

    • Allow sublists that are not marked as items (Albert Krewinkel, #8150). This is technically invalid HTML, but it can be found in the wild and browsers handle it.
  • Org reader (Albert Krewinkel):

    • Recognize absolute paths on Windows (Albert Krewinkel, #8201).
    • Recognize {webp,jxl} files as images (YI).
    • Allow attrs for Org tables (Albert Krewinkel, #8049). Tables with attributes are no longer wrapped in Div elements; attributes are added directly to the table element.
    • Support line selection in INCLUDE directives (Brian Leung, #8060).
    • Fix Post / Pre mixup when setting emphasis chars (Amir Dekel, #8134).
  • LaTeX reader:

    • Support \includesvg (#8027).
    • Unescape characters in \lstinline inside \passthrough (#8179).
    • Improve mathEnvWith (#8122). When converting e.g. an align environment to an aligned environment inside a Math element, we need to include a newline before the \end{aligned}, since the previous line might end in a comment.
    • Fix treatment of extensions for \input in LaTeX reader (#8092). Previously we required a .tex extension, but TeX allows any extension for \input (as opposed to \include).
  • RTF reader:

    • support \nosupersub (#8170).
  • TikiWiki reader:

    • Support underlined text
  • DocBook reader:

    • Improved reading <xref> elements (Frerich Raabe, #8065).
  • JATS reader:

    • Strip ref- prefix from ref id in xref (#8007).
    • Support edition in references (#8087).
  • RIS reader:

    • Make parser more forgiving (#8034). Allow blank lines after entries. Allow entries with no space after the -, provided they just have a newline, e.g. DB -\n.
    • Get right order of names (#8055).
  • MediaWiki reader:

    • Allow HTML comment after row start (#8110).
  • DokuWiki reader:

    • The tex_math_dollars extension is now supported for dokuwiki (but off by default) (#8178).
    • Content inside <latex>...</latex> is parsed as raw LaTeX inline, and inside <LATEX>..</LATEX> as raw LaTeX block (#8178).
    • The behavior of <php>...</php> is changed, so that instead of producing a code block, it produces raw HTML with <?php ... ?>.
  • LaTeX writer:

    • Improve grouping with autocites (#8088).
    • Extend list of book documentclasses (Wentau Han, #8053).
    • Fix width of multicolumn cells (Albert Krewinkel, #8090). Cells spanning multiple columns must be given an explicit width, calculated from the table properties.
    • Beamer: allow containsverbatim as alternative to fragile (#8080).
  • HTML writer:

    • Add ‘footnotes’ identifier to footnotes section (#8043).
    • Fix bug with --number-offset. This formerly caused section divs to be produced, even when --section-divs was not specified (#8097).
    • Use CSS flexboxes for columns (Albert Krewinkel). This allows an arbitrary number of columns, while the previous approach assumed exactly two columns.
    • Allow “spanlike” classes to be combined (see #8194). Previously classes like “underline” and “marked” had to be the first class in a span in order for the span to be interpreted as a “ul” or “mark” element. This commit allows these special classes to be “stacked,” e.g. [test]{.mark .underline}; in addition, the special classes are no longer required to come first in the list of classes.
    • Avoid doubled style attribute when height and width are added to style because of an image, but the image already has a style attribute (#8047).
    • Do not include the deprecated doc-endnote role (#8030). doc-endnote was deprecated in DPUB-ARIA 1.1.
    • Remove extra soft break for tasklist (black-desk, #8142). Browser will display the extra newline character between checkbox and text as a space, which make tasklist items cannot be aligned.
  • EPUB writer:

    • Allow choice of math method for v3 (#8164). Previously we always used MathML for math in EPUB3, because the spec includes MathML. But this is not widely supported by readers, so it seems better to allow users to choose their math method as they can with EPUB2 or HTML. NOTE: Existing workflows that produce EPUBv3 documents including math will be affected by this change. You must add --mathml to your command line if you want to continue producing MathML.
  • RST writer:

    • Fix missing spaces with nested inlines (#8182).
    • Always escape literal backslash (#8178).
  • Ms writer:

    • Add comment in preamble stating generator.
    • Fix roff ms syntax highlighting definitions (#8175, thanks to Branden Robinson).
  • ConTeXt writer:

    • Support complex table structures (Albert Krewinkel, #8116). The following table feature are now supported in ConTeXt:

      • colspans,
      • rowspans,
      • multiple bodies,
      • row headers, and
      • multi-row table head and foot.

      The wrapping placetable environment is also given a reference option with the table identifier, enabling referencing of the table from within the document.

    • Unify link handling (Albert Krewinkel, #8096). Autolinks, i.e. links with content that’s the same as the linked URL, are now marked with the \url command. All other links, both internal and external, are created with the \goto command, leading to shorter, slightly more idiomatic code. As before, autolinks can still be styled via \setupurl, other links via \setupinteraction.

    • Use “sectionlevel” environment for headings (Albert Krewinkel, #5539). The document hierarchy is now conveyed using the \startsectionlevel/\stopsectionlevel by default. This makes it easy to include pandoc-generated snippets in documents at arbitrary levels. The more semantic environments “chapter”, “section”, “subsection”, etc. are used if the --top-level-division command line parameter is set to a non-default value.

  • Docx writer:

    • Add w:lang to rPr for Span and Div with lang attribute, so that Word can know that “Apfel” is not a spelling error (#8026).
    • Prevent crashing when handling invalid tables (Albert Krewinkel, #8102). Tables with different numbers of cells per row would sometimes crash pandoc. This fix prevents this by cutting off overlong rows.
  • ICML writer:

    • Support custom-style attribute on Table (#8079).
  • AsciiDoc writer:

    • Fix commas in link text (#8070). Commas in link text trigger interpretation of attributes. To block this, we replace them with numeric entities.
    • Fix underline. We were rendering it as +++text+++; this is now changed to [.underline]#text#. See comment at #8070 (comment).
  • FB2 writer:

    • Fix handling of non-section Divs (#8123).
  • Markdown writer:

    • Disable soft wrapping when hard_line_breaks enabled (#8035). We were already doing this for markdown; this commit does the same thing for markua and commonmark and gfm.
    • Avoid excessive indentation on bullet lists for commonmark, markua, gfm. They are now nested by 2 spaces instead of 4 (#8011).
  • Text.Pandoc.Class:

    • Add new function findFileWithDataFallback [API Change] (Albert Krewinkel).
    • fillMediaBag: Keep attributes of original image on Span (Albert Krewinkel, #8099). Images that cannot be fetched are replaced with a Span that contains the image’s description. The span now also retains all original image attributes and inherits all attributes of the image. Furthermore, the classes image and placeholder are added, and path and title are store in attributes original-image-src and original-image-title, respectively.
  • Text.Pandoc.Shared:

    • makeSections: don’t make a section for a div with class “fragments” (#8098).
    • Ensure that Nulls are ignored by makeSection and in segmenting slides (#8155).
    • Add formatCode function to Text.Pandoc.Shared [API change] (Elliot Bobrow, #8129).
    • taskListItemToAscii: handle asciidoctor’s characters (#8011). Asciidoctor uses different unicode characters for task lists; we should recognize them too and be able to convert them to ascii task lists in formats like gfm.
    • Deprecate deLink and mark for later removal.
  • Text.Pandoc.Writers.Shared:

    • toTableOfContents: Don’t replace links with empty spans in TOC (#8020).
  • Text.Pandoc.Readers.Metadata:

    • Ensure that metadata values w/o trailing newlines are parsed as inlines, ...

pandoc 2.18

04 Apr 18:10
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • New input formats: endnotexml (EndNote XML bibliography), ris (RIS bibliography).

  • A RIS bibliography file may now be used with --citeproc.

  • Citeproc: Allow a formatted bibliography to be placed in metadata fields via a Div with class refs (#7969, #526). Thus, one can include a metadata field, say refs, whose content is an empty div with id refs, and the formatted bibliography will be put into this metadata field. It may then be interpolated into a template using the variable refs.

  • Ensure that you don’t get PDF output to terminal. -t pdf now behaves like -t docx and gives an error unless the output is redirected.

  • --version now prints hslua version (#7929) and Lua version (#7997, Albert Krewinkel).

  • Change --metadata-file parsing so that, when the input format is not markdown or a markdown variant, pandoc’s markdown is used (#6832, #7926). When the input format is a markdown variant, the same format is used. Reason for the change: it doesn’t make sense to run the markdown parser with a set of extensions designed for a non-markdown format, and this dramatically limits what people can do in metadata files.

  • Trim whitespace from math in --webtex (#7892). This fixes problems with –webtex and markdown output, when display math starts or ends with a newline.

  • New exported module Text.Pandoc.Readers.EndNote, exporting readEndNoteXML, readEndNoteXMLCitation, and readEndNoteXMLReferences. [API change]

  • --self-contained: issue warning rather than failing with an error if a resource can’t be found (#7904).

  • New exported module, Text.Pandoc.Readers.RIS, exporting readRIS (#7894).

  • LaTeX reader:

    • Handle subequations as inline math environment (#7883).
    • Rudimentary support for vbox (#7939).
    • Support \today (#7905).
    • Handle \label and \ref for footnotes (#7930).
    • Allow inline groups starting with \bgroup (#7953).
    • Use custom TokStream that keeps track of whether macros are expanded. This allows us to improve performance a bit by avoiding unnecessary runs of the macro expansion code (e.g. from 24 ms to 20 ms on our standard benchmark).
    • Further optimizations for inline parsing.
    • Better handling of \usepackage. If the package is local but causes parse errors, parse everything up to the error and skip the rest. Issue a CouldNotParseIncludeFile warning indicating that parsing failed at that point.
    • Text.Pandoc.Readers.LaTeX.Parsing: Monoid and Semigroup instances for TokStream.
  • HTML reader:

    • Give warnings and emit empty note when parsing <a epub:type="noteref"> and the identifier doesn’t correspond to anything in the note table (#7884). Previously we just silently skipped these cases.
    • Fix parsing of epub footnotes (#7884).
  • DocBook reader:

  • DokuWiki reader:

    • Add DokuWiki table alignment (#5202, damon-sava-stanley).
  • RST reader:

    • Fix treatment of headerless simple tables (#7902).
    • Wrap math in Span to preserve attributes (#7998, Albert Krewinkel). Math elements with a name, classes, or other fields are wrapped in a Span with these attributes.
  • JATS reader:

    • Improve handling of fn-group elements (#6348, Albert Krewinkel). Footnotes in <fn-group> elements are collected and re-inserted into the document as proper footnotes in the place where they are referenced.
    • Handle pub-date (#8000).
    • Support PMID, DOI, issue in citations (#7995).
    • Improve refs parsing. Handle issn and isbn; use simpler form for issued date.
    • Strip ‘ref-’ from ref id in constructing CSL id. This allows better round-tripping, because the JATS writer adds the ref- prefix to the citation id to get the ref element’s id.
  • Org reader:

    • Allow “:” in property drawer keys (Lucas V. R). Any non-space character is allowed as property drawer key, including “:” itself (so it is not really a delimiter). The real delimiter is a space character, so in a drawer like

      :PROPERTIES:
      ::k:ey:: value
      :END:
      

      “:k:ey:” is a key with value “value”.

    • Allow comments above property drawer.

    • More flexible LaTeX environments (Lucas V. R).

    • Handle #+bibliography: as metadata so that it can work with --citeproc.

    • Parse #+print_bibliography: as Div with id refs.

    • Allow multiple #+bibliography:.

  • Markdown reader:

    • Allow one-column pipe tables with pipe on right (#7919).
    • Remove restriction on identifiers, so they no longer need to begin with a letter (#7920).
  • Docx reader:

    • Enable citations extension for docx reader (#7840). When enabled, Zotero, Mendeley, and EndNote citations embedded in a docx are parsed as native pandoc citations. (When disabled, the generated citation text and bibliography are passed through as regular text.) The bibliography generated by the plugin is suppressed. Instead, bibliographic data embedded in citation items is added to the references metadata field so that it can be used with --citeproc.
  • Docbook writer:

    • Interpret links without contents as cross-references (#7360, Jan Tojnar). Links without text contents are converted to <xref> elements. DocBook processors will generate appropriate cross-reference text when presented with an xref element.
  • Docx writer:

    • Single numbering ID for examples (#7895, mjfs). This change ensures that example list items all belong to a single number sequence, so that if items are added or deleted in a word processor, the other items will renumber automatically.
    • Add bookmark with table id to table (#7989, Nikolai Korobeinikov, #7285). This allows tables with ids to be linked to.
  • Ipynb writer:

    • Handle metadata better (#7928). Previously we used the markdown writer to render metadata. This had some undesirable consequences (e.g. en dash expanded to -- when smart enabled), so now we use the plain writer.
  • LaTeX writer:

    • Avoid extra space before \CSLRightInline (#7932).
    • Add scrreport to chaptersClasses (#6168, ivardb).
    • Support page,trim,clip attributes on images (#7181).
    • Add () after booktabs rules (#8001). These commands take optional arguments with () and [], which can lead to problems if the content of the table cell begins with these characters.
  • RST writer:

    • Support all standard metadata (“bibliographic”) fields.
  • HTML writer: performance improvements.

  • Org writer:

    • Stop indenting property drawers, quote blocks (#3245, Albert Krewinkel). This follows the current default org-mode behavior.
  • Markdown writer:

    • Move table-related code into submodule (Albert Krewinkel).
    • Don’t produce redundant header identifier when the gfm_auto_identifiers extension is set (#7941).
    • Update escaping rules for \. We now escape \ only if raw_tex is enabled or it is followed by a non-alphanumeric.
  • JATS writer:

    • Encode author “others” as <etal/> (Albert Krewinkel). Citeproc adopted the BibTeX convention to use the author name “others” when there are additional authors that are not named. JATS uses the <etal> element for this.
    • Avoid doubled ref-list element (#7990). Previously when generating JATS with the element_citations extension enabled, the references were put in a doubly-nested ref-list element (<ref-list><ref-list>...).
    • Keep edition info in element citations (#7993, Albert Krewinkel).
    • Fix handling of CSL variable ‘page’ (not ‘pages’ as we had before). It should go to ‘lpage’ and ‘rpage’, not ‘page-range’.
  • EPUB writer: refactor for clarity (#7991, Jonathan Dönszelmann, Ola Wolska, Ivar de Bruin, Jaap de Jong).

  • Custom writer (Albert Krewinkel):

    • Support new-style Writer function (Albert Krewinkel). See the documentation for custom writers for details.
    • Produce stacktrace if Writer function fails
  • Text.Pandoc.Logging: add CouldNotParseIncludeFile constructor for LogMessage [API change].

  • Text.Pandoc.Shared:

    • Put id attributes on TOC entries (#7907, damon-sava-stanley). Naming scheme of id is “toc-” + id of linked to header/section. Effects HTML, Markdown, Powerpoint, and RTF.
    • Define ordNub as alias for nubOrd from containers package (#7963, Albert Krewinkel).
    • Export ensureValidXmlIdentifiers. This function changes identifiers that don’t start with letters, and internal links to these identifiers, making them compatible with XML standards. The change is simple: we add id_ to the front. There is potential for duplication if there are already id_... identifiers defined, but this seems rare enough not to worry too much about.
  • Ensure that valid XML identifiers are used in Docbook, EPUB, FB2, HTML4, S5, Slidy, Slideous, ICML, ODT, TEI writers. Thus, if you convert [anchor]{#1} and [link to](#1), id_1 will be used instead of 1 for the identifier.

  • Lua (Albert Krewinkel).

    • Add module pandoc.layout to format and layout text.
    • Move custom writer code into Lua hierarchy.
    • Use pandoc-lua-marshal 0.1.5.
    • Allow any type of callable object as argument to List functions filter, map, and find_if. These previously required the argument to be of...