pandoc 2.2.1

@jgm jgm released this May 11, 2018 · 134 commits to master since this release

  • Restored and undeprecated gladtex for HTML math (#4607).

    • Added GladTeX constructor to Text.Pandoc.Options.HTMLMathMethod [API change, reverts removal in v2.2]
    • Restored and undeprecated --gladtex option, removed in v2.2.
  • LaTeX reader: Handle $ in /text{..} inside math (#4576).

  • Org reader: Fix image filename recognition (Albert Krewinkel). Use a function from the filepath library to check whether a string is a valid file name. The custom validity checker that was used before gave wrong results (e.g. for absolute file paths on Windows, kawabata/ox-pandoc#52).

  • FB2 reader: Replace some errors with warnings (Alexander Krotov).

  • HTML writer:

    • Strip links from headers when creating TOC (#4340). Otherwise the TOC entries will not link to the sections.
    • Fix regression with tex math environments in HTML + MathJax (#4639).
  • Muse writer (Alexander Krotov): Add support for left-align and right-align classes (#4542).

  • Docx writer: Support underline (#4633).

  • Text.Pandoc.Parsing: Lookahead for non-whitespace after singleQuoteStart and doubleQuoteStart (#4637).

  • test-pandoc-utils.lua: more robust testing on both windows and *nix. Previously the pipe tests were only run if \bin/false and /bin/sed were present, which they aren’t in default MacOS and Windows systems. Fixed by using tr and false, which should always be in the path on a *nix system, and find and echo for Windows.

  • Text.Pandoc.Shared: add uriPathToPath. This adjusts the path from a file: URI in a way that is sensitive to Windows/Linux differences. Thus, on Windows, /c:/foo gets interpreted as c:/foo, but on Linux, /c:/foo gets interpreted as /c:/foo. See #4613.

  • Use uriPathToPath with file: URIs (#4613).

  • Revert piping HTML to pdf-engine (Mauro Bieg, #4413). Use a temp file as before.

  • Text.Pandoc.Class: Catch IO errors when writing media files and issue a warning, rather than an error (Francesco Occhipinti, #4559).

  • Don’t lowercase custom writer filename (Alexander Krotov, #4610).

  • MANUAL (Mauro Bieg):

    • Clarify truthiness in template variables (#2281).
    • Clarify pipe table width calculation (#4520).
  • ConTeXt template: New Greek fallback typeface (Pablo Rodríguez, #4405). CMU Serif gives better typographic results than the previous Greek fallback DejaVu Serif.

  • Make HTML template polyglot (#4606, OvidiusCicero), by making <link rel="stylesheet" href="$css$"> self-closing.

  • Use texmath 0.11, allowing better translation of non-ASCII characters in math (#4642).

pandoc 2.2

@jgm jgm released this Apr 27, 2018 · 178 commits to master since this release

  • New input format: fb2 (FictionBook2) (Alexander Krotov).

  • Make --ascii work for all XML formats (ICML, OPML, JATS,…), and for ms and man.

  • Remove deprecated --latexmathml, --gladtex, --mimetex, --jsmath, -m, --asciimathml options.

  • New module Text.Pandoc.Readers.FB2, exporting readFB2 (Alexander Krotov, API change).

  • Markdown reader:

    • Allow empty key-value attributes, like title="" (#2944).
    • Handle table w/o following blank line in fenced div (#4560).
    • Remove “fallback” for doubleQuote parser. Previously the parser tried to be efficient – if no end double quote was found, it would just return the contents. But this could backfire in a case **this should "be bold**, since the fallback would return the content "be bold** and the closing boldface delimiter would never be encountered.
    • Improve computation of the relative width of the last column in a multiline table, so we can round-trip tables without constantly shrinking the last column.
  • EPUB reader:

    • Fix images with space in file path (#4344).
  • LaTeX reader:

    • Properly resolve section numbers with \ref and chapters (#4529).
    • Parse sloppypar environment (#4517, Marc Schreiber).
    • Improve handling of raw LaTeX (for markdown etc.) (#4589, #4594). Previously there were some bugs in how macros were handled.
    • Support \MakeUppercase, \MakeLowercase',, \lowercase, and also \MakeTextUppercase and \MakeTextLowercase from textcase (#4959).
  • Textile reader:

    • Fixed tables with no body rows (#4513). Previously these raised an exception.
  • Mediawiki reader:

    • Improve table parsing (#4508). This fixes detection of table attributes and also handles ! characters in cells.
  • DocBook reader:

    • Properly handle title in section element (#4526). Previously we just got section_title for section (though sect1, sect2, etc. were handled properly).
    • Read tex math as output by asciidoctor (#4569, Joe Hermaszewski).
  • Docx reader:

    • Combine adjacent CodeBlocks with the same attributes into a single CodeBlock. This prevents a multiline codeblock in Word from being read as different paragraphs.
  • RST reader:

    • Allow < 3 spaces indent under directives (#4579).
    • Fix anonymous redirects with backticks (#4598).
  • Muse reader (Alexander Krotov):

    • Add support for Text::Amuse multiline headings.
    • Add <math> tag support.
    • Add support for <biblio> and <play> tags.
    • Allow links to have empty descriptions.
    • Require block <literal> tags to be on separate lines.
    • Allow - in anchors.
    • Allow verse to be indented.
    • Allow nested footnotes.
    • Internal improvements.
  • Muse writer (Alexander Krotov):

    • Escape > only at the beginning of a line.
    • Escape ] in image title.
    • Escape ] brackets in URLs as %5D.
    • Only escape brackets when necessary.
    • Escape ordered list markers.
    • Do not escape list markers unless preceded by space.
    • Escape strings starting with space.
    • Escape semicolons and markers after line break.
    • Escape ; to avoid accidental comments.
    • Don’t break headers, line blocks and tables with line breaks.
    • Correctly output empty headings.
    • Escape horizontal rule only if at the beginning of the line.
    • Escape definition list terms starting with list markers.
    • Place header IDs before header.
    • Improve span writing.
    • Do not join Spans in normalization.
    • Don’t align ordered list items.
    • Remove key-value pairs from attributes before normalization.
    • Enable --wrap=preserve for all tests by default.
    • Reduced <verbatim> tags in output.
    • Internal changes.
  • RST writer:

    • Use more consistent indentation (#4563). Previously we used an odd mix of 3- and 4-space indentation. Now we use 3-space indentation, except for ordered lists, where indentation must depend on the width of the list marker.
    • Flatten nested inlines (#4368, Francesco Occhipinti). Nested inlines are not valid RST syntax, so we flatten them following some readability criteria discussed in #4368.
  • EPUB writer:

    • Ensure that pagetitle is always set, even when structured titles are used. This prevents spurious warnings about empty title elements (#4486).
  • FB2 writer (Alexander Krotov):

    • Output links inline instead of producing notes. Previously all links were turned into footnotes with unclickable URLs inside.
    • Allow emphasis and notes in titles.
    • Don’t intersperse paragraph with empty lines.
    • Convert metadata value abstract to book annotation.
    • Use <empty-line /> for HorizontalRule rather thanLineBreak. FB2 does not have a way to represent line breaks inside paragraphs; previously we used<empty-line />\ elements, but these are not allowed inside paragraphs.
  • Powerpoint writer (Jesse Rosenthal):

    • Handle Quoted Inlines (#4532).
    • Simplify code with ParseXml.
    • Allow fallback options when looking for placeholder type.
    • Check reference-doc for all layouts.
    • Simplify speaker notes logic.
    • Change notes state to a simpler per-slide value.
    • Remove Maybe from SpeakerNotes in Slide. mempty means no speaker notes.
    • Add tests for improved speaker notes.
    • Handle speaker notes earlier in the conversion process.
    • Keep notes with related blocks (#4477). Some blocks automatically split slides (imgs, tables, column divs). We assume that any speaker notes immediately following these are connected to these elements, and keep them with the related blocks, splitting after them.
    • Remove docProps/thumbnail.jpeg in data dir (Jesse Rosenthal, #4588). It contained a nonfree ICC color calibration profile and is not needed for production of a powerpoint document.
  • Markdown writer:

    • Include a blank line at the end of the row in a single-row multiline table, to prevent it from being interpreted as a simple table (#4578).
  • CommonMark writer:

    • Correctly ignore LaTeX raw blocks when raw_tex is not enabled (#4527, quasicomputational).
  • EPUB writer:

    • Add epub:type="footnotes" to notes section in EPUB3 (#4489).
  • LaTeX writer:

    • In beamer, don’t use format specifier for default ordered lists (#4556). This gives better results for styles that put ordered list markers in boxes or circles.
    • Update \lstinline delimiters (#4369, Tim Parenti).
  • Ms writer:

    • Use \f[R] rather than \f[] to reset font (#4552).
    • Use \f[BI] and \f[CB] in headers, instead of \f[I] and \f[C], since the header font is automatically bold (#4552).
    • Use \f[CB] rather than \f[BC] for monospace bold (#4552).
    • Create pdf anchor for a Div with an identifier (#4515).
    • Escape / character in anchor ids (#4515).
    • Improve escaping for anchor ids: we now use uNNN instead of uNNN to avoid ambiguity.
  • Man writer:

    • Don’t escape U+2019 as ' (#4550).
  • Text.Pandoc.Options:

    • Removed JsMath, LaTeXMathML, and GladTeX constructors from Text.Pandoc.Options.HTMLMathMethod [API change].
  • Text.Pandoc.Class:

    • writeMedia: unescape URI-escaping in file path. This avoids writing things like file%20one.png to the file system.
  • Text.Pandoc.Parsing:

    • Fix romanNumeral parser (#4480). We previously accepted ‘DDC’ as 1100.
    • uri: don’t treat * characters at end as part of URI (#4561).
  • Text.Pandoc.MIME:

    • Use the alias application/eps for EPS (#2067). This will ensure that we retain the eps extension after reading the image into a mediabag and writing it again.
  • Text.Pandoc.PDF:

    • Use withTempDir in html2pdf.
    • With xelatex, don’t compress images til the last run (#4484). This saves time for image-heavy documents.
    • Don’t try to convert EPS files (#2067). `pdflatex converts them itself, and JuicyPixels can’t do it.
    • For pdflatex, use a temp directory in the working directory. Otherwise we can have problems with the EPS conversion pdflatex tries to do, which can’t operate on a file above the working directory without --shell-escape.
  • Changes to tests to accommodate changes in pandoc-types. In jgm/pandoc-types#36 we changed the table builder to pad cells. This commit changes tests (and two readers) to accord with this behavior.

  • Set default extensions for beamer same as latex.

  • LaTeX template:

    • Add beameroption variable (#4359, Étienne Bersac).
    • Use pgfpages package; this is needed for notes on second screen in beamer (Étienne Bersac).
    • Add background-image variable (#4601, John Muccigrosso).
  • reveal.js template: Add background-image variable (#4600, John Muccigrosso).

  • ms template: Fix date. Previously .ND was used, but this only works if you have a title page, which we don’t. Thanks to @teoric.

  • Removed pragmas for unused extensions (#4506, Anabra).

  • Fix bash completion for --print-default-data-file (#4549). Previously this looked in the filesystem, even if pandoc was compiled with embed_data_files (and sometimes it looked in a nonexistent build directory). Now the bash completion script just includes a hard-coded list of data file names.


    • Clarify template vs metadata variables (#4501, Mauro Bieg).
    • Fix raw content example (#4479, Mauro Bieg).
    • Specify that you use html for raw output in epub.
    • Add examples for raw docx blocks (#4472, Tristan Stenner). The documentation states that the target format name should match the output format, which isn’t the case for docx/openxml and some others.
    • Don’t say that empty_paragraphs affects markdown output (#4540).
    • Consolidate input/output format documentation (#4577, Mauro Bieg).
  • New README template. Take in/out formats from manual.

  • Fix example in lua-filters docs (#4459, HeirOfNorton).

  • Use the -threaded GHC flag when building benchmarks (#4587, Francesco Occhipinti).

  • Bump temporary upper bound to 1.4.

  • Use pandoc-citeproc

  • Use texmath- (fixes escapes in math in ms, #4597).

  • Removed old lib directory. This was used for something long ago, but plays no role now.

  • Removed unneeded data file LaTeXMathML.js.

  • Create 64- and 32-bit versions of Windows binary packages.

pandoc 2.1.3

@jgm jgm released this Mar 19, 2018 · 360 commits to master since this release

  • Docx reader (Jesse Rosenthal):

    • Add tests for nested smart tags.
    • Parse nested smart tags.
    • Make unwrapSDT into a general unwrap function that can unwrap both nested SDT tags and smartTags. This makes the SmartTags constructor in the Docx type unnecessary, so we remove it (#4446).
    • Remove unused docxWarnings (Alexander Krotov).
  • RST reader: Allow unicode bullet characters (#4454).

  • Haddock reader: Better table handling, using haddock-library’s new table support, if compiled against a version that includes it. Note that tables with col/rowspans will not translate well into Pandoc.

  • Muse reader (Alexander Krotov):

    • Various internal improvements.
    • Require closing tag to have the same indentation as opening.
    • Do not reparse blocks inside unclosed block tag (#4425).
    • Parse <class> tag (supported by Emacs Muse).
    • Do not produce empty Str element for unindented verse lines.
  • LaTeX reader:

    • Add support to parse unit string of \SI command (closes #4296, Marc Schreiber).
  • Haddock writer: In the writer, we now render tables always as grid tables, since Haddock supports these.

  • DokuWiki writer: rewrite backSlashLineBreaks (#4445, Mauro Bieg).

  • Docx writer: Fixed formatting of DefaultStyle ordered lists in docx writer. We want decimal for the top level, not lower roman.

  • RST writer:

    • Strip whitespace at beginning and ending of inline containers (#4327, Francesco Occhipinti).
    • Filter out empty inline containers (#4434). There is nothing in RST that corresponds to e.g. Emph [], so we just ignore elements like this (Francesco Occhipinti).
  • Muse writer (Alexander Krotov):

    • Support spans with anchors.
    • Replace smallcaps with emphasis before normalization.
    • Output smallcaps as emphasis.
    • Expand Cite before list normalization.
    • Write empty inline lists as <verbatim></verbatim>.
    • Remove empty Str from the beginning of inline lists during normalization.
    • Escape “-” to avoid creating bullet lists.
    • Fix math expansion for more than one expression per paragraph.
    • Expand math before inline list normalization.
  • Dokuwiki writer: fix LineBreaks in Tables (#4313, Mauro Bieg).

  • Ms writer:

    • Asciify pdf anchors, since unicode anchors don’t work (#4436). Internal links should be converted automatically, so this shouldn’t affect users directly.
    • Don’t escape hyphens as \-; that’s for a minus sign (#4467).
  • Beamer writer: put hyperlink after \begin{frame} and not in the title (#4307). If it’s in the title, then we get a titlebar on slides with the plain attribute, when the id is non-null. This fixes a regression in 2.0.

  • EPUB writer: Remove notes from TOC in nav.xhtml (#4453, Mauro Bieg).

  • JATS writer: Remove extraneous, significant whitespace (#4335, Nokome Bentley).

  • html2pdf: inject base tag wih current working directory (#4413, Mauro Bieg). This helps ensure that linked resources are included.

  • Add Semigroup instances for everything for which we defined a Monoid instance previously (API change):

    • Text.Pandoc.Class.FileTree.
    • Text.Pandoc.Translations.Translations.
    • Text.Pandoc.Extensions.Extensions.
    • Text.Pandoc.Readers.Odt.StyleReader.Styles.
    • Text.Pandoc.Pretty.Doc.
    • Text.Pandoc.MediaBag.MediaBag.
  • Add custom Prelude to give clean code for Monoid and Semigroup that works with ghc 7.10-8.4. The custom Prelude (prelude/Prelude) is used for ghc versions < 8.4. NoImplicitPrelude is used in all source files, and Prelude is explicitly imported (this is necessary for ghci to work properly with the custom prelude).

  • Text.Pandoc.Writers.Shared (Francesco Occhipinti):

    • Export stripLeadingTrailingSpace.
    • Don’t wrap lines in grid tables when --wrap=none (#4320).
    • gridTable: Don’t wrap lines in tables when --wrap=none. Instead, expand cells, even if it results in cells that don’t respect relative widths or surpass page column width. This change affects RST, Markdown, and Haddock writers.
  • Raise error if someone tries to print docx, odt, etc. template (#4441).

  • LaTeX template: Provide bidi package’s option using \PassOptionsToPackage (#4357, Václav Haisman). This avoid a clash when polyglossia loads it first and then it is loaded again for XeLaTeX.

  • ConTeXt template: Added pdfa variable to generate PDF/A (#4294, Henri Menke). Instructions on how to install the ICC profiles on ConTeXt standalone can be found in the wiki: If the ICC profiles are not available the log will contain error messages.

  • Use latest pandoc-types, skylighting

  • Use latest pandoc-citeproc in binary package.

  • Bump upper bound for time, criterion, haddock-library, exceptions, http-types, aeson, haddock-library.

  • Bump upper bound tasty-quickcheck 0.10 (#4429, Felix Yan).

  • pandoc.cabal: fix up other-extensions and language fields. Language is now consistently Haskell2010, and other-extensions is consistently NoImplicitPrelude. Everything else to be specified in the module header as needed.

  • Removed old-locale flag and Text.Pandoc.Compat.Time. This is no longer necessary since we no longer support ghc 7.8.

  • Make weigh-pandoc into a benchmark program. Remove weigh-pandoc flag. weigh-pandoc is now built (and run) automatically when you build (and run) benchmarks.

  • MANUAL: add instructions for background images reveal.js (#4325, John Muccigrosso).

  • appveyor: use VS 2013 environment instead of VS 2015 for Windows builds.

pandoc 2.1.2

@jgm jgm released this Mar 3, 2018 · 453 commits to master since this release

  • Markdown reader:

    • Fix parsing bug with nested fenced divs (#4281). Previously we allowed “nonindent spaces” before the opening and closing :::, but this interfered with list parsing, so now we require the fences to be flush with the margin of the containing block.
  • Commonmark reader:

    • raw_html is now on by default. It can be disabled explicitly using -f commonmark-raw_html.
  • Org reader (Albert Krewinkel):

    • Move citation tests to separate module.

    • Allow changing emphasis syntax (#4378). The characters allowed before and after emphasis can be configured via #+pandoc-emphasis-pre and #+pandoc-emphasis-post, respectively. This allows to change which strings are recognized as emphasized text on a per-document or even per-paragraph basis. Example:

      #+pandoc-emphasis-pre: "-\t ('\"{"
      #+pandoc-emphasis-post: "-\t\n .,:!?;'\")}["
  • LaTeX reader:

    • Fixed comments inside citations (#4374).
    • Fix regression in package options including underscore (#4424).
    • Make --trace work.
    • Fixed parsing of tabular* environment (#4279).
  • RST reader:

    • Fix regression in parsing of headers with trailing space (#4280).
  • Muse reader (Alexander Krotov):

    • Enable <literal> tags even if amuse extension is enabled. Amusewiki disables tags for security reasons. If user wants similar behavior in pandoc, RawBlocks and RawInlines can be removed or replaced with filters.
    • Remove space prefix from <literal> tag contents.
    • Do not consume whitespace while looking for closing end tag.
    • Convert alphabetical list markers to decimal in round-trip test. Alphabetical lists are an addition of Text::Amuse. They are not present in Emacs Muse and can be ambiguous when list starts with “i.”, “c.” etc.
    • Allow <quote> and other tags to be indented.
    • Allow single colon in definition list term.
    • Fix parsing of verse in lists.
    • Improved parsing efficiency. Avoid parseFromString. Lists are parsed in linear instead of exponential time now.
    • Replace ParserState with MuseState.
    • Prioritize lists with roman numerals over alphabetical lists. This is to make sure “i.” starts a roman numbered list, instead of a list with letter “i” (followed by “j”, “k”, …“).
    • Fix directive parsing.
    • Parse definition lists with multiple descriptions.
    • Parse next list item before parsing more item contents.
    • Fixed a bug: headers did not terminate lists.
    • Move indentation parsing from definitionListItem to definitionList.
    • Paragraph indentation does not indicate nested quote. Muse allows indentation to indicate quotation or alignment, but only on the top level, not within a or list.
    • Require that block tags are on separate lines. Text::Amuse already explicitly requires it anyway.
    • Fix matching of closing inline tags.
    • Various internal changes.
    • Fix parsing of nested definition lists.
    • Require only one space for nested definition list indentation.
    • Do not remove trailing whitespace from <code>.
    • Fix parsing of trailing whitespace. Newline after whitespace now results in softbreak instead of space.
  • Docx reader (Jesse Rosenthal, except where noted):

    • Handle nested sdt tags (#4415).
    • Don’t look up dependant run styles if +styles is enabled.
    • Move pandoc inline styling inside custom-style span.
    • Read custom styles (#1843). This will read all paragraph and character classes as divs and spans, respectively. Dependent styles will still be resolved, but will be wrapped with appropriate style tags. It is controlled by the +styles extension (-f docx+styles). This can be used in conjunction with the custom-style feature in the docx writer for a pandoc-docx editing workflow. Users can convert from an input docx, reading the custom-styles, and then use that same input docx file as a reference-doc for producing an output docx file. Styles will be maintained across the conversion, even if pandoc doesn’t understand them.
    • Small change to Fields hyperlink parser. Previously, unquoted string required a space at the end of the line (and consumed it). Now we either take a space (and don’t consume it), or end of input.
    • Pick table width from the longest row or header (Francesco Occhipinti, #4360).
  • Muse writer (Alexander Krotov):

    • Change verse markup: > instead of <verse> tag.
    • Remove empty strings during inline normalization.
    • Don’t indent nested definition lists.
    • Use unicode quotes for quoted text.
    • Write image width specified in percent in Text::Amuse mode.
    • Don’t wrap displayMath into <verse>.
    • Escape nonbreaking space (~~).
    • Join code with different attributes during normalization.
    • Indent lists inside Div.
    • Support definitions with multiple descriptions.
  • Powerpoint writer (Jesse Rosenthal):

    • Use table styles This will use the default table style in the reference-doc file. As a result they will be easier when using in a template, and match the color scheme.
    • Remove empty slides. Because of the way that slides were split, these could be accidentally produced by comments after images. When animations are added, there will be a way to add an empty slide with either incremental lists or pauses.
    • Implement syntax highlighting. Note that background colors can’t be implemented in PowerPoint, so highlighting styles that require these will be incomplete.
    • New test framework for pptx. We now compare the output of the Powerpoint writer with files that we know to (a) not be corrupt, and (b) to show the desired output behavior (details below).
    • Add notesMaster to presentation.xml if necessary.
    • Ignore links and (end)notes in speaker notes.
    • Output speaker notes.
    • Read speaker note templates conditionally. If there are speaker notes in the presentation, we read in the notesMasters templates from the reference pptx file.
    • Fix deletion track changes (#4303, Jesse Rosenthal).
  • Markdown writer: properly escape @ to avoid capture as citation (#4366).

  • LaTeX writer:

    • Put hypertarget inside figure environment (#4388). This works around a problem with the endfloat package and makes pandoc’s output compatible with it.
    • Fix image height with percentage (#4389). This previously caused the image to be resized to a percentage of textwidth, rather than textheight.
  • ConTeXt writer (Henri Menke):

    • New section syntax and support --section-divs (#2609). \section[my-header]{My Header} -> \section[title={My Header},reference={my-header}]. The ConTeXt writer now supports the --section-divs option to write sections in the fenced style, with \startsection and \stopsection.
    • xtables: correct wrong usage of caption (Henri Menke).
  • Docx writer:

    • Fix image resizing with multiple images (#3930, Andrew Pritchard).
    • Use new golden framework (Jesse Rosenthal).
    • Make more deterministic to facilitate testing (Jesse Rosenthal).
      • getUniqueId now calls to the state to get an incremented digit, instead of calling to P.uniqueHash.
      • we always start the PRNG in mkNumbering/mkAbstractNum with the same seed (1848), so our randoms should be the same each time.
    • Fix ids in comment writing (Jesse Rosenthal). Comments from --track-changes=all were producing corrupt docx, because the writer was trying to get id from the (ID,_,_) field of the attributes, and ignoring the “id” entry in the key-value pairs. We now check both.
  • Ms writer: Added papersize variable.

  • TEI writer:

    • Use height instead of depth for images (#4331).
    • Ensure that id prefix is always used.
    • Don’t emit role attribute; that was a leftover from the Docbook writer.
    • Use ‘xml:id’, not ‘id’ attribute (#4371).
  • AsciiDoc writer:

    • Do not output implicit heading IDs (#4363, Alexander Krotov). Convert to asciidoc-auto_identifiers for old behaviour.
  • RST writer:

    • Remove blockToRST' moving its logic into fixBlocks (Francesco Occhipinti).
    • Insert comment between lists and quotes (#4248, Francesco Occchipinti).
  • RST template: remove definition of ‘math’ role as raw. This used to be needed prior to v 0.8 of docutils, but now math support is built-in.

  • Slides: Use divs to set incremental/non-incremental (#4381, Jesse Rosenthal). The old method (list inside blockquote) still works, but we are encouraging the use of divs with class incremental or nonincremental.

  • Text.Pandoc.ImageSize:

    • Make image size detection for PDFs more robust (#4322).
    • Determine image size for PDFs (#4322).
    • EMF Image size support (#4375, Andrew Pritchard).
  • Text.Pandoc.Extensions:

    • Add Ext_styles (Jesse Rosenthal, API change). This will be used in the docx reader (defaulting to off) to read pargraph and character styles not understood by pandoc (as divs and spans, respectively).
    • Made Ext_raw_html default for commonmark format.
  • Text.Pandoc.Parsing:

    • Export manyUntil (Alexander Krotov, API change).
    • Export improved sepBy1 (Alexander Krotov).
    • Export list marker parsers: upperRoman, lowerRoman, decimal, lowerAlpha, upperAlpha (Alexander Krotov, API change).
  • Tests/Lua: fix tests on windows (Albert Krewinkel).

  • Lua: register script name in global variable (#4393). The name of the Lua script which is executed is made available in the global Lua variable PANDOC_SCRIPT_FILE, both for Lua filters and custom writers.

  • Tests: Abstract powerpoint tests out to OOXML tests (Jesse Rosenthal). There is very little pptx-specific in these tests, so we abstract out the basic testing function so it can be used for docx as well. This should allow us to catch some errors in the docx writer that slipped by the roundtrip testing.

  • Lua filters: store constructors in registry (Albert Krewinkel). Lua functions used to construct AST element values are stored in the Lua registry for quicker access. Getting a value from the registry is much faster than getting a global value (partly to idiosyncrasies of hslua); this change results in a considerable performance boost.

  • Documentation:

    • doc/ Add draft of Org-mode documentation (Albert Krewinkel).
    • doc/ document global vars set for filters (Albert Krewinkel).
    • mention Stack version. (#4343, Adam Brandizzi).
    • MANUAL: add documentation on custom styles (Jesse Rosenthal).
    • MANUAL.txt: Document incremental and nonincremental divs (Jesse Rosenthal). Blockquoted lists are still described, but fenced divs are presented in preference.
    • MANUAL.txt: document header and footer variables (newmana).
    • MANUAL.txt: self-contained implies standalone (#4304, Daniel Lublin).
    • label was renamed. (#4310, Alexander Brandizzi).
  • Require tagsoup 0.14.3 (#4282), fixing HTML tokenization bug.

  • Use latest texmath.

  • Use latest pandoc-citeproc.

  • Allow exceptions 0.9.

  • Require aeson-pretty 0.8.5 (#4394).

  • Bump blaze-markup, blaze-html lower bounds to 0.8, 0.9 (#4334).

  • Update tagsoup to 0.14.6 (Alexander Krotov, #4282).

  • Removed ghc-prof-options. As of cabal 1.24, sensible defaults are used.

  • Update default.nix to current nixpkgs-unstable for hslua-0.9.5 (#4348, jarlg).

pandoc 2.1.1

@jgm jgm released this Jan 18, 2018 · 659 commits to master since this release

  • Markdown reader:

    • Don’t coalesce adjacent raw LaTeX blocks if they are separated by a blank line. See lierdakil/pandoc-crossref#160.
    • Improved inlinesInBalancedBrackets (#4272, jgm/pandoc-citeproc#315). The change both improves performance and fixes a regression whereby normal citations inside inline notes and figure captions were not parsed correctly.
  • RST reader:

    • Better handling for headers with an anchor (#4240). Instead of creating a Div containing the header, we put the id directly on the header. This way header promotion will work properly.
    • Add aligned environment when needed in math (#4254). uses an align* environment for math in .. math:: blocks, so this math may contain line breaks. If it does, we put the math in an aligned environment to simulate’s behavior.
  • HTML reader:

    • Fix col width parsing for percentages < 10% (#4262, n3fariox).
  • LaTeX reader:

    • Advance source position at end of stream.

    • Pass through macro defs in rawLaTeXBlock even if the latex_macros extension is set (#4246). This reverts to earlier behavior and is probably safer on the whole, since some macros only modify things in included packages, which pandoc’s macro expansion can’t modify.

    • Fixed pos calculation in tokenizing escaped space.

    • Allow macro definitions inside macros (#4253). Previously we went into an infinite loop with

    • Fix inconsistent column widths (#4238). This fixes a bug whereby column widths for the body were different from widths for the header in some tables.

  • Docx reader (Jesse Rosenthal):

    • Parse hyperlinks in instrText tags (#3389, #4266). This was a form of hyperlink found in older versions of word. The changes introduced for this, though, create a framework for parsing further fields in MS Word (see the spec, ECMA-376-1:2016, §17.16.5, for more on these fields). We introduce a new module, Text.Pandoc.Readers.Docx.Fields which contains a simple parsec parser. At the moment, only simple hyperlink fields are accepted, but that can be extended in the future.
  • Muse reader (Alexander Krotov):

    • Parse ~~ as non-breaking space in Text::Amuse mode.
    • Refactor list parsing.
  • Powerpoint writer (Jesse Rosenthal):

    • Change reference to notesSlide to endNotesSlide.
    • Move image sizing into picProps.
    • Improve table placement.
    • Make our own _rels/.rels file.
    • Import reference-doc images properly.
    • Move Presentation.hs out of PandocMonad.
    • Refactor into separate modules. T.P.W.Powerpoint.Presentation defines the Presentation datatype and goes Pandoc->Presentation; T.P.W.Pandoc.Output goes Presentation->Archive. Text.Pandoc.Writers.Powerpoint a thin wrapper around the two modules.
    • Avoid overlapping blocks in column output.
    • Position images correctly in two-column layout.
    • Make content shape retrieval environment-aware.
    • Improve image handling. We now determine image and caption placement by getting the dimensions of the content box in a given layout. This allows for images to be correctly sized and positioned in a different template. Note that images without captions and headers are no longer full-screened. We can’t do this dependably in different layouts, because we don’t know where the header is (it could be to the side of the content, for example).
    • Read presentation size from reference file. Our presentation size is now dependent on the reference/template file we use.
    • Handle (sub)headers above slidelevel correctly. Above the slidelevel, subheaders will be printed in bold and given a bit of extra space before them. Note that at the moment, no distinction is made between levels of headers above the slide header, though that can be changed.
    • Check for required files. Since we now import from reference/dist file by glob, we need to make sure that we’re getting the files we need to make a non-corrupt Powerpoint. This performs that check.
    • Improve templating using --reference-doc. Templating should work much more reliably now.
    • Include Notes slide in TOC.
    • Set notes slide header to slide-level.
    • Add table of contents. This is triggered by the --toc flag. Note that in a long slide deck this risks overrunning the text box. The user can address this by setting --toc-depth=1.
    • Set notes slide number correctly.
    • Clean up adding metadata slide. We want to count the slide numbers correctly if it’s in there.
    • Add anchor links. For anchor-type links ([foo](#bar)) we produce an anchor link. In powerpoint these are links to slides, so we keep track of a map relating anchors to the slides they occur on.
    • Make the slide number available to the blocks. For anchors, block-processing functions need to know what slide number they’re in. We make the envCurSlideId available to blocks.
    • Move curSlideId to environment.
    • Allow setting toc-title in metadata.
    • Link notes to endnotes slide.
  • Markdown writer:

    • Fix cell width calculation (#4265). Previously we could get ever-lengthening cell widths when a table was run repeatedly through pandoc -f markdown -t markdown.
  • LaTeX writer:

    • Escape & in lstinline (Robert Schütz).
  • ConTeXt writer:

    • Use xtables instead of Tables (#4223, Henri Menke). Default to xtables for context output. Natural Tables are used if the new ntb extension is set.
  • HTML writer:

    • Fixed footnote backlinks with --id-prefix (#4235).
  • Text.Pandoc.Extensions: Added Ext_ntb constructor (API change, Henri Menke).

  • Text.Pandoc.ImageSize: add derived Eq instance to Dimension (Jesse Rosenthal, API change).

  • Lua filters (Albert Krewinkel):

    • Make PANDOC_READER_OPTIONS available. The options which were used to read the document are made available to Lua filters via the PANDOC_READER_OPTIONS global.
    • Add lua module pandoc.utils.run_json_filter, which runs a JSON filter on a Pandoc document.
    • Refactor filter-handling code into Text.Pandoc.Filter.JSON, Text.Pandoc.Filter.Lua, and Text.Pandoc.Filter.Path.
    • Improve error messages. Provide more context about the task which caused an error.
  • data/pandoc.lua (Albert Krewinkel):

    • Accept singleton inline as a list. Every constructor which accepts a list of inlines now also accepts a single inline element for convenience.
    • Accept single block as singleton list. Every constructor which accepts a list of blocks now also accepts a single block element for convenience. Furthermore, strings are accepted as shorthand for {pandoc.Str "text"} in constructors.
    • Add attr, listAttributes accessors. Elements with attributes got an additional attr accessor. Attributes were accessible only via the identifier, classes, and attributes, which was in conflict with the documentation, which indirectly states that such elements have the an attr property.
    • Drop _VERSION. Having a _VERSION became superfluous, as this module is closely tied to the pandoc version, which is available via PANDOC_VERSION.
    • Fix access to Attr components. Accessing an Attr value (e.g., Attr().classes) was broken; the more common case of accessing it via an Inline or Block element was unaffected by this.
  • Move metaValueToInlines to from Docx writer to Text.Pandoc.Writers.Shared, so it can be used by other writers (Jesse Rosenthal).

  • MANUAL.txt:

    • Clarify otherlangs in LaTeX (#4072).
    • Clarify latex_macros extension.
    • Recommend use of raw_attribute extension in header includes (#4253).
  • Allow latest QuickCheck, tasty, criterion.

  • Remove custom prelude and ghc 7.8 support.

  • Reduce compiler noise (exact paths for compiled modules).

pandoc 2.1

@jgm jgm released this Jan 8, 2018 · 764 commits to master since this release

  • Allow filters and lua filters to be interspersed (#4196). Previously we ran all lua filters before JSON filters. Now we run filters in the order they are presented on the command line, whether lua or JSON. There are two incompatible API changes: The type of applyFilters has changed, and applyLuaFilters has been removed. Filter is also now exported.

  • Use latest skylighting and omit the missingIncludes check, fixing a major performance regression in earlier releases of the 2.x series (#4226). Behavior change: If you use a custom syntax definition that refers to a syntax you haven’t loaded, pandoc will now complain when it is highlighting the text, rather than doing a check at the start. This change dramatically speeds up invocations of pandoc on short inputs.

  • Text.Pandoc.Class: make FileTree opaque (don’t export FileTree constructor). This forces users to interact with it using insertInFileTree and getFileInfo, which normalize file names.

  • Markdown reader:

    • Rewrite inlinesInBalancedBrackets. The rewrite is much more direct, avoiding parseFromString. And it performs significantly better; unfortunately, parsing time still increases exponentially (see #1735).
    • Avoid parsing raw tex unless \ + letter seen. This seems to help with the performance problem, #4216.
  • LaTeX reader: Simplified a check for raw tex command.

  • Muse reader (Alexander Krotov):

    • Enable round trip test (#4107).
    • Automatically translate #cover into #cover-image. Amusewiki uses #cover directive to specify cover image.
  • Docx reader (Jesse Rosenthal):

    • Allow for insertion/deletion of paragraphs (#3927). If the paragraph has a deleted or inserted paragraph break (depending on the track-changes setting) we hold onto it until the next paragraph. This takes care of accept and reject. For this we introduce a new state which holds the ils from the previous para if necessary. For --track-changes=all, we add an empty span with class paragraph-insertion/paragraph-deletion at the end of the paragraph prior to the break to be inserted or deleted.
    • Remove unused anchors (#3679). Docx produces a lot of anchors with nothing pointing to them—we now remove these to produce cleaner output. Note that this has to occur at the end of the process because it has to follow link/anchor rewriting.
    • Read multiple children of w:sdtContents.
    • Combine adjacent anchors. There isn’t any reason to have numerous anchors in the same place, since we can’t maintain docx’s non-nesting overlapping. So we reduce to a single anchor.
    • Improved tests.
  • Muse writer (Alexander Krotov): don’t escape URIs from AST

  • Docx writer:

    • Removed redundant subtitle in title (Sebastian Talmon).
    • firstRow table definition compatibility for Word 2016 (Sebastian Talmon). Word 2016 seems to use a default value of “1” for table headers, if there is no firstRow definition (although a default value of 0 is documented), so all tables get the first Row formatted as header. Setting the parameter to 0 if the table has no header row fixes this for Word 2016
    • Fix custom styles with spaces in the name (#3290).
  • Powerpoint writer (Jesse Rosenthal):

    • Ignore Notes div for parity with other slide outputs.
    • Set default slidelevel correctly. We had previously defaulted to slideLevel 2. Now we use the correct behavior of defaulting to the highest level header followed by content. We change an expected test result to match this behavior.
    • Split blocks correctly for linked images.
    • Combine adjacent runs.
    • Make inline code inherit code size. Previously (a) the code size wasn’t set when we force size, and (b) the properties was set from the default, instead of inheriting.
    • Simplify replaceNamedChildren function.
    • Allow linked images. The following markdown: [![Image Title](image.jpg)]( will now produce a linked image in the resulting PowerPoint file.
    • Fix error with empty table cell. We require an empty <a:p> tag, even if the cell contains no paragraphs—otherwise PowerPoint complains of corruption.
    • Implement two-column slides. This uses the columns/column div format described in the pandoc manual. At the moment, only two columns (half the screen each) are allowed. Custom widths are not supported.
    • Added more tests.
  • OpenDocument/ODT writers: improved rendering of formulas (#4170, oltolm).

  • Lua filters (Albert Krewinkel):

    • data/pandoc.lua: drop ‘pandoc-api-version’ from Pandoc objects

    • The current pandoc-types version is made available to Lua programs in the global PANDOC_API_VERSION. It contains the version as a list of numbers.

    • The pandoc version available as a global PANDOC_VERSION (a list of numbers).

    • data/pandoc.lua: make Attr an AstElement.

    • data/pandoc.lua: make all types subtypes of AstElement. Pandoc, Meta, and Citation were just plain functions and did not set a metatable on the returned value, which made it difficult to amend objects of these types with new behavior. They are now subtypes of AstElement, meaning that all their objects can gain new features when a method is added to the behavior object (e.g., pandoc.Pandoc.behavior).

    • data/pandoc.lua: split type and behavior tables. Clearly distinguish between a type and the behavioral properties of an instance of that type. The behavior of a type (and all its subtypes) can now be amended by adding methods to that types behavior object, without exposing the type objects internals. E.g.:

      pandoc.Inline.behavior.frob = function () print'42' end
      local str = pandoc.Str'hello'
      str.frob() -- outputs '42'
    • data/pandoc.lua: fix Element inheritance. Extending all elements of a given type (e.g., all inline elements) was difficult, as the table used to lookup unknown methods would be reset every time a new element of that type was created, preventing recursive property lookup. This is was changed in that all methods and attributes of supertypes are now available to their subtypes.

    • data/pandoc.lua: fix attribute names of Citation (#4222). The fields were named like the Haskell fields, not like the documented, shorter version. The names are changed to match the documentation and Citations are given a shared metatable to enable simple extensibility.

    • data/pandoc.lua: drop function pandoc.global_filter.

    • Bump hslua version to 0.9.5. This version fixes a bug that made it difficult to handle failures while getting lists or a Map from Lua. A bug in pandoc, which made it necessary to always pass a tag when using MetaList or MetaBlock, is fixed as a result. Using the pandoc module’s constructor functions for these values is now optional (if still recommended).

    • Stop exporting pushPandocModule (API change). The introduction of runPandocLua renders direct use of this function obsolete.

    • Update generation of module docs for lua filters.

    • Lua.Module.Utils: make stringify work on MetaValues (John MacFarlane). I’m sure this was intended in the first place, but currently only Meta is supported.

  • Improve benchmarks.

    • Set the default extensions properly.
    • Improve benchmark argument parsing. You can now say make bench BENCHARGS="markdown latex reader" and both the markdown and latex readers will be benchmarked.
  • MANUAL.txt simplify and add more structure (Mauro Bieg).

  • Generate from template and MANUAL.txt. make will generate the after changes to MANUAL.txt have been made.

  • Update copyright notices to include 2018 (Albert Krewinkel).

pandoc 2.0.6

@jgm jgm released this Dec 29, 2017 · 862 commits to master since this release

  • Added jats as an input format.

    • Add Text.Pandoc.Readers.JATS, exporting readJATS (API change) (Hamish Mackenzie).
    • Improved citation handling in JATS reader. JATS citations are now converted to pandoc citations, and JATS ref-lists are converted into a references field in metadata, suitable for use with pandoc-citeproc. Thus a JATS article with embedded bibliographic information can be processed with pandoc and pandoc-citeproc to produce a formatted bibliography.
  • Allow --list-extensions to take an optional FORMAT argument. This lists the extensions set by default for the selected FORMAT. The extensions are now alphabetized, and the + or - indicating the default setting comes before, rather than after, the extension.

  • Markdown reader:

    • Preserve original whitespace between blocks.
    • Recognize \placeformula as context.
    • Be pickier about table captions. A caption starts with a : which can’t be followed by punctuation. Otherwise we can falsely interpret the start of a fenced div, or even a table header line like :--:|:--:, as a caption.
    • Always use four space rule for example lists. It would be awkward to indent example list contents to the first non-space character after the label, since example list labels are often long. Thanks to Bernhard Fisseni for the suggestion.
    • Improve raw tex parsing. Note that the Markdown reader is also affected by the latex_macros extension changes described below under the LaTeX reader.
  • LaTeX reader:

    • latex_macros extension changes (#4179). Don’t pass through macro definitions themselves when latex_macros is set. The macros have already been applied. If latex_macros is enabled, then rawLaTeXBlock in Text.Pandoc.Readers.LaTeX will succeed in parsing a macro definition, and will update pandoc’s internal macro map accordingly, but the empty string will be returned.
    • Export tokenize, untokenize (API change).
    • Use applyMacros in rawLaTeXBlock, rawLaTeXInline.
    • Refactored inlineCommand.
    • Fix bug in tokenizer. Material following ^^ was dropped if it wasn’t a character escape. This only affected invalid LaTeX, so we didn’t see it in the wild, but it appeared in a QuickCheck test failure.
    • Fix regression in LateX tokenization (#4159). This mainly affects the Markdown reader when parsing raw LaTeX with escaped spaces.
    • Add tests of LaTeX tokenizer.
    • Support \foreignlanguage from babel.
    • Be more tolerant of & character (#4208). This allows us to parse unknown tabular environments as raw LaTeX.
  • Muse reader (Alexander Krotov):

    • Parse anchors immediately after headings as IDs.
    • Require that note references does not start with 0.
    • Parse empty comments correctly.
  • Org reader (Albert Krewinkel):

    • Fix asterisks-related parsing error (#4180).
    • Support minlevel option for includes (#4154). The level of headers in included files can be shifted to a higher level by specifying a minimum header level via the :minlevel parameter. E.g. #+include: "" :minlevel 1 will shift the headers in such that the topmost headers become level 1 headers.
    • Break-up org reader test file into multiple modules.
  • OPML reader:

    • Enable raw HTML and other extensions by default for notes (#4164). This fixes a regression in 2.0. Note that extensions can now be individually disabled, e.g. -f opml-smart-raw_html.
  • RST reader:

    • Allow empty list items (#4193).
    • More accurate parsing of references (#4156). Previously we erroneously included the enclosing backticks in a reference ID (#4156). This change also disables interpretation of syntax inside references, as in docutils. So, there is no emphasis in `my *link*`_.
  • Docx reader:

    • Continue lists after interruption (#4025, Jesse Rosenthal). Docx expects that lists will continue where they left off after an interruption and introduces a new id if a list is starting again. So we keep track of the state of lists and use them to define a “start” attribute, if necessary.
    • Add tests for structured document tags unwrapping (Jesse Rosenthal).
    • Preprocess Document body to unwrap w:sdt elements (Jesse Rosenthal, #4190).
  • Plain writer:

    • Don’t linkify table of contents.
  • RST writer:

    • Fix anchors for headers (#4188). We were missing an _.
  • PowerPoint writer (Jesse Rosenthal):

    • Treat lists inside BlockQuotes as lists. We don’t yet produce incremental lists in PowerPoint, but we should at least treat lists inside BlockQuotes as lists, for compatibility with other slide formats.
    • Add ability to force size. This replaces the more specific blockQuote runProp, which only affected the size of blockquotes. We can use this for notes, etc.
    • Implement notes. This currently prints all notes on a final slide. Note that at the moment, there is a danger of text overflowing the note slide, since there is no logic for adding further slides.
    • Implement basic definition list functionality to PowerPoint writer.
    • Don’t look for default template file for Powerpoint (#4181).
    • Add pptx to isTextFormat list. This is used to check standalone and not writing to the terminal.
    • Obey slide level option (Jesse Rosenthal).
    • Introduce tests.
  • Docx writer:

    • Ensure that distArchive is the one that comes with pandoc (#4182). Previously a reference.docx in ~/.pandoc (or the user data dir) would be used instead, and this could cause problems because a user-modified docx sometimes lacks vital sections that we count on the distArchive to supply.
  • Org writer:

    • Do not wrap “-” to avoid accidental bullet lists (Alexander Krotov).
    • Don’t allow fn refs to wrap to beginning of line (#4171, with help from Alexander Krotov). Otherwise they can be interpreted as footnote definitions.
  • Muse writer (Alexander Krotov):

    • Don’t wrap note references to the next line (#4172).
  • HTML writer:

    • Use br elements in line blocks instead of relying on CSS (#4162). HTML-based templates have had the custom CSS for div.line-block removed. Those maintaining custom templates will want to remove this too. We still enclose line blocks in a div with class line-block.
  • LaTeX writer:

    • Use \renewcommand for \textlatin with babel (#4161). This avoids a clash with a deprecated \textlatin command defined in Babel.
    • Allow fragile=singleslide attribute in beamer slides (#4169).
    • Use \endhead after \toprule in headerless tables (#4207).
  • FB2 writer:

    • Add cover image specified by cover-image meta (Alexander Krotov, #4195).
  • JATS writer (Hamish Mackenzie):

    • Support writing <fig> and <table-wrap> elements with <title> and <caption> inside them by using Divs with class set to one of fig, table-wrap or caption (Hamish Mackenzie). The title is included as a Heading so the constraint on where Heading can occur is also relaxed.
    • Leave out empty alt attributes on links.
    • Deduplicate image mime type code.
    • Make <p> optional in <td> and <th> (#4178).
    • Self closing tags for empty xref (#4187).
    • Improve support for code language.
  • Custom writer:

    • Use init file to setup Lua interpreter (Albert Krewinkel). The same init file (data/init) that is used to setup the Lua interpreter for Lua filters is also used to setup the interpreter of custom writers.lua.
    • Define instances for newtype wrapper (Albert Krewinkel). The custom writer used its own ToLuaStack instance definitions, which made it difficult to share code with Lua filters, as this could result in conflicting instances. A Stringify wrapper is introduced to avoid this problem.
    • Added tests for custom writer.
    • Fixed definition lists and tables in data/sample.lua.
  • Fixed regression: when target is PDF, writer extensions were being ignored. So, for example, pandoc -t latex-smart -o file.pdf did not work properly.

  • Lua modules (Albert Krewinkel):

    • Add pandoc.utils module, to hold utility functions.
    • Create a Haskell module Text.Pandoc.Lua.Module.Pandoc to define the pandoc lua module.
    • Make a Haskell module for each Lua module. Move definitions for the pandoc.mediabag modules to a separate Haskell module.
    • Move sha1 from the main pandoc module to pandoc.utils.
    • Add function pandoc.utils.hierarchicalize (convert list of Pandoc blocks into (hierarchical) list of Elements).
    • Add function pandoc.utils.normalize_date (parses a date and converts it (if possible) to “YYYY-MM-DD” format).
    • Add function pandoc.utils.to_roman_numeral (allows conversion of numbers below 4000 into roman numerals).
    • Add function pandoc.utils.stringify (converts any AST element to a string with formatting removed).
    • data/init.lua: load pandoc.utils by default
    • Turn pipe, read into full Haskell functions. The pipe and read utility functions are converted from hybrid lua/haskell functions into full Haskell functions. This avoids the need for intermediate _pipe/_read helper functions, which have dropped.
    • pandoc.lua: re-add missing MetaMap function. This was a bug introduced in version 2.0.4.
  • Text.Pandoc.Class: Add insertInFileTree [API change]. This gives a pure way to insert an ersatz file into a FileTree. In addition, we normalize paths both on insertion and on lookup.

  • Text.Pandoc.Shared: export blocksToInlines' (API change, Maura Bieg).

  • Text.Pandoc.MIME: Add opus to MIME type table as audio/ogg (#4198).

  • Text.Pandoc.Extensions: Alphabetical order constructors for Extension. This makes them appear in order in --list-extensions.

  • Allow lenient decoding of latex error logs, which are not always properly UTF8-encoded (#4200).

  • Update latex template to work with recent versions of beamer. The old template produced numbered sections with some recent versions of beamer. Thanks to Thomas Hodgson.

  • Updated reference.docx (#4175). Instead of just “Hello, world”, the document now contains exemplars of most of the styles that have an effect on pandoc documents. This makes it easier to see the effect of style changes.

  • Removed default.theme data file (#4096). It is no longer needed now that we have --print-highlight-style.

  • Added stack.lts9.yaml for building with lts 9 and ghc 8.0.2. We still need this for the alpine static linux build, since we don’t have ghc 8.2.2 for that yet.

  • Removed stack.pkg.yaml. We only really need stack.yaml; we can put flag settings for pandoc-citeproc there.

  • Makefile: Add ‘trypandoc’ and ‘pandoc-templates’ targets to make releases easier.

  • MANUAL.txt:

    • Add note on what formats have +smart by default.
    • Use native syntax for custom-style (#4174, Mauro Bieg).
    • Introduce dedicated Extensions section, since some extensions affect formats other than markdown (Mauro Bieg, #4204).
    • Clarify default html output for --section-divs (Richard Edwards).
  • say that Text.Pandoc.JSON comes form pandoc-types. Closes jgm/pandoc-website#16.

  • Delete removed -S option from command (#4151, Georger Araújo).

pandoc 2.0.5

@jgm jgm released this Dec 13, 2017 · 1023 commits to master since this release

  • Fix a bug in 2.0.4, whereby pandoc could not read the theme files
    generated with --print-highlight-style (#4133). Improve JSON
    serialization of styles.

  • Fix CSS issues involving line numbers (#4128).
    Highlighted code blocks are now enclosed in a div with class sourceCode.
    Highlighting CSS no longer sets a generic color for pre and code; we only
    set these for class sourceCode.

  • --pdf-engine-opt: fix bug where option order was reversed (#4137).

  • Add PowerPoint (pptx) writer (Jesse Rosenthal).
    It works following the standard Pandoc conventions for making other
    sorts of slides. Caveats:

    • Syntax highlighting is not yet implemented. (This is difficult
      because there are no character classes in Powerpoint.)
    • Footnotes and Definition lists are not yet implemented. (Notes will
      usually take the form of a final slide.
    • Image placement and auto-resizing has a few glitches.
    • Reference powerpoint files don’t work dependably from the command
      line. This will be implemented, but at the moment users are advised
      to change themes from within Powerpoint.
  • Create shared Text.Pandoc.Writers.OOXML module (Jesse Rosenthal).
    This is for functions used by both Powerpoint and Docx writers.

  • Add default pptx data for Powerpoint writer (Jesse Rosenthal).

  • Add empty_paragraphs extension.

    • Deprecate --strip-empty-paragraphs option. Instead we now
      use an empty_paragraphs extension that can be enabled on
      the reader or writer. By default, disabled.
    • Add Ext_empty_paragraphs constructor to Extension.
    • Revert “Docx reader: don’t strip out empty paragraphs.”
      This reverts commit d6c58eb.
    • Implement empty_paragraphs extension in docx reader and writer,
      opendocument writer, html reader and writer.
    • Add tests for empty_paragraphs extension.
  • Markdown reader:

    • Don’t parse native div as table caption (#4119).
    • Improved computation of column widths in pipe tables.
      Pipe tables with lines longer than the text width (as set
      by --columns) are now scaled to text width, with the relative
      widths of columns determined by the ratios between the
      header lines. Previously we computed column widths using
      the ratio of header line lengths to column width, so that
      tables with narrow header lines were extremely thin, which
      was very rarely the desired result.
  • LaTeX reader: fix \ before newline (#4134). This should be a space,
    as long as it’s not followed by a blank line. This has been fixed at the
    tokenizer level.

  • Muse reader (Alexander Krotov):

    • Add test for #disable-tables directive in Emacs mode.
    • Don’t allow emphasis to be preceded by letter.
    • Add underline support in Emacs Muse mode..
    • Support multiline directives in Amusewiki mode
  • Man writer: omit internal links (#4136). That is, just print the link
    text without the URL.

  • Markdown reader: accept processing instructions as raw HTML (#4125).

  • Lua filters (Albert Krewinkel):

    • Use script to initialize the interpreter. The file init.lua is
      used to initialize the Lua interpreter which is used in Lua filters.
      This gives users the option to require libraries which they want to
      use in all of their filters, and to extend default modules.
    • Fix package loading for Lua 5.1. The list of package searchers is
      named package.loaders in Lua 5.1 and LuaJIT, and package.searchers
      in Lua 5.2 and later.
    • Refactor lua module handling. The integration with Lua’s package/module
      system is improved: A pandoc-specific package searcher is prepended to
      the searchers in package.searchers. The modules pandoc and
      pandoc.mediabag can now be loaded via require.
    • Bump lower bound of hslua. The release hslua 0.9.3 contains a new
      function which makes using Haskell functions as package loaders much
  • reveal.js template: add title-slide identifier to title slide (#4120).
    This allows it to be styled more easily.

  • LaTeX template: Added support for pagestyle variable (#4135,
    Thomas Hodgson)

  • Add -threaded to ghc-options for executable (#4130, fixes a build
    error on linux).

pandoc 2.0.4

@jgm jgm released this Dec 4, 2017 · 1066 commits to master since this release

  • Add --print-highlight-style option. This generates a JSON version
    of a highlighting style, which can be saved as a .theme file, modified,
    and used with --highlight-style (#4106, #4096).

  • Add --strip-empty-paragraphs option. This works for any input format.
    It is primarily intended for use with docx and odt documents where
    empty paragraphs have been used for inter-paragraph spaces.

  • Support --webtex for gfm output.

  • Recognize .muse file extension.

  • Support beamer \alert in LaTeX reader. Closes #4091.

  • Docx reader: don’t strip out empty paragraphs (#2252).
    Users who have a conversion pipeline from docx may want to consider adding
    --strip-empty-paragraphs to the command line.

  • Org reader (Albert Krewinkel): Allow empty list items (#4090).

  • Muse reader (Alexander Krotov):

    • Parse markup in definition list terms.
    • Allow definition to end with EOF.
    • Make code blocks round trip.
    • Drop common space prefix from list items.
    • Add partial round trip test.
    • Don’t interpret XML entities.
    • Remove nested.
    • Parse ~~ as non-breaking space in Emacs mode.
    • Correctly remove indentation from notes. Exactly one space is
      required and considered to be part of the marker.
    • Allow list items to be empty.
    • Add ordered list test.
    • Add more multiline definition tests.
    • Don’t allow blockquotes within lists.
    • Fix reading of multiline definitions.
    • Add inline <literal> support.
    • Concatenate inlines of the same type
  • Docx writer: allow empty paragraphs (#2252).

  • CommonMark/gfm writer:

    • Use raw html for native divs/spans (#4113). This allows a pandoc
      markdown native div or span to be rendered in gfm using raw html tags.
    • Implement raw_html and raw_tex extensions. Note that raw_html
      is enabled by default for gfm, while raw_tex is disabled by default.
  • Muse writer (Alexander Krotov):

    • Test that inline math conversion result is normalized.
      Without normalization this test produced
    • Improve inline list normalization and move to writer.
    • Escape hash symbol.
    • Escape ---- to avoid accidental horizontal rules.
    • Escape only </code> inside code tag.
    • Additional <verbatim> is not needed as <code> is verbatim already.
  • LaTeX writer:

    • Allow specifying just width or height for image size.
      Previously both needed to be specified (unless the image was
      being resized to be smaller than its original size).
      If height but not width is specified, we now set width to
      textwidth. If width but not height is specified, we now set
      height to textheight. Since we have keepaspectratio, this
      yields the desired result.
    • Escape ~ and _ in code with --listings (#4111).
  • HTML writer: export tagWithAttributes. This is a helper allowing
    other writers to create single HTML tags.

  • Let papersizes a0, a1, a2, … be case-insensitive by
    converting the case as needed in LaTeX and ConTeXt writers.

  • Change fixDisplayMath from Text.Pandoc.Writers.Shared
    so that it no longer produces empty Para’s as an artifact.

  • Text.Pandoc.Shared.blocksToInlines: rewrote using builder.
    This gives us automatic normalization, so we don’t get
    for example two consecutive Spaces.

  • Include default CSS for ‘underline’ class in HTML-based templates.

  • revealjs template: add tex2jax configuration for the
    math plugin. With the next release of reveal.js, this will
    fix the problem of $s outside of math contexts being
    interpreted as math delimiters (#4027).

  • pandoc.lua module for use in lua filters (Albert Krewinkel):

    • Add basic lua List module (#4099, #4081). The List module is
      automatically loaded, but not assigned to a global variable. It can be
      included in filters by calling List = require 'List'. Lists of blocks,
      lists of inlines, and lists of classes are now given List as a metatable,
      making working with them more convenient. E.g., it is now possible to
      concatenate lists of inlines using Lua’s concatenation operator ..
      (requires at least one of the operants to have List as a metatable):

      function Emph (emph)
        local s = {pandoc.Space(), pandoc.Str 'emphasized'}
        return pandoc.Span(emph.content .. s)

      The List metatable is assigned to the tables which get passed to
      the constructors MetaBlocks, MetaInline, and MetaList. This
      enables the use of the resulting objects as lists.

    • Lua/StackInstances: push Pandoc and Meta via constructor.
      Pandoc and Meta elements are now pushed by calling the respective
      constructor functions of the pandoc Lua module. This makes serialization
      consistent with the way blocks and inlines are pushed to lua and allows
      to use List methods with the blocks value.

    • Add documentation for pandoc.List in

  • Use latest tagsoup. This fixes a bug in parsing HTML tags with
    & (but not a valid entity) following them (#4094, #4088).

  • Use skylighting, fixing the color of unmarked code text
    when numberLines is used (#4103).

  • Make normalizeDate more forgiving (Mauro Bieg, #4101), not
    requiring a leading 0 on single-digit days.

  • Fix --help output for --highlight-style to include FILE (Mauro
    Bieg, #4095).

  • Clearer deprecation warning for --latexmathml, --asciimathml, -m.
    Previously we only mentioned --latexmathml, even if -m was

  • Changelog: fix description of lua filters in 2.0 release
    (Albert Krewinkel). Lua filters were initially run after conventional
    (JSON) filters. However, this was changed later to make it easier to deal
    with files in the mediabag. The changelog is updated to describe that
    feature of the 2.0 release correctly.

  • Change Generic JSON instances to TemplateHaskell (Jasper Van der Jeugt,
    #4085). This reduces compile time and memory usage significantly.

  • Added tikz filter example.

  • Create alternative zip file for macOS binaries.

  • Create alternative zip file for Windows binaries.

  • Update since we now provide zips for binaries.

  • Relax http-types dependency (Justus Sagemüller, #4084).

  • Add, to docs. These used to live in
    the website repo.

  • Add packages target to Makefile.

  • Bump bounds for binary, http-types, tasty-hunit

pandoc 2.0.3

@jgm jgm released this Nov 21, 2017 · 1149 commits to master since this release

  • Lua filters: preload text module (Albert Krewinkel, #4077).
    The text module is preloaded in lua. The module contains some UTF-8
    aware string functions, implemented in Haskell. The module is loaded on
    request only, e.g.:

    text = require 'text'
    function Str (s)
      s.text = text.upper(s.text)
      return s
  • Allow table-like access to attributes in lua filters (Albert Krewinkel,
    #4071). Attribute lists are represented as associative lists in Lua. Pure
    associative lists are awkward to work with. A metatable is attached to
    attribute lists, allowing to access and use the associative list as if
    the attributes were stored in as normal key-value pair in table.
    Note that this changes the way pairs works on attribute lists. Instead
    of producing integer keys and two-element tables, the resulting iterator
    function now returns the key and value of those pairs. Use ipairs to
    get the old behavior. Warning: the new iteration mechanism only works if
    pandoc has been compiled with Lua 5.2 or later (current default: 5.3).

  • Text.Pandoc.Parsing.uri: allow & and = as word characters (#4068).
    This fixes a bug where pandoc would stop parsing a URI with an
    empty attribute: for example, &a=&b= wolud stop at a.
    (The uri parser tries to guess which punctuation characters
    are part of the URI and which might be punctuation after it.)

  • Introduce HasSyntaxExtensions typeclass (Alexander Krotov, #4074).

    • Added new HasSyntaxExtensions typeclass for ReaderOptions and
    • Reimplemented isEnabled function from Options.hs to accept both
      ReaderOptions and WriterOptions.
    • Replaced enabled from CommonMark.hs with new isEnabled.
  • Add amuse extension (Alexander Krotov) to enable Amuse wiki
    behavior for muse. New Ext_amuse constructor for
    Extension. Note: this is switched on by default; for
    Emacs behavior, use muse-amuse.

  • Muse reader (Alexander Krotov):

    • Count only one space as part of list item marker.
    • Produce SoftBreaks on newlines. Now wrapping can be preserved
      with --wrap=preserve.
    • Add Text::Amuse footnote extensions. Footnote end is indicated by
      indentation, so footnotes can be placed anywhere in the text,
      not just at the end of it.
    • Accept Emacs Muse definition lists when -amuse.
      Emacs Muse does not require indentation.
  • HTML reader:

    • Ensure we don’t produce level 0 headers (#4076), even for chapter
      sections in epubs. This causes problems because writers aren’t set
      up to expect these.
    • Allow spaces after \( and before \) with tex_math_single_backslash.
      Previously \( \frac{1}{a} < \frac{1}{b} \) was not parsed as math in
      markdown or html +tex_math_single_backslash.
  • MANUAL: clarify that math extensions work with HTML.
    Clarify that tex_math_dollars and tex_math_single_backslash
    will work with HTML as well as Markdown.

  • Creole reader: Fix performance issue for longer lists (Sascha Wilde,

  • RST reader: better support for ‘container’ directive (#4066).
    Create a div, incorporate name attribute and classes.

  • LaTeX reader:

    • Support column specs like *{2}{r} (#4056). This is equivalent to
      rr. We now expand it like a macro.
    • Allow optional args for parbox (#4056).
    • Allow optional arguments on \footnote (#4062).
  • EPUB writer: Fixed path for cover image (#4069). It was previously
    media/media/imagename, and should have been media/imagename.

  • Markdown writer: fix bug with doubled footnotes in grid tables

  • LaTeX template: include natbib/biblatex after polyglossia (#4073).
    Otherwise we seem to get an error; biblatex wants polyglossia
    language to be defined.

  • Added examples to lua filters documentation.