Skip to content

Releases: jgm/pandoc

pandoc 1.14.0.1

29 May 04:40
@jgm jgm
Compare
Choose a tag to compare
  • Fixed problem with building of reference.docx and reference.odt when the embed_data_files flag is used. Instead of having a phase of the build where reference.docx and reference.odt are created from their constituent data files, we now construct these archives from their constituents when a docx or odt is built. The constituent files have been moved from extra-source-files to data-files, and reference.docx and reference.odt have been removed. Users can create their own reference.docx or reference.odt by using pandoc to create a simple docx or odt. make-reference-files.hs has been removed, simplifying the build process (#2187)
  • Don't include generated man pages in extra-source-files (#2189).
  • Bumped upper bound for aeson.
  • ConTeXt writer: create internal link anchors for Div elements with identifiers. (This is needed for linked citations to work.)

pandoc 1.14

28 May 05:18
@jgm jgm
Compare
Choose a tag to compare

New features

  • Added commonmark as input and output format.
  • Added --verbose flag for debugging output in PDF production (#1840, #1653).
  • Allow wildcards in --epub-embed-font arguments (#1939).
  • Added --latex-engine-opt option (#969, #1779, Sumit Sahrawat).
  • Added shortcut_reference_links extension (Konstantin Zudov, #1977). This is enabled by default for those markdown flavors that support reading shortcut reference links, namely: markdown, markdown_strict, markdown_github, markdown_php. If the extension is enabled, the reader parses shortcut reference links like [foo], and the writer creates such links unless doing so would cause problems. Users of markdown flavors that support shortcut reference links should not notice a difference in reading markdown, but the markdown pandoc produces may differ. If shortcut links are not desired, the extension can be disabled in the normal way.

Behavior changes

  • --toc is now supported for docx output (#458, Nikolay Yakimov). A "dirty" TOC is created at the beginning of document. It can be regenerated after the document has been opened.
  • An implicit --filter pandoc-citeproc is now triggered only when the --bibliography option is used, and not when the bibliography field in metadata is specified (#1849).
  • Markdown reader:
  • Reference links with implicit_header_references are no longer case-sensitive (#1606).
  • Definition lists no longer require indentation for first line (#2087). Previously the body of the definition (after the : or ~ marker) needed to be in column 4. This commit relaxes that requirement, to better match the behavior of PHP Markdown Extra. So, now this is a valid definition list:
```
foo
: bar
```
  • Resolve a potentially ambiguity with table captions:
```
foo

  : bar

  -----
  table
  -----
```

Is "bar" a definition, or the caption for the table? We'll count it as a caption for the table.

  • Disallow headerless pipe tables (#1996), to conform to GFM and PHP Markdown Extra. Note: If you have been using headerless pipe tables, this change may cause existing tables to break.
  • Allow pipe tables with header but no body (#2017).
  • Allow a digit as first character of a citation key (Matthias Troffaes). See jgm/pandoc-citeproc#97
  • LaTeX reader:
  • Don't limit includes to .tex extension (#1882). If the extension is not .tex, it must be given explicitly in the \input or \include.
  • Docx reader:
  • Allow numbering in the style file. This allows inherited styles with numbering (lists) (Jesse Rosenthal).
  • Org reader:
  • Support smart punctuation (Craig Bosma).
  • Drop trees with a :noexport: tag (Albert Krewinkel). Trees having a :noexport: tag set are not exported. This mirrors org-mode.
  • Put header tags into empty spans (Albert Krewinkel, #2160). Org mode allows headers to be tagged: * Headline :TAG1:TAG2. Instead of being interpreted as part of the headline, the tags are now put into the attributes of empty spans. Spans without textual content won't be visible by default, but they are detectable by filters. They can also be styled using CSS when written as HTML.
  • Generalize code block result parsing (Albert Krewinkel). Previously, only code blocks were recognized as result blocks; now, any kind of block can be the result.
  • Append newline to the LineBreak in Dokuwiki, HTML, EPUB, LaTeX, MediaWiki, OpenDocument, Texinfo writers (#1924, Tim Lin).
  • HTML writer:
  • Add "inline" or "display" class to math spans (#1914). This allows inline and display math to be styled differently.
  • Include raw latex blocks if --mathjax specified (#1938).
  • Require highlighting-kate >= 0.5.14 (#1903). This ensures that all code blocks will be wrapped in a div with class sourceCode. Also, the default highlighting CSS now adds div.sourceCode { x-overflow: auto; }, which means that code blocks (even with line numbers) will acquire a scroll bar on screens too small to display them (e.g. mobile phones). See also jgm/highlighting-kate#65.
  • LaTeX writer:
  • Use a declaration for tight lists (Jose Luis Duran, Joseph Harriott). Previously, pandoc hard-coded some commands to make tight lists in LaTeX. Now we use a custom command instead, allowing the styling to be changed in a macro in the header. (Note: existing templates may need to be modified to include the definition of this macro. See the current template.)
  • Beamer output: if the header introducing a slide has the class fragile, add the [fragile] option to the slide (#2119).
  • MediaWiki writer:
  • Use File: instead of the deprecated Image: for images and other media files (Greg Rundlett).
  • DocBook writer:
  • Render a Div (id,_,_) [Para _] element as a para element with an id attribute. This makes links to citations work in DocBook with pandoc-citeproc.
  • RST writer:
  • Normalize headings to sequential levels (Nikolay Yakimov). This is pretty much required by docutils.
  • Treat headings in block quotes, etc as rubrics (Nikolay Yakimov).
  • Better handling of raw latex inline (#1961). We use :raw-latex:...`` and add a definition for this role to the template.
  • EPUB writer:
  • Remove linear=no from cover itemref (#1609).
  • Don't use sup element for epub footnotes (#1995). Instead, just use an a element with class footnoteRef. This allows more styling options, and provides better results in some readers (e.g. iBooks, where anything inside the a tag breaks popup footnotes).
  • Take TOC title from toc-title metadata field.
  • Docx writer:
  • Implemented FirstParagraph style (Jesse Rosenthal). Following the ODT writer, we add the FirstParagraph style to the first text paragraph following an image, blockquote, table, heading, or beginning of document. This allows it to be styled differently. The default is for it to be the same as Normal.
  • Added BodyText style (Jesse Rosenthal). We apply a BodyText style to all unstyled paragraphs. This is, essentially, the same as Normal, except that since not everything inherits from BodyText (the metadata won't, for example, or the headers or footnote numbers), we can change the text in the body without having to make exceptions for everything. If we do want to change everything, we can still do it through Normal.
  • Altered Blockquote style slightly (Jesse Rosenthal). Since BlockQuote derives from BodyText, we just want to specify by default that it won't indent, regardless of what BodyText does. Note that this will not produce any visible difference in the default configuration.
  • Take TOC title from toc-title metadata field (Nikolay Yakimov).
  • Added a style to figure images (Nikolay Yakimov). Figures with empty captions use style Figure. Figures with nonempty captions use style Figure with Caption, which is based on Figure, and additionally has keepNext set.
  • ODT writer:
  • Added figure captions (Nikolay Yakimov). The following styles are used for figures: Figure -- for figure with empty caption), FigureWithCaption (based on Figure) -- for figure with caption, FigureCaption (based on Caption) -- for figure captions. Also, TableCaption (based on Caption) is used for table captions.

API changes

  • New Text.Pandoc.Error module with PandocError type (Matthew Pickering).
  • All readers now return Either PandocError Pandoc instead of Pandoc (Matthew Pickering). This allows better handling of errors.
  • Added Text.Pandoc.Writers.CommonMark, exporting writeCommonMark.
  • Added Text.Pandoc.Readers.CommonMark, exporting readCommonMark.
  • Derive Data and Typeable instances for MediaBag, Extension, ReaderOptions, EPUBVersion, CiteMethod, ObfuscationMethod, HTMLSlideVariant, TrackChanges, WriterOptions (Shabbaz Youssefi).
  • New Ext_shortcut_reference_links constructor for Extension (Konstantin Zudov).

Bug fixes

  • Markdown reader:
  • Allow smart ' after inline math (#1909, Nikolay Yakimov).
  • Check for tex macros after indented code (#1973).
  • Rewrote charsInBalancedBrackets for efficiency.
  • Make sure a closing </div> doesn't get included in a definition list item (#2127).
  • Don't parse bracketed text as citation if it might be a link, image, or footnote (Nikolay Yakimov).
  • Require space after key in mmd title block (#2026, Nikolay Yakimov). Require space after key-value delimiter colon in mmd title block.
  • Require nonempty value in mmd title block (Nikolay Yakimov).
  • Disable all metadata block extensions when parsing metadata field values (#2026, Nikolay Yakimov). Otherwise we could get a mmd title block inside YAML metadata, for example.
  • HTML reader:
  • Improve self-closing tag detection in htmlInBalanced (#2146).
  • Handle tables with <th> in body rows (#1859, mb21).
  • Fixed htmlTag (#1820). If the tag parses as a comment, we check to see if the input starts with <!--. If not, it's bogus comment mode and we fail htmlTag.
  • Handle base tag; if it has an href value, this is added to all relative URLs in links and images.
  • DocBook reader:
  • Look inside "info" elements for section titles (#1931).
  • Docx reader:
  • Parse images in deprecated vml format (Jesse Rosenthal).
  • Allow sub/superscript verbatims (Jesse Rosenthal). Verbatim usually shuts off all other run styles, but we don't want it to shut off sub/superscript.
  • LaTeX reader:
  • Handle tabular* environment (#1850). Note that the table width is not actually parsed or taken into account, but pandoc no longer chokes on it.
  • Ignore options in \lstinline rather than raising error (#1997).
  • Add some test cases for simple tables (Mathias Schenner).
  • Handle valign argument in tables (Mathias Schenner) (cur...
Read more

pandoc 1.13.2

20 Dec 08:43
@jgm jgm
Compare
Choose a tag to compare

This is mainly a spit-and-polish release, though there is one new reader and some minor new features. Note that, for the first time, we are providing a linux binary (64-bit Debian/Ubuntu).

  • TWiki Reader: add new new twiki reader (API chaneg, Alexander Sulfrian).
  • Markdown reader:
  • Better handling of paragraph in div (#1591). Previously text that ended a div would be parsed as Plain unless there was a blank line before the closing div tag.
  • Don't treat a citation as a reference link label (#1763).
  • Fix autolinks with following punctuation (#1811). The price of this is that autolinked bare URIs can no longer contain > characters, but this is not a big issue.
  • Fix Ext_lists_without_preceding_blankline bug (#1636, Artyom).
  • Allow startnum to work without fancy_lists. Formerly pandoc -f markdown-fancy_lists+startnum did not work properly.
  • RST reader (all Daniel Bergey):
  • Parse quoted literal blocks (#65). RST quoted literal blocks are the same as indented literal blocks (which pandoc already supports) except that the quote character is preserved in each line.
  • Parse RST class directives. The class directive accepts one or more class names, and creates a Div value with those classes. If the directive has an indented body, the body is parsed as the children of the Div. If not, the first block folowing the directive is made a child of the Div. This differs from the behavior of rst2xml, which does not create a Div element. Instead, the specified classes are applied to each child of the directive. However, most Pandoc Block constructors to not take an Attr argument, so we can't duplicate this behavior.
  • Warn about skipped directives.
  • Literal role now produces Code. Code role should have "code" class.
  • Improved support for custom roles
    - Added sourceCode to classes for :code: role, and anything inheriting from it.
    - Add the name of the custom role to classes if the Inline constructor supports Attr.
    - If the custom role directive does not specify a parent role, inherit from the :span: role.

This differs somewhat from the rst2xml.py behavior. If a custom role inherits from another custom role, Pandoc will attach both roles' names as classes. rst2xml.py will only use the class of the directly invoked role (though in the case of inheriting from a :code: role with a :language: defined, it will also provide the inherited language as a class).

  • Warn about ignored fields in role directives.
  • LaTeX reader:
  • Parse label after caption into a span instead of inserting an additional paragraph of bracketed text (#1747).
  • Parse math environments as inline when possible (#1821).
  • Better handling of \noindent and \greektext (#1783).
  • Handle \texorpdfstring more gracefully.
  • Handle \cref and \sep (Wikiwide).
  • Support \smartcite and \Smartcite from biblatex.
  • HTML reader:
  • Retain display type of MathML output (#1719, Matthew Pickering).
  • Recognise <br> tags inside <pre> blocks (#1620, Matthew Pickering).
  • Make embed tag either block or inline (#1756).
  • DocBook reader:
  • Handle keycombo, keycap (#1815).
  • Get string content in inner tags for literal elements (#1816).
  • Handle menuchoice elements better, with a > between (#1817).
  • Include id on section headers (#1818).
  • Document/test "type" as implemented (Brian O'Sullivan).
  • Add support for calloutlist and callout (Brian O'Sullivan). We treat a calloutlist as a bulleted list. This works well in practice.
  • Add support for classname (Bryan O'Sullivan).
  • Docx reader:
  • Fix window path for image lookup (Jesse Rosenthal). Don't use os-sensitive "combine", since we always want the paths in our zip-archive to use forward-slashes.
  • Single-item headers in ordered lists are headers (Jesse Rosenthal). When users number their headers, Word understands that as a single item enumerated list. We make the assumption that such a list is, in fact, a header.
  • Rewrite rewriteLink to work with new headers (Jesse Rosenthal). There could be new top-level headers after making lists, so we have to rewrite links after that.
  • Use polyglot header list (Jesse Rosenthal). We're just keeping a list of header formats that different languages use as their default styles. At the moment, we have English, German, Danish, and French. We can continue to add to this. This is simpler than parsing the styles file, and perhaps less error-prone, since there seems to be some variations, even within a language, of how a style file will define headers.
  • Remove header class properly in other langs (Jesse Rosenthal). When we encounter one of the polyglot header styles, we want to remove that from the par styles after we convert to a header. To do that, we have to keep track of the style name, and remove it appropriately.
  • Account for external link URLs with anchors. Previously, if a URL had an anchor, the reader would incorrectly identify it as an internal link and return only the anchor as URL. (Caleb McDaniel)
  • Fix for Issue #1692 (i18n styles) (Nikolay Yakimov).
  • Org reader:
  • Added state changing blanklines (Jesse Rosenthal). This allows us to emphasize at the beginning of a new paragraph (or, in general, after blank lines).
  • Fixed bug with bulleted lists:
    - a
    - b
    - c

was being parsed as a list, even though an unindented * should make a heading. See http://orgmode.org/manual/Plain-lists.html#fn-1.

  • Org reader: absolute, relative paths in link (#1741, Albert Krewinkel). The org reader was too restrictive when parsing links; some relative links and links to files given as absolute paths were not recognized correctly.
  • Org reader: allow empty links (jgm/gitit#471, Albert Krewinkel). This is important for use in gitit, which uses empty links for wikilinks.
  • Respect indent when parsing Org bullet lists (#1650, Timothy Humphries). Fixes issue with top-level bullet list parsing.
  • Fix indent issue for definition lists (Timothy Humphries, see #1650, #1698, #1680).
  • Parse multi-inline terms correctly in definition list (#1649, Matthew Pickering).
  • Fix rules for emphasis recognition (Albert Krewinkel). Things like /hello,/ or /hi'/ were falsy recognized as emphasised strings. This is wrong, as , and ' are forbidden border chars and may not occur on the inner border of emphasized text.
  • Drop COMMENT document trees (Albert Krewinkel). Document trees under a header starting with the word COMMENT are comment trees and should not be exported. Those trees are dropped silently (#1678).
  • Properly handle links to file:target (Albert Krewinkel). Org links like [[file:target][title]] were not handled correctly, parsing the link target verbatim. The org reader is changed such that the leading file: is dropped from the link target (see #756, #1812).
  • Parse LaTeX-style MathML entities (#1657, Albert Krewinkel). Org supports special symbols which can be included using LaTeX syntax, but are actually MathML entities. Examples for this are \nbsp (non-breaking space), \Aacute (the letter A with accent acute) or \copy (the copyright sign ©)
  • EPUB reader:
  • URI handling improvements. Now we outsource most of the work to fetchItem'. Also, do not include queries in file extensions (#1671).
  • LaTeX writer:
  • Use \texorpdfstring for section captions when needed (Vaclav Zeman).
  • Handle consecutive linebreaks (#1733).
  • Protect graphics in headers (Jesse Rosenthal). Graphics in \section/\subsection etc titles need to be \protected.
  • Put ~ before header in list item text (Jesse Rosenthal). Because of the built-in line skip, LaTeX can't handle a section header as the first element in a list item.
  • Avoid using reserved characters as \lstinline delimiters (#1595).
  • Better handling of display math in simple tables (#1754). We convert display math to inline math in simple tables, since LaTeX can't deal with display math in simple tables.
  • Escape spaces in code (#1694, Bjorn Buckwalter).
  • MediaWiki writer:
  • Fixed links with URL = text. Previously these were rendered as bare words, even if the URL was not an absolute URL (#1825).
  • ICML writer:
  • Don't force all citations into footnotes.
  • RTF writer:
  • Add blankline at end of output (#1732, Matthew Pickering).
  • RST writer:
  • Ensure blank line after figure.
  • Avoid exces whitespace after last list item (#1777).
  • Wrap line blocks with spaces before continuations (#1656).
  • Fixed double-rendering of footnotes in RST tables (#1769).
  • DokuWiki writer:
  • Better handling of block quotes. This change ensures that multiple paragraph blockquotes are rendered using native > rather than as HTML (#1738).
  • Fix external images (#1739). Preface relative links with ":", absolute URIs without. (Timothy Humphries)
  • HTML writer:
  • Use protocol-relative URL for mathjax.
  • Put newline btw img and caption paragraph.
  • MathML now outputted with tex annotation (#1635, Matthew Pickering).
  • Add support for KaTeX HTML math (#1626, Matthew Pickering). This adds KaTeX to HTMLMathMethod (API change).
  • Don't double render when email-obfuscation=none (#1625, Matthew Pickering).
  • Make header attributes work outside top level (#1711). Previously they only appeared on top level header elements. Now they work e.g. in blockquotes.
  • ODT writer:
  • Correctly handle images without extensions (#1729).
  • Strip querystring in ODT write (#1682, Todd Sifleet).
  • FB2 writer:
  • Add newline to output.
  • EPUB writer:
  • Don't add sourceURL to absolute URIs (#1669).
  • Don't use unsupported opf:title-type for epub2.
  • Include "landmarks" section in nav document for epub3 (#1757).
  • Removed playOrder from navpoint elements in ncx ...
Read more

pandoc 1.13.1

31 Aug 06:23
@jgm jgm
Compare
Choose a tag to compare
  • Fixed --self-contained with Windows paths (#1558). Previously C:\foo.js was being wrongly interpreted as a URI.
  • HTML reader: improved handling of tags that can be block or inline. Previously a section like this would be enclosed in a paragraph, with RawInline for the video tags (since video is a tag that can be either block or inline):
<video controls="controls">
   <source src="../videos/test.mp4" type="video/mp4" />
   <source src="../videos/test.webm" type="video/webm" />
   <p>
      The videos can not be played back on your system.<br/>
      Try viewing on Youtube (requires Internet connection):
      <a href="http://youtu.be/etE5urBps_w">Relative Velocity on
Youtube</a>.
   </p>
</video>

This change will cause the video and source tags to be parsed as RawBlock instead, giving better output. The general change is this: when we're parsing a "plain" sequence of inlines, we don't parse anything that COULD be a block-level tag.

  • Docx reader:
  • Be sensitive to user styles. Note that "Hyperlink" is "blacklisted," as we don't want the default underline styling to be inherited by all links by default (Jesse Rosenthal).
  • Read single paragraph in table cell as Plain (Jesse Rosenthal). This makes to docx reader's native output fit with the way the markdown reader understands its markdown output.
  • Textile writer: Extended the range of cases where native textile tables will be used (as opposed to raw HTML): we now handle any alignment type, but only for simple tables with no captions.
  • Txt2Tags reader:
  • Header is now parsed only if standalone flag is set (Matthew Pickering).
  • The header is now parsed as meta information. The first line is the title, the second is the author and third line is the date (Matthew Pickering).
  • Corrected formatting of %%mtime macro (Matthew Pickering).
  • Fixed crash when reading from stdin.
  • EPUB writer: Don't use page-progression-direction in EPUB2, which doesn't support it. Also, if page-progression-direction not specified in metadata, don't include the attribute even in EPUB3; not including it is the same as including it with the value "default", as we did before. (#1550)
  • Org writer: Accept example lines with indentation at the beginning (Calvin Beck).
  • DokuWiki writer:
  • Refactor to use Reader monad (Matthew Pickering).
  • Avoid using raw HTML in table cells; instead, use \\ instead of newlines (Jesse Rosenthal).
  • Properly handle HTML table cell alignments, and use spacing to make the tables look prettier (#1566).
  • Docx writer:
  • Bibliography entries get Bibliography style (#1559).
  • Implement change tracking (Jesse Rosenthal).
  • LaTeX writer:
  • Fixed a bug that caused a table caption to repeat across all pages (Jose Luis Duran).
  • Improved vertical spacing in tables and made it customizable using standard lengths set by booktab. See https://groups.google.com/forum/#!msg/pandoc-discuss/qMu6_5lYy0o/ZAU7lzAIKw0J (Jose Luis Duran).
  • Added \strut to fix spacing in multiline tables (Jose Luis Duran).
  • Use \tabularnewline instead of \\ in table cells (Jose Luis Duran).
  • Made horizontal rules more flexible (Jose Luis Duran).
  • Text.Pandoc.MIME:
  • Added MimeType (type synonym for String) and getMimeTypeDef. Code cleanups (Artyom Kazak).
  • Templates:
  • LaTeX template: disable microtype protrusion for typewriter font (#1549, thanks lemzwerg).
  • Improved OSX build procedure.
  • Added network-uri flag, to deal with split of network-uri from network.
  • Fix build dependencies for the trypandoc flag, so that they are ignored if trypandoc flag is set to False (Gabor Pali).
  • Updated README to remove outdated claim that --self-contained looks in the user data directory for missing files.

pandoc 1.13.0.1

18 Aug 01:11
@jgm jgm
Compare
Choose a tag to compare

This release fixes a couple of serious regressions in 1.13.

  • Docx writer:
    • Fixed regression which bungled list numbering (#1544), causing all lists to appear as basic ordered lists.
    • Include row width in table rows (Christoffer Ackelman, Viktor Kronvall). Added a property to all table rows where the sum of column widths is specified in pct (fraction of 5000). This helps persuade Word to lay out the table with the widths we specify.
  • Fixed a bug in Windows 8 which caused pandoc not to find the pandoc-citeproc filter (#1542).
  • Docx reader: miscellaneous under-the-hood improvements (Jesse Rosenthal). Most significantly, the reader now uses Builder, leading to some performance improvements.
  • HTML reader: Parse appropriately styled span as SmallCaps.
  • Markdown writer: don't escape $, ^, ~ when tex_math_dollars, superscript, and subscript extensions, respectively, are deactivated (#1127).
  • Added trypandoc flag to build CGI executable used in the online demo.
  • Makefile: Added 'quick', 'osxpkg' targets.
  • Updated README in templates to indicate templates license. The templates are dual-licensed, BSD3 and GPL2+.

pandoc 1.13

16 Aug 04:06
@jgm jgm
Compare
Choose a tag to compare

New features

  • Added docx as an input format (Jesse Rosenthal). The docx reader includes conversion of native Word equations to pandoc LaTeX Math elements. Metadata is taken from paragraphs at the beginning of the document with styles Author, Title, Subtitle, Date, and Abstract.
  • Added epub as an input format (Matthew Pickering). The epub reader includes conversion of MathML to pandoc LaTeX Math elements.
  • Added t2t (Txt2Tags) as an input format (Matthew Pickering). Txt2tags is a lightweight markup format described at http://txt2tags.org/.
  • Added dokuwiki as an output format (Clare Macrae).
  • Added haddock as an output format.
  • Added --extract-media option to extract media contained in a zip container (docx or epub) while adjusting image paths to point to the extracted images.
  • Added a new markdown extension, compact_definition_lists, that restores the syntax for definition lists of pandoc 1.12.x, allowing tight definition lists with no blank space between items, and disallowing lazy wrapping. (See below under behavior changes.)
  • Added an extension epub_html_exts for parsing HTML in EPUBs.
  • Added extensions native_spans and native_divs to activate parsing of material in HTML span or div tags as Pandoc Span inlines or Div blocks.
  • --trace now works with the Markdown, HTML, Haddock, EPUB, Textile, and MediaWiki readers. This is an option intended for debugging parsing problems; ordinary users should not need to use it.

Behavior changes

  • Changed behavior of the markdown_attribute extension, to bring it in line with PHP markdown extra and multimarkdown. Setting markdown="1" on an outer tag affects all contained tags, recursively, until it is reversed with markdown="0" (#1378).
  • Revised markdown definition list syntax (#1429). Both the reader and writer are affected. This change brings pandoc's definition list syntax into alignment with that used in PHP markdown extra and multimarkdown (with the exception that pandoc is more flexible about the definition markers, allowing tildes as well as colons). Lazily wrapped definitions are now allowed. Blank space is required between list items. The space before a definition is used to determine whether it is a paragraph or a "plain" element. WARNING: This change may break existing documents! Either check your documents for definition lists without blank space between items, or use markdown+compact_definition_lists for the old behavior.
  • .numberLines now works in fenced code blocks even if no language is given (#1287, jgm/highlighting-kate#40).
  • Improvements to --filter:
  • Don't search PATH for a filter with an explicit path. This fixed a bug wherein --filter ./caps.py would run caps.py from the system path, even if there was a caps.py in the working directory.
  • Respect shebang if filter is executable (#1389).
  • Don't print misleading error message. Previously pandoc would say that a filter was not found, even in a case where the filter had a syntax error.
  • HTML reader:
  • Parse div and span elements even without --parse-raw, provided native_divs and native_spans extensions are set. Motivation: these now generate native pandoc Div and Span elements, not raw HTML.
  • Parse EPUB-specific elements if the epub_html_exts extension is enabled. These include switch, footnote, rearnote, noteref.
  • Org reader:
  • Support for inline LaTeX. Inline LaTeX is now accepted and parsed by the org-mode reader. Both math symbols (like \tau) and LaTeX commands (like \cite{Coffee}), can be used without any further escaping (Albert Krewinkel).
  • Textile reader and writer:
  • The raw_tex extension is no longer set by default. You can enable it with textile+raw_tex.
  • DocBook reader:
  • Support equation, informalequation, inlineequation elements with mml:math content. This is converted into LaTeX and put into a Pandoc Math inline.
  • Revised plain output, largely following the style of Project Gutenberg:
  • Emphasis is rendered with _underscores_, strong emphasis with ALL CAPS.
  • Headings are rendered differently, with space to set them off, not with setext style underlines. Level 1 headers are ALL CAPS.
  • Math is rendered using unicode when possible, but without the distracting emphasis markers around variables.
  • Footnotes use a regular [n] style.
  • Markdown writer:
  • Horizontal rules are now a line across the whole page.
  • Prettier pipe tables. Columns are now aligned (#1323).
  • Respect the raw_html extension. pandoc -t markdown-raw_html no longer emits any raw HTML, including span and div tags generated by Span and Div elements.
  • Use span with style for SmallCaps (#1360).
  • HTML writer:
  • Autolinks now have class uri, and email autolinks have class email, so they can be styled.
  • Docx writer:
  • Document formatting is carried over from reference.docx. This includes margins, page size, page orientation, header, and footer, including images in headers and footers.
  • Include abstract (if present) with Abstract style (#1451).
  • Include subtitle (if present) with Subtitle style, rather than tacking it on to the title (#1451).
  • Org writer:
  • Write empty span elements with an id attribute as org anchors. For example Span ("uid",[],[]) [] becomes <<uid>>.
  • LaTeX writer:
  • Put table captions above tables, to match the conventional standard. (Previously they appeared below tables.)
  • Use \(..\) instead of $..$ for inline math (#1464).
  • Use \nolinkurl in email autolinks. This allows them to be styled using \urlstyle{tt}. Thanks to Ulrike Fischer for the solution.
  • Use \textquotesingle for ' in inline code. Otherwise we get curly quotes in the PDF output (#1364).
  • Use \footnote<.>{..} for notes in beamer, so that footnotes do not appear before the overlays in which their markers appear (#1525).
  • Don't produce a \label{..} for a Div or Span element. Do produce a \hyperdef{..} (#1519).
  • EPUB writer:
  • If the metadata includes page-progression-direction (which can be ltr or rtl, the page-progression-direction attribute will be set in the EPUB spine (#1455).
  • Custom lua writers:
  • Custom writers now work with --template.
  • Removed HTML header scaffolding from sample.lua.
  • Made citation information available in lua writers.
  • --normalize and Text.Pandoc.Shared.normalize now consolidate adjacent RawBlocks when possible.

API changes

  • Added Text.Pandoc.Readers.Docx, exporting readDocx (Jesse Rosenthal).
  • Added Text.Pandoc.Readers.EPUB, exporting readEPUB (Matthew Pickering).
  • Added Text.Pandoc.Readers.Txt2Tags, exporting readTxt2Tags (Matthew Pickering).
  • Added Text.Pandoc.Writers.DokuWiki, exporting writeDokuWiki (Clare Macrae).
  • Added Text.Pandoc.Writers.Haddock, exporting writeHaddock.
  • Added Text.Pandoc.MediaBag, exporting MediaBag, lookupMedia, insertMedia, mediaDirectory, extractMediaBag. The docx and epub readers return a pair of a Pandoc document and a MediaBag with the media resources they contain. This can be extracted using --extract-media. Writers that incorporate media (PDF, Docx, ODT, EPUB, RTF, or HTML formats with --self-contained) will look for resources in the MediaBag generated by the reader, in addition to the file system or web.
  • Text.Pandoc.Readers.TexMath: Removed deprecated readTeXMath. Renamed readTeXMath' to texMathToInlines.
  • Text.Pandoc: Added Reader data type (Matthew Pickering). readers now associates names of readers with Reader structures. This allows inclusion of readers, like the docx reader, that take binary rather than textual input.
  • Text.Pandoc.Shared:
  • Added capitalize (Artyom Kazak), and replaced uses of map toUpper (which give bad results for many languages).
  • Added collapseFilePath, which removes intermediate . and .. from a path (Matthew Pickering).
  • Added fetchItem', which works like fetchItem but searches a MediaBag before looking on the net or file system.
  • Added withTempDir.
  • Added removeFormatting.
  • Added extractSpaces (from HTML reader) and generalized its type so that it can be used by the docx reader (Matthew Pickering).
  • Added ordNub.
  • Added normalizeInlines, normalizeBlocks.
  • normalize is now Pandoc -> Pandoc instead of Data a :: a -> a. Some users may need to change their uses of normalize to the newly exported normalizeInlines or normalizeBlocks.
  • Text.Pandoc.Options:
  • Added writerMediaBag to WriterOptions.
  • Removed deprecated and no longer used readerStrict in ReaderOptions. This is handled by readerExtensions now.
  • Added Ext_compact_definition_lists.
  • Added Ext_epub_html_exts.
  • Added Ext_native_divs and Ext_native_spans. This allows users to turn off the default pandoc behavior of parsing contents of div and span tags in markdown and HTML as native pandoc Div blocks and Span inlines.
  • Text.Pandoc.Parsing:
  • Generalized readWith to readWithM (Matthew Pickering).
  • Export runParserT and Stream (Matthew Pickering).
  • Added HasQuoteContext type class (Matthew Pickering).
  • Generalized types of mathInline, smartPunctuation, quoted, singleQuoted, doubleQuoted, failIfInQuoteContext, applyMacros (Matthew Pickering).
  • Added custom token (Matthew Pickering).
  • Added stateInHtmlBlock to ParserState. This is used to keep track of the ending tag we're waiting for when we're parsing inside HTML block tags.
  • Added stateMarkdownAttribute to ParserState. This is used to keep track of whether the markdown attribute has been set in an enclosing tag.
  • Generalized type of registerHeader, using new type classes HasReaderOptions, `...
Read more

pandoc 1.12.4.2

14 May 22:06
@jgm jgm
Compare
Choose a tag to compare
  • Require highlighting-kate >= 0.5.8. Fixes a performance regression.
  • Shared: addMetaValue now behaves slightly differently: if both the new and old values are lists, it concatenates their contents to form a new list.
  • LaTeX reader:
  • Set bibliography in metadata from \bibliography or \addbibresource command.
  • Don't error on %foo with no trailing newline.
  • Org reader:
  • Support code block headers (#+BEGIN_SRC ...) (Albert Krewinkel).
  • Fix parsing of blank lines within blocks (Albert Krewinkel).
  • Support pandoc citation extension (Albert Krewinkel). This can be turned off by specifying org-citation as the input format.
  • Markdown reader:
  • citeKey moved to Text.Pandoc.Parsing so it can be used by other readers (Albert Krewinkel).
  • Text.Pandoc.Parsing:
  • Added citeKey (see above).
  • Added HasLastStrPosition type class and updateLastStrPos and notAfterString functions.
  • Updated copyright notices (Albert Krewinkel).
  • Added default.icml to data files so it installs with the package.
  • OSX package:
  • The binary is now built with options to ensure that it can be used with OSX 10.6+.
  • Moved OSX package materials to osx directory.
  • Added OSX package uninstall script, included in the zip container (thanks to Daniel T. Staal).

pandoc 1.12.4

08 May 07:26
@jgm jgm
Compare
Choose a tag to compare
  • Made it possible to run filters that aren't executable (#1096).
    Pandoc first tries to find the executable (searching the path
    if path isn't given). If it fails, but the file exists and has
    a .py, .pl, .rb, .hs, or .php extension, pandoc runs the filter
    using the appropriate interpreter. This should make it easier to
    use filters on Windows, and make it more convenient for everyone.
  • Added Emacs org-mode reader (Albert Krewinkel).
  • Added InDesign ICML Writer (mb21).
  • MediaWiki reader:
    • Accept image links in more languages (Jaime Marquínez Ferrándiz).
    • Fixed bug in certain nested lists (#1213). If a level 2 list was
      followed by a level 1 list, the first item of the level 1 list
      would be lost.
    • Handle table rows containing just an HTML comment (#1230).
  • LaTeX reader:
    • Give better location information on errors, pointing to line
      numbers within included files (#1274).
    • LaTeX reader: Better handling of table environment (#1204).
      Positioning options no longer rendered verbatim.
    • Better handling of figure and table with caption (#1204).
    • Handle @{} and p{length} in tabular. The length is not actually
      recorded, but at least we get a table (#1180).
    • Properly handle \nocite. It now adds a nocite metadata
      field. Citations there will appear in the bibliography but not
      in the text (unless you explicitly put a $nocite$ variable
      in your template).
  • Markdown reader:
    • Ensure that whole numbers in YAML metadata are rendered without
      decimal points. (This became necessary with changes to aeson
      and yaml libraries. aeson >= 0.7 and yaml >= 0.8.8.2 are now required.)
    • Fixed regression on line breaks in strict mode (#1203).
    • Small efficiency improvements.
    • Improved parsing of nested divs. Formerly a closing div tag
      would be missed if it came right after other block-level tags.
    • Avoid backtracking when closing </div> not found.
    • Fixed bug in reference link parsing in markdown_mmd.
    • Fixed a bug in list parsing (#1154). When reading a raw list
      item, we now strip off up to 4 spaces.
    • Fixed parsing of empty reference link definitions (#1186).
    • Made one-column pipe tables work (#1218).
  • Textile reader:
    • Better support for attributes. Instead of being ignored, attributes
      are now parsed and included in Span inlines. The output will be a bit
      different from stock textile: e.g. for *(foo)hi*, we'll get
      <em><span class="foo">hi</span></em> instead of
      <em class="foo">hi</em>. But at least the data is not lost.
    • Improved treatment of HTML spans (%) (#1115).
    • Improved link parsing. In particular we now pick up on attributes.
      Since pandoc links can't have attributes, we enclose the whole link in
      a span if there are attributes (#1008).
    • Implemented correct parsing rules for inline markup (#1175, Matthew
      Pickering).
    • Use Builder (Matthew Pickering).
  • DocBook reader:
    • Better treatment of formalpara. We now emit the title (if present)
      as a separate paragraph with boldface text (#1215).
    • Set metadata author not authors.
    • Added recognition of authorgroup and releaseinfo elements (#1214,
      Matthew Pickering).
    • Converted current meta information parsing in DocBook to a more
      extensible version which is aware of the more recent meta
      representation (Matthew Pickering).
  • HTML reader:
    • Require tagsoup 0.13.1, to fix a bug with parsing of script tags
      (#1248).
    • Treat processing instructions & declarations as block. Previously
      these were treated as inline, and included in paragraph tags in HTML
      or DocBook output, which is generally not what is wanted (#1233).
    • Updated closes with rules from HTML5 spec.
    • Use Builder (Matthew Pickering, #1162).
  • RST reader:
    • Remove duplicate http in PEP links (Albert Krewinkel).
    • Make rst figures true figures (#1168, CasperVector)
    • Enhanced Pandoc's support for rST roles (Merijn Verstaaten).
      rST parser now supports: all built-in rST roles, new role definition,
      role inheritance, though with some limitations.
    • Use author rather than authors in metadata.
    • Better handling of directives. We now correctly handle field
      lists that are indented more than three spaces. We treat an
      aafig directive as a code block with attributes, so it can be
      processed in a filter (#1212).
  • LaTeX writer:
    • Mark span contents with label if span has an ID (Albert Krewinkel).
    • Made --toc-depth work well with books in latex/pdf output (#1210).
    • Handle line breaks in simple table cells (#1217).
    • Workaround for level 4-5 headers in quotes. These previously produced
      invalid LaTeX: \paragraph or \subparagraph in a quote environment.
      This adds an mbox{} in these contexts to work around the problem.
      See http://tex.stackexchange.com/a/169833/22451 (#1221).
    • Use \/ to avoid en-dash ligature instead of -{}- (Vaclav Zeman).
      This is to fix LuaLaTeX output. The -{}- sequence does not avoid the
      ligature with LuaLaTeX but \/ does.
    • Fixed string escaping in hyperref and hyperdef (#1130).
  • ConTeXt writer: Improved autolinks (#1270).
  • DocBook writer:
    • Improve handling of hard line breaks in Docbook writer
      (Neil Mayhew). Use a <literallayout> for the entire paragraph, not
      just for the newline character.
    • Don't let line breaks inside footnotes influence the enclosing
      paragraph (Neil Mayhew).
    • Distinguish tight and loose lists in DocBook output, using
      spacing="compact" (Neil Mayhew, #1250).
  • Docx writer: When needed files are not present in the user's
    reference.docx, fall back on the versions in the reference.docx
    in pandoc's data files. This fixes a bug that occurs when a
    reference.docx saved by LibreOffice is used. (#1185)
  • EPUB writer:
    • Include extension in epub ids. This fixes a problem with duplicate
      extensions for fonts and images with the same base name but different
      extensions (#1254).
    • Handle files linked in raw img tags (#1170).
    • Handle media in audio source tags (#1170).
      Note that we now use a media directory rather than images.
    • Incorporate files linked in video tags (#1170). src and poster
      will both be incorporated into content.opf and the epub container.
  • HTML writer:
    • Add colgroup around col tags (#877). Also affects EPUB writer.
    • Fixed bug with unnumbered section headings. Unnumbered section
      headings (with class unnumbered) were getting numbers.
    • Improved detection of image links. Previously image links with
      queries were not recognized, causing <embed> to be used instead
      of <img>.
  • Man writer: Ensure that terms in definition lists aren't line wrapped
    (#1195).
  • Markdown writer:
    • Use proper escapes to avoid unwanted lists (#980). Previously we used
      0-width spaces, an ugly hack.
    • Use longer backtick fences if needed (#1206). If the content contains a
      backtick fence and there are attributes, make sure longer fences are
      used to delimit the code. Note: This works well in pandoc, but github
      markdown is more limited, and will interpret the first string of three
      or more backticks as ending the code block.
  • RST writer: Avoid stack overflow with certain tables (#1197).
  • RTF writer: Fixed table cells containing paragraphs.
  • Custom writer:
    • Correctly handle UTF-8 in custom lua scripts (#1189).
    • Fix bugs with lua scripts with mixed-case filenames and
      paths containing + or - (#1267). Note that getWriter
      in Text.Pandoc no longer returns a custom writer on input
      foo.lua.
  • AsciiDoc writer: Handle multiblock and empty table cells
    (#1245, #1246). Added tests.
  • Text.Pandoc.Options: Added readerTrace to ReaderOptions
  • Text.Pandoc.Shared:
    • Added compactify'DL (formerly in markdown reader) (Albert Krewinkel).
    • Fixed bug in toRomanNumeral: numbers ending with '9' would
      be rendered as Roman numerals ending with 'IXIV' (#1249). Thanks to
      Jesse Rosenthal.
    • openURL: set proxy with value of http_proxy env variable (#1211).
      Note: proxies with non-root paths are not supported, due to
      limitations in http-conduit.
  • Text.Pandoc.PDF:
    • Ensure that temp directories deleted on Windows (#1192). The PDF is
      now read as a strict bytestring, ensuring that process ownership will
      be terminated, so the temp directory can be deleted.
    • Use / as path separators in a few places, even on Windows.
      This seems to be necessary for texlive (#1151, thanks to Tim Lin).
    • Use ; for TEXINPUTS separator on Windows (#1151).
    • Changes to error reporting, to handle non-UTF8 error output.
  • Text.Pandoc.Templates:
    • Removed unneeded datatype context (Merijn Verstraaten).
    • YAML objects resolve to "true" in conditionals (#1133).
      Note: If address is a YAML object and you just have $address$
      in your template, the word true will appear, which may be
      unexpected. (Previously nothing would appear.)
  • Text.Pandoc.SelfContained: Handle poster attribute in video
    tags (#1188).
  • Text.Pandoc.Parsing:
    • Made F an instance of Applicative (#1138).
    • Added stateCaption.
    • Added HasMacros, simplified other typeclasses.
      Removed updateHeaderMap, setHeaderMap, getHeaderMap,
      updateIdentifierList, setIdentifierList, getIdentifierList.
    • Changed the smart punctuation parser to return Inlines
      rather than Inline (Matthew Pickering).
    • Changed HasReaderOptions, HasHeaderMap, HasIdentifierList
      from typeclasses of monads to typeclasses of states. This simplifies
      the instance definitions and provides more flexibility. Generalized
      type of getOption and added a default definition. Removed
      askReaderOption. Added extractReaderOption. Added
      extractHeaderMap and updateHeaderMap in HasHeaderMap.
      ...
Read more

pandoc 1.12.3

10 Jan 19:35
@jgm jgm
Compare
Choose a tag to compare
  • The --bibliography option now sets the biblio-files variable. So, if you're using --natbib or --biblatex, you can just use --bibliography=foo.bib instead of -V bibliofiles=foo.
  • Don't run pandoc-citeproc filter if --bibliography is used together with --natbib or --biblatex (Florian Eitel).
  • Template changes:
  • Updated beamer template to include booktabs.
  • Added abstract variable to LaTeX template.
  • Put header-includes after title in LaTeX template (#908).
  • Allow use of \includegraphics[size] in beamer. This just required porting a macro definition from the default LaTeX template to the default beamer template.
  • reference.docx: Include FootnoteText style. Otherwise Word ignores the style, even when specified in the pPr. (#901)
  • reference.odt: Tidied styles.xml.
  • Relaxed version bounds for dependencies.
  • Added withSocketsDo around http conduit code in openURL, so it works on Windows (#1080).
  • Added Cite function to sample.lua.
  • Markdown reader:
  • Fixed regression in title blocks (#1089). If author field was empty, date was being ignored.
  • Allow backslash-newline hard line breaks in grid and multiline table cells.
  • Citation keys may now start with underscores, and may contain underscores adjacent to internal punctuation.
  • LaTeX reader:
  • Add support for Verb macro (jrnold) (#1090).
  • Support babel-style quoting: "..."'`.
  • Properly handle script blocks in strict mode. (That is, markdown-markdown_in_html_blocks.) Previously a spurious <p> tag was being added (#1093).
  • Docbook reader: Avoid failure if tbody contains no tr or row elements.
  • LaTeX writer:
  • Factored out function for table cell creation.
  • Better treatment of footnotes in tables. Notes now appear in the regular sequence, rather than in the table cell. (This was a regression in 1.10.)
  • HTML reader: Parse name/content pairs from meta tags as metadata. Closes #1106.
  • Moved fixDisplayMath from Docx writer to Writer.Shared.
  • OpenDocument writer: Fixed RawInline, RawBlock so they don't escape.
  • ODT writer: Use mathml for proper rendering of formulas. Note: LibreOffice's support for this seems a bit buggy. But it should be better than what we had before.
  • RST writer: Ensure no blank line after def in definition list (#992).
  • Markdown writer: Don't use tilde code blocks with braced attributes in markdown_github output. A consequence of this change is that the backtick form will be preferred in general if both are enabled. That is good, as it is much more widespread than the tilde form. (#1084)
  • Docx writer: Fixed problem with some modified reference docx files. Include word/_rels/settings.xml.rels if it exists, as well as other rels files besides the ones pandoc generates explicitly.
  • HTML writer:
  • With --toc, headers no longer link to themselves (#1081).
  • Omit footnotes from TOC entries. Otherwise we get doubled footnotes when headers have notes!
  • EPUB writer:
  • Avoid duplicate notes when headings contain notes. This arose because the headings are copied into the metadata "title" field, and the note gets rendered twice. We strip the note now before putting the heading in "title".
  • Strip out footnotes from toc entries.
  • Fixed bug with --epub-stylesheet. Now the contents of writerEpubStylesheet (set by --epub-stylesheet) should again work, and take precedence over a stylesheet specified in the metadata.
  • Text.Pandoc.Pretty: Added nestle. API change.
  • Text.Pandoc.MIME: Added wmf, emf.
  • Text.Pandoc.Shared: fetchItem now handles image URLs beginning with //.
  • Text.Pandoc.ImageSize: Parse EXIF format JPEGs. Previously we could only get size information for JFIF format, which led to squished images in Word documents. Closes #976.
  • Removed old MarkdownTest_1.0.3 directory (#1104).

pandoc 1.12.2.1

09 Dec 03:32
@jgm jgm
Compare
Choose a tag to compare
  • Markdown reader: Fixed regression in list parser, involving continuation lines containing raw HTML (or even verbatim raw HTML).