Releases: jgm/pandoc
pandoc 1.14.0.1
- Fixed problem with building of
reference.docx
andreference.odt
when theembed_data_files
flag is used. Instead of having a phase of the build wherereference.docx
andreference.odt
are created from their constituent data files, we now construct these archives from their constituents when adocx
orodt
is built. The constituent files have been moved fromextra-source-files
todata-files
, andreference.docx
andreference.odt
have been removed. Users can create their ownreference.docx
orreference.odt
by using pandoc to create a simpledocx
orodt
.make-reference-files.hs
has been removed, simplifying the build process (#2187) - Don't include generated man pages in extra-source-files (#2189).
- Bumped upper bound for aeson.
- ConTeXt writer: create internal link anchors for Div elements with identifiers. (This is needed for linked citations to work.)
pandoc 1.14
New features
- Added
commonmark
as input and output format. - Added
--verbose
flag for debugging output in PDF production (#1840, #1653). - Allow wildcards in
--epub-embed-font
arguments (#1939). - Added
--latex-engine-opt
option (#969, #1779, Sumit Sahrawat). - Added
shortcut_reference_links
extension (Konstantin Zudov, #1977). This is enabled by default for those markdown flavors that support reading shortcut reference links, namely:markdown
,markdown_strict
,markdown_github
,markdown_php
. If the extension is enabled, the reader parses shortcut reference links like[foo]
, and the writer creates such links unless doing so would cause problems. Users of markdown flavors that support shortcut reference links should not notice a difference in reading markdown, but the markdown pandoc produces may differ. If shortcut links are not desired, the extension can be disabled in the normal way.
Behavior changes
--toc
is now supported fordocx
output (#458, Nikolay Yakimov). A "dirty" TOC is created at the beginning of document. It can be regenerated after the document has been opened.- An implicit
--filter pandoc-citeproc
is now triggered only when the--bibliography
option is used, and not when thebibliography
field in metadata is specified (#1849). - Markdown reader:
- Reference links with
implicit_header_references
are no longer case-sensitive (#1606). - Definition lists no longer require indentation for first line (#2087). Previously the body of the definition (after the
:
or~
marker) needed to be in column 4. This commit relaxes that requirement, to better match the behavior of PHP Markdown Extra. So, now this is a valid definition list:
```
foo
: bar
```
- Resolve a potentially ambiguity with table captions:
```
foo
: bar
-----
table
-----
```
Is "bar" a definition, or the caption for the table? We'll count it as a caption for the table.
- Disallow headerless pipe tables (#1996), to conform to GFM and PHP Markdown Extra. Note: If you have been using headerless pipe tables, this change may cause existing tables to break.
- Allow pipe tables with header but no body (#2017).
- Allow a digit as first character of a citation key (Matthias Troffaes). See jgm/pandoc-citeproc#97
- LaTeX reader:
- Don't limit includes to
.tex
extension (#1882). If the extension is not.tex
, it must be given explicitly in the\input
or\include
. - Docx reader:
- Allow numbering in the style file. This allows inherited styles with numbering (lists) (Jesse Rosenthal).
- Org reader:
- Support smart punctuation (Craig Bosma).
- Drop trees with a :noexport: tag (Albert Krewinkel). Trees having a
:noexport:
tag set are not exported. This mirrors org-mode. - Put header tags into empty spans (Albert Krewinkel, #2160). Org mode allows headers to be tagged:
* Headline :TAG1:TAG2
. Instead of being interpreted as part of the headline, the tags are now put into the attributes of empty spans. Spans without textual content won't be visible by default, but they are detectable by filters. They can also be styled using CSS when written as HTML. - Generalize code block result parsing (Albert Krewinkel). Previously, only code blocks were recognized as result blocks; now, any kind of block can be the result.
- Append newline to the LineBreak in Dokuwiki, HTML, EPUB, LaTeX, MediaWiki, OpenDocument, Texinfo writers (#1924, Tim Lin).
- HTML writer:
- Add "inline" or "display" class to math spans (#1914). This allows inline and display math to be styled differently.
- Include raw latex blocks if
--mathjax
specified (#1938). - Require highlighting-kate >= 0.5.14 (#1903). This ensures that all code blocks will be wrapped in a
div
with classsourceCode
. Also, the default highlighting CSS now addsdiv.sourceCode { x-overflow: auto; }
, which means that code blocks (even with line numbers) will acquire a scroll bar on screens too small to display them (e.g. mobile phones). See also jgm/highlighting-kate#65. - LaTeX writer:
- Use a declaration for tight lists (Jose Luis Duran, Joseph Harriott). Previously, pandoc hard-coded some commands to make tight lists in LaTeX. Now we use a custom command instead, allowing the styling to be changed in a macro in the header. (Note: existing templates may need to be modified to include the definition of this macro. See the current template.)
- Beamer output: if the header introducing a slide has the class
fragile
, add the[fragile]
option to the slide (#2119). - MediaWiki writer:
- Use
File:
instead of the deprecatedImage:
for images and other media files (Greg Rundlett). - DocBook writer:
- Render a
Div (id,_,_) [Para _]
element as apara
element with anid
attribute. This makes links to citations work in DocBook with pandoc-citeproc. - RST writer:
- Normalize headings to sequential levels (Nikolay Yakimov). This is pretty much required by docutils.
- Treat headings in block quotes, etc as rubrics (Nikolay Yakimov).
- Better handling of raw latex inline (#1961). We use
:raw-latex:
...`` and add a definition for this role to the template. - EPUB writer:
- Remove
linear=no
from coveritemref
(#1609). - Don't use
sup
element for epub footnotes (#1995). Instead, just use an a element with classfootnoteRef
. This allows more styling options, and provides better results in some readers (e.g. iBooks, where anything inside the a tag breaks popup footnotes). - Take TOC title from
toc-title
metadata field. - Docx writer:
- Implemented
FirstParagraph
style (Jesse Rosenthal). Following the ODT writer, we add theFirstParagraph
style to the first text paragraph following an image, blockquote, table, heading, or beginning of document. This allows it to be styled differently. The default is for it to be the same asNormal
. - Added
BodyText
style (Jesse Rosenthal). We apply aBodyText
style to all unstyled paragraphs. This is, essentially, the same asNormal
, except that since not everything inherits fromBodyText
(the metadata won't, for example, or the headers or footnote numbers), we can change the text in the body without having to make exceptions for everything. If we do want to change everything, we can still do it throughNormal
. - Altered
Blockquote
style slightly (Jesse Rosenthal). SinceBlockQuote
derives fromBodyText
, we just want to specify by default that it won't indent, regardless of whatBodyText
does. Note that this will not produce any visible difference in the default configuration. - Take TOC title from
toc-title
metadata field (Nikolay Yakimov). - Added a style to figure images (Nikolay Yakimov). Figures with empty captions use style
Figure
. Figures with nonempty captions use styleFigure with Caption
, which is based onFigure
, and additionally haskeepNext
set. - ODT writer:
- Added figure captions (Nikolay Yakimov). The following styles are used for figures:
Figure
-- for figure with empty caption),FigureWithCaption
(based onFigure
) -- for figure with caption,FigureCaption
(based onCaption
) -- for figure captions. Also,TableCaption
(based onCaption
) is used for table captions.
API changes
- New
Text.Pandoc.Error
module withPandocError
type (Matthew Pickering). - All readers now return
Either PandocError Pandoc
instead ofPandoc
(Matthew Pickering). This allows better handling of errors. - Added
Text.Pandoc.Writers.CommonMark
, exportingwriteCommonMark
. - Added
Text.Pandoc.Readers.CommonMark
, exportingreadCommonMark
. - Derive
Data
andTypeable
instances forMediaBag
,Extension
,ReaderOptions
,EPUBVersion
,CiteMethod
,ObfuscationMethod
,HTMLSlideVariant
,TrackChanges
,WriterOptions
(Shabbaz Youssefi). - New
Ext_shortcut_reference_links
constructor forExtension
(Konstantin Zudov).
Bug fixes
- Markdown reader:
- Allow smart
'
after inline math (#1909, Nikolay Yakimov). - Check for tex macros after indented code (#1973).
- Rewrote
charsInBalancedBrackets
for efficiency. - Make sure a closing
</div>
doesn't get included in a definition list item (#2127). - Don't parse bracketed text as citation if it might be a link, image, or footnote (Nikolay Yakimov).
- Require space after key in mmd title block (#2026, Nikolay Yakimov). Require space after key-value delimiter colon in mmd title block.
- Require nonempty value in mmd title block (Nikolay Yakimov).
- Disable all metadata block extensions when parsing metadata field values (#2026, Nikolay Yakimov). Otherwise we could get a mmd title block inside YAML metadata, for example.
- HTML reader:
- Improve self-closing tag detection in
htmlInBalanced
(#2146). - Handle tables with
<th>
in body rows (#1859, mb21). - Fixed
htmlTag
(#1820). If the tag parses as a comment, we check to see if the input starts with<!--
. If not, it's bogus comment mode and we failhtmlTag
. - Handle
base
tag; if it has anhref
value, this is added to all relative URLs in links and images. - DocBook reader:
- Look inside "info" elements for section titles (#1931).
- Docx reader:
- Parse images in deprecated vml format (Jesse Rosenthal).
- Allow sub/superscript verbatims (Jesse Rosenthal). Verbatim usually shuts off all other run styles, but we don't want it to shut off sub/superscript.
- LaTeX reader:
- Handle
tabular*
environment (#1850). Note that the table width is not actually parsed or taken into account, but pandoc no longer chokes on it. - Ignore options in
\lstinline
rather than raising error (#1997). - Add some test cases for simple tables (Mathias Schenner).
- Handle valign argument in tables (Mathias Schenner) (cur...
pandoc 1.13.2
This is mainly a spit-and-polish release, though there is one new reader and some minor new features. Note that, for the first time, we are providing a linux binary (64-bit Debian/Ubuntu).
- TWiki Reader: add new new twiki reader (API chaneg, Alexander Sulfrian).
- Markdown reader:
- Better handling of paragraph in div (#1591). Previously text that ended a div would be parsed as Plain unless there was a blank line before the closing div tag.
- Don't treat a citation as a reference link label (#1763).
- Fix autolinks with following punctuation (#1811). The price of this is that autolinked bare URIs can no longer contain
>
characters, but this is not a big issue. - Fix
Ext_lists_without_preceding_blankline
bug (#1636, Artyom). - Allow
startnum
to work withoutfancy_lists
. Formerlypandoc -f markdown-fancy_lists+startnum
did not work properly. - RST reader (all Daniel Bergey):
- Parse quoted literal blocks (#65). RST quoted literal blocks are the same as indented literal blocks (which pandoc already supports) except that the quote character is preserved in each line.
- Parse RST class directives. The class directive accepts one or more class names, and creates a Div value with those classes. If the directive has an indented body, the body is parsed as the children of the Div. If not, the first block folowing the directive is made a child of the Div. This differs from the behavior of rst2xml, which does not create a Div element. Instead, the specified classes are applied to each child of the directive. However, most Pandoc Block constructors to not take an Attr argument, so we can't duplicate this behavior.
- Warn about skipped directives.
- Literal role now produces Code. Code role should have "code" class.
- Improved support for custom roles
- AddedsourceCode
to classes for:code:
role, and anything inheriting from it.
- Add the name of the custom role to classes if the Inline constructor supports Attr.
- If the custom role directive does not specify a parent role, inherit from the:span:
role.
This differs somewhat from the rst2xml.py
behavior. If a custom role inherits from another custom role, Pandoc will attach both roles' names as classes. rst2xml.py
will only use the class of the directly invoked role (though in the case of inheriting from a :code:
role with a :language:
defined, it will also provide the inherited language as a class).
- Warn about ignored fields in role directives.
- LaTeX reader:
- Parse label after caption into a span instead of inserting an additional paragraph of bracketed text (#1747).
- Parse math environments as inline when possible (#1821).
- Better handling of
\noindent
and\greektext
(#1783). - Handle
\texorpdfstring
more gracefully. - Handle
\cref
and\sep
(Wikiwide). - Support
\smartcite
and\Smartcite
from biblatex. - HTML reader:
- Retain display type of MathML output (#1719, Matthew Pickering).
- Recognise
<br>
tags inside<pre>
blocks (#1620, Matthew Pickering). - Make
embed
tag either block or inline (#1756). - DocBook reader:
- Handle
keycombo
,keycap
(#1815). - Get string content in inner tags for literal elements (#1816).
- Handle
menuchoice
elements better, with a>
between (#1817). - Include
id
on section headers (#1818). - Document/test "type" as implemented (Brian O'Sullivan).
- Add support for calloutlist and callout (Brian O'Sullivan). We treat a calloutlist as a bulleted list. This works well in practice.
- Add support for
classname
(Bryan O'Sullivan). - Docx reader:
- Fix window path for image lookup (Jesse Rosenthal). Don't use os-sensitive "combine", since we always want the paths in our zip-archive to use forward-slashes.
- Single-item headers in ordered lists are headers (Jesse Rosenthal). When users number their headers, Word understands that as a single item enumerated list. We make the assumption that such a list is, in fact, a header.
- Rewrite rewriteLink to work with new headers (Jesse Rosenthal). There could be new top-level headers after making lists, so we have to rewrite links after that.
- Use polyglot header list (Jesse Rosenthal). We're just keeping a list of header formats that different languages use as their default styles. At the moment, we have English, German, Danish, and French. We can continue to add to this. This is simpler than parsing the styles file, and perhaps less error-prone, since there seems to be some variations, even within a language, of how a style file will define headers.
- Remove header class properly in other langs (Jesse Rosenthal). When we encounter one of the polyglot header styles, we want to remove that from the par styles after we convert to a header. To do that, we have to keep track of the style name, and remove it appropriately.
- Account for external link URLs with anchors. Previously, if a URL had an anchor, the reader would incorrectly identify it as an internal link and return only the anchor as URL. (Caleb McDaniel)
- Fix for Issue #1692 (i18n styles) (Nikolay Yakimov).
- Org reader:
- Added state changing blanklines (Jesse Rosenthal). This allows us to emphasize at the beginning of a new paragraph (or, in general, after blank lines).
- Fixed bug with bulleted lists:
- a
- b
- c
was being parsed as a list, even though an unindented *
should make a heading. See http://orgmode.org/manual/Plain-lists.html#fn-1.
- Org reader: absolute, relative paths in link (#1741, Albert Krewinkel). The org reader was too restrictive when parsing links; some relative links and links to files given as absolute paths were not recognized correctly.
- Org reader: allow empty links (jgm/gitit#471, Albert Krewinkel). This is important for use in gitit, which uses empty links for wikilinks.
- Respect indent when parsing Org bullet lists (#1650, Timothy Humphries). Fixes issue with top-level bullet list parsing.
- Fix indent issue for definition lists (Timothy Humphries, see #1650, #1698, #1680).
- Parse multi-inline terms correctly in definition list (#1649, Matthew Pickering).
- Fix rules for emphasis recognition (Albert Krewinkel). Things like
/hello,/
or/hi'/
were falsy recognized as emphasised strings. This is wrong, as,
and'
are forbidden border chars and may not occur on the inner border of emphasized text. - Drop COMMENT document trees (Albert Krewinkel). Document trees under a header starting with the word
COMMENT
are comment trees and should not be exported. Those trees are dropped silently (#1678). - Properly handle links to
file:target
(Albert Krewinkel). Org links like[[file:target][title]]
were not handled correctly, parsing the link target verbatim. The org reader is changed such that the leadingfile:
is dropped from the link target (see #756, #1812). - Parse LaTeX-style MathML entities (#1657, Albert Krewinkel). Org supports special symbols which can be included using LaTeX syntax, but are actually MathML entities. Examples for this are
\nbsp
(non-breaking space),\Aacute
(the letter A with accent acute) or\copy
(the copyright sign ©) - EPUB reader:
- URI handling improvements. Now we outsource most of the work to
fetchItem'
. Also, do not include queries in file extensions (#1671). - LaTeX writer:
- Use
\texorpdfstring
for section captions when needed (Vaclav Zeman). - Handle consecutive linebreaks (#1733).
- Protect graphics in headers (Jesse Rosenthal). Graphics in
\section
/\subsection
etc titles need to be\protect
ed. - Put
~
before header in list item text (Jesse Rosenthal). Because of the built-in line skip, LaTeX can't handle a section header as the first element in a list item. - Avoid using reserved characters as
\lstinline
delimiters (#1595). - Better handling of display math in simple tables (#1754). We convert display math to inline math in simple tables, since LaTeX can't deal with display math in simple tables.
- Escape spaces in code (#1694, Bjorn Buckwalter).
- MediaWiki writer:
- Fixed links with URL = text. Previously these were rendered as bare words, even if the URL was not an absolute URL (#1825).
- ICML writer:
- Don't force all citations into footnotes.
- RTF writer:
- Add blankline at end of output (#1732, Matthew Pickering).
- RST writer:
- Ensure blank line after figure.
- Avoid exces whitespace after last list item (#1777).
- Wrap line blocks with spaces before continuations (#1656).
- Fixed double-rendering of footnotes in RST tables (#1769).
- DokuWiki writer:
- Better handling of block quotes. This change ensures that multiple paragraph blockquotes are rendered using native
>
rather than as HTML (#1738). - Fix external images (#1739). Preface relative links with ":", absolute URIs without. (Timothy Humphries)
- HTML writer:
- Use protocol-relative URL for mathjax.
- Put newline btw img and caption paragraph.
- MathML now outputted with tex annotation (#1635, Matthew Pickering).
- Add support for KaTeX HTML math (#1626, Matthew Pickering). This adds
KaTeX
toHTMLMathMethod
(API change). - Don't double render when
email-obfuscation=none
(#1625, Matthew Pickering). - Make header attributes work outside top level (#1711). Previously they only appeared on top level header elements. Now they work e.g. in blockquotes.
- ODT writer:
- Correctly handle images without extensions (#1729).
- Strip querystring in ODT write (#1682, Todd Sifleet).
- FB2 writer:
- Add newline to output.
- EPUB writer:
- Don't add
sourceURL
to absolute URIs (#1669). - Don't use unsupported
opf:title-type
for epub2. - Include "landmarks" section in nav document for epub3 (#1757).
- Removed playOrder from navpoint elements in ncx ...
pandoc 1.13.1
- Fixed
--self-contained
with Windows paths (#1558). PreviouslyC:\foo.js
was being wrongly interpreted as a URI. - HTML reader: improved handling of tags that can be block or inline. Previously a section like this would be enclosed in a paragraph, with RawInline for the video tags (since video is a tag that can be either block or inline):
<video controls="controls">
<source src="../videos/test.mp4" type="video/mp4" />
<source src="../videos/test.webm" type="video/webm" />
<p>
The videos can not be played back on your system.<br/>
Try viewing on Youtube (requires Internet connection):
<a href="http://youtu.be/etE5urBps_w">Relative Velocity on
Youtube</a>.
</p>
</video>
This change will cause the video and source tags to be parsed as RawBlock instead, giving better output. The general change is this: when we're parsing a "plain" sequence of inlines, we don't parse anything that COULD be a block-level tag.
- Docx reader:
- Be sensitive to user styles. Note that "Hyperlink" is "blacklisted," as we don't want the default underline styling to be inherited by all links by default (Jesse Rosenthal).
- Read single paragraph in table cell as
Plain
(Jesse Rosenthal). This makes to docx reader's native output fit with the way the markdown reader understands its markdown output. - Textile writer: Extended the range of cases where native textile tables will be used (as opposed to raw HTML): we now handle any alignment type, but only for simple tables with no captions.
- Txt2Tags reader:
- Header is now parsed only if standalone flag is set (Matthew Pickering).
- The header is now parsed as meta information. The first line is the
title
, the second is theauthor
and third line is thedate
(Matthew Pickering). - Corrected formatting of
%%mtime
macro (Matthew Pickering). - Fixed crash when reading from stdin.
- EPUB writer: Don't use page-progression-direction in EPUB2, which doesn't support it. Also, if page-progression-direction not specified in metadata, don't include the attribute even in EPUB3; not including it is the same as including it with the value "default", as we did before. (#1550)
- Org writer: Accept example lines with indentation at the beginning (Calvin Beck).
- DokuWiki writer:
- Refactor to use Reader monad (Matthew Pickering).
- Avoid using raw HTML in table cells; instead, use
\\
instead of newlines (Jesse Rosenthal). - Properly handle HTML table cell alignments, and use spacing to make the tables look prettier (#1566).
- Docx writer:
- Bibliography entries get
Bibliography
style (#1559). - Implement change tracking (Jesse Rosenthal).
- LaTeX writer:
- Fixed a bug that caused a table caption to repeat across all pages (Jose Luis Duran).
- Improved vertical spacing in tables and made it customizable using standard lengths set by booktab. See https://groups.google.com/forum/#!msg/pandoc-discuss/qMu6_5lYy0o/ZAU7lzAIKw0J (Jose Luis Duran).
- Added
\strut
to fix spacing in multiline tables (Jose Luis Duran). - Use
\tabularnewline
instead of\\
in table cells (Jose Luis Duran). - Made horizontal rules more flexible (Jose Luis Duran).
- Text.Pandoc.MIME:
- Added
MimeType
(type synonym forString
) andgetMimeTypeDef
. Code cleanups (Artyom Kazak). - Templates:
- LaTeX template: disable microtype protrusion for typewriter font (#1549, thanks lemzwerg).
- Improved OSX build procedure.
- Added
network-uri
flag, to deal with split ofnetwork-uri
fromnetwork
. - Fix build dependencies for the
trypandoc
flag, so that they are ignored iftrypandoc
flag is set to False (Gabor Pali). - Updated README to remove outdated claim that
--self-contained
looks in the user data directory for missing files.
pandoc 1.13.0.1
This release fixes a couple of serious regressions in 1.13.
- Docx writer:
- Fixed regression which bungled list numbering (#1544), causing all lists to appear as basic ordered lists.
- Include row width in table rows (Christoffer Ackelman, Viktor Kronvall). Added a property to all table rows where the sum of column widths is specified in pct (fraction of 5000). This helps persuade Word to lay out the table with the widths we specify.
- Fixed a bug in Windows 8 which caused pandoc not to find the
pandoc-citeproc
filter (#1542). - Docx reader: miscellaneous under-the-hood improvements (Jesse Rosenthal). Most significantly, the reader now uses Builder, leading to some performance improvements.
- HTML reader: Parse appropriately styled span as SmallCaps.
- Markdown writer: don't escape
$
,^
,~
whentex_math_dollars
,superscript
, andsubscript
extensions, respectively, are deactivated (#1127). - Added
trypandoc
flag to build CGI executable used in the online demo. - Makefile: Added 'quick', 'osxpkg' targets.
- Updated README in templates to indicate templates license. The templates are dual-licensed, BSD3 and GPL2+.
pandoc 1.13
New features
- Added
docx
as an input format (Jesse Rosenthal). The docx reader includes conversion of native Word equations to pandoc LaTeXMath
elements. Metadata is taken from paragraphs at the beginning of the document with stylesAuthor
,Title
,Subtitle
,Date
, andAbstract
. - Added
epub
as an input format (Matthew Pickering). The epub reader includes conversion of MathML to pandoc LaTeXMath
elements. - Added
t2t
(Txt2Tags) as an input format (Matthew Pickering). Txt2tags is a lightweight markup format described at http://txt2tags.org/. - Added
dokuwiki
as an output format (Clare Macrae). - Added
haddock
as an output format. - Added
--extract-media
option to extract media contained in a zip container (docx or epub) while adjusting image paths to point to the extracted images. - Added a new markdown extension,
compact_definition_lists
, that restores the syntax for definition lists of pandoc 1.12.x, allowing tight definition lists with no blank space between items, and disallowing lazy wrapping. (See below under behavior changes.) - Added an extension
epub_html_exts
for parsing HTML in EPUBs. - Added extensions
native_spans
andnative_divs
to activate parsing of material in HTML span or div tags as Pandoc Span inlines or Div blocks. --trace
now works with the Markdown, HTML, Haddock, EPUB, Textile, and MediaWiki readers. This is an option intended for debugging parsing problems; ordinary users should not need to use it.
Behavior changes
- Changed behavior of the
markdown_attribute
extension, to bring it in line with PHP markdown extra and multimarkdown. Settingmarkdown="1"
on an outer tag affects all contained tags, recursively, until it is reversed withmarkdown="0"
(#1378). - Revised markdown definition list syntax (#1429). Both the reader and writer are affected. This change brings pandoc's definition list syntax into alignment with that used in PHP markdown extra and multimarkdown (with the exception that pandoc is more flexible about the definition markers, allowing tildes as well as colons). Lazily wrapped definitions are now allowed. Blank space is required between list items. The space before a definition is used to determine whether it is a paragraph or a "plain" element. WARNING: This change may break existing documents! Either check your documents for definition lists without blank space between items, or use
markdown+compact_definition_lists
for the old behavior. .numberLines
now works in fenced code blocks even if no language is given (#1287, jgm/highlighting-kate#40).- Improvements to
--filter
: - Don't search PATH for a filter with an explicit path. This fixed a bug wherein
--filter ./caps.py
would runcaps.py
from the system path, even if there was acaps.py
in the working directory. - Respect shebang if filter is executable (#1389).
- Don't print misleading error message. Previously pandoc would say that a filter was not found, even in a case where the filter had a syntax error.
- HTML reader:
- Parse
div
andspan
elements even without--parse-raw
, providednative_divs
andnative_spans
extensions are set. Motivation: these now generate native pandoc Div and Span elements, not raw HTML. - Parse EPUB-specific elements if the
epub_html_exts
extension is enabled. These includeswitch
,footnote
,rearnote
,noteref
. - Org reader:
- Support for inline LaTeX. Inline LaTeX is now accepted and parsed by the org-mode reader. Both math symbols (like
\tau
) and LaTeX commands (like\cite{Coffee}
), can be used without any further escaping (Albert Krewinkel). - Textile reader and writer:
- The
raw_tex
extension is no longer set by default. You can enable it withtextile+raw_tex
. - DocBook reader:
- Support
equation
,informalequation
,inlineequation
elements withmml:math
content. This is converted into LaTeX and put into a Pandoc Math inline. - Revised
plain
output, largely following the style of Project Gutenberg: - Emphasis is rendered with
_underscores_
, strong emphasis with ALL CAPS. - Headings are rendered differently, with space to set them off, not with setext style underlines. Level 1 headers are ALL CAPS.
- Math is rendered using unicode when possible, but without the distracting emphasis markers around variables.
- Footnotes use a regular
[n]
style. - Markdown writer:
- Horizontal rules are now a line across the whole page.
- Prettier pipe tables. Columns are now aligned (#1323).
- Respect the
raw_html
extension.pandoc -t markdown-raw_html
no longer emits any raw HTML, including span and div tags generated by Span and Div elements. - Use span with style for
SmallCaps
(#1360). - HTML writer:
- Autolinks now have class
uri
, and email autolinks have classemail
, so they can be styled. - Docx writer:
- Document formatting is carried over from
reference.docx
. This includes margins, page size, page orientation, header, and footer, including images in headers and footers. - Include abstract (if present) with
Abstract
style (#1451). - Include subtitle (if present) with
Subtitle
style, rather than tacking it on to the title (#1451). - Org writer:
- Write empty span elements with an id attribute as org anchors. For example
Span ("uid",[],[]) []
becomes<<uid>>
. - LaTeX writer:
- Put table captions above tables, to match the conventional standard. (Previously they appeared below tables.)
- Use
\(..\)
instead of$..$
for inline math (#1464). - Use
\nolinkurl
in email autolinks. This allows them to be styled using\urlstyle{tt}
. Thanks to Ulrike Fischer for the solution. - Use
\textquotesingle
for'
in inline code. Otherwise we get curly quotes in the PDF output (#1364). - Use
\footnote<.>{..}
for notes in beamer, so that footnotes do not appear before the overlays in which their markers appear (#1525). - Don't produce a
\label{..}
for a Div or Span element. Do produce a\hyperdef{..}
(#1519). - EPUB writer:
- If the metadata includes
page-progression-direction
(which can beltr
orrtl
, thepage-progression-direction
attribute will be set in the EPUB spine (#1455). - Custom lua writers:
- Custom writers now work with
--template
. - Removed HTML header scaffolding from
sample.lua
. - Made citation information available in lua writers.
--normalize
andText.Pandoc.Shared.normalize
now consolidate adjacentRawBlock
s when possible.
API changes
- Added
Text.Pandoc.Readers.Docx
, exportingreadDocx
(Jesse Rosenthal). - Added
Text.Pandoc.Readers.EPUB
, exportingreadEPUB
(Matthew Pickering). - Added
Text.Pandoc.Readers.Txt2Tags
, exportingreadTxt2Tags
(Matthew Pickering). - Added
Text.Pandoc.Writers.DokuWiki
, exportingwriteDokuWiki
(Clare Macrae). - Added
Text.Pandoc.Writers.Haddock
, exportingwriteHaddock
. - Added
Text.Pandoc.MediaBag
, exportingMediaBag
,lookupMedia
,insertMedia
,mediaDirectory
,extractMediaBag
. The docx and epub readers return a pair of aPandoc
document and aMediaBag
with the media resources they contain. This can be extracted using--extract-media
. Writers that incorporate media (PDF, Docx, ODT, EPUB, RTF, or HTML formats with--self-contained
) will look for resources in theMediaBag
generated by the reader, in addition to the file system or web. Text.Pandoc.Readers.TexMath
: Removed deprecatedreadTeXMath
. RenamedreadTeXMath'
totexMathToInlines
.Text.Pandoc
: AddedReader
data type (Matthew Pickering).readers
now associates names of readers withReader
structures. This allows inclusion of readers, like the docx reader, that take binary rather than textual input.Text.Pandoc.Shared
:- Added
capitalize
(Artyom Kazak), and replaced uses ofmap toUpper
(which give bad results for many languages). - Added
collapseFilePath
, which removes intermediate.
and..
from a path (Matthew Pickering). - Added
fetchItem'
, which works likefetchItem
but searches aMediaBag
before looking on the net or file system. - Added
withTempDir
. - Added
removeFormatting
. - Added
extractSpaces
(from HTML reader) and generalized its type so that it can be used by the docx reader (Matthew Pickering). - Added
ordNub
. - Added
normalizeInlines
,normalizeBlocks
. normalize
is nowPandoc -> Pandoc
instead ofData a :: a -> a
. Some users may need to change their uses ofnormalize
to the newly exportednormalizeInlines
ornormalizeBlocks
.Text.Pandoc.Options
:- Added
writerMediaBag
toWriterOptions
. - Removed deprecated and no longer used
readerStrict
inReaderOptions
. This is handled byreaderExtensions
now. - Added
Ext_compact_definition_lists
. - Added
Ext_epub_html_exts
. - Added
Ext_native_divs
andExt_native_spans
. This allows users to turn off the default pandoc behavior of parsing contents of div and span tags in markdown and HTML as native pandoc Div blocks and Span inlines. Text.Pandoc.Parsing
:- Generalized
readWith
toreadWithM
(Matthew Pickering). - Export
runParserT
andStream
(Matthew Pickering). - Added
HasQuoteContext
type class (Matthew Pickering). - Generalized types of
mathInline
,smartPunctuation
,quoted
,singleQuoted
,doubleQuoted
,failIfInQuoteContext
,applyMacros
(Matthew Pickering). - Added custom
token
(Matthew Pickering). - Added
stateInHtmlBlock
toParserState
. This is used to keep track of the ending tag we're waiting for when we're parsing inside HTML block tags. - Added
stateMarkdownAttribute
toParserState
. This is used to keep track of whether the markdown attribute has been set in an enclosing tag. - Generalized type of
registerHeader
, using new type classesHasReaderOptions
, `...
pandoc 1.12.4.2
- Require highlighting-kate >= 0.5.8. Fixes a performance regression.
- Shared:
addMetaValue
now behaves slightly differently: if both the new and old values are lists, it concatenates their contents to form a new list. - LaTeX reader:
- Set
bibliography
in metadata from\bibliography
or\addbibresource
command. - Don't error on
%foo
with no trailing newline. - Org reader:
- Support code block headers (
#+BEGIN_SRC ...
) (Albert Krewinkel). - Fix parsing of blank lines within blocks (Albert Krewinkel).
- Support pandoc citation extension (Albert Krewinkel). This can be turned off by specifying
org-citation
as the input format. - Markdown reader:
citeKey
moved toText.Pandoc.Parsing
so it can be used by other readers (Albert Krewinkel).Text.Pandoc.Parsing
:- Added
citeKey
(see above). - Added
HasLastStrPosition
type class andupdateLastStrPos
andnotAfterString
functions. - Updated copyright notices (Albert Krewinkel).
- Added default.icml to data files so it installs with the package.
- OSX package:
- The binary is now built with options to ensure that it can be used with OSX 10.6+.
- Moved OSX package materials to osx directory.
- Added OSX package uninstall script, included in the zip container (thanks to Daniel T. Staal).
pandoc 1.12.4
- Made it possible to run filters that aren't executable (#1096).
Pandoc first tries to find the executable (searching the path
if path isn't given). If it fails, but the file exists and has
a.py
,.pl
,.rb
,.hs
, or.php
extension, pandoc runs the filter
using the appropriate interpreter. This should make it easier to
use filters on Windows, and make it more convenient for everyone. - Added Emacs org-mode reader (Albert Krewinkel).
- Added InDesign ICML Writer (mb21).
- MediaWiki reader:
- LaTeX reader:
- Give better location information on errors, pointing to line
numbers within included files (#1274). - LaTeX reader: Better handling of
table
environment (#1204).
Positioning options no longer rendered verbatim. - Better handling of figure and table with caption (#1204).
- Handle
@{}
andp{length}
in tabular. The length is not actually
recorded, but at least we get a table (#1180). - Properly handle
\nocite
. It now adds anocite
metadata
field. Citations there will appear in the bibliography but not
in the text (unless you explicitly put a$nocite$
variable
in your template).
- Give better location information on errors, pointing to line
- Markdown reader:
- Ensure that whole numbers in YAML metadata are rendered without
decimal points. (This became necessary with changes to aeson
and yaml libraries. aeson >= 0.7 and yaml >= 0.8.8.2 are now required.) - Fixed regression on line breaks in strict mode (#1203).
- Small efficiency improvements.
- Improved parsing of nested
div
s. Formerly a closingdiv
tag
would be missed if it came right after other block-level tags. - Avoid backtracking when closing
</div>
not found. - Fixed bug in reference link parsing in
markdown_mmd
. - Fixed a bug in list parsing (#1154). When reading a raw list
item, we now strip off up to 4 spaces. - Fixed parsing of empty reference link definitions (#1186).
- Made one-column pipe tables work (#1218).
- Ensure that whole numbers in YAML metadata are rendered without
- Textile reader:
- Better support for attributes. Instead of being ignored, attributes
are now parsed and included in Span inlines. The output will be a bit
different from stock textile: e.g. for*(foo)hi*
, we'll get
<em><span class="foo">hi</span></em>
instead of
<em class="foo">hi</em>
. But at least the data is not lost. - Improved treatment of HTML spans (%) (#1115).
- Improved link parsing. In particular we now pick up on attributes.
Since pandoc links can't have attributes, we enclose the whole link in
a span if there are attributes (#1008). - Implemented correct parsing rules for inline markup (#1175, Matthew
Pickering). - Use Builder (Matthew Pickering).
- Better support for attributes. Instead of being ignored, attributes
- DocBook reader:
- Better treatment of
formalpara
. We now emit the title (if present)
as a separate paragraph with boldface text (#1215). - Set metadata
author
notauthors
. - Added recognition of
authorgroup
andreleaseinfo
elements (#1214,
Matthew Pickering). - Converted current meta information parsing in DocBook to a more
extensible version which is aware of the more recent meta
representation (Matthew Pickering).
- Better treatment of
- HTML reader:
- Require tagsoup 0.13.1, to fix a bug with parsing of script tags
(#1248). - Treat processing instructions & declarations as block. Previously
these were treated as inline, and included in paragraph tags in HTML
or DocBook output, which is generally not what is wanted (#1233). - Updated
closes
with rules from HTML5 spec. - Use Builder (Matthew Pickering, #1162).
- Require tagsoup 0.13.1, to fix a bug with parsing of script tags
- RST reader:
- Remove duplicate
http
in PEP links (Albert Krewinkel). - Make rst figures true figures (#1168, CasperVector)
- Enhanced Pandoc's support for rST roles (Merijn Verstaaten).
rST parser now supports: all built-in rST roles, new role definition,
role inheritance, though with some limitations. - Use
author
rather thanauthors
in metadata. - Better handling of directives. We now correctly handle field
lists that are indented more than three spaces. We treat an
aafig
directive as a code block with attributes, so it can be
processed in a filter (#1212).
- Remove duplicate
- LaTeX writer:
- Mark span contents with label if span has an ID (Albert Krewinkel).
- Made
--toc-depth
work well with books in latex/pdf output (#1210). - Handle line breaks in simple table cells (#1217).
- Workaround for level 4-5 headers in quotes. These previously produced
invalid LaTeX:\paragraph
or\subparagraph
in aquote
environment.
This adds anmbox{}
in these contexts to work around the problem.
See http://tex.stackexchange.com/a/169833/22451 (#1221). - Use
\/
to avoid en-dash ligature instead of-{}-
(Vaclav Zeman).
This is to fix LuaLaTeX output. The-{}-
sequence does not avoid the
ligature with LuaLaTeX but\/
does. - Fixed string escaping in
hyperref
andhyperdef
(#1130).
- ConTeXt writer: Improved autolinks (#1270).
- DocBook writer:
- Improve handling of hard line breaks in Docbook writer
(Neil Mayhew). Use a<literallayout>
for the entire paragraph, not
just for the newline character. - Don't let line breaks inside footnotes influence the enclosing
paragraph (Neil Mayhew). - Distinguish tight and loose lists in DocBook output, using
spacing="compact"
(Neil Mayhew, #1250).
- Improve handling of hard line breaks in Docbook writer
- Docx writer: When needed files are not present in the user's
reference.docx
, fall back on the versions in thereference.docx
in pandoc's data files. This fixes a bug that occurs when a
reference.docx
saved by LibreOffice is used. (#1185) - EPUB writer:
- Include extension in epub ids. This fixes a problem with duplicate
extensions for fonts and images with the same base name but different
extensions (#1254). - Handle files linked in raw
img
tags (#1170). - Handle media in
audio
source tags (#1170).
Note that we now use amedia
directory rather thanimages
. - Incorporate files linked in
video
tags (#1170).src
andposter
will both be incorporated intocontent.opf
and the epub container.
- Include extension in epub ids. This fixes a problem with duplicate
- HTML writer:
- Add colgroup around col tags (#877). Also affects EPUB writer.
- Fixed bug with unnumbered section headings. Unnumbered section
headings (with classunnumbered
) were getting numbers. - Improved detection of image links. Previously image links with
queries were not recognized, causing<embed>
to be used instead
of<img>
.
- Man writer: Ensure that terms in definition lists aren't line wrapped
(#1195). - Markdown writer:
- Use proper escapes to avoid unwanted lists (#980). Previously we used
0-width spaces, an ugly hack. - Use longer backtick fences if needed (#1206). If the content contains a
backtick fence and there are attributes, make sure longer fences are
used to delimit the code. Note: This works well in pandoc, but github
markdown is more limited, and will interpret the first string of three
or more backticks as ending the code block.
- Use proper escapes to avoid unwanted lists (#980). Previously we used
- RST writer: Avoid stack overflow with certain tables (#1197).
- RTF writer: Fixed table cells containing paragraphs.
- Custom writer:
- AsciiDoc writer: Handle multiblock and empty table cells
(#1245, #1246). Added tests. Text.Pandoc.Options
: AddedreaderTrace
toReaderOptions
Text.Pandoc.Shared
:- Added
compactify'DL
(formerly in markdown reader) (Albert Krewinkel). - Fixed bug in
toRomanNumeral
: numbers ending with '9' would
be rendered as Roman numerals ending with 'IXIV' (#1249). Thanks to
Jesse Rosenthal. openURL
: set proxy with value of http_proxy env variable (#1211).
Note: proxies with non-root paths are not supported, due to
limitations inhttp-conduit
.
- Added
Text.Pandoc.PDF
:- Ensure that temp directories deleted on Windows (#1192). The PDF is
now read as a strict bytestring, ensuring that process ownership will
be terminated, so the temp directory can be deleted. - Use
/
as path separators in a few places, even on Windows.
This seems to be necessary for texlive (#1151, thanks to Tim Lin). - Use
;
forTEXINPUTS
separator on Windows (#1151). - Changes to error reporting, to handle non-UTF8 error output.
- Ensure that temp directories deleted on Windows (#1192). The PDF is
Text.Pandoc.Templates
:- Removed unneeded datatype context (Merijn Verstraaten).
- YAML objects resolve to "true" in conditionals (#1133).
Note: Ifaddress
is a YAML object and you just have$address$
in your template, the wordtrue
will appear, which may be
unexpected. (Previously nothing would appear.)
Text.Pandoc.SelfContained
: Handleposter
attribute invideo
tags (#1188).Text.Pandoc.Parsing
:- Made
F
an instance of Applicative (#1138). - Added
stateCaption
. - Added
HasMacros
, simplified other typeclasses.
RemovedupdateHeaderMap
,setHeaderMap
,getHeaderMap
,
updateIdentifierList
,setIdentifierList
,getIdentifierList
. - Changed the smart punctuation parser to return
Inlines
rather thanInline
(Matthew Pickering). - Changed
HasReaderOptions
,HasHeaderMap
,HasIdentifierList
from typeclasses of monads to typeclasses of states. This simplifies
the instance definitions and provides more flexibility. Generalized
type ofgetOption
and added a default definition. Removed
askReaderOption
. AddedextractReaderOption
. Added
extractHeaderMap
andupdateHeaderMap
inHasHeaderMap
.
...
- Made
pandoc 1.12.3
- The
--bibliography
option now sets thebiblio-files
variable. So, if you're using--natbib
or--biblatex
, you can just use--bibliography=foo.bib
instead of-V bibliofiles=foo
. - Don't run pandoc-citeproc filter if
--bibliography
is used together with--natbib
or--biblatex
(Florian Eitel). - Template changes:
- Updated beamer template to include booktabs.
- Added
abstract
variable to LaTeX template. - Put
header-includes
aftertitle
in LaTeX template (#908). - Allow use of
\includegraphics[size]
in beamer. This just required porting a macro definition from the default LaTeX template to the default beamer template. reference.docx
: IncludeFootnoteText
style. Otherwise Word ignores the style, even when specified in thepPr
. (#901)reference.odt
: Tidiedstyles.xml
.- Relaxed version bounds for dependencies.
- Added
withSocketsDo
around http conduit code inopenURL
, so it works on Windows (#1080). - Added
Cite
function tosample.lua
. - Markdown reader:
- Fixed regression in title blocks (#1089). If author field was empty, date was being ignored.
- Allow backslash-newline hard line breaks in grid and multiline table cells.
- Citation keys may now start with underscores, and may contain underscores adjacent to internal punctuation.
- LaTeX reader:
- Add support for
Verb
macro (jrnold) (#1090). - Support babel-style quoting:
"
..."'`. - Properly handle script blocks in strict mode. (That is,
markdown-markdown_in_html_blocks
.) Previously a spurious<p>
tag was being added (#1093). - Docbook reader: Avoid failure if
tbody
contains notr
orrow
elements. - LaTeX writer:
- Factored out function for table cell creation.
- Better treatment of footnotes in tables. Notes now appear in the regular sequence, rather than in the table cell. (This was a regression in 1.10.)
- HTML reader: Parse name/content pairs from meta tags as metadata. Closes #1106.
- Moved
fixDisplayMath
from Docx writer toWriter.Shared
. - OpenDocument writer: Fixed
RawInline
,RawBlock
so they don't escape. - ODT writer: Use mathml for proper rendering of formulas. Note: LibreOffice's support for this seems a bit buggy. But it should be better than what we had before.
- RST writer: Ensure no blank line after def in definition list (#992).
- Markdown writer: Don't use tilde code blocks with braced attributes in
markdown_github
output. A consequence of this change is that the backtick form will be preferred in general if both are enabled. That is good, as it is much more widespread than the tilde form. (#1084) - Docx writer: Fixed problem with some modified reference docx files. Include
word/_rels/settings.xml.rels
if it exists, as well as otherrels
files besides the ones pandoc generates explicitly. - HTML writer:
- With
--toc
, headers no longer link to themselves (#1081). - Omit footnotes from TOC entries. Otherwise we get doubled footnotes when headers have notes!
- EPUB writer:
- Avoid duplicate notes when headings contain notes. This arose because the headings are copied into the metadata "title" field, and the note gets rendered twice. We strip the note now before putting the heading in "title".
- Strip out footnotes from toc entries.
- Fixed bug with
--epub-stylesheet
. Now the contents ofwriterEpubStylesheet
(set by--epub-stylesheet
) should again work, and take precedence over a stylesheet specified in the metadata. Text.Pandoc.Pretty
: Addednestle
. API change.Text.Pandoc.MIME
: Addedwmf
,emf
.Text.Pandoc.Shared
:fetchItem
now handles image URLs beginning with//
.Text.Pandoc.ImageSize
: Parse EXIF format JPEGs. Previously we could only get size information for JFIF format, which led to squished images in Word documents. Closes #976.- Removed old
MarkdownTest_1.0.3
directory (#1104).
pandoc 1.12.2.1
- Markdown reader: Fixed regression in list parser, involving continuation lines containing raw HTML (or even verbatim raw HTML).