# dfc/pandoc forked from jgm/pandoc

Switch branches/tags
Nothing to show
Fetching contributors…
Cannot retrieve contributors at this time
5096 lines (3839 sloc) 206 KB
 pandoc (1.10.1.1) * Windows installer improvements: + The installer is now signed with a certificate. + WiX is used instead of InnoSetup. The installer is now a standard msi file. + The version number is now auto-detected, and need not be updated separately. * OSX installer improvements: + The package and pandoc executable are now signed with a certificate. + RTF version of license is used. * Converted COPYING to markdown. * Text.Pandoc.UTF8: Strip off BOM if present. Closes #743. * README: List proper default data directory for Windows 7. * Added --default-image-extension and readerDefaultImageExtension. This allows you to determine extensions for extensionless image files from the command line, using different extensions for different output formats. Currently only works for input in markdown and LaTeX. * Beamer template: Fixed captions with longtable. Thanks to Joost Kremers. * Text.Pandoc.Parsing: Optimized oneOfStringsCI. This dramatically reduces the speed penalty that comes from enabling the autolink_bare_uris extension. The penalty is still substantial (in one test, from 0.33s to 0.44s), but nowhere near what it used to be. The RST reader is also much faster now, as it autodetects URIs. * HTML reader: Handle  tag. pandoc (1.10.1) * Markdown reader: various optimizations, leading to a significant performance boost. * RST reader: Allow anonymous form of inline links:  hello __  Closes #724. * Mediawiki reader: Don't require newlines after tables. Thanks to jrunningen for the patch. Closes #733. * Fixed LaTeX macro parsing. Now LaTeX macro definitions are preserved when output is LaTeX, and applied when it is another format. Partially addresses #730. * Markdown and RST readers: Added parser to block that skips blank lines. This fixes a subtle regression involving grid tables with empty cells. Also added test for grid table with empty cells. Closes #732. * RST writer: Use .. code:: language for code blocks with language. Closes #721. * DocBook writer: Fixed output for hard line breaks, adding a newline between  tags. * Markdown writer: Use an autolink when link text matches url. Previously we also checked for a null title, but this test fails for links produced by citeproc-hs in bibliographies. So, if the link has a title, it will be lost on conversion to an autolink, but that seems okay. * Markdown writer: Set title, author, date variables as before. These are no longer used in the default template, since we use titleblock, but we set them anyway for those who use custom templates. * LaTeX writer: Avoid extra space at start/end of table cell. Thanks to Nick Bart for the suggestion of using @{}. * Text.Pandoc.Parsing: + More efficient version of anyLine. + Type of macro has changed; the parser now returns Blocks instead of Block. * Relaxed old-time version bound, allowing 1.0.*. * Removed obsolete hsmarkdown script. Those who need hsmarkdown should create a symlink as described in the README. pandoc (1.10.0.5) * Markdown reader: Try lhsCodeBlock before rawTeXBlock. Otherwise \begin{code}...\end{code} isn't handled properly in markdown+lhs. Thanks to Daniel Miot for noticing the bug and suggesting the fix. * Markdown reader: Fixed bug with headerless grid tables. The 1.10 code assumed that each table header cell contains exactly one block. That failed for headerless tables (0) and also for tables with multiple blocks in a header cell. The code is fixed and tests provided. Thanks to Andrew Lee for pointing out the bug. * Markdown reader: Fixed regressions in fenced code blocks. Closes #722. + Tilde code fences can again take a bare language string (~~~ haskell), not just curly-bracketed attributes (~~~ {.haskell}). + Backtick code blocks can take the curly-bracketed attributes. + Backtick code blocks don't *require* a language. + Consolidated code for the two kinds of fenced code blocks. * LaTeX template: Use \urlstyle{same} to avoid monospace URLs. * Markdown writer: Use proportional font for email autolinks with obfuscation. Closes #714. * Corrected name of blank_before_blockquote in README. Closes #718. * Text.Pandoc.Shared: Fixed bug in uri parser. The bug prevented an autolink at the end of a string (e.g. at the end of a line block line) from counting as a link. Closes #711. * Use the hsb2hs preprocessor instead of TH for embed_data_files. This should work on Windows, unlike the TH solution with file-embed. * Eliminated use of TH in test suite. * Added Text.Pandoc.Data (non-exported) to hold the association list of embedded data files, if the embed_data_files flag is selected. This isolates the code that needs special treatment with file-embed or hsb2hs. * Changes to make-windows-installer.bat. + Exit batch file if any of the cabal-dev installs fail. + There's no longer any need to reinstall highlighting-kate. + Don't start with a cabal update; leave that to the user. + Force reinstall of pandoc. * Fixed EPUB writer so it builds with blaze-html 0.4.x. Thanks to Jens Petersen. pandoc (1.10.0.4) * Fixed bug with escaped % in LaTeX reader. Closes #710. pandoc (1.10.0.3) * Added further missing fb2 tests to cabal file. pandoc (1.10.0.2) * Added fb2 tests to cabal file's extra-source-files. pandoc (1.10.0.1) * Bump version bounds on test-framework packages. pandoc (1.10) [new features] * New input formats: mediawiki (MediaWiki markup). * New output formats: epub3 (EPUB v3 with MathML), fb2 (FictionBook2 ebooks). * New --toc-depth option, specifying how many levels of headers to include in a table of contents. * New --epub-chapter-level option, specifying the header level at which to divide EPUBs into separate files. Note that this normally affects only performance, not the visual presentation of the EPUB in a reader. * Removed the --strict option. Instead of using --strict, one can now use the format name markdown_strict for either input or output. This gives more fine-grained control that --strict did, allowing one to convert from pandoc's markdown to strict markdown or vice versa. * It is now possible to enable or disable specific syntax extensions by appending them (with + or -) to the writer or reader name. For example, pandoc -f markdown-footnotes+hard_line_breaks disables footnotes and enables treating newlines as hard line breaks. The literate Haskell extensions are now implemented this way as well, using either +lhs or +literate_haskell. For a list of extension names, see the README under "Pandoc's Markdown." * The following aliases have been introduced for specific combinations of markdown extensions: markdown_phpextra, markdown_github, markdown_mmd, markdown_strict. These aliases work just like regular reader and writer names, and can be modified with extension modifiers as described above. (Note that conversion from one markdown dialect to another does not work perfectly, because there are differences in markdown parsers besides just the extensions, and because pandoc's internal document model is not rich enough to capture all of the extensions.) * New --html-q-tags option. The previous default was to use  tags for smart quotes in HTML5. But  tags are also valid HTML4. Moreover, they are not a robust way of typesetting quotes, since some user agents don't support them, and some CSS resets (e.g. bootstrap) prevent pandoc's quotes CSS from working properly. We now just insert literal quote characters by default in both html and html5 output, but this option is provided for those who still want  tags. * The markdown reader now prints warnings (to stderr) about duplicate link and note references. Closes #375. * Markdown syntax extensions: + Added pipe tables. Thanks to François Gannaz for the initial patch. These conform to PHP Markdown Extra's pipe table syntax. A subset of org-mode table syntax is also supported, which means that you can use org-mode's nice table editor to create tables. + Added support for RST-style line blocks. These are useful for verse and addresses. + Attributes can now be specified for headers, using the same syntax as in code blocks. (However, currently only the identifier has any effect in most writers.) For example, # My header {#foo} See [the header above](#foo). + Pandoc will now act as if link references have been defined for all headers without explicit identifiers. So, you can do this: # My header Link to [My header]. Another link to [it][My header]. Closes #691. * LaTeX reader: + Command macros now work everywhere, including non-math. Environment macros still not supported. + \input now works, as well as \include. TEXINPUTS is used. Pandoc looks recursively into included files for more included files. [behavior changes] * The Markdown reader no longer puts the text of autolinks in a Code inline. This means that autolinks will no longer appear in a monospace font. * The character / can now appear in markdown citation keys. * HTML blocks in strict_markdown are no longer required to begin at the left margin. Technically this is required, according to the markdown syntax document, but Markdown.pl and other markdown processors are more liberal. * The -V option has been changed so that if there are duplicate variables, those specified later on the command line take precedence. * Tight lists now work in LaTeX and ConTeXt output. * The LaTeX writer no longer relien on the enumerate package. Instead, it uses standard LaTeX commands to change the list numbering style. * The LaTeX writer now uses longtable instead of ctable. This allows tables to be split over page boundaries. * The RST writer now uses a line block to render paragraphs containing linebreaks (which previously weren't supported at all). * The markdown writer now applies the --id-prefix to footnote IDs. Closes #614. * The plain writer no longer uses backslash-escaped line breaks (which are not very "plain"). * Text.Pandoc.UTF8: Better error message for invalid UTF8. Read bytestring and use Text's decodeUtf8 instead of using System.IO.hGetContents. This way you get a message saying "invalid UTF-8 stream" instead of "invalid byte sequence." You are also told which byte caused the problem. * Docx, ODT, and EPUB writers now download images specified by a URL instead of skipping them or raising an error. * EPUB writer: + The default CSS now left-aligns headers by default, instead of centering. This is more consistent with the rest of the writers. + A proper multi-level table of contents is now used in toc.ncx. There is no longer a subsidiary table of contents at the beginning of each chapter. + Code highlighting now works by default. + Section divs are used by default for better semantic markup. + The title is used instead of "Title Page" in the table of contents. Otherwise we have a hard-coded English string, which looks strange in ebooks written in other languages. Closes #572. * HTML writer: + Put mathjax in span with class "math". Closes #562. + Put citations in a span with class "citation." In HTML5, also include a data-cite attribute with a space-separated list of citation keys. * Text.Pandoc.UTF8: use universalNewlineMode in reading. This treats both \r\n and \n as \n on input, no matter what platform we're running on. * Citation processing is now done in the Markdown and LaTeX readers, not in pandoc.hs. This makes it easier for library users to use citations. [template changes] * HTML: Added css to template to preserve spaces in  tags. Thanks to Dirk Laurie. * Beamer: Remove English-centric strings in section pages. Section pages used to have "Section" and a number as well as the section title. Now they just have the title. Similarly for part and subsection. Closes #566. * LaTeX, ConTeXt: Added papersize variable. * LaTeX, Beamer templates: Use longtable instead of ctable. * LaTeX, Beamer templates: Don't require 'float' package for tables. We don't actually seem to use the '[H]' option. * LaTeX: Use upquote package if it is available. This fixes straight quotes in verbatim environments. * Markdown, plain: Fixed titleblock so it is just a single string. Previously separate title, author, and date variables were used, but this didn't allow different kinds of title blocks. * EPUB: + Rationalized templates. Previously there were three different templates involved in epub production. There is now just one template, default.epub or default.epub3. It can now be overridden using --template, just like other templates. The titlepage is now folded into the default template. A titlepage variable selects it. + UTF-8, lang tag, meta tags, title element. * Added scale-to-width feature to beamer template [API changes] * Text.Pandoc.Definition: Added Attr field to Header. Previously header identifers were autogenerated by the writers. Now they are added in the readers (either automatically or explicitly). * Text.Pandoc.Builder: + Inlines and Blocks are now synonyms for Many Inline and Many Block. Many is a newtype wrapper around Seq, with custom Monoid instances for Many Inline and Many Block. This allows Many to be made an instance of Foldable and Traversable. + The old Listable class has been removed. + The module now exports isNull, toList, fromList. + The old Read and Show instances have been removed; derived instances are now used. + Added headerWith. * The readers now take a ReaderOptions rather than a ParserState as a parameter. Indeed, not all parsers use the ParserState type; some have a custom state. The motivation for this change was to separate user-specifiable options from the accounting functions of parser state. * New module Text.Pandoc.Options. This includes the WriterOptions formerly in Text.Pandoc.Shared, and its associated data types. It also includes a new type ReaderOptions, which contains many options formerly in ParserState, and its associated data types: + ParserState.stateParseRaw -> ReaderOptions.readerParseRaw. + ParserState.stateColumns -> ReaderOptions.readerColumns. + ParserState.stateTabStop -> ReaderOptions.readerTabStop. + ParserState.stateOldDashes -> ReaderOptions.readerOldDashes. + ParserState.stateLiterateHaskell -> ReaderOptions.readerLiterateHaskell. + ParserState.stateCitations -> ReaderOptions.readerReferences. + ParserState.stateApplyMacros -> ReaderOptions.readerApplyMacros. + ParserState.stateIndentedCodeClasses -> ReaderOptions.readerIndentedCodeClasses. + Added ReaderOptions.readerCitationStyle. * WriterOptions now includes writerEpubVersion, writerEpubChapterLevel, writerEpubStylesheet, writerEpubFonts, writerReferenceODT, writerReferenceDocx, and writerTOCDepth. writerEPUBMetadata has been renamed writerEpubMetadata for consistency. * Changed signatures of writeODT, writeDocx, writeEPUB, since they no longer stylesheet, fonts, reference files as separate parameters. * Removed writerLiterateHaskell from WriterOptions, and readerLiterateHaskell from ReaderOptions. LHS is now handled by an extension (Ext_literate_haskell). * Removed deprecated writerXeTeX. * Removed writerStrict from WriterOptions. Added writerExtensions. Strict is now handled through extensions. * Text.Pandoc.Options exports pandocExtensions, strictExtensions, phpMarkdownExtraExtensions, githubMarkdownExtensions, and multimarkdownExtensions, as well as the Extensions type. * New Text.Pandoc.Readers.MediaWiki module, exporting readMediaWiki. * New Text.Pandoc.Writers.FB2 module, exporting writeFB2 (thanks to Sergey Astanin). * Text.Pandoc: + Added getReader, getWriter to Text.Pandoc. + writers is now an association list (String, Writer). A Writer can be a PureStringWriter, an IOStringWriter, or an IOByteStringWriter. ALL writers are now in the 'writers' list, including the binary writers and FB2 writer. This allows code in pandoc.hs to be simplified. + Changed type of readers, so all readers are in IO. Users who want pure readers can still get them form the reader modules; this just affects the function getReader that looks up a reader based on the format name. The point of this change is to make it possible to print warnings from the parser. * Text.Pandoc.Parsing: + Text.Parsec now exports all Parsec functions used in pandoc code. No other module directly imports Parsec. This will make it easier to change the parsing backend in the future, if we want to. + Text.Parsec is used instead of Text.ParserCombinators.Parsec. + Export the type synonym Parser. + Export widthsFromIndices, NoteTable', KeyTable', Key', toKey', withQuoteContext, singleQuoteStart, singleQuoteEnd, doubleQuoteStart, doubleQuoteEnd, ellipses, apostrophe, dash, nested, F(..), askF, asksF, runF, lineBlockLines. + ParserState is no longer an instance of Show. + Added stateSubstitutions and stateWarnings to ParserState. + Generalized type of withQuoteContext. + Added guardEnabled, guardDisabled, getOption. + Removed failIfStrict. + lookupKeySrc and fromKey are no longer exported. * Data.Default instances are now provided for ReaderOptions, WriterOptions, and ParserState. Text.Pandoc re-exports def. Now you can use def (which is re-exported by Text.Pandoc) instead of defaultWriterOptions (which is still defined). Closes #546. * Text.Pandoc.Shared: + Added safeRead. + Renamed removedLeadingTrailingSpace to trim, removeLeadingSpace to triml, and removeTrailingSpace to trimr. + Count \r as space in trim functions. + Moved renderTags' from HTML reader and Text.Pandoc.SelfContained to Shared. + Removed failUnlessLHS. + Export compactify', formerly in Markdown reader. + Export isTightList. + Do not export findDataFile. + readDataFile now returns a strict ByteString. + Export readDataFileUTF8 which returns a String, like the old readDataFile. + Export fetchItem and openURL. * Text.Pandoc.ImageSize: Use strict, not lazy bytestrings. Removed readImageSize. * Text.Pandoc.UTF8: Export encodePath, decodePath, decodeArg, toString, fromString, toStringLazy, fromStringLazy. * Text.Pandoc.UTF8 is now an exposed module. * Text.Pandoc.Biblio: + csl parameter now a String rather than a FilePath. + Changed type of processBiblio. It is no longer in the IO monad. It now takes a Maybe Style argument rather than parameters for CSL and abbrev filenames. (pandoc.hs now calls the functions to parse the style file and add abbreviations.) * Markdown reader now exports readMarkdownWithWarnings. * Text.Pandoc.RTF now exports writeRTFWithEmbeddedImages instead of rtfEmbedImage. [bug fixes] * Make --ascii work properly with --self-contained. Closes #568. * Markdown reader: + Fixed link parser to avoid exponential slowdowns. Closes #620. Previously the parser would hang on input like this: [[[[[[[[[[[[[[[[[[hi We fixed this by making the link parser parser characters between balanced brackets (skipping brackets in inline code spans), then parsing the result as an inline list. One change is that [hi *there]* bud](/url) is now no longer parsed as a link. But in this respect pandoc behaved differently from most other implementations anyway, so that seems okay. + Look for raw html/latex blocks before tables. Otherwise the following gets parsed as a table: \begin{code} -------------- -- My comment. \end{code} Closes #578. * RST reader: + Added support for :target: on .. image:: blocks and substitutions. + Field list fixes: - Fixed field lists items with body beginning after a new line (Denis Laxalde). - Allow any char but ':' in names of field lists in RST reader (Denis Laxalde). - Don't allow line breaks in field names. - Require whitespace after field list field names. - Don't create empty definition list for metadata field lists. Previously a field list consisting only of metadata fields (author, title, date) would be parsed as an empty DefinitionList, which is not legal in LaTeX and not needed in any format. + Don't recognize inline-markup starts inside words. For example, 2*2 = 4*1 should not contain an emphasized section. Added test case for "Literal symbols". Closes #569. + Allow dashes as separator in simple tables. Closes #555. + Added support for container, compound, epigraph, rubric, highlights, pull-quote. + Added support for .. code::. + Made directive labels case-insensitive. + Removed requirement that directives begin at left margin. This was (correctly) not in earlier releases; docutils doesn't make the requirement. + Added support for replace:: and unicode:: substitutions. + Ignore unknown interpreted roles. + Renamed image parser to subst, since it now handles all substitution references. * Textile reader: + Allow newlines before pipes in table. Closes #654. + Fixed bug with list items containing line breaks. Now pandoc correctly handles hard line breaks inside list items. Previously they broke list parsing. + Implemented comment blocks. + Fixed bug affected words ending in hyphen. + Properly handle links with surrounding brackets. Square brackets need to be used when the link isn't surrounded by spaces or punctuation, or when the URL ending may be ambiguous. Closes #564. + Removed nullBlock. Better to know about parsing problems than to skip stuff when we get stuck. + Allow ID attributes on headers. + Textile reader: Avoid parsing dashes as strikeout. Previously the input text-- text-- text-- text-- would be parsed with strikeouts rather than dashes. This fixes the problem by requiring that a strikeout delimiting - not be followed by a -. Closes #631. + Expanded list of stringBreakers. This fixes a bug on input like "(_hello_)" which should be a parenthesized emphasized "hello". The new list is taken from the PHP source of textile 2.4. + Fixed autolinks. Previously the textile reader and writer incorrectly implented RST-style autolinks for URLs and email addresses. This has been fixed. Now an autolink is done this way: "":http://myurl.com. + Fixed footnotes bug in textile. This affected notes occuring before punctuation, e.g. foo[1].. Closes #518. * LaTeX reader: + Better handling of citation commands. + Better handling of \noindent. + Added a 'try' in rawLaTeXBlock, so we can handle \begin without {. Closes #622. + Made rawLaTeXInline try to parse block commands as well. This is usually what we want, given how rawLaTeXInline is used in the markdown and textile readers. If a block-level LaTeX command is used in the middle of a paragraph (e.g. \subtitle inside a title), we can treat it as raw inline LaTeX. + Handle \slash command. Closes #605. + Basic \enquote support. + Fixed parsing of paragraphs beginning with a group. Closes #606. + Use curly quotes for bare straight quotes. + Support obeylines environment. Closes #604. + Guard against "begin", "end" in inlineCommand and blockCommand. + Better error messages for environments. Now it should tell you that it was looking for \end{env}, instead of giving "unknown parse error." * HTML reader: + Added HTML 5 tags to list of block-level tags. + HTML reader: Fixed bug in htmlBalanced, which caused hangs in parsing certain markdown input using strict mode. + Parse  as Quoted DoubleQuote. + Handle nested  tags properly. + Modified htmlTag for fewer false positives. A tag must start with < followed by !,?, /, or a letter. This makes it more useful in the wikimedia and markdown parsers. * DocBook reader: Support title in "figure" element. Closes #650. * MediaWiki writer: + Remove newline after   in translation of LineBreak There's no particular need for a newline (other than making the generated MediaWiki source look nice to a human), and in fact sometimes it is incorrect: in particular, inside an enumeration, list items cannot have embedded newline characters. (Brent Yorgey) + Use  not  for Code. * Man writer: Escape - as \-. Unescaped -'s become hyphens, while \-'s are left as ascii minus signs. That is preferable for use with command-line options. See . Thanks to Andrea Bolognani for bringing the issue to our attention. * RST writer: + Improved line block output. Use nonbreaking spaces for initial indent (otherwise lost in HTML and LaTeX). Allow multiple paragraphs in a single line block. Allow soft breaks w continuations in line blocks. + Properly handle images with no alt text. Closes #678. + Fixed bug with links with duplicate text. We now (a) use anonymous links for links with inline URLs, and (b) use an inline link instead of a reference link if the reference link would require a label that has already been used for a different link. Closes #511. + Fixed hyperlinked images. Closes #611. Use :target: field when you have a simple linked image. + Don't add :align: center to figures. * Texinfo writer: Fixed internal cross-references. Now we insert anchors after each header, and use @ref instead of @uref for links. Commas are now escaped as @comma{} only when needed; previously all commas were escaped. (This change is needed, in part, because @ref commands must be followed by a real comma or period.) Also insert a blank line in from of @verbatim environments. * DocBook writer: + Made --id-prefix work in DocBook as well as HTML. Closes #607. + Don't include empty captions in figures. Closes #581. * LaTeX writer: + Use \hspace* for nonbreaking space after line break, since ~ spaces after a line break are just ignored. Closes #687. + Don't escape _ in URLs or hyperref identifiers. + Properly escape strings inside \url{}. Closes #576. + Use [fragile] only for slides containing code rendered using listings. Closes #649. + Escape | as \vert in LaTeX math. This avoids a clash with highlighting-kate's macros, which redefine | as a short verbatim delimiter. Thanks to Björn Peemöller for raising this issue. + Use minipage rather than parbox for block containers in tables. This allows verbatim code to be included in grid tables. Closes #663. + Prevent paragraphs containing only linebreaks or spaces. * HTML writer: + Included highlighting-css for code spans, too. Previously it was only included if used in a code block. Closes #653. + Improved line breaks with   tags. We now put a newline between   and   when there are multiple definitions. + Changed mathjax cdn url so it doesn't use https. (This caused problems when used with --self-contained.) See #609. * EPUB writer: + --number-sections now works properly. + Don't strip meta and link elements in epub metadata. Patch from aberrancy. Closes #589. + Fixed a couple validation bugs. + Use ch001, ch002, etc. for chapter filenames. This improves sorting of chapters in some readers, which apparently sort ch2 after ch10. Closes #610. * ODT writer: properly set title property (Arlo O'Keeffe). * Docx writer: + Fixed bug with nested lists. Previously a list like 1. one - a - b 2. two would come out with a bullet instead of "2." Thanks to Russell Allen for reporting the bug. + Use w:cr in w:r instead of w:br for linebreaks. This seems to fix a problem viewing pandoc-generated docx files in LibreOffice. + Use integer ids for bookmarks. Closes #626. + Added nsid to abstractNum elements. This helps when merging word documents with numbered or bulleted lists. Closes #627. + Use separate footnotes.xml for notes. This seems to help LibreOffice convert the file, even though it was valid docx before. Closes #637. + Use rIdNN identifiers for r:embed in images. + Avoid reading image files again when we've already processed them. + Fixed typo in referenc.docx that prevented image captions from working. Thanks to Huashan Chen. * Text.Pandoc.Parsing: + Fixed bug in withRaw, which didn't correctly handle the case where nothing is parsed. + Made emailAddress parser more correct. Now it is based on RFC 822, though it still doesn't implement quoted strings in email addresses. + Revised URI parser. It now allows many more schemes, allows uppercase URIs, and better handles trailing punctuation and trailing slashes in bare URIs. Added many tests. + Simplified and improved singleQuoteStart. This makes 's', 'l', etc. parse properly. Formerly we had some English-centric heuristics, but they are no longer needed. Closes #698. * Text.Pandoc.Pretty: Added wide punctuation range to charWidth. This fixes bug with Chinese commas in markdown and reST tables, and a bug that caused combining characters to be dropped. * Text.Pandoc.MIME: Added MIME types for .wof and .eot. Closes #640. * Text.Pandoc.Biblio: + Run mvPunc and deNote on metadata too. This fixed a bug with notes on titles using footnote styles. + Fixed bug in fetching CSL files from CSL data directory. * pandoc.hs: Give correct value to writerSourceDirectory when a URL is provided. It should be the URL up to the path. * Fixed/simplified diff output for tests. Biblio: Make sure mvPunc and deNote run on metadata too. This fixed a bug with notes on titles using footnote styles. [under the hood improvements] * We no longer depend on utf8-string. Instead we use functions defined in Text.Pandoc.UTF8 that use Data.Text's conversions. * Use safeRead instead of using reads directly (various modules). * "Implicit figures" (images alone in a paragraph) are now handled differently. The markdown reader gives their titles the prefix fig:; the writers look for this before treating the image as a figure. Though this is a bit of a hack, it has two advantages: (i) implicit figures can be limited to the markdown reader, and (ii) they can be deactivated by turning off the implicit_figures extension. * catch from Control.Exception is now used instead of the old Preface catch. * Text.Pandoc.Shared: Improved algorithm for normalizeSpaces and oneOfStrings (which is now non-backtracking). * Text.Pandoc.Biblio: Remove workaround for toCapital. Now citeproc-hs is fixed upstream, so this is no longer needed. Closes #531. * Textile reader: Improved speed of hyphenedWords. This speeds up the textile reader by about a factor of 4. * Use Text.Pandoc.Builder in RST reader, for more flexibility, better performance, and automatic normalization. * Major rewrite of markdown reader: + Use Text.Pandoc.Builder instead of lists. This also means that everything is normalized automatically. + Move to a one-pass parsing strategy, returning values in the reader monad, which are then run (at the end of parsing) against the final parser state. * In HTML writer, we now use toHtml instead of pre-escaping. We work around the problem that blaze-html unnecessarily escapes ' by pre-escaping just the ' characters, instead of the whole string. If blaze-html later stops escaping ' characters, we can simplify strToHtml to toHtml. Closes #629. * Moved code for embedding images in RTFs from pandoc.hs to the RTF writer (which now exports writeRTFWithEmbeddedImages). * Moved citation processing from pandoc.hs into the readers. This makes things more convenient for library users. * The man pages are now built by an executable make-pandoc-man-pages, which has its own stanza in the cabal file so that dependencies can be handled by Cabal. Special treatment in Setup.hs ensures that this executable never gets installed; it is only used to create the man pages. * The cabal file has been modified so that the pandoc library is used in building the pandoc executable. (This required moving pandoc.hs from src to ..) This cuts compile time in half. * -O2 is no longer used in building pandoc. The performance improvement it yields is so slight that it is not worth it. (Measured with benchmarks on ghc 7.4.) * The executable and library flags have been removed. * -threaded has been removed from ghc-options. * Version bounds of dependencies have been raised, and the blaze_html_0_5 flag now defaults to True. Pandoc now compiles on GHC 7.6. * We now require base >= 4.2. * Integrated the benchmark program into cabal. One can now do: cabal configure --enable-benchmarks && cabal build cabal bench --benchmark-option='markdown' --benchmark-option='-s 20' The benchmark now uses README + testsuite, so benchmark results from older versions aren't comparable. * Integrated test suite with cabal. To run tests, configure with --enable-tests, then cabal test. You can specify particular tests using --test-options='-t markdown'. No output is shown unless tests fail. The Haskell test modules have been moved from src/ to tests/. * Moved all data files and templates to the data/ subdirectory. * Added an embed_data_files cabal flag. This causes all data files to be embedded in the binary, so that the binary is self-sufficient and can be relocated anywhere, copied on a USB key, etc. The Windows installer now uses this. (Since we no longer have the option to build the executable without the library, this is the only way to get a relocatable binary on Windows.) * Removed pcre3.dll from windows package. It isn't needed unless highlighting-kate is compilled with the pcre-light flag. By default, regex-prce-builtin is used. pandoc (1.9.4.2) * Don't encode/decode file paths if base >= 4.4. Prior to base 4.4, filepaths and command line arguments were treated as unencoded lists of bytes, not unicode strings, so we had to work around that by encoding and decoding them. This commit adds CPP checks for the base version that intelligibly enable encoding/decoding when needed. Fixes a bug with multilingual filenames when pandoc was compiled with ghc 7.4 (#540). * Don't generate an empty H1 after hrule slide breaks. We now use a slide-level header with contents [Str "\0"] to mark an hrule break. This avoids creation of an empty H1 in these contexts. Closes #484. * Docbook reader: Added support for "bold" emphasis. Thanks to mb21. * In make_osx_package.sh, ensure citeproc-hs is built with the embed_data_files flag. * MediaWiki writer: Avoid extra blank lines after sublists (Gavin Beatty). * ConTeXt writer: Don't escape &, ^, <, >, _, simplified escapes for } and { to \{ and \} (Aditya Mahajan). * Fixed handling of absolute URLs in CSS imports with --self-contained. Closes #535. * Added webm to mime types. Closes #543. * Added some missing exports and tests to the cabal file (Alexander V Vershilov). * Compile with -rtsopts and -threaded by default. pandoc (1.9.4.1) * Markdown reader: Added cf. and cp. to list of likely abbreviations. * LaTeX template: Added linkcolor, urlcolor and links-as-notes variables. Make TOC links black. * LaTeX template improvements. + Don't print date unless one is given explicitly in the document. + Simplified templates. + Use fontenc [T1] by default, and lmodern. + Use microtype if available. * Biblio: + Add comma to beginning of bare suffix, e.g. @item1 [50]. Motivation: @item1 [50] should be as close as possible to [@item1, 50]. + Added workaround for a bug in citeproc-hs 0.3.4 that causes footnotes beginning with a citation to be empty. Closes #531. * Fixed documentation on mixed lists. Closes #533. pandoc (1.9.4) * Simplified Text.Pandoc.Biblio and fixed bugs with citations inside footnotes and captions. We now handle note citations by inserting footnotes during initial citation processing, and doing a separate pass later to remove notes inside notes. * Added 'zenburn' highlight style from highlighting-kate. * Added Slideous writer. Slideous is an HTML + javascript slide show format, similar to Slidy, but works with IE 7. (Jonas Smedegaard) * LaTeX writer: + Ensure we don't have extra blank lines at ends of cells. This can cause LaTeX errors, as they are interpreted as new paragraphs. + More consistent interblock spacing. + Require highlighting-kate >= 0.5.1, for proper highlighted inline code in LaTeX. Closes #527. + Ensure that a Verbatim at the end of a footnote is followed by a newline. (Fixes a regression in the previous version.) + In default template, use black for internal links and TOC. Added commented-out code to use footnotes for links, as would be suitable in print output. * Beamer writer: When --incremental is used, lists inside a block quote should appear all at once. (This makes Beamer output consistent with the HTML slide show formats.) * ConTeXt writer: + Escape % as \letterpercent{} not \letterpercent , to avoid gobbling spaces after the % sign. + Ensure space after \stopformula. * Markdown writer: + Use : form instead of ~ in definition lists, for better compatibility with other markdown implementations. + Don't wrap the term, because it breaks definition lists. + Use a nonzero space to prevent false recognition of list marker in ordered lists. Closes #516. * Org writer: Add space before language name. Closes #523. * Docx writer: Simplified bullet characters so they work properly with Word 2007. Closes #520. * LaTeX reader: Support \centerline. * RST reader: handle figures. Closes #522. * Textile reader: fix for  and ==. Closes #517. (Paul Rivier) pandoc (1.9.3) * Fixed bug in fromEntities. The previous version would turn hi & low you know; into hi &. * HTML reader: + Don't skip nonbreaking spaces. Previously a paragraph containing just   would be rendered as an empty paragraph. Thanks to Paul Vorbach for pointing out the bug. + Support   and  in tables. Closes #486. * Markdown reader: + Don't recognize references inside delimited code blocks. + Allow list items to begin with lists. * Added basic docbook reader (John MacFarlane and Mauro Bieg). * LaTeX reader: + Handle \bgroup, \egroup, \begingroup, \endgroup. + Control sequences can't be followed by a letter. This fixes a bug where \begingroup was parsed as \begin followed by group. + Parse 'dimension' arguments to unknown commands. e.g. \parindent0pt + Make \label and \ref sensitive to --parse-raw. If --parse-raw is selected, these will be parsed as raw latex inlines, rather than bracketed text. + Don't crash on unknown block commands (like \vspace{10pt}) inside \author; just skip them. Closes #505. * Textile reader: + Implemented literal escapes with == and . Closes #473. + Added support for LaTeX blocks and inlines (Paul Rivier). + Better conformance to RedCloth inline parsing (Paul Rivier). + Parse '+text+' as emphasized (should be underlined, but this is better than leaving literal plus characters in the output. * Docx writer: Fixed multi-paragraph list items. Previously they each got a list marker. Closes #457. * LaTeX writer: + Added --no-tex-ligatures option to avoid replacing quotation marks and dashes with TeX ligatures. + Use fixltx2e package to provide '\textsubscript'. + Improve spacing around LaTeX block environments: quote, verbatim, itemize, description, enumerate. Closes #502. + Use blue instead of pink for URL links in latex/pdf output. * ConTeXt writer: Fixed escaping of %. In text, % needs to be escaped as \letterpercent, not \% Inside URLs, % needs to be escaped as \% Thanks to jmarca and adityam for the fix. Closes #492. * Texinfo writer: Escape special characters in node titles. This fixes a problem pointed out by Joost Kremers. Pandoc used to escape an '@' in a chapter title, but not in the corresponding node title, leading to invalid texinfo. * Fixed document encoding in texinfo template. Resolves Debian Bug #667816. * Markdown writer: + Don't force delimited code blocks to be flush left. Fixes bug with delimited code blocks inside lists etc. + Escape < and . * LaTeX writer: Use \hyperref[ident]{text} for internal links. Previously we used \href{\#ident}{text}, which didn't work on all systems. Thanks to Dirk Laurie. * RST writer: Don't wrap link references. Closes #487. * Updated to use latest versions of blaze-html, mtl. pandoc (1.9.2) * LaTeX reader: + Made lstlisting work as a proper verbatim environment. + Fixed bug parsing LaTeX tables with one column. * LaTeX writer: + Use {} around ctable caption, so that formatting can be used. + Don't require eurosym package unless document has a €. * LaTeX template: Added variables for geometry, romanfont, sansfont, mathfont, mainfont so users can more easily customize fonts. * PDF writer: + Run latex engine at least two times, to ensure that PDFs will have hyperlinked bookmarks. + Added PDF metadata (title,author) in LaTeX standalone + PDF output. * Texinfo writer: retain directories in image paths. (Peter Wang) * RST writer: Better handling of inline formatting, in accord with docutils' "inline markup recognition rules" (though we don't implement the unicode rules fully). Now hi*there*hi gets rendered properly as hi\ *there*\ hi, and unnecessary \  are avoided around :math:, :sub:, :sup:. * RST reader: + Parse \  as null, not escaped space. + Allow  :math:...  even when not followed by blank or \. This does not implement the complex rule docutils follows, but it should be good enough for most purposes. + Add support for the rST default-role directive. (Greg Maslov) * Text.Pandoc.Parsing: Added stateRstDefaultRole field to ParserState. (Greg Maslov) * Markdown reader: Properly handle citations nested in other inline elements. * Markdown writer: don't replace empty alt in image with "image". * DZSlides: Updated template.html and styles in default template. Removed bizarre CSS for q in dzslides template. * Avoid repeated id attribute in section and header in HTML slides. * README improvements: new instructions on internal links, removed misleading note on reST math. * Build system: + Fixed Windows installer so that dzslides works. + Removed stripansi.sh. + Added .travis.yml for Travis continuous integration support.. + Fixed upper bound for zlib (Sergei Trofimovich). + Fixed upper bound for test-framework. + Updated haddocks for haddock-2.10 (Sergei Trofimovich). pandoc (1.9.1.2) * Added beamer+lhs as output format. * Don't escape < in