This is a Saxon extension function that wraps the org.commonmark
Markdown parser.
In your stylesheet, declare the extension namespace:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ext="http://nwalsh.com/xslt"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="ext xs"
version="3.0">
Then you can call the extension function:
<xsl:sequence select="ext:commonmark('This is *bold*', $options)"/>
You can use function-available()
to code more defensively. The
options map is a map from QName keys to values. Four keys are
recognized:
- escape-html
- boolean, enables the
escapeHtml()
option on theHtmlRenderer
- sanitize-urls
- boolean, enables the
sanitizeUrls()
option on theHtmlRenderer
- ext:parser
- “xml”, “html”, or “none”; how to parse the result
- ext:extensions
- list-of-strings, adds extensions to the Markdown parser
The default value for escape-html
and sanitize-urls
is “false”. (Neither
option seems to have any effect in the 0.21.0 version of the CommonMark parser.)
The default value for ext:parser
is “xml”.
The default value for ext:extensions
is the empty sequence.
You can omit the second argument entirely if you’re happy with those defaults.
The output from the Markdown parser is a string containing the derived HTML markup. You can further parse that as XML or HTML, to return an XDM node for further processing, or you can omit parsing and return the string.
If the parser is “xml”, the rendered HTML is parsed as XML (by fn:parse-xml
).
If the first attempt to parse fails, a second attempt is made with a
single <div>
wrapped around the html. (In other words, if the CommonMark parse
produces more than one top-level element, then they’re all wrapped in a “div” to
make a well-formed XML document.)
The resulting HTML will be in the XHTML namespace.
If the Markdown contains markup that’s intrinsically not well-formed, then it cannot be parsed as XML.
If the parser is “html”, the rendered HTML is parsed as HTML using the
nu.validator
htmlparser
. This always produces well-formed results, by
altering the markup if necessary. The top-level element returned is always
<html>
in the XHTML namespace.
If the parser is “none”, the rendered HTML is returned as a string.
The org.commonmark
parser supports a number of extensions. You can use them
by listing the full class name in the extensions property. For example:
<xsl:sequence
select="ext:commonmark($markdown,
map{
xs:QName('ext:extensions'):
('org.commonmark.ext.gfm.tables.TablesExtension',
'org.commonmark.ext.gfm.strikethrough.StrikethroughExtension')})"/>
You must also make sure that the corresponding class is on your classpath.
Initialization of each extension relies on the class providing a static
method create()
that returns an instance of Extension
. That’s the
way all of the standard extensions from org.commonmark
work.
If you are trying to use an extension that has to be instantiated in some other way, it won’t work.