-
Notifications
You must be signed in to change notification settings - Fork 83
Syntax
This page is a memo about the various elements of the kramdown-rfc syntax. It is not ordered by significance of the element, but by its technical aspects.
Some additional syntax was added to the standard markdown syntax as already exhibited by kramdown.
Reference: lib/kramdown-rfc2629.rb
, i.e., class Kramdown::Parser::RFC2629Kramdown
Files can be included with the syntax {::include fn}
(this needs to
be flush left in column 1). A typical example from a recent RFC,
where the contents of a figure was machine-generated:
~~~~~~~~~~
{::include ../ghc/packets-new/p4.out}
~~~~~~~~~~
{: #example2 title="A longer RPL example"}
(Note that this can be suppressed for use in servers by setting the
environment variable KRAMDOWN_SAFE
; it may not be as useful for
online tools as it is for Makefile-driven draft generation.)
If you would like to include a file inside an included file, the parent file must use include-nested
instead of include
.
In the parent file:
{::include-nested ./section/child.md}
In markdown, an internal reference with an empty link text should have the form [](#RFC7252)
.
Interestingly, this syntax (empty link text) has
historically not been supported by kramdown,
but that bas been changed with the upstream kramdown release 1.10.
In any case, tool support may be limited for empty link text (e.g.,
the link might simply be invisible).
So the following syntax for an internal reference with empty link text
was invented (note that the #
is implied unless ref is a full URI [1.0.29]):
-
{{ref}}
for an internal reference, regardless of biblio, section, figure, table. -
{{?ref}}
for a biblio reference that automatically generates the entry as informative -
{{!ref}}
for a biblio reference that automatically generates the entry as normative
Where ref is not valid syntax for an XML ID, it is prefixed by an underscore.
To support referenced drafts evolving into RFCs without having to
touch the entire document, they can be given a kramdown-internal
symbolic name in the YAML, which is then referenced as {{-name}}
.
(1.0.29 adds support for adding alias names to other biblio references.)
An internal reference is normally rendered by xml2rfc with the word Section/Appendix/Figure/Table and the section/item number (format=default). There are shortcuts for setting the format to other values:
-
{{<target}}
for<xref target="target" format="counter"/>
, i.e., section number only. This is useful in combinations such as...discussed in Sections {{<intro}} and {{<extro}}...
. -
{{<<target}}
for<xref target="target" format="title”/>
, i.e., the title of the referenced section. This can be used if adding the section title to the reference is helpful, e.g.,... discussed in {{extro}}, {{<<extro}}.
, or of course on its own.
Markdown does not have syntax for index entries. kramdown-rfc emulates some asciidoc/pandoc syntax here.
- Leading
!
➔ primary flag is set - item/optional subitem comma-separated, either one needs to be in quotes if it contains a comma
RFCXMLv3 has some special treatment for contact names, via the
<contact/>
element.
kramdown-rfc provides some basic syntax for <contact/>
in text:
{{{Jürgen Möllemann}}}
and
{{{トヨタ自動車株式会社}{Toyota Jidosha}}}
create contact elements with fullname and optionally asciiFullname attributes:
<contact fullname="Jürgen Möllemann"/>
<contact fullname="トヨタ自動車株式会社" asciiFullname="Toyota Jidosha"/></t>
(The transliteration of the second example is somewhat wrong, but we don’t have latinFullname, just asciiFullname; this will need to be fixed in the course of the upcoming beyond-ASCII renovation of RFCXMLv3.)
Note that with the current XML2RFC implementation of RFCXMLv3 there is no need to input asciiFullnames for fullnames that are using Latin characters only.
Also, the RFCXMLv3 grammar used by XML2RFC has mysterious limitations
in the context where a <contact/>
can be used. (This can be fixed
using a hack, but that doesn't help with the installations in the
I-D submit system and at the RFC editor.)
Within the limitations created by the above grammar issues, this syntax can also be abused to work around limitations in using beyond-ASCII characters in running text:
set {{{α}{}}}<sub>aimd</sub> to 1 once W<sub>est</sub> reaches W_max (#2)
{{{Voilà}}}.
This makes use of the undocumented XML2RFC feature that beyond-Latin fullnames can be used with an empty asciiFullname, leading to the weird bracefest around the α (which is Greek, not Latin).
(And please note the discussion in https://mailarchive.ietf.org/arch/msg/rfc-markdown/e9nnRhTVna5zM2wyfa-l-smYUMw for why I ultimately did go ahead with syntax for mentioning human names that remotely reminds one of triple parentheses.)
It is not always straightforward how to elicit RFCXML behavior where markdown has been shaped by its HTML background.
Footnotes turn into crefs (editing comments). Note that [IALs][inline-attribute-lists] are needed to add a source attribute, making this a bit unwieldy unless complemented with an ALD:
Another questionable paragraph.[^1]{: source="observer"}
[^1]: so why not delete it
<!-- taking the noise out into an ALD: -->
{:cabo: source="cabo"}
(This section to be removed by the RFC editor.)[^2]{:cabo}
[^2]: are we sure about this?
The IALs can also be attached to the markdown footnote instead to the footnote reference:
Another questionable paragraph.[^1]
[^1]: so why not delete it
{: source="observer"}
<!-- taking the noise out into an ALD: -->
{:cabo: source="cabo"}
(This section to be removed by the RFC editor.)[^2]
[^2]: are we sure about this?
{:cabo}
Please take note of the syntax of kramdown footnotes (emphasis and redactions mine):
- The footnote name in square brackets, optionally indented up to three spaces,
- then a colon and one or more optional spaces,
- then the text of the footnote
- and optionally more text on the following lines which have to follow the syntax for standard code blocks (the leading four spaces are naturally stripped from the text)
- (my addition): and optionally an IAL on the next line.
Since 1.0.30, kramdown-rfc converts display math into a crude ASCII art form (using the tex2mail tool, if available). This is probably useful only for a minority of applications. There is also no way to do embedded math (i.e., within a paragraph).
The preamble of normative terms like MAY and MUST can be included with the directive:
{::boilerplate bcp14-tagged}
This adds the following text from the BCP to the output:
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
This version also wraps all occurrences of the normative terms in semantic tags understood by the xml2rfc processor. To insert the boilerplate text but skip the additional tagging, use the following directive:
{::boilerplate bcp14}
A kramdown mapping from an abbreviation to a full phrase is instead used to automatically create index entries.
*[IANA]:
*[MUST]: BCP14
*[MAY]: BCP14
*[MUST NOT]: BCP14
*[SHALL]: BCP14
*[CBOR]: (((Object Representation, Concise Binary))) (((CBOR)))
The word in square brackets (which must match exactly, case-sensitively) is entered into the index automatically for each place where it occurs. If no title is given, just the word is entered (first example). If one is given, that becomes the main item (the auto-indexed word becomes the subitem, second example and following). If full control is desired (e.g., for multiple entries per occurrence), just write down the full index entries instead (last example).
(IAL = Inline Attribute Lists, a kramdown extension borrowed from Maruku. Note that the kramdown support for ALD = Attribute List Definitions comes free of charge with markdown and has been used to factor out some ugly, repetitive IALs.)
Generally, IAL attributes are copied to the XML elements created. There is some special handling for
-
id
, which is usually translated toanchor
, so the#id
syntax can be used in IALs -
href
, which is usually translated totarget
; in a link, it is translated toxref
for internal links anderef
for external ones;&foo;
is converted to an entity reference
There are also some element-specific hacks, such as:
-
translating
artwork-align
etc. on a figure to analign
attribute etc. for the artwork in the figure (same forsourcecode-markers
andsourcecode-name
). -
translating
class
for code blocks intotype
(after removing thelanguage-
inserted by kramdown) -
an attribute
cols
is translated into ttcol attributes for a table (so the writer does not have to memorize the arcane markdown table attribute syntax, which is also supported) -
an attribute
gi
can be used to set the element type for elements where kramdown-rfc may be guessing wrong (e.g.,artwork
vs.sourcecode
). -
element content is turned into
title
attributes where needed (with some markup loss) -
codespans are converted to
style="verb"
, em/strong toemph
/strong
-
vspace
hacks [TODO]
The canonical way to include YANG as sourcecode is:
~~~~ yang
{::include yang/ietf-voucher-latest.yang}
~~~~
{: sourcecode-markers="true" sourcecode-name="ietf-voucher@2021-07-02.yang”}
The following ALDs are predefined:
ALD name | ALD value | Usage |
---|---|---|
unnumbered | numbered="false" | section: do not number (e.g., acknowledgements) |
removeinrfc | removeinrfc="true" | section: remove in RFC |
notoc | toc="exclude" | section: do not include in table of contents |
quote | gi="blockquote" | block quote: turn into <blockquote>
|
aside | gi="aside" | block quote: turn into <aside>
|
markers | sourcecode-markers="true" | code block: use <CODE BEGINS/ENDS>
|
vspace | vspace="0" | definition list (break after term) |
compact | spacing="compact" | list (definition/numbered/unnumbered) |
noabbrev | noabbrev="true" | suppress abbrev processing for a paragraph or a span |
For instance, a proper blockquote can be notated:
{:quote}
> The future is already here, it is just not evenly distributed.
Similar for an aside:
{:aside}
> The correct unit for bitrate is bit/s, not bps.
Mostly documented by examples in README.md.
The structure of the YAML header is mostly derived from the structure of the front matter in RFCXML. The YAML header is a map (hash) with keys as defined below.
Some of these keys create attributes in the <rfc>
element, with the
following abbreviations also available:
Name |
<rfc> Attribute |
---|---|
ipr | ipr |
docname | docName |
cat | category |
number | number |
obsoletes | obsoletes |
updates | updates |
seriesno | seriesNo |
The following hash keys create front matter elements or their attributes:
Name | XML | Type |
---|---|---|
title | <title> |
String |
abbrev | <title abbrev="..."> |
String |
author | <author...> |
Author |
date | <date> |
Date |
area | <area> |
String |
wg | <workgroup> |
String |
kw | <keyword> |
String |
An Author is structured like an author in the references list.
A date can be null (for "n.d."), a YAML date (as in 2016-03-06), an
integer for a year (2016),
or a string that will be parsed into YYYY-MM (defaulting to now if
that parse fails); an array of strings is joined and turned into the
actual value of the year
attribute.
Explicitly setting the date to false makes the date disappear; this is
not the same as leaving it out (setting it to null, effectively), as
the "n.d." can serve as a reminder to the author that a date is missing.
There are some additions for evoking other XML features:
-
entity
: a hash named "entity" is turned into additional entity declarations that then can be referenced as{{&entityname}}
in the text.entity: SELF: "[RFCXXXX]"
-
pi
: a hash named "pi" allows adding processing instructions of the form<?rfc name="value" ?>
to the front matter. A value that is not given (i.e., nil) is interpreted as "yes", as istrue
.false
turns into "no".pi: toc: yes tocdepth: 4 sortrefs: yes symrefs: yes compact: yes comments: yes
As an abbreviation, if all values are "yes", the hash can be replaced by an array of names, as in (without the
tocdepth
setting):pi: [toc, sortrefs, symrefs, compact, comments]
The YAML keys normative
and informative
are maps (hashes) that
create reference entries. In each, a map key that is an RFC name, an
I-D name, or one of the additional reference styles for external
standards documents from W3C etc. (to be defined what is in bibxml2
etc.) does not need a map value, as in:
normative:
RFC2119:
RFC7252:
I-D.ietf-core-block:
W3C.REC-xml.xml:
For references not in the xml2rfc.ietf.org
libraries, more
information has to be given (example with slightly over the top usage
of seriesinfo):
informative:
TypedArray:
-: ta
target: https://www.khronos.org/registry/typedarray/specs/1.0/
title: Typed Array Specification
author:
-
ins: V. Vukicevic
name: Vladimir Vukicevic
org: Mozilla Corporation
-
ins: K. Russell
name: Kenneth Russell
org: Google, Inc.
seriesinfo:
ISBN: 9780470747995
DOI: 10.1109/MIC.2012.29
date: 2011-02-08
format:
TXT: http://foo.bar/baz.txt
PDF: http://foo.bar/baz.pdf
ann: >
This is a long annotation.
Here, TypedArray
is the reference label as will be visible in the
document, -
is an alias name (see above), target
, title
,
seriesinfo
, format
, ann
otation, date
and author
are the parts of the
reference as defined in RFCXML. More information about authors below.
In RFCXML references, seriesinfo
is sometimes used as a catchall for
things that don't fit elsewhere. A slightly over the top example:
seriesinfo:
"ISO/IEC": JTC1/SC2/WG2 N4246R2
ISSN: "0001-0782"
"ACM Press": "Communications of the ACM vol 13 no 7 pp 422-426"
"in: ECMA-262 6th Edition,": "The ECMAScript 2015 Language Specification"
"Ph.D.": "Dissertation, University of California, Irvine"
ISBN: 9780470747995
DOI: 10.1109/MIC.2012.29
'CoRE ticket': '#204'
IEEE: "Transactions on Information Theory, Vol. 23, No. 3, pp. 337-343"
'HTTPBIS ticket': '#131, closed 2009-12-02'
An author key can be a single hash or an array of hashes, each an
author entry. An author entry has a combined initials/surname (ins
)
entry, a full name
, and other keys as defined in RFCXML:
Name | Element/Attribute |
---|---|
role | role |
org | organization |
abbrev | organization/abbrev |
street | postal/street |
city | postal/city |
region | postal/region |
code | postal/code |
country | postal/country |
phone | phone |
facsimile | facsimile |
uri | uri |
After the YAML header, markdown sections are delimited by lines surrounded by empty lines, of the form:
--- sectionname
The sections are as follows:
Section | Purpose |
---|---|
abstract | Abstract |
middle | Main body, chapters of the document |
back | Back matter, appendices |
fluff | Unused, treated as comments |
In addition, sections normative
and informative
can be used to add
XML material to the front of the respective reference sections (the
text generated from the YAML header is appended to that text). The
usefulness of this function is limited as kramdown translates GIs and
attribute names into lower case, and RFCXML inexplicably contains some
mixed-case GIs; these therefore cannot be used from the markdown.
While markdown text entry usually provides a smooth experience, there are a few pitfalls that need to be watched:
-
Use of
<something>
without intending to create an XML element --- always escape left angle brackets that could be mistaken for XML with\<
constructs. (The symptom may be that the XML-like text is simply swallowed, a hard to understand error message from xml2rfc, or weird formatting.) -
Use of beyond-ASCII characters together with
coding: us-ascii
--- don't usecoding: us-ascii
unless you really need it. xml2rfc can usually convert beyond-ASCII characters into ASCII for the text version, and the HTML version may even benefit from the correct beyond-ASCII characters. (For final submission, of course the RFC-editor "non-ASCII" guidelines need to be fulfilled.) -
Use of HT characters (horizontal tabs). In the best case, they will unpredictably mess up the alignment of artwork; in the worst case they will send xml2rfc into a loop. Avoid horizontal tabs and other control characters in the markdown input.
Some of these pitfalls could be addressed by adding some detection in the kramdown-rfc converter, but not all of them.
The YAML format that kramdown-rfc2629 uses for its structured header is trying to be a user-friendly configuration file format. Most text strings can be written without using quote characters around them. However, in the process, YAML makes some assumptions that may lead to surprising results.
-
What looks like a number is parsed as a number. For most data items in the YAML header, this is then converted back into a string. However, the number parsing may lead to surprises, such as interpretation as an octal number:
code: 02700
... which becomes 1472 when converted back to a (now decimal) string. Instead, one should write:
code: "02700"
-
YAML detects other formats, such as Booleans, Dates, etc. When turned back into strings, unquoted n|N|no|No|NO|false|False|FALSE|off|Off|OFF become
false
, and unquoted y|Y|yes|Yes|YES|true|True|TRUE|on|On|ON becometrue
.If in doubt, do quote.
-
YAML syntax such as embedded colons are interpreted, so the following leads to a syntax error:
title: XDR: External Data Representation Standard
Quoting is one way to handle this, but the preferred way to enter free-form text in a YAML field is the
>
syntax:title: > XDR: External Data Representation Standard
Note that the indented lines become the text; it can be continued over multiple lines up to the next outdented line:
title: > Information technology — Procedures for the operation of object identifier registration authorities: General procedures and top arcs of the international object identifier tree author: org: International Telecommunications Union
Internal newlines and following indentation are converted to a single space when this format is parsed, so the preceding example is exactly equivalent to the following:
title: > Information technology — Procedures for the operation of object identifier registration authorities: General procedures and top arcs of the international object identifier tree author: org: International Telecommunications Union