Skip to content
Martin Thomson edited this page Oct 9, 2023 · 38 revisions

Elements of kramdown-rfc syntax

This page is a memo about the various elements of the kramdown-rfc syntax. It is not ordered by significance of the element, but by its technical aspects.

Additions to the kramdown syntax

Some additional syntax was added to the standard markdown syntax as already exhibited by kramdown. Reference: lib/kramdown-rfc2629.rb, i.e., class Kramdown::Parser::RFC2629Kramdown

File inclusion

Files can be included with the syntax {::include fn} (this needs to be flush left in column 1). A typical example from a recent RFC, where the contents of a figure was machine-generated:

~~~~~~~~~~
{::include ../ghc/packets-new/p4.out}
~~~~~~~~~~
{: #example2 title="A longer RPL example"}

(Note that this can be suppressed for use in servers by setting the environment variable KRAMDOWN_SAFE; it may not be as useful for online tools as it is for Makefile-driven draft generation.)

If you would like to include a file inside an included file, the parent file must use include-nested instead of include.

In the parent file:

{::include-nested ./section/child.md}

{{}} syntax (links)

In markdown, an internal reference with an empty link text should have the form [](#RFC7252). Interestingly, this syntax (empty link text) has historically not been supported by kramdown, but that bas been changed with the upstream kramdown release 1.10. In any case, tool support may be limited for empty link text (e.g., the link might simply be invisible).

So the following syntax for an internal reference with empty link text was invented (note that the # is implied unless ref is a full URI [1.0.29]):

  • {{ref}} for an internal reference, regardless of biblio, section, figure, table.
  • {{?ref}} for a biblio reference that automatically generates the entry as informative
  • {{!ref}} for a biblio reference that automatically generates the entry as normative

Where ref is not valid syntax for an XML ID, it is prefixed by an underscore.

To support referenced drafts evolving into RFCs without having to touch the entire document, they can be given a kramdown-internal symbolic name in the YAML, which is then referenced as {{-name}}. (1.0.29 adds support for adding alias names to other biblio references.)

An internal reference is normally rendered by xml2rfc with the word Section/Appendix/Figure/Table and the section/item number (format=default). There are shortcuts for setting the format to other values:

  • {{<target}} for <xref target="target" format="counter"/>, i.e., section number only. This is useful in combinations such as ...discussed in Sections {{<intro}} and {{<extro}}....
  • {{<<target}} for <xref target="target" format="title”/>, i.e., the title of the referenced section. This can be used if adding the section title to the reference is helpful, e.g., ... discussed in {{extro}}, {{<<extro}}., or of course on its own.

((())) syntax (index entries)

Markdown does not have syntax for index entries. kramdown-rfc emulates some asciidoc/pandoc syntax here.

  • Leading ! ➔ primary flag is set
  • item/optional subitem comma-separated, either one needs to be in quotes if it contains a comma

{{{}}} syntax (contact names)

RFCXMLv3 has some special treatment for contact names, via the <contact/> element.

kramdown-rfc provides some basic syntax for <contact/> in text:

{{{Jürgen Möllemann}}} and {{{トヨタ自動車株式会社}{Toyota Jidosha}}} create contact elements with fullname and optionally asciiFullname attributes:

<contact fullname="Jürgen Möllemann"/>
<contact fullname="トヨタ自動車株式会社" asciiFullname="Toyota Jidosha"/></t>

(The transliteration of the second example is somewhat wrong, but we don’t have latinFullname, just asciiFullname; this will need to be fixed in the course of the upcoming beyond-ASCII renovation of RFCXMLv3.)

Note that with the current XML2RFC implementation of RFCXMLv3 there is no need to input asciiFullnames for fullnames that are using Latin characters only.

Also, the RFCXMLv3 grammar used by XML2RFC has mysterious limitations in the context where a <contact/> can be used. (This can be fixed using a hack, but that doesn't help with the installations in the I-D submit system and at the RFC editor.)

Within the limitations created by the above grammar issues, this syntax can also be abused to work around limitations in using beyond-ASCII characters in running text:

set {{{α}{}}}<sub>aimd</sub> to 1 once W<sub>est</sub> reaches W_max (#2)
{{{Voilà}}}.

This makes use of the undocumented XML2RFC feature that beyond-Latin fullnames can be used with an empty asciiFullname, leading to the weird bracefest around the α (which is Greek, not Latin).

(And please note the discussion in https://mailarchive.ietf.org/arch/msg/rfc-markdown/e9nnRhTVna5zM2wyfa-l-smYUMw for why I ultimately did go ahead with syntax for mentioning human names that remotely reminds one of triple parentheses.)

Mapping kramdown's markdown elements to RFCXML

It is not always straightforward how to elicit RFCXML behavior where markdown has been shaped by its HTML background.

Footnotes ➔ crefs

Footnotes turn into crefs (editing comments). Note that [IALs][inline-attribute-lists] are needed to add a source attribute, making this a bit unwieldy unless complemented with an ALD:

Another questionable paragraph.[^1]{: source="observer"}

[^1]: so why not delete it

<!-- taking the noise out into an ALD: -->

{:cabo: source="cabo"}

(This section to be removed by the RFC editor.)[^2]{:cabo}

[^2]: are we sure about this?

The IALs can also be attached to the markdown footnote instead to the footnote reference:

Another questionable paragraph.[^1]

[^1]: so why not delete it
{: source="observer"}

<!-- taking the noise out into an ALD: -->

{:cabo: source="cabo"}

(This section to be removed by the RFC editor.)[^2]

[^2]: are we sure about this?
{:cabo}

Please take note of the syntax of kramdown footnotes (emphasis and redactions mine):

  • The footnote name in square brackets, optionally indented up to three spaces,
  • then a colon and one or more optional spaces,
  • then the text of the footnote
  • and optionally more text on the following lines which have to follow the syntax for standard code blocks (the leading four spaces are naturally stripped from the text)
  • (my addition): and optionally an IAL on the next line.

Math handling

Since 1.0.30, kramdown-rfc converts display math into a crude ASCII art form (using the tex2mail tool, if available). This is probably useful only for a minority of applications. There is also no way to do embedded math (i.e., within a paragraph).

Normative Terms (BCP14 / RFC2119 / RFC8174)

The preamble of normative terms like MAY and MUST can be included with the directive:

{::boilerplate bcp14-tagged}

This adds the following text from the BCP to the output:

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

This version also wraps all occurrences of the normative terms in semantic tags understood by the xml2rfc processor. To insert the boilerplate text but skip the additional tagging, use the following directive:

{::boilerplate bcp14}

Abbreviations ➔ automatic irefs

A kramdown mapping from an abbreviation to a full phrase is instead used to automatically create index entries.

*[IANA]:
*[MUST]: BCP14
*[MAY]: BCP14
*[MUST NOT]: BCP14
*[SHALL]: BCP14
*[CBOR]: (((Object Representation, Concise Binary))) (((CBOR)))

The word in square brackets (which must match exactly, case-sensitively) is entered into the index automatically for each place where it occurs. If no title is given, just the word is entered (first example). If one is given, that becomes the main item (the auto-indexed word becomes the subitem, second example and following). If full control is desired (e.g., for multiple entries per occurrence), just write down the full index entries instead (last example).

Special IAL attributes

(IAL = Inline Attribute Lists, a kramdown extension borrowed from Maruku. Note that the kramdown support for ALD = Attribute List Definitions comes free of charge with markdown and has been used to factor out some ugly, repetitive IALs.)

Generally, IAL attributes are copied to the XML elements created. There is some special handling for

  • id, which is usually translated to anchor, so the #id syntax can be used in IALs
  • href, which is usually translated to target; in a link, it is translated to xref for internal links and eref for external ones; &foo; is converted to an entity reference

There are also some element-specific hacks, such as:

  • translating artwork-align etc. on a figure to an align attribute etc. for the artwork in the figure (same for sourcecode-markers and sourcecode-name).

  • translating class for code blocks into type (after removing the language- inserted by kramdown)

  • an attribute cols is translated into ttcol attributes for a table (so the writer does not have to memorize the arcane markdown table attribute syntax, which is also supported)

  • an attribute gi can be used to set the element type for elements where kramdown-rfc may be guessing wrong (e.g., artwork vs. sourcecode).

  • element content is turned into title attributes where needed (with some markup loss)

  • codespans are converted to style="verb", em/strong to emph/strong

  • vspace hacks [TODO]

The canonical way to include YANG as sourcecode is:

~~~~ yang
{::include yang/ietf-voucher-latest.yang}
~~~~
{: sourcecode-markers="true" sourcecode-name="ietf-voucher@2021-07-02.yang”}

The following ALDs are predefined:

ALD name ALD value Usage
unnumbered numbered="false" section: do not number (e.g., acknowledgements)
removeinrfc removeinrfc="true" section: remove in RFC
notoc toc="exclude" section: do not include in table of contents
quote gi="blockquote" block quote: turn into <blockquote>
aside gi="aside" block quote: turn into <aside>
markers sourcecode-markers="true" code block: use <CODE BEGINS/ENDS>
vspace vspace="0" definition list (break after term)
compact spacing="compact" list (definition/numbered/unnumbered)
noabbrev noabbrev="true" suppress abbrev processing for a paragraph or a span

For instance, a proper blockquote can be notated:

{:quote}
> The future is already here, it is just not evenly distributed.

Similar for an aside:

{:aside}
> The correct unit for bitrate is bit/s, not bps.

The YAML header

Mostly documented by examples in README.md.

The structure of the YAML header is mostly derived from the structure of the front matter in RFCXML. The YAML header is a map (hash) with keys as defined below.

Some of these keys create attributes in the <rfc> element, with the following abbreviations also available:

Name <rfc> Attribute
ipr ipr
docname docName
cat category
number number
obsoletes obsoletes
updates updates
seriesno seriesNo

The following hash keys create front matter elements or their attributes:

Name XML Type
title <title> String
abbrev <title abbrev="..."> String
author <author...> Author
date <date> Date
area <area> String
wg <workgroup> String
kw <keyword> String

An Author is structured like an author in the references list. A date can be null (for "n.d."), a YAML date (as in 2016-03-06), an integer for a year (2016), or a string that will be parsed into YYYY-MM (defaulting to now if that parse fails); an array of strings is joined and turned into the actual value of the year attribute. Explicitly setting the date to false makes the date disappear; this is not the same as leaving it out (setting it to null, effectively), as the "n.d." can serve as a reminder to the author that a date is missing.

There are some additions for evoking other XML features:

  • entity: a hash named "entity" is turned into additional entity declarations that then can be referenced as {{&entityname}} in the text.

    entity:
      SELF: "[RFCXXXX]"
  • pi: a hash named "pi" allows adding processing instructions of the form <?rfc name="value" ?> to the front matter. A value that is not given (i.e., nil) is interpreted as "yes", as is true. false turns into "no".

    pi:
      toc: yes
      tocdepth: 4
      sortrefs: yes
      symrefs: yes
      compact: yes
      comments: yes

    As an abbreviation, if all values are "yes", the hash can be replaced by an array of names, as in (without the tocdepth setting):

    pi: [toc, sortrefs, symrefs, compact, comments]

References

The YAML keys normative and informative are maps (hashes) that create reference entries. In each, a map key that is an RFC name, an I-D name, or one of the additional reference styles for external standards documents from W3C etc. (to be defined what is in bibxml2 etc.) does not need a map value, as in:

normative:
  RFC2119:
  RFC7252:
  I-D.ietf-core-block:
  W3C.REC-xml.xml:

For references not in the xml2rfc.ietf.org libraries, more information has to be given (example with slightly over the top usage of seriesinfo):

informative:
  TypedArray:
    -: ta
    target: https://www.khronos.org/registry/typedarray/specs/1.0/
    title: Typed Array Specification
    author:
      -
        ins: V. Vukicevic
        name: Vladimir Vukicevic
        org: Mozilla Corporation
      -
        ins: K. Russell
        name: Kenneth Russell
        org: Google, Inc.
    seriesinfo:
      ISBN: 9780470747995
      DOI: 10.1109/MIC.2012.29
    date: 2011-02-08
    format:
      TXT: http://foo.bar/baz.txt
      PDF: http://foo.bar/baz.pdf
    ann: >
      This is a long annotation.

Here, TypedArray is the reference label as will be visible in the document, - is an alias name (see above), target, title, seriesinfo, format, annotation, date and author are the parts of the reference as defined in RFCXML. More information about authors below.

Seriesinfo

In RFCXML references, seriesinfo is sometimes used as a catchall for things that don't fit elsewhere. A slightly over the top example:

    seriesinfo:
      "ISO/IEC": JTC1/SC2/WG2 N4246R2
      ISSN: "0001-0782"
      "ACM Press": "Communications of the ACM vol 13 no 7 pp 422-426"
       "in: ECMA-262 6th Edition,": "The ECMAScript 2015 Language Specification"
      "Ph.D.": "Dissertation, University of California, Irvine"
      ISBN: 9780470747995
      DOI: 10.1109/MIC.2012.29
      'CoRE ticket': '#204'
      IEEE: "Transactions on Information Theory, Vol. 23, No. 3, pp. 337-343"
      'HTTPBIS ticket': '#131, closed 2009-12-02'

Author {#author}

An author key can be a single hash or an array of hashes, each an author entry. An author entry has a combined initials/surname (ins) entry, a full name, and other keys as defined in RFCXML:

Name Element/Attribute
role role
org organization
abbrev organization/abbrev
street postal/street
city postal/city
region postal/region
code postal/code
country postal/country
phone phone
facsimile facsimile
email email
uri uri

Sections

After the YAML header, markdown sections are delimited by lines surrounded by empty lines, of the form:

--- sectionname

The sections are as follows:

Section Purpose
abstract Abstract
middle Main body, chapters of the document
back Back matter, appendices
fluff Unused, treated as comments

In addition, sections normative and informative can be used to add XML material to the front of the respective reference sections (the text generated from the YAML header is appended to that text). The usefulness of this function is limited as kramdown translates GIs and attribute names into lower case, and RFCXML inexplicably contains some mixed-case GIs; these therefore cannot be used from the markdown.

Avoiding Pitfalls

While markdown text entry usually provides a smooth experience, there are a few pitfalls that need to be watched:

  • Use of <something> without intending to create an XML element --- always escape left angle brackets that could be mistaken for XML with \< constructs. (The symptom may be that the XML-like text is simply swallowed, a hard to understand error message from xml2rfc, or weird formatting.)

  • Use of beyond-ASCII characters together with coding: us-ascii --- don't use coding: us-ascii unless you really need it. xml2rfc can usually convert beyond-ASCII characters into ASCII for the text version, and the HTML version may even benefit from the correct beyond-ASCII characters. (For final submission, of course the RFC-editor "non-ASCII" guidelines need to be fulfilled.)

  • Use of HT characters (horizontal tabs). In the best case, they will unpredictably mess up the alignment of artwork; in the worst case they will send xml2rfc into a loop. Avoid horizontal tabs and other control characters in the markdown input.

Some of these pitfalls could be addressed by adding some detection in the kramdown-rfc converter, but not all of them.

Pitfalls in YAML

The YAML format that kramdown-rfc2629 uses for its structured header is trying to be a user-friendly configuration file format. Most text strings can be written without using quote characters around them. However, in the process, YAML makes some assumptions that may lead to surprising results.

  • What looks like a number is parsed as a number. For most data items in the YAML header, this is then converted back into a string. However, the number parsing may lead to surprises, such as interpretation as an octal number:

    code: 02700

    ... which becomes 1472 when converted back to a (now decimal) string. Instead, one should write:

    code: "02700"
  • YAML detects other formats, such as Booleans, Dates, etc. When turned back into strings, unquoted n|N|no|No|NO|false|False|FALSE|off|Off|OFF become false, and unquoted y|Y|yes|Yes|YES|true|True|TRUE|on|On|ON become true.

    If in doubt, do quote.

  • YAML syntax such as embedded colons are interpreted, so the following leads to a syntax error:

    title: XDR: External Data Representation Standard

    Quoting is one way to handle this, but the preferred way to enter free-form text in a YAML field is the > syntax:

    title: >
      XDR: External Data Representation Standard

    Note that the indented lines become the text; it can be continued over multiple lines up to the next outdented line:

    title: >
      Information technology —
      Procedures for the operation of object identifier
      registration authorities:
      General procedures and top arcs of the international
      object identifier tree
    author:
      org: International Telecommunications Union

    Internal newlines and following indentation are converted to a single space when this format is parsed, so the preceding example is exactly equivalent to the following:

    title: >
      Information technology — Procedures for the operation of object
      identifier registration authorities: General procedures and top
      arcs of the international object identifier tree
    author:
      org: International Telecommunications Union