Skip to content

Markdown Notation

Don Mendelson edited this page Oct 24, 2022 · 4 revisions

Markdown Overview

Markdown is notation used in user forums, README files, and the like. The beauty of it is that a markdown file is completely humanly readable and writeable, unlike HTML or XML that use complex tag schemes. At the same time, markdown was designed to be renderable as a web page by tools.

There are numerous markdown editors with web preview. But with a little experience, a dedicated markdown editor becomes unnecessary because the format is so simple that any plain text editor will do.

Mostly, a markdown document is just plain text, but recognizes some marks for enrichment. There are only few features of markdown that you need to know in order to write rules of engagement and use md2orchestra:

  • Paragraphs are separated by an extra line break.
  • Headings begin with a hash mark "#" for each heading level.
  • Words can be emphasized as by typing *italic* for italic or **bold** for bold.
  • Tables are drawn with bars "|" to separate columns and one row of hyphens to set off the column headers. (You don't need to line up columns precisely, unless you want to.)

See this cheat sheet for more.

Conventionally, markdown files are saved with ".md" extension.

Interpretation in Tablature

Not every markdown feature is distinguished or processed in Tablature. Here is its specific interpretation:

Blocks

A markdown document is constructed of blocks. Each block is one of the following:

  • Heading
  • Paragraph, List, or Blockquote--Tablature parses these block types but treats them the same.
  • Table

Heading

An ATX-style (forerunner of markdown, apparently) heading begins with one or more hashes # (unescaped--see below), optionally followed by a single space, followed by heading text that is delimited by a newline. The number of hashes indicates the heading level.

Tablature tokenizes headings to discover key words. Normally, tokens are simply separated by whitespace. However, names can be written with internal spaces if they are surrounded by double quotes.

Paragraph

A paragraph is text separated by an extra newline. The intial character of a paragraph cannot be an unescaped #, >, or | since those characters begin a heading, blockquote, and table column respectively. Paragraph text may include inline marks for italic, bold, and so forth.

List

A block of text delimited by an extra newline in which each line begins with one of these characters:

  • One of -, +, or * interpreted as a bullet.
  • One or more digits followed by a period . and a space for a numbered list item.

Blockquote

A block of text delimited by an extra newline in which each line begins with > greater-than character, optionally followed by one space.

Table

A table consists of three parts: a table heading row, a delimiter row, and any number of data rows. In all rows, table columns are delimited by a pipe | character. A pipe character following the rightmost column is optional. It is not necessary to line up the columns, although that makes it more humanly readable.

Table Column Headings

Each column in the first row contains text used as the heading for that column. As for other rows, each column is separated by a pipe character. Within a column, the text may have leading and trailing spaces which are stripped off by Tablature.

For simplicity, markdown tables do not support headings that span columns.

Table Delimiter Row

Each column in the delimiter row is filled with one or more hypen characters. It may optionally be preceded by one colon : and may be followed by one colon. A column delimiter of |:- is aligned left, |-: is aligned right, and |:-: is centered. Alignment is purely for rendering and makes no difference in how Tablature handles data. i

Table Data

Each cell in a data row is delimited by a pipe | character. A trailing pipe on the rightmost column is optional. The text may have leading or trailing spaces which are stripped off by Tablature.

Escaped Characters

Any punctionation character may be preceded by a backslash \ character to indicate that it should be treated as a literal rather than a special character. Such a character is said to be escaped. For exampled, an escaped pipe character is treated as a literal symbol, not as a table column delimiter. Without a backslash, the character is said to be unescaped and may have a special meaning in markdown. The sequence \\ is interpreted as a literal backslash.

Literals

Text embedded within a pair of backticks ` is considered literal and does not need escape characters.

Unimplemented Features

The following leaf blocks of markdown are not implemented in Tablature:

  • Thematic breaks
  • Setext headings (uses hyphen row instead of #)
  • Fenced code blocks
  • Indented code blocks
  • HTML blocks

Inline features of markdown are generally passed through without special handling.