Markdown Abstract Syntax Tree.
MDAST discloses markdown as an abstract syntax tree. Abstract means not all information is stored in this tree and an exact replica of the original document cannot be re-created. Syntax Tree means syntax is present in the tree, thus an exact syntactic document can be re-created.
MDAST is a subset of unist, and implemented by remark.
This document may not be released.
See releases for released documents.
The latest released version is 2.2.0.
Table of Contents
AST
Root
Root (Parent) houses all nodes.
interface Root <: Parent {
type: "root";
}Paragraph
Paragraph (Parent) represents a unit of discourse dealing with a
particular point or idea.
interface Paragraph <: Parent {
type: "paragraph";
}For example, the following markdown:
Alpha bravo charlie.Yields:
{
type: 'paragraph',
children: [{type: 'text', value: 'Alpha bravo charlie.'}]
}Blockquote
Blockquote (Parent) represents a quote.
interface Blockquote <: Parent {
type: "blockquote";
}For example, the following markdown:
> Alpha bravo charlie.Yields:
{
type: 'blockquote',
children: [{
type: 'paragraph',
children: [{type: 'text', value: 'Alpha bravo charlie.'}]
}]
}Heading
Heading (Parent), just like with HTML, with a level greater than
or equal to 1, lower than or equal to 6.
interface Heading <: Parent {
type: "heading";
depth: 1 <= uint32 <= 6;
}For example, the following markdown:
# AlphaYields:
{
type: 'heading',
depth: 1,
children: [{type: 'text', value: 'Alpha'}]
}Code
Code (Text) occurs at block level (see InlineCode
for code spans).
The value after the opening of fenced code can be followed by a language tag,
and then optionally white-space followed by the meta value.
interface Code <: Text {
type: "code";
lang: string | null;
meta: string | null;
}For example, the following markdown:
foo()Yields:
{
type: 'code',
lang: null,
meta: null,
value: 'foo()'
}And the following markdown:
```js highlight-line="2"
foo()
bar()
baz()
```Yields:
{
type: 'code',
lang: 'js',
meta: 'highlight-line="2"',
value: 'foo()\nbar()\nbaz()'
}InlineCode
InlineCode (Text) occurs inline (see Code for blocks).
Inline code does not sport lang or meta properties.
interface InlineCode <: Text {
type: "inlineCode";
}For example, the following markdown:
`foo()`Yields:
{type: 'inlineCode', value: 'foo()'}YAML
YAML (Text) can occur at the start of a document, and contains
embedded YAML data.
interface YAML <: Text {
type: "yaml";
}Note: YAML used to be available through the core of remark and thus is specified here. Support for it now moved to
remark-frontmatter, and the definition here may be removed in the future.
For example, the following markdown:
---
foo: bar
---Yields:
{type: 'yaml', value: 'foo: bar'}HTML
HTML (Text) contains embedded HTML.
interface HTML <: Text {
type: "html";
}For example, the following markdown:
<div>Yields:
{type: 'html', value: '<div>'}List
List (Parent) contains ListItems.
No other nodes may occur in lists.
The start property contains the starting number of the list when
ordered: true; null otherwise.
When all list items have loose: false, the list’s loose property is also
false. Otherwise, loose: true.
interface List <: Parent {
type: "list";
ordered: true | false;
start: uint32 | null;
loose: true | false;
}For example, the following markdown:
1. [x] fooYields:
{
type: 'list',
ordered: true,
start: 1,
loose: false,
children: [{
type: 'listItem',
checked: true,
loose: false,
children: [{
type: 'paragraph',
children: [{type: 'text', value: 'foo'}]
}]
}]
}ListItem
ListItem (Parent) is a child of a List.
Loose ListItems often contain more than one block-level elements.
A checked property exists on ListItems, set to true (when checked), false
(when unchecked), or null (when not containing a checkbox).
See Task Lists on GitHub for information.
interface ListItem <: Parent {
type: "listItem";
checked: true | false | null;
loose: true | false;
}For an example, see the definition of List.
Table
Table (Parent) represents tabular data, with alignment.
Its children are TableRows, the first of which acts as a table
header row.
table.align represents the alignment of columns.
interface Table <: Parent {
type: "table";
align: [alignType];
}enum alignType {
"left" | "right" | "center" | null;
}For example, the following markdown:
| foo | bar |
| :-- | :-: |
| baz | qux |Yields:
{
type: 'table',
align: ['left', 'center'],
children: [
{
type: 'tableRow',
children: [
{
type: 'tableCell',
children: [{type: 'text', value: 'foo'}]
},
{
type: 'tableCell',
children: [{type: 'text', value: 'bar'}]
}
]
},
{
type: 'tableRow',
children: [
{
type: 'tableCell',
children: [{type: 'text', value: 'baz'}]
},
{
type: 'tableCell',
children: [{type: 'text', value: 'qux'}]
}
]
}
]
}TableRow
TableRow (Parent).
Its children are always TableCell.
interface TableRow <: Parent {
type: "tableRow";
}For an example, see the definition of Table.
TableCell
TableCell (Parent).
Contains a single tabular field.
interface TableCell <: Parent {
type: "tableCell";
}For an example, see the definition of Table.
ThematicBreak
A ThematicBreak (Node) represents a break in content, often shown
as a horizontal rule, or by two HTML section elements.
interface ThematicBreak <: Node {
type: "thematicBreak";
}For example, the following markdown:
***Yields:
{type: 'thematicBreak'}Break
Break (Node) represents an explicit line break.
interface Break <: Node {
type: "break";
}For example, the following markdown (interpuncts represent spaces):
foo··
barYields:
{
type: 'paragraph',
children: [
{type: 'text', value: 'foo'},
{type: 'break'},
{type: 'text', value: 'bar'}
]
}Emphasis
Emphasis (Parent) represents slight emphasis.
interface Emphasis <: Parent {
type: "emphasis";
}For example, the following markdown:
*alpha* _bravo_Yields:
{
type: 'paragraph',
children: [
{
type: 'emphasis',
children: [{type: 'text', value: 'alpha'}]
},
{type: 'text', value: ' '},
{
type: 'emphasis',
children: [{type: 'text', value: 'bravo'}]
}
]
}Strong
Strong (Parent) represents strong emphasis.
interface Strong <: Parent {
type: "strong";
}For example, the following markdown:
**alpha** __bravo__Yields:
{
type: 'paragraph',
children: [
{
type: 'strong',
children: [{type: 'text', value: 'alpha'}]
},
{type: 'text', value: ' '},
{
type: 'strong',
children: [{type: 'text', value: 'bravo'}]
}
]
}Delete
Delete (Parent) represents text ready for removal.
interface Delete <: Parent {
type: "delete";
}For example, the following markdown:
~~alpha~~Yields:
{
type: 'delete',
children: [{type: 'text', value: 'alpha'}]
}Link
Link (Parent) represents the humble hyperlink.
interface Link <: Parent {
type: "link";
url: string;
title: string | null;
}For example, the following markdown:
[alpha](http://example.com "bravo")Yields:
{
type: 'link',
url: 'http://example.com',
title: 'bravo',
children: [{type: 'text', value: 'alpha'}]
}Image
Image (Node) represents the figurative figure.
interface Image <: Node {
type: "image";
url: string;
title: string | null;
alt: string | null;
}For example, the following markdown:
Yields:
{
type: 'image',
url: 'http://example.com',
title: 'bravo',
alt: 'alpha'
}Footnote
Footnote (Parent) represents an inline marker, whose content
relates to the document but is outside its flow.
interface Footnote <: Parent {
type: "footnote";
}For example, the following markdown:
[^alpha bravo]Yields:
{
type: 'footnote',
children: [{type: 'text', value: 'alpha bravo'}]
}LinkReference
LinkReference (Parent) represents a humble hyperlink, its url
and title defined somewhere else in the document by a
Definition.
referenceType is needed to detect if a reference was meant as a reference
([foo][]) or just unescaped brackets ([foo]).
interface LinkReference <: Parent {
type: "linkReference";
identifier: string;
referenceType: referenceType;
}enum referenceType {
"shortcut" | "collapsed" | "full";
}For example, the following markdown:
[alpha][bravo]Yields:
{
type: 'linkReference',
identifier: 'bravo',
referenceType: 'full',
children: [{type: 'text', value: 'alpha'}]
}ImageReference
ImageReference (Node) represents a figurative figure, its url and
title defined somewhere else in the document by a Definition.
referenceType is needed to detect if a reference was meant as a reference
(![foo][]) or just unescaped brackets (![foo]).
See LinkReference for the definition of referenceType.
interface ImageReference <: Node {
type: "imageReference";
identifier: string;
referenceType: referenceType;
alt: string | null;
}For example, the following markdown:
![alpha][bravo]Yields:
{
type: 'imageReference',
identifier: 'bravo',
referenceType: 'full',
alt: 'alpha'
}FootnoteReference
FootnoteReference (Node) is like Footnote, but its
content is already outside the documents flow: placed in a
FootnoteDefinition.
interface FootnoteReference <: Node {
type: "footnoteReference";
identifier: string;
}For example, the following markdown:
[^alpha]Yields:
{
type: 'footnoteReference',
identifier: 'alpha'
}Definition
Definition (Node) represents the definition (as in, location and
title) of a LinkReference or an
ImageReference.
interface Definition <: Node {
type: "definition";
identifier: string;
url: string;
title: string | null;
}For example, the following markdown:
[alpha]: http://example.comYields:
{
type: 'definition',
identifier: 'alpha',
url: 'http://example.com',
title: null
}FootnoteDefinition
FootnoteDefinition (Parent) represents the definition (as in,
content) of a FootnoteReference.
interface FootnoteDefinition <: Parent {
type: "footnoteDefinition";
identifier: string;
}For example, the following markdown:
[^alpha]: bravo and charlie.Yields:
{
type: 'footnoteDefinition',
identifier: 'alpha',
children: [{
type: 'paragraph',
children: [{type: 'text', value: 'bravo and charlie.'}]
}]
}TextNode
TextNode (Text) represents everything that is just text.
Note that its type property is text, but it is different from
Text.
interface TextNode <: Text {
type: "text";
}For example, the following markdown:
Alpha bravo charlie.Yields:
{type: 'text', value: 'Alpha bravo charlie.'}List of Utilities
mdast-util-assert— Assert MDAST nodesmdast-add-list-metadata— Enhances the metadata of list and listItem nodesmdast-comment-marker— Parse a comment markermdast-util-compact— Make an MDAST tree compactmdast-util-definitions— Find definition nodesmdast-flatten-listitem-paragraphs— Flatten listItem and (nested) paragraph into one listItem nodemdast-flatten-nested-lists— Transforms an MDAST tree to avoid lists inside listsmdast-util-heading-range— Markdown heading as rangesmdast-util-heading-style— Get the style of a heading nodemdast-util-inject— Inject a tree into another at a given headingmdast-util-to-string— Get the plain text content of a nodemdast-flatten-image-paragraphs— Flatten paragraph and image into one image nodemdast-move-images-to-root— Moves image nodes up the tree until they are strict children of the rootmdast-normalize-headings— Ensure at most one top-level heading is in the documentmdast-squeeze-paragraphs— Remove empty paragraphsmdast-util-toc— Generate a Table of Contents from a treemdast-util-to-hast— Transform MDAST to HASTmdast-util-to-nlcst— Transform MDAST to NLCSTmdast-zone— HTML comments as ranges or markers
Related
Contribute
mdast is built by people just like you! Check out
contributing.md for ways to get started.
This project has a Code of Conduct. By interacting with this repository, organisation, or community you agree to abide by its terms.
Want to chat with the community and contributors? Join us in Gitter!
Have an idea for a cool new utility or tool? That’s great! If you want
feedback, help, or just to share it with the world you can do so by creating
an issue in the syntax-tree/ideas repository!
Acknowledgments
The initial release of this project was authored by @wooorm.
Special thanks to @eush77 for their work, ideas, and incredibly valuable feedback!
Thanks to @anandthakker, @BarryThePenguin, @izumin5210, @jasonLaster, @justjake, @KyleAMathews, @Rokt33r, @rhysd, @Sarah-Seo, @sethvincent, and @simov for contributing commits since!