Natural Language Concrete Syntax Tree, used in @retextjs
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information. Add `` Nov 18, 2017 Update links Dec 8, 2017
logo.svg Update logo Aug 31, 2017


Natural Language Concrete Syntax Tree format.

Natural language for human and machine.

NLCST discloses the parts of natural language as a concrete syntax tree. Concrete means all information is stored in this tree and an exact replica of the original document can be re-created.

NLCST is a subset of Unist, and implemented by retext.

This document may not be released. See releases for released documents. The latest released version is 1.0.1.

Table of Contents

List of Utilities

In addition, see Unist for other utilities which work with retext nodes.



Root (Parent) houses all nodes.

interface Root <: Parent {
  type: "RootNode";


Paragraph (Parent) represents a self-contained unit of discourse in writing dealing with a particular point or idea.

interface Paragraph <: Parent {
  type: "ParagraphNode";


Sentence (Parent) represents grouping of grammatically linked words, that in principle tells a complete thought, although it may make little sense taken in isolation out of context.

interface Sentence <: Parent {
  type: "SentenceNode";


Word (Parent) represents the smallest element that may be uttered in isolation with semantic or pragmatic content.

interface Word <: Parent {
  type: "WordNode";


Symbol (Text) represents typographical devices like white space, punctuation, signs, and more, different from characters which represent sounds (like letters and numerals).

interface Symbol <: Text {
  type: "SymbolNode";


Punctuation (Symbol) represents typographical devices which aid understanding and correct reading of other grammatical units.

interface Punctuation <: Symbol {
  type: "PunctuationNode";


WhiteSpace (Symbol) represents typographical devices devoid of content, separating other grammatical units.

interface WhiteSpace <: Symbol {
  type: "WhiteSpaceNode";


Source (Text) represents an external (ungrammatical) value embedded into a grammatical unit: a hyperlink, a line, and such.

interface Source <: Symbol {
  type: "SourceNode";


TextNode (Text) represents actual content in an NLCST document: one or more characters. Note that its type property is TextNode, but it is different from the abstract Text interface.

interface TextNode < Text {
    type: "TextNode";



nlcst is built by people just like you! Check out for ways to get started.

This project has a Code of Conduct. By interacting with this repository, organisation, or community you agree to abide by its terms.

Want to chat with the community and contributors? Join us in Gitter!

Have an idea for a cool new utility or tool? That’s great! If you want feedback, help, or just to share it with the world you can do so by creating an issue in the syntax-tree/ideas repository!


The initial release of this project was authored by @wooorm.

Thanks to @nwtn, @tmcw, @muraken720, and @dozoisch for contributing commits since!


CC-BY-4.0 © Titus Wormer