Natural Language Concrete Syntax Tree, used in @retextjs
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
code-of-conduct.md Add `code-of-conduct.md` Nov 18, 2017
contributing.md Update links Dec 8, 2017
logo.svg Update logo Aug 31, 2017
readme.md

readme.md

NLCST

Natural Language Concrete Syntax Tree format.


Natural language for human and machine.

NLCST discloses the parts of natural language as a concrete syntax tree. Concrete means all information is stored in this tree and an exact replica of the original document can be re-created.

NLCST is a subset of Unist, and implemented by retext.

This document may not be released. See releases for released documents. The latest released version is 1.0.1.

Table of Contents

List of Utilities

In addition, see Unist for other utilities which work with retext nodes.

CST

Root

Root (Parent) houses all nodes.

interface Root <: Parent {
  type: "RootNode";
}

Paragraph

Paragraph (Parent) represents a self-contained unit of discourse in writing dealing with a particular point or idea.

interface Paragraph <: Parent {
  type: "ParagraphNode";
}

Sentence

Sentence (Parent) represents grouping of grammatically linked words, that in principle tells a complete thought, although it may make little sense taken in isolation out of context.

interface Sentence <: Parent {
  type: "SentenceNode";
}

Word

Word (Parent) represents the smallest element that may be uttered in isolation with semantic or pragmatic content.

interface Word <: Parent {
  type: "WordNode";
}

Symbol

Symbol (Text) represents typographical devices like white space, punctuation, signs, and more, different from characters which represent sounds (like letters and numerals).

interface Symbol <: Text {
  type: "SymbolNode";
}

Punctuation

Punctuation (Symbol) represents typographical devices which aid understanding and correct reading of other grammatical units.

interface Punctuation <: Symbol {
  type: "PunctuationNode";
}

WhiteSpace

WhiteSpace (Symbol) represents typographical devices devoid of content, separating other grammatical units.

interface WhiteSpace <: Symbol {
  type: "WhiteSpaceNode";
}

Source

Source (Text) represents an external (ungrammatical) value embedded into a grammatical unit: a hyperlink, a line, and such.

interface Source <: Symbol {
  type: "SourceNode";
}

TextNode

TextNode (Text) represents actual content in an NLCST document: one or more characters. Note that its type property is TextNode, but it is different from the abstract Text interface.

interface TextNode < Text {
    type: "TextNode";
}

Related

Contribute

nlcst is built by people just like you! Check out contribute.md for ways to get started.

This project has a Code of Conduct. By interacting with this repository, organisation, or community you agree to abide by its terms.

Want to chat with the community and contributors? Join us in Gitter!

Have an idea for a cool new utility or tool? That’s great! If you want feedback, help, or just to share it with the world you can do so by creating an issue in the syntax-tree/ideas repository!

Acknowledgments

The initial release of this project was authored by @wooorm.

Thanks to @nwtn, @tmcw, @muraken720, and @dozoisch for contributing commits since!

License

CC-BY-4.0 © Titus Wormer