tree-sitter-lux is a Tree-sitter grammar for the Lux language. It is based on this syntax document.
npm install tree-sitter tree-sitter-lux
A basic nodejs script might look like this:
const Parser = require('tree-sitter');
const Lux = require('tree-sitter-lux');
const parser = new Parser();
parser.setLanguage(Lux);
const sourceCode = '(+ 1 1)';
const tree = parser.parse(sourceCode);
console.log(tree.rootNode.toString());
This produces the following syntax tree
(lux
(form
(identifier)
(natural)
(natural)))
Currently the grammar recognizes all the basic Lux literals comment, bit, natural, integer, revolution, fraction, text, identifier, tag, form, tuple and record.
Recognizing definitions, anonymous functions and modules
The node types in the abstract syntax tree generated by tree-sitter correspond to Lux syntax tokens.
Additional meaning that is derived from those syntax tokens, e.g. that(def: x Int +1)
is a definition,
might be encoded using fields on the node.
The top level node type is always lux
.
Children of the lux
node are of one of the following types:
Recognizes comments, e.g. ## this is a comment
.
Recognizes bits, e.g. #0
and #1
.
Recognizes naturals, e.g. 123
.
Recognizes integers, e.g. +123
and -456
.
Recognizes revolutions, e.g. .123
.
Recognizes fractions, e.g. +123.456
.
Recognizes text, e.g. "text"
.
Recognizes identifiers, e.g. identifier
, prefix.identifier
, or ..identifier
.
Recognizes tags, e.g. #tag
.
Recognizes forms, e.g. (+ 1 2)
.
This example produces the following syntax tree:
(lux
(form
(identifier)
(natural)
(natural)))
Children of form nodes can be of any of the top level types.
Recognizes tuples, e.g. [a +2 "c"]
.
This example produces the following syntax tree:
(lux
(tuple
(identifier)
(integer)
(text)))
Children of form nodes can be of any of the top level types.
Recognizes records, e.g. {#a b "c" 4}
.
This example produces the following syntax tree:
(lux
(record
(pair (tag) (identifier))
(pair (text) (natural))))
Children of record nodes can be of type comment
or pair
.
Recognizes pairs of syntax tokens, but only inside records, e.g. #a b
inside {#a b}
.
Children of pair nodes can be of any of the top level types.
Don't assume that there are exactly two children inside a pair. There might be a comment between the key and value. However, you can assume that there are always exactly two non-comment nodes inside a pair. Otherwise there would be an error.
Anything that is not recognized as valid Lux syntax will be encoded by a node of type ERROR
or MISSING
.