Skip to content
This repository has been archived by the owner on Aug 10, 2022. It is now read-only.

Latest commit

 

History

History
73 lines (53 loc) · 2.24 KB

new_luif.md

File metadata and controls

73 lines (53 loc) · 2.24 KB

Suggested LUIF syntax, 3 June 2015

This is for a seamless grammar -- a lexer with a single lexeme. Lexers allow semantics in Kollos, so this does everything.

seamless l0.E
   (number) -> number
   || (E ws? '*' ws? E) -> E:1*E:2
   || (E ws? '+' ws? E) -> E:1+E:2
token ws ([\009\010\013\032]) -> nil
token l0.number ([%d]+)

Guide to the syntax

seamless and token are keywords. seamless indicates the top of a seamless grammar -- one that is lexical, with only one lexeme. Lexers in Kollos allow semantics, and the semantics of a seamless grammar will usually be important.

Rules are in the form

lhs (rhs)

they are treated as functions a la recursive descent and/or Perl 6. But the implementation is Marpa, and lhs (rhs) is equivalent to lhs ::= rhs. For example, there can be more than one rule with the same "lhs".

A precedenced rule is shown by separating the RHS's with ||, as is done in the SLIF.

Semantics is Lua code either is curly braces ({}) or preceeded by a "does" operator (->). The curly brace form is not shown. When a "does" operator is used, what follows it must be a single Lua expression.

-> { E:1 * E:2 }

is the same as

{ return E:1 * E:2 }

The E:1 and E:2 variables refer to the RHS, or child values. E:1 is the value of the first instance of E on the RHS. Currently E+E would mean E:1+E:1, but it might be nice to have it mean E:1+E:2 -- that is, if there is more than one E on the RHS, each E in the semantics corresponds to the RHS instances of E in order. If there are more instances of E in the semantics than there are on the RHS of the rule, the last RHS instance is repeated.

Where the LHS is qualified -- for example l0.E, that indicates symbol E in the l0 grammar. The LUIF will allow several grammar to be defined as once. Qualifying the LHS with a grammar names affects the entire rule. If no grammar name is specified, the last one explicitly specified is used.

Rough SLIF equivalent

There is no exact equivalent in the SLIF, but this grammar is in sort of "in the same spirit":

   E ::= Number
       || E '*' E action => do_multiply
       || E '+' E action => do_add
   Number ~ [\d]+

   :discard ~ whitespace
   whitespace ~ [\s]+