Simplify grammar #22

QuarticCat · 2023-05-29T14:39:19Z

Here are some possible ergonomic improvements.

Set active as default so that users only need to mark silent rules. And for silent rules, we can use some special character or naming style to make them clean.
~~Combine lexer definitions and tokens.~~ (implemented in unify lexical def and lexical token #56)
1. ~~To ensure all tokens are terminal, we only need to check if the reference graph is a DAG, and then inline all rules.~~
2. ~~To avoid generating extra lexers, we can delay the generation of lexers after the generation of parsers, and inline & generate lexers by need.~~
~~Combine parser definitions and fixpoints.~~ (implemented in Auto infer fixpoints #47)
1. ~~We may automatically infer fixpoints. A possible algorithm is to find cycles in the reference graph and then mark all vertices in cycles as fixpoints.~~
Remove ~ (sequence operator). Instead of writing e1 ~ e2, we can simply write e1 e2.
Ad-hoc lexical rule. For example, "(" ~ sexprs ~ ")".

The text was updated successfully, but these errors were encountered:

SchrodingerZhu · 2023-06-06T16:53:43Z

We are thinking of extending our system such that not only trees, but arbitrary data types are supported as parser output as well.

However, this brings difficulties to apply TCO. Thus, it is still not clear to me how should the design go.

SchrodingerZhu · 2023-06-06T16:59:50Z

It seems to me that we can separate rules into two parts (not counting offset and src):

A negative rule that accepts a &mut Consumer and returns Result<(), Error>. (This can be tail-call optimised.) (The &mut Consumer, for example, can be a &mut Vec<T>).
A positive rule that accepts nothing and returns Result<T, Error>.

However, it is not clear that what will happen when we need to expand actions. We also need to figure out a way to really specify such rule properly.

SchrodingerZhu added the help wanted Extra attention is needed label Jun 6, 2023

QuarticCat self-assigned this Jun 12, 2023

Provide feedback