Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle groups and repeating groups #3

Closed
johnedquinn opened this issue Mar 29, 2023 · 2 comments
Closed

Handle groups and repeating groups #3

johnedquinn opened this issue Mar 29, 2023 · 2 comments

Comments

@johnedquinn
Copy link
Owner

We need to handle groups and repeating groups:

expr
    : expr (PLUS | MINUS | ASTERISK) atomic --> arithmeticExpr
    | expr ( PERIOD expr )+ --> indexingExpr
    ;
@johnedquinn
Copy link
Owner Author

Background

So, I've been getting around this by manually writing out the translation for * and +.

For example, to accomplish:

expr: BRACKET_LEFT expr* BRACKET_RIGHT --> exprArray;
expr: INT --> exprInt;

The way around this is isolating the expr* into its own rule:

exprs: EPSILON --> emptyExprs;
exprs: exprs expr --> manyExprs;

expr: BRACKET_LEFT exprs BRACKET_RIGHT --> exprArray;
expr: INT --> exprInt;

Now, if arrays required at least a single expression (expr+):

exprs: expr --> singleExpr;
exprs: exprs expr --> manyExprs;

expr: BRACKET_LEFT exprs BRACKET_RIGHT --> exprArray;
expr: INT --> exprInt;

Similarly, we need to handle ?. I haven't done this yet, but it should just be replaced with EPSILON as the alternative.

@johnedquinn
Copy link
Owner Author

In my opinion, we should introduce the concept of "generated", or "hidden", rules:

Groups

expr: (IDENT) --> exprIdent;

Should turn into:

expr: _0 --> exprIdent;

_0: IDENT --> _0_0 { generated: true; };

0+

expr: BRACKET_LEFT expr* BRACKET_RIGHT --> exprArray;
expr: INT --> exprInt;

Should turn into

expr: BRACKET_LEFT _0 BRACKET_RIGHT --> exprArray;
expr: INT --> exprInt;

_0: EPSILON --> _0_0 { generated: true; };
_0: _0 expr --> _0_1 { generated: true; };

1+

expr: BRACKET_LEFT expr+ BRACKET_RIGHT --> exprArray;
expr: INT --> exprInt;

Should turn into

expr: BRACKET_LEFT _0 BRACKET_RIGHT --> exprArray;
expr: INT --> exprInt;

_0: expr --> _0_0 { generated: true; };
_0: _0 expr --> _0_1 { generated: true; };

Optionals (?)

expr: SELECT expr FROM expr (AS IDENT)? --> exprSfw;
expr: IDENT --> exprIdent;

Should turn into

expr: SELECT expr FROM expr _0 --> exprSfw;
expr: IDENT --> exprIdent;

_0: EPSILON --> _0_0 { generated: true; };
_0: AS IDENT --> _0_1 { generated: true; };

Alternatives (|)

expr: expr IS (INT | STR | expr) --> exprIsFunction;

Should turn into

expr: expr IS _0 --> exprIsFunction;

_0: INT --> _0_0 { generated: true; };
_0: STR --> _0_1 { generated: true; };
_0: expr --> _0_2 { generated: true; };

Generation & Parsing

Then, as part of the runtime, we can expose a GeneratedRuleNode, which acts as a temporary holder of children.

For generation, for generated rules, their CreateNodes can just create GeneratedRuleNodes.

Then, when we are parsing and we reduce, we can check if any of the popped children from the stack are GeneratedRuleNodes, and if they are, we add their children (and pop recursively) to the newly created node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant