Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] How to create grammar to parse unary post increment and decrement AST? #379

Closed
DanielMazurkiewicz opened this issue Mar 9, 2023 · 6 comments

Comments

@DanielMazurkiewicz
Copy link

Hey!

Got a little stuck while trying to parse to AST post inc(++) and dec(--). Wherever it seem logic to me to put expression_unary_post rule, once I do it compilation throws:

Error parsing grammar
Maximum call stack size exceeded

How to make it work? (assuming that you can pre and post increment anything)

start = precedence_11

precedence_11_operator = "+" / "-"
precedence_11
  = left:precedence_12 empty? operator:precedence_11_operator empty? right:precedence_11 {
        const { start, end } = location();
        return {
            type: "math",
            subtype: operator,
            left,
            right,
            start: start.offset, end: end.offset,
        };
    }
  / precedence_12

precedence_12_operator = "*" / "/"
precedence_12
  = left:precedence_13 empty? operator:precedence_12_operator empty? right:precedence_12 {
        const { start, end } = location();
        return {
            type: "math",
            subtype: operator,
            left,
            right,
            start: start.offset, end: end.offset,
        };
    }
  / precedence_13


precedence_13
  = left:expression_primary empty? "**" empty? right:precedence_13 {
        const { start, end } = location();
        return {
            type: "math",
            subtype: "**",
            left,
            right,
            start: start.offset, end: end.offset,
        };
    }
  / expression_primary



expression_unary_post_operator = "++" / "--"
expression_unary_post
  = left:expression_primary empty? operator:expression_unary_post_operator {
        const { start, end } = location();

        return {
            type: "post",
            subtype: operator,
            isUnary: true,
            left,
            right: null,
            start: start.offset, end: end.offset,
        };
    }


expression_group
  = sign:sign? "(" empty? operation:expression empty? ")" {
        const { start, end } = location();
        return {
            type: "group",
            sign,
            operation,
            start: start.offset, end: end.offset,
        };
    }


expression_unary_pre_operator = "!" / "~" / "++" / "--"
expression_unary_pre
  = operator:expression_unary_pre_operator empty? right:expression_primary {
        const { start, end } = location();
        let type;
        switch (operator) {
            case "--": type = "pre"; break;
            case "++": type = "pre"; break;
            case "!": type = "logic"; break;
            case "~": type = "bit"; break;
        }

        return {
            type,
            subtype: operator,
            isUnary: true,
            left: null,
            right,
            start: start.offset, end: end.offset,
        };
    }

expression_primary = 
    number / 
    expression_unary_pre /
    expression_group 
@Mingun
Copy link
Member

Mingun commented Mar 10, 2023

Tip: if you need only start and end offset, more performant solution is use range() function.

Typical solution to parse arithmetic expressions:

//{ The higher the priority, the larger the rule number.
Expr// assignment, minimal priority
  = l:Expr0 t:(p1 Expr0)* {return rightAssotiative(l, t);};
Expr0// value lists
  = h:(@Expr1 comma)* t:Expr1? {return list(h, t);};
Expr1// Logical OR
  = l:Expr2 t:(p2 Expr2)* {return leftAssotiative(l, t);};
Expr2// Logical AND
  = l:Expr3 t:(p3 Expr3)* {return leftAssotiative(l, t);};
Expr3// Bitwise OR
  = l:Expr4 t:(p4 Expr4)* {return leftAssotiative(l, t);};
Expr4// Bitwise XOR
  = l:Expr5 t:(p5 Expr5)* {return leftAssotiative(l, t);};
Expr5// bitwise AND
  = l:Expr6 t:(p6 Expr6)* {return leftAssotiative(l, t);};
Expr6// comparison operators
  = l:Expr7 t:(p7 Expr7)* {return leftAssotiative(l, t);};
Expr7// relational operators
  = l:Expr8 t:(p8 Expr8)* {return leftAssotiative(l, t);};
Expr8// bitwise shift
  = l:Expr9 t:(p9 Expr9)* {return leftAssotiative(l, t);};
Expr9// addition and substraction
  = l:Expr10 t:(p10 Expr10)* {return leftAssotiative(l, t);};
Expr10// multiplication, division, remainder
  = l:Expr11 t:(p11 Expr11)* {return leftAssotiative(l, t);};
Expr11// pow
  = l:Expr12 t:(p12 Expr12)* {return leftAssotiative(l, t);};
Expr12// Logical and bitwise NOT, unary operators, maximum priority
  = o:p13 r:Expr12 {return node({kind: o, expr: r});}
  / Expr13
  ;
Expr13// prefix increment / decrement
  = o:p14 r:Expr13 {return node({kind: o, expr: r});}
  / Expr14
  ;
Expr14// postfix increment / decrement
  = r:Expr15 o:p15 {return node({kind: o, expr: r});}
  / Expr15
  ;
Expr15// function call, indexation, dot syntax
  = /*l:Expr16 t:suffix* {return suffix(l, t);};
Expr16
  = funcdef
  / block
  / literal
  / constant
  / IfExpr
  / ForExpr
  / WhileExpr
  / BreakExpr
  / YieldExpr
  / ReturnExpr
  / ContinueExpr
  / define
  / name
  / */"(" @Expr ")"
  ;
//}

//{ Operators
comma = @',' _;

p1 = @('='/'+='/'-='/'*='/'/='/'%='/'<<='/'>>='/'&='/'^='/'|='/'&&='/'||=') _;
p2 = @'||'_;
p3 = @'&&'_;
p4 = @'|' _;
p5 = @'^' _;
p6 = @'&' _;
p7 = @('=='/'!=') _;
p8 = @('<='/'<'/'>='/'>') _;
p9 = @('<<'/'>>') _;
p10= @[+-] _;
p11= @[*/%]_;
p12= @'**' _;
p13= !p14 @[!~+-] _;
p14= @('++'/'--') _;
p15= p14;
//}
_ 'Whitespace' = [ \r\n\t]*;

Error parsing grammar
Maximum call stack size exceeded

This is bad. Could you post the full minimal reproducible example of such grammar? It could be bug in peggy looks like some infinity recursion was not detected

@DanielMazurkiewicz
Copy link
Author

DanielMazurkiewicz commented Mar 10, 2023

Thanks for tip and grammar pattern!

@DanielMazurkiewicz
Copy link
Author

DanielMazurkiewicz commented Mar 10, 2023

@Mingun

This is bad. Could you post the full minimal reproducible example of such grammar? It could be bug in peggy looks like some infinity recursion was not detected

Narrowed it, this is minimal, or close to minimal that causes maximum stack exceeded error:

start = expr* 

expr
  = expr "++"

@hildjj
Copy link
Contributor

hildjj commented Mar 21, 2023

@Mingun do you have a suggestion for how to catch this as an infinite loop?

@Mingun
Copy link
Member

Mingun commented Mar 21, 2023

These are unintended consequences of #160 -- because we now run checks of each stage without throwing exceptions, the AST in reportInfiniteRepetition pass can have infinity recursive rules which we try to check. The fast solution is to factor out the reportInfiniteRecursion pass to its own stage before the current check stage. The more clever solution is to tweak all passes in a check stage so that they will not assume that AST contains no recursive rules.

@hildjj
Copy link
Contributor

hildjj commented Feb 8, 2024

That was part of it. The other part is that the reportInfiniteRecursion recursion rule keeps going after it finds an error, in this case. I'm moving the rule into the new prepare phase, and sprinkling some checks around looking to see if we've gotten an error yet. This means that we will only find the first infinite recursion, but I'm fine with that in order to fix this class of bugs.

@hildjj hildjj closed this as completed in 2a6623f Feb 10, 2024
hildjj added a commit that referenced this issue Feb 10, 2024
hildjj added a commit to hildjj/peggy that referenced this issue Feb 13, 2024
* main: (107 commits)
  Version update.  Check npmignore.  Audit CHANGELOG.md.
  4.0.0
  Update dependencies final time before release.
  Fix indentation in one part of examples/javascript.pegjs.  Fixes peggyjs#445.
  Code review issues
  Add developer docs.  Fixes peggyjs#466.
  Add directories when they don't exist, rather than throwing an error.  Fixes peggyjs#440.
  Update changelog.  Include peggyjs#463 as well.
  Fix peggyjs#379.  Move reportInfiniteRecursion to prepare phase, ensure that it doesn't keep going when it finds an error.
  Typo: 'rutimes'
  Mark IE as explicitly unsupported
  Switch to flat eslint config.  Lint minified output for web compat.
  Code review nits
  Add tests, fix up fromMem to not need gross hack.
  Adds support for running tests against es module output.  Fixes peggyjs#399.
  Slight wording tweak.  Double-checked that peggyjs#415 is fixed.
  Fixes peggyjs#434
  Update bundles as well
  Fixes peggyjs#450
  Update dependencies, including new lint rules.  Move to simpler tsconfig so that eslint can easily find the correct file.  Fix small lint issues with new rules.
  ...
hildjj added a commit that referenced this issue Feb 13, 2024
* main: (107 commits)
  Version update.  Check npmignore.  Audit CHANGELOG.md.
  4.0.0
  Update dependencies final time before release.
  Fix indentation in one part of examples/javascript.pegjs.  Fixes #445.
  Code review issues
  Add developer docs.  Fixes #466.
  Add directories when they don't exist, rather than throwing an error.  Fixes #440.
  Update changelog.  Include #463 as well.
  Fix #379.  Move reportInfiniteRecursion to prepare phase, ensure that it doesn't keep going when it finds an error.
  Typo: 'rutimes'
  Mark IE as explicitly unsupported
  Switch to flat eslint config.  Lint minified output for web compat.
  Code review nits
  Add tests, fix up fromMem to not need gross hack.
  Adds support for running tests against es module output.  Fixes #399.
  Slight wording tweak.  Double-checked that #415 is fixed.
  Fixes #434
  Update bundles as well
  Fixes #450
  Update dependencies, including new lint rules.  Move to simpler tsconfig so that eslint can easily find the correct file.  Fix small lint issues with new rules.
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants