Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[plsql] Rework grammar for better maintainability #1500

Open
adangel opened this issue Dec 3, 2018 · 0 comments
Open

[plsql] Rework grammar for better maintainability #1500

adangel opened this issue Dec 3, 2018 · 0 comments
Labels
an:enhancement An improvement on existing features / rules in:grammar About the grammar of a lexer or parser, eg, a parse/lex exception
Milestone

Comments

@adangel
Copy link
Member

adangel commented Dec 3, 2018

Current problems of the plsql grammar:

  • Not really complying to oracle's plsql definition. The grammar is able to parse most plsql source files, but it might parse it wrongly (e.g. choosing wrong productions)
  • There is often a mixture of using specific AST nodes or image-strings to transport information. This image creation (via StringBuilders) makes the grammar hard to maintain and hard to read. E.g.
    ASTVariableOrConstantDeclarator VariableOrConstantDeclarator() :
    {
    PLSQLNode simpleNode = null ;
    StringBuilder sb = new StringBuilder();
    }
    {
    (
    simpleNode = VariableOrConstantDeclaratorId() { sb.append(simpleNode.getImage());}
    [LOOKAHEAD(2) <CONSTANT> {sb.append(" " + token.image);} ] simpleNode = Datatype() { sb.append(" " + simpleNode.getImage());}
    [[<NOT> {sb.append(" " + token.image);} ] <NULL> {sb.append(" " + token.image);} ]
    [ ( ":" "=" {sb.append(" :=");}| <_DEFAULT> {sb.append(" " + token.image);})
    simpleNode = VariableOrConstantInitializer() { sb.append(" " + simpleNode.getImage());}
    ]
    )
    { jjtThis.setImage(sb.toString()) ; return jjtThis ; }
    }
  • Since changing this will definitively change the resulting AST, most rules will need to be adjusted.

Comment from #1493 (comment) copied here:

Doesn't this production try to do too much? I mean dumping the whole operator to a string makes it harder to analyse later on, whereas reifying the structure of the operator into an AST node would avoid losing this information.
E.g.

RelationalExpression ::= 
    AdditiveExpression RelationalOperator AdditiveExpression EscapeClause?

RelationalOperator  ::= // a #void interface, to keep the AST flat
      ComparisonOperator
    | NegatedOperator
    | NegatableOperator

NegatedOperator ::=
   "NOT" NegatableOperator
     
NegatableOperator ::= // a #void interface
      BetweenOp   // These nodes would each correspond to their respective keyword
    | InOp
    | LikeOp
    | SetInclusionOp
    | MultisetOp

ComparisonOperator ::=
    "<>" | "<=" | ...

SetInclusionOp ::= 
    (MemberOp | SubmultisetOp) "OF"?

MultisetOp ::=
   "MULTISET" ("EXCEPT | "INTERSECT" | "UNION") [ "DISTINCT" | "ALL"]

EscapeClause ::=
  "ESCAPE" (CharLiteral | StringLiteral)

Having one node for each operator like

    | InOp
    | LikeOp

may seem verbose at first, but JJTree allows to write that inline as e.g.

      <IN>   #InOp
    | <LIKE> #LikeOp

instead of splitting this in very many productions. (I mention this because I think I've never come across this syntax anywhere in our grammars so maybe you're unaware of it.) This makes the AST structure very informative, but has the downside to increase the number of node classes significantly. If this is a problem, we may want to use a single common node type, e.g. ASTPlsqlKeyword, to represent any one-keyword AST node, with the image selecting the actual keyword used.

This is obviously out of the scope of this PR, and maybe to be left for 7.0.0, but something to think about anyway.
I can open an issue to track it separately if you think this change is worth it.

@adangel adangel mentioned this issue May 23, 2020
6 tasks
@oowekyala oowekyala added in:grammar About the grammar of a lexer or parser, eg, a parse/lex exception an:enhancement An improvement on existing features / rules labels Apr 26, 2021
@adangel adangel added this to the 7.x milestone Jan 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
an:enhancement An improvement on existing features / rules in:grammar About the grammar of a lexer or parser, eg, a parse/lex exception
Projects
None yet
Development

No branches or pull requests

2 participants