You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Compilation of EQL expressions involves several phases of analysis and transformation. The focus of
this issue is on the first 2 phases: tokenisation and Stage 0 transformation.
The result of method calls on EQL's fluent API is represented as a sequence of tokens
corresponding to the called methods, which may also be seen as a sequence of transitions between
states of an automaton manifested in the form of the fluent API. This phase corresponds to
the operation of a lexer -- lexical analysis.
The tokens are then subject to an analysis (namely, Stage 0 transformation) that groups them
according to the rules of EQL's grammar, thereby building an AST for the EQL expression. This
phase corresponds to syntactic analysis, to some extent, since it makes use of semantic analysis as well.
The following drawbacks have been identified in the course of work with this approach:
The tokenisation phase is not strictly necessary, since method calls on the fluent API (state
transitions) perform the role of a lexer implicitly: the called method directly corresponds to
the token. In other words, we get tokens for free.
As a consequence of the first issue, existence of a separate phase dedicated to syntactic
analysis (Phase 2) leads to duplication of the EQL's grammar rules which are already encoded in
the automaton that is the fluent API. This duplication occcurs since Phase 2 needs to know how
to build an AST out of incoming tokens.
Separate existence of two phases, coupled with duplication of grammar rules, requires strict
synchronisation between them which has its own disadvantages. One example is the current defect
involving caseWhen with multiple conditions: it is accepted by the fluent API but rejected
during Stage 0 transformation (Phase 2).
It is proposed to incorporate Phase 2 into Phase 1, effectively moving syntactic analysis into the
implementation of the fluent API. The result of method calls on the fluent API would then be an AST
for the EQL expression. Such an implementation would effectively act as an LR parser (with superpowers, due to the ability of semantic analysis).
Expected outcome
Simplified implementation of Stage 0 transformation due to elimination of grammar rules duplication and the need for synchronisation.
Elimination of potential errors that could arise because of lack of synchronisation between phases.
The text was updated successfully, but these errors were encountered:
After further discussion and an experimental spike it was concluded that the duplication of grammar rules is inherent to any parsing process. At the moment the only general approach seems to be the usage of a generator for the fluent API and the parser so that grammar rules have to be defined only once.
It was decided that it is better to focus on the construction of a formal grammar for EQL and improvements to the existing parser, until the possibility of fluent API and parser generation is better understood.
@01es, yes, and they should be captured in a separate issue. Among them:
The defect with caseWhen used in conjunction with a combination of conditions.
Insufficient separation of representations of token categories (the abstract syntax tree) and encountered values (the concrete syntax tree), which complicates informal reasoning.
Description
Compilation of EQL expressions involves several phases of analysis and transformation. The focus of
this issue is on the first 2 phases: tokenisation and Stage 0 transformation.
corresponding to the called methods, which may also be seen as a sequence of transitions between
states of an automaton manifested in the form of the fluent API. This phase corresponds to
the operation of a lexer -- lexical analysis.
according to the rules of EQL's grammar, thereby building an AST for the EQL expression. This
phase corresponds to syntactic analysis, to some extent, since it makes use of semantic analysis as well.
The following drawbacks have been identified in the course of work with this approach:
transitions) perform the role of a lexer implicitly: the called method directly corresponds to
the token. In other words, we get tokens for free.
analysis (Phase 2) leads to duplication of the EQL's grammar rules which are already encoded in
the automaton that is the fluent API. This duplication occcurs since Phase 2 needs to know how
to build an AST out of incoming tokens.
synchronisation between them which has its own disadvantages. One example is the current defect
involving
caseWhen
with multiple conditions: it is accepted by the fluent API but rejectedduring Stage 0 transformation (Phase 2).
It is proposed to incorporate Phase 2 into Phase 1, effectively moving syntactic analysis into the
implementation of the fluent API. The result of method calls on the fluent API would then be an AST
for the EQL expression. Such an implementation would effectively act as an LR parser (with superpowers, due to the ability of semantic analysis).
Expected outcome
The text was updated successfully, but these errors were encountered: