You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Consider a scenario where we wish to have strings with their quotation marks removed, so that "hello" is surfaced in the target language as a STRING token with value hello, rather than "hello" or 'hello':
fragment SingleQuote: '\'';
fragment DoubleQuote: '"';
STRING
: SingleQuote ~SingleQuote*? SingleQuote
| DoubleQuote ~DoubleQuote*? DoubleQuote
;
You'd currently need language specific actions or careful use of hidden/skip to pull this off, and it's not especially intuitive when doing so. I'd like to suggest a couple of multi-faceted possible alternatives that are language independent:
Antlr will currently warn about a collision if you do STRING=... so I used angled brackets to make it more distinctive, and I thought requiring the parens added to that clarity, but I can live without them.
SQ_STRING and DQ_STRING would be entirely inlined to STRING, making them transparent (and in languages like Python reducing the function call overhead).
The second syntax could also do this:
STRING : (SQ_STRING | DQ_STRING) AS STRING // 'as' term must match token name.
Nota Bene
Syntax candidates provided without prejudice, one demonstrating an extrapolation of current antlr syntax and the other probably triggered by a neuron that knows sql.
The text was updated successfully, but these errors were encountered:
Consider a scenario where we wish to have strings with their quotation marks removed, so that
"hello"
is surfaced in the target language as a STRING token with valuehello
, rather than"hello"
or'hello'
:You'd currently need language specific actions or careful use of hidden/skip to pull this off, and it's not especially intuitive when doing so. I'd like to suggest a couple of multi-faceted possible alternatives that are language independent:
<tokenname>=(term)
Antlr will currently warn about a collision if you do
STRING=...
so I used angled brackets to make it more distinctive, and I thought requiring the parens added to that clarity, but I can live without them.Replacement
A gross simplification of https://github.com/antlr/antlr4/blob/master/doc/faq/lexical.md becomes possible by allowing a second equal sign for literal substitution:
becomes:
Alternative syntax: (lit "AS" sub)
Fairly succinct and clean, with a special case for discarding tokens by reducing them to the empty string (
''
).Passthru/Inline
The first syntax could also allow for passthru, a case where the user wants a named production but doesn't want it to appear in the parse tree:
SQ_STRING and DQ_STRING would be entirely inlined to STRING, making them transparent (and in languages like Python reducing the function call overhead).
The second syntax could also do this:
Syntax candidates provided without prejudice, one demonstrating an extrapolation of current antlr syntax and the other probably triggered by a neuron that knows sql.
The text was updated successfully, but these errors were encountered: