Refactoring of grammar literals parsing #3780

KvanTTT · 2022-07-09T11:02:04Z

Code for parsing grammar string literals 'abc' and grammar char set [abc] have a lot of in common and I unified it. I extracted the functionality into GrammarLiteralParser that is responsible for parsing from string literals and char sets. It has the following methods:

parseStringFromStringLiteral
parseCharFromStringLiteral (returns CharParseResult)
parseChar (returns CharParseResult)
parseNextChar (returns CharParseResult)

CharParseResult can be either INVALID or CODE_POINT or PROPERTY. PROPERTY is supported only by char set.

Also, I unraveled ANTLR tool and ANTLR runtime functionality of char processing.

Split CharSupport on CharSupport and AntlrCharSupport Signed-off-by: Ivan Kochurkin <kvanttt@gmail.com>

Signed-off-by: Ivan Kochurkin <kvanttt@gmail.com>

…arser Signed-off-by: Ivan Kochurkin <kvanttt@gmail.com>

KvanTTT · 2022-07-09T11:09:10Z

@parrt it's also ready. Tool tests will be fine after #3779

parrt · 2022-07-09T16:46:37Z

Hmm...pretty risky modifying the ATN construction...

parrt · 2022-07-09T16:53:57Z

runtime/Java/src/org/antlr/v4/runtime/misc/IntervalSet.java

-				if ( a==Token.EOF ) buf.append("<EOF>");
-				else if ( elemAreChar ) buf.append("'").appendCodePoint(a).append("'");
-				else buf.append(a);
+				if ( a==Token.EOF ) {


Seems like a lot of cosmetic changes here and flipping of strings two characters for no real game in performance. Yet it costs me mental effort to review

It's not only cosmetic changes but clarity changes as well. toString should print clear representation of object instead of plain strings. In the old version it prints

' '

In the new version it prints '\n'. It's more expected behaior I think.

parrt · 2022-07-09T16:54:23Z

Can you help me understand the prime motivation here? There's a lot to review and a lot of changes that could affect allow users, because it touches the tool itself.

KvanTTT · 2022-07-09T17:35:52Z

Can you help me understand the prime motivation here? There's a lot to review and a lot of changes that could affect allow users, because it touches the tool itself.

I decided to rebase and suggest my old commits since they exist, remove code duplication (more removed lines than added) and fix inaccurancy in error messages (see tests). Our test coverage is good enough especially for the tool. Error messages without location look akward.

I've touched the tool several times, and it almost hasn't cause any trouble.

parrt · 2022-07-15T17:18:24Z

I'm going to leave this active as it could be useful but I'm allocating my time to antlr-as-a-service proj at moment. thanks!

KvanTTT added 3 commits July 8, 2022 22:43

Fix position and message for CHARACTERS_COLLISION_IN_SET error

47b5587

Split CharSupport on CharSupport and AntlrCharSupport Signed-off-by: Ivan Kochurkin <kvanttt@gmail.com>

Replace EscapedCharValue array on HashMap

22e7e8b

Signed-off-by: Ivan Kochurkin <kvanttt@gmail.com>

Merge AntlrCharSupport and EscapeSequenceParsing into GrammarLiteralP…

09f7108

…arser Signed-off-by: Ivan Kochurkin <kvanttt@gmail.com>

parrt reviewed Jul 9, 2022

View reviewed changes

parrt added error-handling comp:tool type:cleanup labels Jul 15, 2022

KvanTTT mentioned this pull request Aug 27, 2022

Head towards 4.11 release #3840

Closed

10 tasks

KvanTTT mentioned this pull request Feb 10, 2024

Refactoring of grammar literals parsing antlr/antlr5#31

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring of grammar literals parsing #3780

Refactoring of grammar literals parsing #3780

KvanTTT commented Jul 9, 2022

KvanTTT commented Jul 9, 2022 •

edited

parrt commented Jul 9, 2022

parrt Jul 9, 2022

KvanTTT Jul 12, 2022 •

edited

parrt commented Jul 9, 2022

KvanTTT commented Jul 9, 2022 •

edited

parrt commented Jul 15, 2022

Refactoring of grammar literals parsing #3780

Are you sure you want to change the base?

Refactoring of grammar literals parsing #3780

Conversation

KvanTTT commented Jul 9, 2022

KvanTTT commented Jul 9, 2022 • edited

parrt commented Jul 9, 2022

parrt Jul 9, 2022

Choose a reason for hiding this comment

KvanTTT Jul 12, 2022 • edited

Choose a reason for hiding this comment

parrt commented Jul 9, 2022

KvanTTT commented Jul 9, 2022 • edited

parrt commented Jul 15, 2022

KvanTTT commented Jul 9, 2022 •

edited

KvanTTT Jul 12, 2022 •

edited

KvanTTT commented Jul 9, 2022 •

edited