Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring of grammar literals parsing #3780

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

KvanTTT
Copy link
Member

@KvanTTT KvanTTT commented Jul 9, 2022

Code for parsing grammar string literals 'abc' and grammar char set [abc] have a lot of in common and I unified it. I extracted the functionality into GrammarLiteralParser that is responsible for parsing from string literals and char sets. It has the following methods:

  • parseStringFromStringLiteral
  • parseCharFromStringLiteral (returns CharParseResult)
  • parseChar (returns CharParseResult)
  • parseNextChar (returns CharParseResult)

CharParseResult can be either INVALID or CODE_POINT or PROPERTY. PROPERTY is supported only by char set.

Also, I unraveled ANTLR tool and ANTLR runtime functionality of char processing.

Split CharSupport on CharSupport and AntlrCharSupport

Signed-off-by: Ivan Kochurkin <kvanttt@gmail.com>
Signed-off-by: Ivan Kochurkin <kvanttt@gmail.com>
…arser

Signed-off-by: Ivan Kochurkin <kvanttt@gmail.com>
@KvanTTT
Copy link
Member Author

KvanTTT commented Jul 9, 2022

@parrt it's also ready. Tool tests will be fine after #3779

@parrt
Copy link
Member

parrt commented Jul 9, 2022

Hmm...pretty risky modifying the ATN construction...

if ( a==Token.EOF ) buf.append("<EOF>");
else if ( elemAreChar ) buf.append("'").appendCodePoint(a).append("'");
else buf.append(a);
if ( a==Token.EOF ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a lot of cosmetic changes here and flipping of strings two characters for no real game in performance. Yet it costs me mental effort to review

Copy link
Member Author

@KvanTTT KvanTTT Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not only cosmetic changes but clarity changes as well. toString should print clear representation of object instead of plain strings. In the old version it prints

'
'

In the new version it prints '\n'. It's more expected behaior I think.

@parrt
Copy link
Member

parrt commented Jul 9, 2022

Can you help me understand the prime motivation here? There's a lot to review and a lot of changes that could affect allow users, because it touches the tool itself.

@KvanTTT
Copy link
Member Author

KvanTTT commented Jul 9, 2022

Can you help me understand the prime motivation here? There's a lot to review and a lot of changes that could affect allow users, because it touches the tool itself.

I decided to rebase and suggest my old commits since they exist, remove code duplication (more removed lines than added) and fix inaccurancy in error messages (see tests). Our test coverage is good enough especially for the tool. Error messages without location look akward.

I've touched the tool several times, and it almost hasn't cause any trouble.

@parrt
Copy link
Member

parrt commented Jul 15, 2022

I'm going to leave this active as it could be useful but I'm allocating my time to antlr-as-a-service proj at moment. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants