Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TOML Cleanup and Improvements #4565

Merged
merged 3 commits into from
Sep 3, 2022
Merged

Conversation

lkishalmi
Copy link
Contributor

Well, this one got a bit fatter I've initially planned.

It has the same cleanup applied for Yaml and Dockerfile support.

Additionally I've found the root cause of the IndexOutOfBoundException as the Lexer state was not fully saved and restored. Had to do an ugly reflection hack for that, but seems to work.

As a second commit I added the parser Syntax Error output to the UI.

@lkishalmi lkishalmi added this to the NB16 milestone Aug 30, 2022
),
@ActionReference(
path = "Loaders/text/x-toml/Actions",
id = @ActionID(category = "Edit", id = "org.openide.actions.PushAction"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo?

Suggested change
id = @ActionID(category = "Edit", id = "org.openide.actions.PushAction"),
id = @ActionID(category = "Edit", id = "org.openide.actions.PasteAction"),

}
return c;
}

//Marks are for buffering in ANTLR4, we do not really need them
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read the documentation correctly, we actually need buffering/seekability of the consumed stream (the documentation of CharStream declares this as required for lookahead), but the LexerInput already handles that - isn't it?

Reading further you actually buffer the whole file in memory, which kind of defeats the purpose of the CharStream implementation. getText can only look back to places, that are protected by a mark, so the buffer is limited (assuming limited lookahead and marking).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LexerInput readText, readLength is scoped for the actual token being processed. CharStream getIndex and getText are work on stream level. Also getText is kind of an optional method. Fortunately it used in all Lexers. Just discovered to have a problem with the previos implementation on the Antlr lexer.

I'm tempted to look around the Lexing API and probably add a more ANTLR friendly interface. Will, see. I do not think that this would be the final implementation. It is kind of good enough for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing, that ANTLR does not bother to implement its own interface (ANTLRInputStream) in an efficient way, I can't argue, that this implementation is inefficient, so ignore that.
For the tempation to change the lexer API to be "ANTLR" friedly: before going there, make sure you have a very good reason, at some point ANTLR will go away and we will retain the fallout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, throwing out unmarked sections of StringBuilder could be implemented one day with mark supported.

Looked around the Lexing API yesterday. Accessing the underlying buffers is not as easy as I've thought.


LexerState(org.tomlj.internal.TomlLexer lexer) {
LexerState(org.tomlj.internal.TomlLexer lexer) throws IllegalArgumentException, IllegalAccessException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using reflection is an implementation detail here - I would catch IllegalAccessException and wrap it into a RuntimeException. In the future you hopefully have access to the lexer state and thus the exceptions will not be thrown.

@lkishalmi lkishalmi merged commit 52409ea into apache:master Sep 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants