Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot parse "ExponentDecimalReal" numbers #144

Closed
szarnyasg opened this issue Sep 10, 2016 · 1 comment
Closed

Cannot parse "ExponentDecimalReal" numbers #144

szarnyasg opened this issue Sep 10, 2016 · 1 comment
Labels

Comments

@szarnyasg
Copy link
Contributor

Using the generated ANTLR4 grammar, I cannot parse exponent decimal real numbers. Parsing a string of 1E2 with the exponentDecimalReal rule returns the following error:

line 1:0 mismatched input '1E2' expecting {'.', DecimalInteger, Digit}

I am not yet very familiar with ANTLR but I think the issue is either the ordering of the rules or the lexer/parser nature of the related rules. This leads us to the shouldBeLexerRule method and the Antlr4Massager classes.

In issue #56 @Mats-SX stated that it's not a goal to provide a good ANTLR parser, so I think a good approach would be to patch the Antlr4 and the Antlr4Massager classes. Do you agree?

Also, @thobe can you please elaborate on how to proceed with the // TODO: melt these snow flakes. task?

@Mats-SX
Copy link
Member

Mats-SX commented Sep 12, 2016

Yes, these issues are caused by the lexer/parser division of the parsing paradigm that Antlr is using. Our source grammar doesn't (yet) have a notion of lexing, so the conversion to the Antlr side of things has so far been quite crude, and with it has come a number of issues. We have cards on our wall that tells us to clean all of this up:

The current ANTLR grammar suffers from having to go to great length to wrestle a lexer out of the source grammar that does not understand that concept. It is quite foreseeable that other consumers of the grammar will hit the same issue. We should restructure the source grammar to be based on a core set of non-overlapping productions that can be translated directly to lexing rules.

A consequence of this is that the code that generates the Antlr4 grammar contains a sizeable piece of custom code to make the lexing work, which is a technical dept.

The problem right now is that we're short on resources. Hopefully we will be able to come back to these issues soon enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants