Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add scannerless tokenizer to runtime jar #814

Open
parrt opened this issue Feb 8, 2015 · 5 comments
Open

add scannerless tokenizer to runtime jar #814

parrt opened this issue Feb 8, 2015 · 5 comments

Comments

@parrt
Copy link
Member

parrt commented Feb 8, 2015

I see an improved version here: https://github.com/parrt/mini-markdown/blob/master/src/org/antlr/md/CharsAsTokens.java

@KvanTTT
Copy link
Member

KvanTTT commented Dec 19, 2016

Not only to Java but to other runtimes too :)

@sharwell
Copy link
Member

If we add this to the runtime library, I think we should take care in implementing certain details:

  1. Define an implicit token vocab in the tool, so the CharVocab.tokens file does not need to be provided separately
  2. Allow the parser to reference any UTF-16 code unit using the '\u00FF' syntax
  3. Define a new implementation of the Vocabulary interface which understands this vocabulary

If we add a new parser option scannerless=true, we would be able to generate better code for the Vocabulary instance, and possibly in the future even generate parser constructors that avoid the need for a user to create an explicit instance of CharsAsTokens.

@parrt
Copy link
Member Author

parrt commented Dec 19, 2016

or perhaps scannerless grammar T; rather than option.

@sharwell
Copy link
Member

sharwell commented Dec 19, 2016

@parrt That would work too, but any new keywords raise compatibility concerns for me (in this case, we'd have to either make a special provision for scannerless as a parser rule name, or have it be a new disallowed reserved word).

The channels keyword was the last added keyword, but it's special because (I believe) it's only considered a keyword when it's followed by {.

@parrt
Copy link
Member Author

parrt commented Dec 19, 2016

Yeah, we'd have to consider all this carefully. Currently, it works great and all we need is an example. We wouldn't even need to include that class in ANTLR proper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants