-
-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple possible token types #36
Comments
The lack of context does make some things a bit more difficult, that’s for sure. In the specific example of the SQL parser I’d probably just not make SELECT a keyword. But generally, I’m not sure. I have solved this in one parser by pushing the statefulness down into the lexer. This was a parser for a nested string interpolation language eg. I’m not against non-CFG but I don’t know if there’s an elegant way to express the grammar like EBNF? |
Well, it's still nice to have separate keyword type, in particular case - for case-insensitive keyword matching (btw, I've seen that pointlander/peg is using Regarding how this type ambiguity can be expressed in grammar, for me it seems like no changes here needed - if parser failed to parse current token with given type, it could ask lexer for other options. Or else lexer can have |
I don't think that's a general solution to contextual-grammars, it just solves one specific example. For example that would not work for the example of the nested interpolation grammar I gave. |
I think this is a good idea; the current approach of case case normalisation isn't really a good one. I think I'll add a parser option to do this, eg. |
I've been thinking about this for a while - yeah, looks like subj has too narrow scope; also have no other ideas, unfortunately, so I'd rather close issue. |
I've added the |
Hello, and thank you for your work :)
I faced some interesting problem while using participle to parse some (well, to be honest, poorly designed) grammar - the thing is that some tokens may be treated differently - for example, as keyword and as identifier.
Simplest example is SQL parser from _examples: it cannot parse "SELECT * FROM select", because table name "select" was identified by lexer as keyword, not ident. Of course that's invalid SQL and table name in this case should be quoted as `select`, but anyway - in my case grammar does not have that quotes.
I doubt what would be the best solution here, the only idea I got is allow multiple Lexer's for Parser - if parser is failing to parse with one lexer, try with another etc.
What do you think? Probably, there is no plans to support anything except CFG grammars, so there is no such problem at all.
The text was updated successfully, but these errors were encountered: