New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow designation of tokens as garbage #46
Comments
A few people have asked for the ability to work with pre-tokenized
Your request falls into the second category. So this may resolve itself |
In a classic shift/reduce parser as generated by Yacc the entire What I'm proposing is simply a special case of such a rule Let's assume that I have some grammar I can trivially provide a grammar for macros as they are a If "junk" happens to conflict with the primary production of the My apologies if I'm misunderstanding the exact mechanism that |
I've added this as an experimental feature in 1.2.3. Check it out and let me know what you think: |
Awesome! After reading, while I agree that whitespace obliteration will be the primary use case for this feature, I would personally use a more general name than :auto-whitespace. Other than that, looks awesome! Thanks for spending the time to add it to this toolkit. |
Thanks for suggesting the feature. |
After some pondering a better keyword doesn't really spring to mind. In either case I'd retain the demo above as part of the documentation which should serve to make its behavior plain no matter the given name. Within reason of course. |
At present, one can emulate the behaviour of a lex/yacc grammar using instaparse only by suitably modifying the source grammar to explicitly account for traditionally ignored whitespace characters. Over the course of the C language grammar or the Pascal language grammar this can easily amount to hundreds of rule changes as one must explicitly provide for the possibility of whitespace in every nonterminal concatenation where the standard defined source grammars state simply "discard whicespace" assuming a separate lexer.
It would be awesome if the top level parser took an
:ignored "ignored-forms-rule"
, which would be implicitly used as a token sink throughout the grammar.The text was updated successfully, but these errors were encountered: