New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grammars with EPSILON issues #6
Comments
@gitmh thank you for your observation! Also, thanks a lot for the minimalistic examples of grammar, which were rather helpful for the troubleshooting. The bug was inside method "completer": in my previous implementation, I have forgotten to write the return statement, with the information about whether parsing chart was changed or not (here is the corresponding line of code: https://github.com/lagodiuk/earley-parser-js/blob/master/earley-oop.js#L267), and in such case, by default JavaScript interpreter considers that the return value is "false". Here is a link to the commit with fix: fb31a9e (in this case I am confident about the positive effect of the change, so I have committed the fix directly to the master branch). Below presented the links to the demonstrations of parsing the grammars, which you have mentioned:
https://jsfiddle.net/2mb3w9c1/36/
https://jsfiddle.net/2mb3w9c1/37/ Also, inside my fix I have adapted a bit the logic of internal method "combinations", which is helpful for the proper displaying of the parsing trees, which contains epsilon transitions among non-epsilon symbols in the right-hand-side (however in this case, epsilon nodes are not displayed in the tree).
|
Fixed. |
Also, I understand that somewhen it would be really useful to have some integration tests for the implementation of the parser. But, as far as parser could handle potentially all possible context free grammars (CFG), and can parse all possible valid statements of the corresponding CFG, it was a bit unclear how to find the proper and exhaustive approach for testing the implementation of the parser. And recently I've came up with an idea regarding the exhaustive testing of the parser: #8 |
I just found an example where parsing with the new epsilon version breaks:
S -> T
T -> a T E | z
E -> EPSILON
the word "aaaaz" cannot be parsed successfully but if I use only "T -> a T | z" it works fine. This was the smallest grammar I could think of showing the problem.
Rules with single epsilon work great otherwise like this on with input "a" is fine:
S -> a A
A -> EPSILON
also switching "T" and "a" works with "zaaaa":
S -> T
T -> T a E | z
E -> EPSILON
The nonterminal E is at the same place with the same rule but this time it works. Here is another broken one which might be the same problem:
N0 -> N1 b | b
N1 -> b X | a X | a N1 X
X -> b X | EPSILON
the word "aaab" does not work but should. I know the grammars do not look useful at all :-)
I tried to figure out why this happens but I wasn't able to find it.
The text was updated successfully, but these errors were encountered: