Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse error due to select option order #7

Closed
tippfelher opened this issue Mar 8, 2023 · 8 comments
Closed

Parse error due to select option order #7

tippfelher opened this issue Mar 8, 2023 · 8 comments
Assignees
Labels
bug Something isn't working invalid This doesn't seem right

Comments

@tippfelher
Copy link

tippfelher commented Mar 8, 2023

// OK
let result = BNF.parse(`program ::= "aa" | "a";`);
let tree = Compile(result);
let syntax = tree.parse("aa");
console.log("result", result, "syntax", syntax);
// Not OK
let result = BNF.parse(`program ::= "a" | "aa";`);
let tree = Compile(result);
let syntax = tree.parse("aa");
console.log("result", result, "syntax", syntax);

Edit by @AjaniBilby:
Answer

@AjaniBilby
Copy link
Owner

What version of the package are you using?
Because both of those parse for me

@AjaniBilby
Copy link
Owner

I'm adding this as a automated test case just in case for future builds, however NPM is down for me and I want to make sure the two syntaxes from the different BNFs should make identical output syntax trees

@AjaniBilby AjaniBilby self-assigned this Mar 8, 2023
@AjaniBilby AjaniBilby added bug Something isn't working invalid This doesn't seem right labels Mar 8, 2023
@tippfelher
Copy link
Author

"bnf-parser": "^3.1.1",

I will try to make a better report. (already deleted everything after it failed xD)

@AjaniBilby
Copy link
Owner

I managed to replicate your error - I have no idea how I didn't run into this when I have an actual programming language running off this as the syntax parser 😅

@tippfelher
Copy link
Author

tippfelher commented Mar 8, 2023

All good. I needed a bnf parser yesterday and started to built one. Then I realized that, what I reported above, won't be handled by my parser and so I started to look out for existing solutions. (yours is the first I checked)

@AjaniBilby
Copy link
Owner

AjaniBilby commented Mar 8, 2023

I thought it was a bug just like you, but I realized it's actually expected behavior - the library assumes not parsing the whole string is a failure. So in your second case it would match the "a", then the program ends, but there is still a remaining character so it fails.

For both cases to be valid you would need to write it like this:

case1 ::= ( "a" | "aa" )+
case2 ::= ( "aa" | "a" )+

Ideally you should order your selects so options are ordered with the most unique and complex elements first.
I have literally thought my parser was broken before, then realizing actually I was just telling it to do the wrong thing.

@AjaniBilby
Copy link
Owner

AjaniBilby commented Mar 8, 2023

All good. I needed a bnf parser yesterday and started to built one. Then I realized that, what I reported above, won't be handled by my parser and so I started to look out for existing solutions. (yours is the first I checked)

Yeh, that's basically why I started mine as well, plus why I adapted BNF syntax to have the ommit character for easy clean up and stuff - the only thing I'm really unhappy with atm is the fact that sometimes it gives the wrong reference range for a syntax error.

i.e. if you have a syntax error in a function, it complains it can't parse at the start of the function body, instead of at the statement in the function

@AjaniBilby AjaniBilby changed the title This library cannot even handle the simplest test cases: Parse error due to select option order Mar 8, 2023
@AjaniBilby
Copy link
Owner

AjaniBilby commented Mar 11, 2023

@tippfelher, this might be a bit late, but I was thinking there are probably other people like you who expect a more non-deterministic select statement.
I was wondering do you think it would be worth adding a variant of || form |, where if you use || it will match all cases then return the best (consumes the most tokens) result, and | always just returns the first match. Thoughts?

It wouldn't be that hard to implement, since I am already comparing 'best result' of errors on a switch, so it returns the error from the option that got the furthest in consuming

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

2 participants