Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsing of concatenation with initial variable repetition fails incorrectly #6

Closed
matthewleon opened this issue Nov 22, 2021 · 1 comment

Comments

@matthewleon
Copy link
Contributor

This is a hard one (for me) to explain, so let's take an example. Take the following minimal grammar:

const GRAMMAR = String.raw`full = *ab b
ab = "a" / "b"
b = "b"
`;

The following string should successfully be parsed by the grammar:

const EXAMPLE = "b";

like so:

const api = new apgApi(GRAMMAR);
api.generate();
if (api.errors.length) {
  console.error(api.errorsToAscii());
  throw `grammar has errors: ${GRAMMAR}`;
}
const schema = api.toObject();
const parser = new apgLib.parser();
const result = parser.parse(schema, 0, EXAMPLE);
console.log(result);

This fails, though, with the following result:

{
  success: false,
  state: 103,
  length: 1,
  matched: 0,
  maxMatched: 1,
  maxTreeDepth: 6,
  nodeHits: 9,
  inputLength: 1,
  subBegin: 0,
  subEnd: 1,
  subLength: 1
}

The reason for the failure seems to be that the initial repetition rule, *ab, is applied greedily, swallowing the single character b. The parser never then checks for the possibility of zero matches for repetition *ab, followed by a successful match of rule b.

This appears to be inconsistent with RFC 5234.

@ldthomas
Copy link
Owner

Answered in #7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants