Syntactic predicates cause nonsensical error messages #194

KamilaBorowska · 2013-07-31T07:25:27Z

With the following grammar (that matches stuff like "abc", "aabbcc", "aaabbbccc"...), empty test code shows 'Line 1, column 1: Expected end of input but end of input found'. I think it should say 'Expected "a"' instead, as lookahead is looking for 'a'.

S = &(A !'b') 'a'+ B
A = 'a' A? 'b'
B = 'b' B? 'c'

The text was updated successfully, but these errors were encountered:

dmajda · 2013-08-16T13:00:32Z

Thanks for the report. I plan to look into polishing error messages in 0.9 timeframe. I'll also have a look at this problem at that time.

dmajda · 2016-06-28T12:03:22Z

In 0.9.0 and current master, the message changed to “Expected undefined but end of input found.”.

Minimal grammar reproducing the problem (with empty input):

start = &"a"

The bug is caused by a combination of two issues:

& and ! suppress error reporting, which means they don’t produce any expectations on match failure.
The message-building code doesn’t expect an empty expectation list and can’t deal with it correctly.

I’m not sure about the right solution yet.

Mingun · 2016-06-28T16:57:26Z

I assume that it can be fixed if in &-predicates the expectations will be produced, but they will still not generate an error. For ! predicates, perhaps, it is necessary to negative captured expectations (!"a" must produce expectation expected not "a"), but I can't invent an example in which this behavior would be required.

dmajda · 2016-07-03T08:47:18Z

I assume that it can be fixed if in &-predicates the expectations will be produced, but they will still not generate an error.

Yes.

For ! predicates, perhaps, it is necessary to negative captured expectations (!"a" must produce expectation expected not "a"), but I can't invent an example in which this behavior would be required.

I’m thinking along similar lines — adding a negative reporting flag flipped by the ! predicate. It would cause the expectations generated by matching inside the ! predicate to be reported negated, except that instead of “not X” I’d use “something else than X”, which works better grammatically. Using ! inside ! would flip the flag again, so that e.g. a failed match in !(!"a")) would get reported positively, which corresponds to the semantics of this expression.

One thing I’m not sure about is how to express negative expectations in the expected property of syntax errors. For example, what expectations should matching an empty input with this grammar produce?

start = "a" / !"b" . / "c"

One approach is to embed the value of the negative reporting flag into the expectations:

[
  { type: "literal", text: "a", ignoreCase: false, negative: false },
  { type: "literal", text: "b", ignoreCase: false, negative: true },
  { type: "literal", text: "c", ignoreCase: false, negative: false }
]

Another approach is to introduce a new expectation type to express negation and use expectation nesting:

[
  { type: "literal", text: "a", ignoreCase: false },
  {
    type: "not",
    expectation: { type: "literal", text: "b", ignoreCase: false }
  },
  { type: "literal", text: "c", ignoreCase: false }
]

The first approach is simpler but it complicates the common case of positive expectations and it would probably lead to more complicated and slower code. The second approach introduces the complexity of nested expectations but keeps the common case as it is now and the code implementing it would likely be smaller and faster.

As for the usefulness, negative reporting will make some error messages more accurate. For example, consider this grammar:

string = "'" (!"'" .)* "'"

With input ’a it currently reports “Expected "'" or any character but end of input found.”, which is slightly incorrect. Reporting “Expected "'" or something else than "'" but end of input found.” is more precise (although it sounds a bit like Vulcan logic).

In any case, if we enable reporting for &, some kind of reporting for ! is needed for consistency across both predicates.

dmajda · 2016-07-15T14:36:06Z

There is a bigger problem with negative reporting than how to express negative expectations in the expected property of syntax errors. Consider this trivial grammar:

start = !"a"

Now let’s use it on input a. Given current architecture with the negative reporting flag added (toggled on the ! boundary) and silent failures for predicates discarded, "a" will be matched and the match will succeed, producing no expectations at all. But we’d like the match to produce a negative expectation, otherwise the error message and the expected property won’t be right. Reversed situation will occur with input not matching "a" — an expectation will be produced while we don’t want to have any.

To implement the desired behavior, we have to do one of two things:

Check for the value of the negative reporting flag in the successful path of matching code (implementation of ACCEPT_N and ACCEPT_STRING bytecode instructions) and produce an expectation if the flag is set. Similarly, check for the flag value in the unsuccessful path (implementation of FAIL) and don’t produce an expectation if the flag is set.
Track the value of the negative reporting flag statically during compilation and compile the code accordingly. Note this means two versions of the code would need to exist for any expression used both in and out of !.

Option 1 is relatively straightforward but will likely result in severe performance degradation (a need to check a flag on every match, successful or not), which makes it unviable in my eyes. Option 2 is somewhat complex to implement, but it won’t result in performance degradation compared to the current state (except when inside the ! predicate), which makes it a preferred option.

Given the relative complexity of implementation of option 2 and given that I plan to implement other static tracking code as part of performance optimization work going into 1.0.0, I’m moving this issue to 1.0.0 as well.

Mingun · 2017-11-07T18:41:39Z

The problem is more difficult, than it seems. The fact that the current test set contains the wrong tests hints at it at least and this error still remained unnoticed:

the positive predicate test only by chance returns the correct set of expectations. However absence of any expectations from &'b' shall come not because they are discarded, and just because expression &'b' in principle does not match any input, therefore, and you should not expect from it any expectations. See more below.
the negative predicate test shall return following 3 expectations, however the second variant is missing, though as it is easily possible to be convinced, it is accepted by a parser. The fact that in the generator there is a bug because of which the input c is not accepted, despite assurances is unexpected however:
- { type: "literal", text: "a", ignoreCase: false }
- { type: "end" }
- { type: "literal", text: "c", ignoreCase: false }

To get such results, it is necessary to remember that predicates do not move a parse position. It means that all expectations received from expression under a predicate shall be executed along with expectations which turn out from further expressions. Below analysis of test cases.

Case &'b' actually means &'b' EOF and generates 2 expectations required at the same time:

{ type: "literal", text: "b", ignoreCase: false }
{ type: "end" }

It is obvious that they are contradictory -- expression cannot be at the same time to correspond to the end of stream and the b character. Thus, it just is deleted from a result array of expectations.

Case !'b' turns in !'b' EOF, that gives the following expectations:

{ not: true, type: "literal", text: "b", ignoreCase: false } (assume that we use not key to invert expected value)
{ type: "end" }

Their combining and simplification of expression gives that the end of stream just expected. From this, in particular, follows that grammar

start = 'a' / !'b' / 'c'

is equivalent to

start = 'a' / !'b' EOF / 'c'
EOF = !.

and even grammar

start = 'a' / EOF / 'c'
EOF = !.

however only last 2 parse an input c as well as is expected.

…me FAIL to EXPECT. FAILED constant now pushed to the stack by the regular PUSH_FAILED opcode.

…tch result of expression

…med` node

…me FAIL to EXPECT. FAILED constant now pushed to the stack by the regular PUSH_FAILED opcode. Conflicts: lib/compiler/passes/generate-bytecode.js lib/compiler/passes/generate-js.js test/spec/unit/compiler/passes/generate-bytecode.spec.js

…tch result of expression Conflicts: lib/compiler/passes/generate-bytecode.js test/spec/unit/compiler/passes/generate-bytecode.spec.js

…med` node Conflicts: lib/compiler/passes/generate-bytecode.js test/spec/unit/compiler/passes/generate-bytecode.spec.js

Conflicts: lib/compiler/passes/generate-js.js test/spec/unit/compiler/passes/generate-bytecode.spec.js

Conflicts: test/spec/unit/compiler/passes/generate-bytecode.spec.js

dmajda changed the title ~~Lookahead makes nonsensical error messages~~ Syntactic predicates cause nonsensical error messages Aug 14, 2015

dmajda modified the milestones: 0.10.0, 0.9.0 Aug 14, 2015

dmajda modified the milestones: 1.0.0, 0.10.0 Jul 15, 2016

Mingun mentioned this issue Nov 12, 2017

State where the rule failed #538

Open

Mingun added a commit to Mingun/pegjs that referenced this issue Dec 20, 2017

Issue pegjs#194: Remove stack manipulations from FAIL opcode and rena…

0365edb

…me FAIL to EXPECT. FAILED constant now pushed to the stack by the regular PUSH_FAILED opcode.

Mingun added a commit to Mingun/pegjs that referenced this issue Dec 20, 2017

Issue pegjs#194: Always save the expected values regardless of the ma…

e507ed1

…tch result of expression

Mingun added a commit to Mingun/pegjs that referenced this issue Dec 20, 2017

Issue pegjs#194: Remove unnecessary check for match result in the `na…

b8cd196

…med` node

Mingun added a commit to Mingun/pegjs that referenced this issue Dec 20, 2017

Issue pegjs#194: Collect expectations from predicates

6e7566f

Mingun added a commit to Mingun/pegjs that referenced this issue Dec 20, 2017

Issue pegjs#194: Process expectations from predicates

4912536

Mingun added a commit to Mingun/pegjs that referenced this issue Dec 20, 2017

Issue pegjs#194: Regenerate parser

1be65e8

Mingun mentioned this issue Dec 20, 2017

Optimize silent fails: remove checks that always false #399

Merged

Mingun added a commit to Mingun/pegjs that referenced this issue Dec 20, 2017

Issue pegjs#194: Process expectations from predicates

fc56aba

Mingun added a commit to Mingun/pegjs that referenced this issue Dec 20, 2017

Issue pegjs#194: Regenerate parser

ad8bc25

futagoza closed this as completed in 669f782 Dec 31, 2017

Mingun added a commit to Mingun/pegjs that referenced this issue Jan 5, 2018

Issue pegjs#194: Remove unnecessary check for match result in the `na…

f334a26

…med` node Conflicts: lib/compiler/passes/generate-bytecode.js test/spec/unit/compiler/passes/generate-bytecode.spec.js

Mingun added a commit to Mingun/pegjs that referenced this issue Jan 5, 2018

Issue pegjs#194: Collect expectations from predicates

c7c91c6

Conflicts: lib/compiler/passes/generate-js.js test/spec/unit/compiler/passes/generate-bytecode.spec.js

Mingun added a commit to Mingun/pegjs that referenced this issue Jan 5, 2018

Issue pegjs#194: Process expectations from predicates

8f0bc7c

Conflicts: test/spec/unit/compiler/passes/generate-bytecode.spec.js

Mingun added a commit to Mingun/pegjs that referenced this issue Jan 5, 2018

Issue pegjs#194: Regenerate parser

5df4d07

Mingun referenced this issue Oct 31, 2018

Updated changelog for upcoming 0.11

87dcc13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syntactic predicates cause nonsensical error messages #194

Syntactic predicates cause nonsensical error messages #194

KamilaBorowska commented Jul 31, 2013

dmajda commented Aug 16, 2013

dmajda commented Jun 28, 2016

Mingun commented Jun 28, 2016

dmajda commented Jul 3, 2016

dmajda commented Jul 15, 2016

Mingun commented Nov 7, 2017

Syntactic predicates cause nonsensical error messages #194

Syntactic predicates cause nonsensical error messages #194

Comments

KamilaBorowska commented Jul 31, 2013

dmajda commented Aug 16, 2013

dmajda commented Jun 28, 2016

Mingun commented Jun 28, 2016

dmajda commented Jul 3, 2016

dmajda commented Jul 15, 2016

Mingun commented Nov 7, 2017