Superfluous rule needed to make parser work (4.7.2, Java target) #2689

andreasabel · 2019-11-24T13:55:11Z

The parser generated from the following grammar fails, unless I activate the commented-out superfluous rule:

parser grammar RightRecParser;
options { tokenVocab = RightRecLexer; }

listInteger returns [ int result ]
  : /* empty */
      { $result = 0; }

  | n=INTEGER m=listInteger
      { $result = Integer.parseInt($n.getText()) + $m.result; }
;

// // The following irrelevant rule makes the parser succeed:
// foo returns [ int result ]
//   : m=listInteger
//       { $result = $m.result; }
// ;

Behavior without foo:

echo "1 2" | java -cp .:/Users/abel/bnfc/testing/data/antlr-4.7.2-complete.jar RightRec.Test
line 1:0 no viable alternative at input '1'

Behavior with foo:

echo "1 2" | java -cp .:/Users/abel/bnfc/testing/data/antlr-4.7.2-complete.jar RightRec.Test
3

I would expect that an unused rule is not needed to generate a working parser. These grammars are originally auto-generated by the BNFC tool, --antlr backend. Adding redundant rules by hand is thus not a viable workaround.

Lexer is unsurprisingly:

lexer grammar RightRecLexer;

// Whitespace
WS             : (' ' | '\r' | '\t' | '\n' | '\f')+ ->  skip;

// Numbers
fragment DIGIT : [0-9] ;
INTEGER        : DIGIT+;

The full MWE is attached (including Makefile and Test wrapper).
bug.zip

The text was updated successfully, but these errors were encountered:

sharwell · 2019-11-25T00:00:07Z

Your entry rule is missing an explicit EOF, so this is likely a duplicate of #118.

ericvergnaud · 2019-11-25T00:38:12Z

Hi The place for support is the google discussion group Éric Envoyé de mon iPhone

…

Le 24 nov. 2019 à 21:55, Andreas Abel ***@***.***> a écrit : The parser generated from the following grammar fails, unless I activate the commented-out superfluous rule: parser grammar RightRecParser; options { tokenVocab = RightRecLexer; } listInteger returns [ int result ] : /* empty */ { $result = 0; } | n=INTEGER m=listInteger { $result = Integer.parseInt($n.getText()) + $m.result; } ; // // The following irrelevant rule makes the parser succeed: // foo returns [ int result ] // : m=listInteger // { $result = $m.result; } // ; Behavior without foo: echo "1 2" | java -cp .:/Users/abel/bnfc/testing/data/antlr-4.7.2-complete.jar RightRec.Test line 1:0 no viable alternative at input '1' Behavior with foo: echo "1 2" | java -cp .:/Users/abel/bnfc/testing/data/antlr-4.7.2-complete.jar RightRec.Test 3 I would expect that an unused rule is not needed to generate a working parser. These grammars are originally auto-generated by the BNFC tool, --antlr backend. Adding redundant rules by hand is thus not a viable workaround. Lexer is unsurprisingly: lexer grammar RightRecLexer; // Whitespace WS : (' ' | '\r' | '\t' | '\n' | '\f')+ -> skip; // Numbers fragment DIGIT : [0-9] ; INTEGER : DIGIT+; The full MWE is attached (including Makefile and Test wrapper). bug.zip — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

sharwell · 2019-11-25T04:56:41Z

@ericvergnaud this issue is related to a bug in the code.

andreasabel · 2019-11-25T10:02:13Z

@sharwell Thanks for the pointer!

Typical LR parser generators add, in a preprocessing step, a rule

   StartX : X EOF;

to the grammar if X is supposed to be usable as an entrypoint. (See, e.g., the happy parser generator for Haskell, or just standard textbooks on LR parsing, such as the Dragon Book.)
Users never talk about EOF in the LR parser specifications, it is handled automatically.

Thanks to your pointer I found the section in the ANTLR documentation that talks about EOF and start rules, https://github.com/antlr/antlr4/blob/master/doc/parser-rules.md#start-rules-and-eof . However, it does not say that adding an EOF rule might be necessary to give successful parses, only to reports errors in case only a proper prefix of the input matches the grammar.

I can work around this issue by emitting EOF rules for entrypoints when translating a grammar to ANTLR. But it comes as a surprise that ANTLR deviates so much from standard conventions of parser generators.

Maybe ANTLR could also create these new entrypoint symbols StartX automatically?

And the OP still exists: why does adding a bogus rule mentioning X fix the problem for X? There is still no mention of EOF.
(Maybe I should not be so surprised, since typical LR parser generators are also non-compositional in ways one only understands when studying the parsing algorithm in detail.)

sharwell · 2019-11-25T15:35:46Z

Technically it's a bug that the missing EOF scenario doesn't work. The bug isn't fixed because we haven't found a way to make the scenario work without a big performance impact that would affect scenarios where the EOF is present (i.e. the bug fix slows down cases that don't need the bug fix).

This works around an issue in ANTLR that parsers sometimes do not work if not created with such a start rule. See antlr/antlr4#2689

andreasabel · 2019-11-25T15:42:39Z

I fixed the ANTLR backend of BNFC to always emit EOF rules for entrypoints now (BNFC/bnfc#272), meaning that I am not dependent on the outcome of this issue any more.

andreasabel mentioned this issue Nov 24, 2019

List entrypoints reverse parsed lists (Haskell/C) BNFC/bnfc#163

Closed

andreasabel mentioned this issue Nov 25, 2019

ANTLR backend: start rules needed BNFC/bnfc#272

Closed

andreasabel added a commit to BNFC/bnfc that referenced this issue Nov 25, 2019

[ fixed #272 ] Generate start rules "start_X : X EOF" for ANTLR

2e99683

This works around an issue in ANTLR that parsers sometimes do not work if not created with such a start rule. See antlr/antlr4#2689

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Superfluous rule needed to make parser work (4.7.2, Java target) #2689

Superfluous rule needed to make parser work (4.7.2, Java target) #2689

andreasabel commented Nov 24, 2019

sharwell commented Nov 25, 2019

ericvergnaud commented Nov 25, 2019 via email

sharwell commented Nov 25, 2019

andreasabel commented Nov 25, 2019 •

edited

sharwell commented Nov 25, 2019

andreasabel commented Nov 25, 2019

Superfluous rule needed to make parser work (4.7.2, Java target) #2689

Superfluous rule needed to make parser work (4.7.2, Java target) #2689

Comments

andreasabel commented Nov 24, 2019

sharwell commented Nov 25, 2019

ericvergnaud commented Nov 25, 2019 via email

sharwell commented Nov 25, 2019

andreasabel commented Nov 25, 2019 • edited

sharwell commented Nov 25, 2019

andreasabel commented Nov 25, 2019

andreasabel commented Nov 25, 2019 •

edited