Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bison crash on counterexamples report #71

Closed
zmajeed opened this issue Jan 23, 2021 · 6 comments
Closed

Bison crash on counterexamples report #71

zmajeed opened this issue Jan 23, 2021 · 6 comments

Comments

@zmajeed
Copy link

zmajeed commented Jan 23, 2021

This bug was originally reported on the bug-bison mailing list by Michal Bartkowiak on 5 Jan 2021 - https://lists.gnu.org/archive/html/bug-bison/2021-01/msg00000.html

I've investigated it and am opening an issue here for ease of tracking - many thanks Akim btw for creating this github mirror!

Here's what happens with the grammar file parser.yc.gz provided by Michal:

$ bison --report=counterexamples parser.yc
parser.yc: warning: 2 shift/reduce conflicts [-Wconflicts-sr]
parser.yc: warning: 1 reduce/reduce conflict [-Wconflicts-rr]
parser.yc: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
Segmentation fault (core dumped)

This is bison built from master

$ bison --version
bison (GNU Bison) 3.7.4.284-fb14

Here's the stack at the crash:

Thread 1 "bison" received signal SIGSEGV, Segmentation fault.
0x000000010041cfc0 in eligible_state_items (target=0x800061460) at lssi.c:141
141           BITSET_FOR_EACH (biter, rsi, sin, 0)
#0  eligible_state_items (target=0x800061460) at lssi.c:141
#1  0x000000010041d087 in shortest_path_from_start (target=28, next_sym=8) at lssi.c:156
#2  0x000000010040bf6d in counterexample_report (itm1=28, itm2=29, next_sym=8, shift_reduce=false, out=0x800050318,
    prefix=0x1004fe958 <__func__.0+384> "    ") at counterexample.c:1255
#3  0x000000010040c7d4 in counterexample_report_reduce_reduce (Reading in symbols for print.c...
itm1=28, itm2=29, conflict_syms=0x800066b80, out=0x800050318, prefix=0x1004fe958 <__func__.0+384> "    ")
    at counterexample.c:1350
#4  0x000000010040cb3c in counterexample_report_state (s=0x80005ea00, out=0x800050318,
    prefix=0x1004fe958 <__func__.0+384> "    ") at counterexample.c:1394
#5  0x000000010043ee02 in print_state (Reading in symbols for main.c...
out=0x800050318, s=0x80005ea00) at print.c:366
#6  0x000000010043f42d in print_results () at print.c:473
#7  0x000000010041e0fc in main (argc=3, argv=0xffffcc10) at main.c:179

This is the source in lssi.c

(gdb) l
136           bitset_set (result, si - state_items);
137           // search all reverse edges.
138           bitset rsi = si->revs;
139           bitset_iterator biter;
140           state_item_number sin;
141           BITSET_FOR_EACH (biter, rsi, sin, 0)
142             gl_list_add_last (queue, &state_items[sin]);

The reason is the bitset rsi passed to BITSET_FOR_EACH is from a disabled state_item si - meaning the bitset was previously freed and has garbage vtable pointers - in particular the list function pointer which is accessed by gl_list_add_last causing the segv

This is the disabled state_item

(gdb) p si.trans
$2 = -2
(gdb) p *si.item
$3 = 15
(gdb) p *si.state
$4 = {
  number = 13,
  accessing_symbol = 4,
  consistent = false,
  solved_conflicts = 0x0,
  solved_conflicts_xml = 0x0,
  nitems = 2,
  items = {22}
}

The state_item was disabled and the bitset was freed by prune_backward in state-items.c:

#0  disable_state_item (si=0x8000615e0) at state-item.c:381
#1  0x000000010045a726 in prune_backward (si=0x800061520) at state-item.c:453
#2  0x000000010045a7c8 in prune_disabled_paths () at state-item.c:471
#3  0x000000010045adf0 in state_items_init () at state-item.c:557
#4  0x000000010040bf00 in counterexample_init () at counterexample.c:1229
#5  0x000000010041e07c in main (argc=3, argv=0xffffcc10) at main.c:154

(gdb) p *si.state
$9 = {
  number = 13,
  accessing_symbol = 4,
  consistent = false,
  solved_conflicts = 0x0,
  solved_conflicts_xml = 0x0,
  nitems = 2,
  items = {22}
}

(gdb) p *si.item
$7 = 15

Clearly the loop in eligible_state_items should not be accessing a disabled state_item - but I don't know enough to say if the loop should avoid adding a disabled state_item to the queue in the first place

I have some comments on the grammar below

@zmajeed
Copy link
Author

zmajeed commented Jan 23, 2021

This is the grammar file:

// parser.yc
%left IDENTIFIER
%left '-'

%%

Start
    : '(' FunctionFormalParList IDENTIFIER {}
    ;

FunctionFormalParList
    : FunctionFormalPar                            {}
    | FunctionFormalParList ',' FunctionFormalPar  {}
    ;

FunctionFormalPar
    : IDENTIFIER IDENTIFIER ":=" SingleExpression            {}
    | IDENTIFIER IDENTIFIER ":=" '-'                       {}
    ;

IdOrIdId
    : IDENTIFIER                 {}
    ;

TypeParams_front
    : '<' IDENTIFIER     {}
    ;

ActualParams_back
    : ActualParList ')'     {}
    ;

ActualParList
    : SingleExpression {}
    | ActualParList ',' SingleExpression {}
    ;

SingleExpression
    : SingleExpression '<' SingleExpression        {}
    | '-' SingleExpression {}
    | IDENTIFIER                                  {}
    | IdOrIdId TypeParams_front ">(" ActualParams_back {}
    ;

I noticed couple things

There is no crash if either of the %left directives is commented out - also there is no crash if the IdOrIdId rule is removed - of course both these changes require some minor adjustments for bison to work

$ bison --report=counterexamples -Wcounterexamples parser.yc
parser.yc: warning: 3 shift/reduce conflicts [-Wconflicts-sr]
parser.yc: warning: 1 reduce/reduce conflict [-Wconflicts-rr]
parser.yc: warning: reduce/reduce conflict on token '<' [-Wcounterexamples]
  Example: IDENTIFIER •
  First reduce derivation
    IdOrIdId
    ↳ 6: IDENTIFIER •
  Second reduce derivation
    SingleExpression
    ↳ 13: IDENTIFIER •
parser.yc: warning: shift/reduce conflict on token IDENTIFIER [-Wcounterexamples]
Productions leading up to the conflict state found.  Still finding a possible unifying counterexample...time limit exceeded: 6.000000
  First example: ":=" '-' • IDENTIFIER SingleExpression IDENTIFIER
  Shift derivation
    Start
    ↳ 1: ":=" FunctionFormalParList                            IDENTIFIER
              ↳ 2: FunctionFormalPar
                   ↳ 4: '-' IdOrIdId          SingleExpression
                            ↳ 6: • IDENTIFIER
  Second example: IDENTIFIER IDENTIFIER ":=" '-' • IDENTIFIER
  Reduce derivation
    Start
    ↳ 1: FunctionFormalParList                      IDENTIFIER
         ↳ 2: FunctionFormalPar
              ↳ 5: IDENTIFIER IDENTIFIER ":=" '-' •
parser.yc: warning: shift/reduce conflict on token IDENTIFIER [-Wcounterexamples]
Productions leading up to the conflict state found.  Still finding a possible unifying counterexample...time limit exceeded: 6.000000
  First example: IDENTIFIER IDENTIFIER ":=" '-' • IDENTIFIER IDENTIFIER
  Shift derivation
    Start
    ↳ 1: FunctionFormalParList                                             IDENTIFIER
         ↳ 2: FunctionFormalPar
              ↳ 4: IDENTIFIER IDENTIFIER ":=" SingleExpression
                                              ↳ 12: '-' SingleExpression
                                                        ↳ 13: • IDENTIFIER
  Second example: IDENTIFIER IDENTIFIER ":=" '-' • IDENTIFIER
  Reduce derivation
    Start
    ↳ 1: FunctionFormalParList                      IDENTIFIER
         ↳ 2: FunctionFormalPar
              ↳ 5: IDENTIFIER IDENTIFIER ":=" '-' •
parser.yc: warning: shift/reduce conflict on token '<' [-Wcounterexamples]
  Example: '-' SingleExpression • '<' SingleExpression
  Shift derivation
    SingleExpression
    ↳ 12: '-' SingleExpression
              ↳ 11: SingleExpression • '<' SingleExpression
  Reduce derivation
    SingleExpression
    ↳ 11: SingleExpression             '<' SingleExpression
          ↳ 12: '-' SingleExpression •
parser.yc: warning: shift/reduce conflict on token '<' [-Wcounterexamples]
  Example: SingleExpression '<' SingleExpression • '<' SingleExpression
  Shift derivation
    SingleExpression
    ↳ 11: SingleExpression '<' SingleExpression
                               ↳ 11: SingleExpression • '<' SingleExpression
  Reduce derivation
    SingleExpression
    ↳ 11: SingleExpression                              '<' SingleExpression
          ↳ 11: SingleExpression '<' SingleExpression •
Productions leading up to the conflict state found.  Still finding a possible unifying counterexample...time limit exceeded: 6.000000
Productions leading up to the conflict state found.  Still finding a possible unifying counterexample...time limit exceeded: 6.000000

@akimd
Copy link
Owner

akimd commented Jan 23, 2021

Hi @zmajeed. Thanks for the analysis. I reached the same conclusion. With a bit of luck, @Xaec6, the original author of cex in Bison, will be able to spend some time on this issue.
Cheers!

@zmajeed
Copy link
Author

zmajeed commented Jan 23, 2021

This fix seems reasonable - can I make a PR to this repo? Or do you prefer a patch to bug-bison?

$ git diff lssi.c
diff --git a/src/lssi.c b/src/lssi.c
index 245461a9..045772d3 100644
--- a/src/lssi.c
+++ b/src/lssi.c
@@ -139,7 +139,11 @@ eligible_state_items (state_item *target)
       bitset_iterator biter;
       state_item_number sin;
       BITSET_FOR_EACH (biter, rsi, sin, 0)
-        gl_list_add_last (queue, &state_items[sin]);
+        {
+          if (SI_DISABLED(sin))
+            continue;
+          gl_list_add_last (queue, &state_items[sin]);
+        }
     }
   gl_list_free (queue);
   return result;

No crash

$ bison --report=counterexamples -Wcounterexamples parser.yc
parser.yc: warning: 2 shift/reduce conflicts [-Wconflicts-sr]
parser.yc: warning: 1 reduce/reduce conflict [-Wconflicts-rr]
parser.yc: warning: reduce/reduce conflict on token '<' [-Wcounterexamples]
  Example: IDENTIFIER •
  First reduce derivation
    IdOrIdId
    ↳ 6: IDENTIFIER •
  Second reduce derivation
    SingleExpression
    ↳ 13: IDENTIFIER •
parser.yc: warning: shift/reduce conflict on token '<' [-Wcounterexamples]
  Example: '-' SingleExpression • '<' SingleExpression
  Shift derivation
    SingleExpression
    ↳ 12: '-' SingleExpression
              ↳ 11: SingleExpression • '<' SingleExpression
  Reduce derivation
    SingleExpression
    ↳ 11: SingleExpression             '<' SingleExpression
          ↳ 12: '-' SingleExpression •
parser.yc: warning: shift/reduce conflict on token '<' [-Wcounterexamples]
  Example: SingleExpression '<' SingleExpression • '<' SingleExpression
  Shift derivation
    SingleExpression
    ↳ 11: SingleExpression '<' SingleExpression
                               ↳ 11: SingleExpression • '<' SingleExpression
  Reduce derivation
    SingleExpression
    ↳ 11: SingleExpression                              '<' SingleExpression
          ↳ 11: SingleExpression '<' SingleExpression •

@Xaec6
Copy link
Contributor

Xaec6 commented Jan 23, 2021

That's not the issue. The state-item pruning is misbehaving, I'm working on it

@zmajeed
Copy link
Author

zmajeed commented Jan 23, 2021

ok - good to hear

akimd pushed a commit that referenced this issue Jan 24, 2021
There were several bugs in pruning that would leave the state-item
graph in an inconsistent state which could cause crashes later on:

- Pruning now happens in one pass instead of two.

- Disabled state-items no longer prune the state-items they transition
  to if that state-item has other states that transition to it.

- State-items that transition to disabled state-items are always
  pruned even if they have productions.

Reported by Michal Bartkowiak <michal.bartkowiak@nokia.com>
https://lists.gnu.org/r/bug-bison/2021-01/msg00000.html
and Zartaj Majeed
#71

* src/state-item.c (prune_forward, prune_backward): Fuse into...
(prune_state_item): this.
Adjust callers.
akimd pushed a commit that referenced this issue Jan 24, 2021
There were several bugs in pruning that would leave the state-item
graph in an inconsistent state which could cause crashes later on:

- Pruning now happens in one pass instead of two.

- Disabled state-items no longer prune the state-items they transition
  to if that state-item has other states that transition to it.

- State-items that transition to disabled state-items are always
  pruned even if they have productions.

Reported by Michal Bartkowiak <michal.bartkowiak@nokia.com>
https://lists.gnu.org/r/bug-bison/2021-01/msg00000.html
and Zartaj Majeed
#71

* src/state-item.c (prune_forward, prune_backward): Fuse into...
(prune_state_item): this.
Adjust callers.
@akimd
Copy link
Owner

akimd commented Jan 24, 2021

Released in 3.7.5.

@akimd akimd closed this as completed Jan 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants