Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code generated is unpredictably incorrect #93

Closed
harpocrates opened this issue Aug 3, 2017 · 3 comments · Fixed by #102
Closed

Code generated is unpredictably incorrect #93

harpocrates opened this issue Aug 3, 2017 · 3 comments · Fixed by #102

Comments

@harpocrates
Copy link
Contributor

I've been running across what I think is the same bug several times in the last couple of months while working on a pretty big grammar. Here are some more extreme things I've witnessed that seem to indicate the code generated is incorrect:

  1. moving a production (for example to the top of the file) changes the behaviour of the generated code
  2. the presence or absence of a production (that is used nowhere!) changes the behaviour of the generated code

I've made a branch here which is as reduced an example as I've managed to get. There is one test suite included which has exactly one test case. Right now, I think the code generated by Happy (1.19.5 and HEAD) is incorrect (the test should pass). However, commenting out the pat_slice production (which is used nowhere) makes the test pass again. In fact, as I've tried to reduce this grammar over the past weeks, I'm now finding it difficult to comment out anything without the test passing again.

I realize this is a huge test case, but I am at a complete loss at what to do to reduce the grammar since doing so seems to remove the bug.

@harpocrates
Copy link
Contributor Author

FYI: I just updated the linked branch to circumvent the bug #91 (which you would hit first otherwise). Now, to replicate the behaviour I described above, just try building everything on with Happy 1.19.5 or HEAD, then compare the output of the test suite with or without the pat_slice production commented out.

Again, since this is such a big test case, I'd be glad to help step through anything that is unclear in this bug report...

@harpocrates
Copy link
Contributor Author

harpocrates commented Aug 12, 2017

The branch I linked in m first comment is definitely producing incorrect code. Here is the debug output (happy run with --debug --info=info.txt -agc) I get when running the test:

state: 1,	token: 118,	action: shift, enter state 58
state: 58,	token: 29,	action: reduce (rule 4), goto state 4
state: 4,	token: 29,	action: reduce (rule 282), goto state 22
state: 22,	token: 29,	action: reduce (rule 22), goto state 9
state: 9,	token: 29,	action: reduce (rule 19), goto state 7
state: 7,	token: 29,	action: reduce (rule 427), goto state 24
state: 24,	token: 29,	action: reduce (rule 127), goto state 11
state: 11,	token: 29,	action: shift, enter state 206
state: 206,	token: 118,	action: shift, enter state 58
state: 58,	token: 47,	action: reduce (rule 4), goto state 4
state: 4,	token: 47,	action: reduce (rule 282), goto state 22
state: 22,	token: 47,	action: reduce (rule 22), goto state 9
state: 9,	token: 47,	action: reduce (rule 19), goto state 7
state: 7,	token: 47,	action: reduce (rule 427), goto state 24
state: 24,	token: 47,	action: reduce (rule 127), goto state 268
state: 268,	token: 47,	action: reduce (rule 463), goto state 24     -- !!! Reduce
state: 24,	token: 47,	action: reduce (rule 127), goto state 11
state: 11,	token: 47,	action: shift, enter state 224
state: 224,	token: 50,	action: reduce (rule 919), goto state 250
state: 250,	token: 50,	action: shift, enter state 758
state: 758,	token: 137,	action: reduce (rule 431), goto state 24
state: 24,	token: 137,	action: reduce (rule 127), goto state 11
state: 11,	token: 137,	action: accept.

Yet, in info.txt state 268, token 47 is supposed to be a shift, not a reduce:

{- ... snipped ... -}

State 268

  {- ... snipped ... -}

  '('            shift, and enter state 224

  {- ... snipped ... -}

{- ... snipped ... -}

Note that with the pat_slice production commented out, I get the same info file (modulo some renumbering of rules since there are an extra two rules used nowhere), but the --debug output matches the info file with action: shift, enter state 224 at state: 268, token: 47 instead of a action: reduce (rule 463), goto state 24.


Here is my attempt at debugging so far:

  • bug is likely in produceActionTable TargetArrayBased (since everything works with -gc but not -agc)

  • bug is likely in mkTables: getting the action associated with state 268, token 47 (using the pseudocode in the comments above mkTables) returns

    LR'Shift 224 (Prio None 25)        with 'pat_slice' present   (correct shift rule)
    LR'Reduce 461 (Prio LeftAssoc 12)  with 'pat_slice' absent    (reduction rule for 'expr || expr')
    

    yet the action table passed in is always correct (it always has LR'Shift 224 (Prio None 25)).

  • the divergence is somewhere in the first and third entries of the tuple returned by mkTables
    since just those two are enough to extract the two actions of the previous bullet point in this
    list.

This is as far as I managed to get so far with the time I've had. Hope it helps! :)

@harpocrates
Copy link
Contributor Author

Alright. I've found what the issue is. I don't think it is directly related to anything above (apologies for the wild goose-chase) and I'm really not sure what the best way is to deal with this (if there is even something to do).

My happyTable has over 2^15 entries in it (my grammar has around 1800 states). That isn't a problem by itself, if it weren't for the fact that happyActOffsets stores indices pointing into happyTable in signed 16 bit numbers. Then, the issue is that the indices that are being written into happyActOffsets sometimes get too big. When that happens, they wrap around to the vicinity of -32000. However, Happy also uses negative indices for something else.

Briefly, here are some thoughts about this:

  • there ought to be a check in Happy telling you when you've wrapped around (and the code you will be generating is bogus)
  • the code in ProduceCode and GenericTemplate could do with some rewriting around wider width types (maybe 32 bit integers?)
  • I may have a trick for still using (almost all) of the negative numbers in the (existing) 16 bit integers

I'll probably open a PR sometime in the next couple of days for the first and third points.

harpocrates added a commit to harpocrates/happy that referenced this issue Sep 13, 2017
This fails on versions of Happy before PR haskell#102 was merged and passes afterwards.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant