Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: On parse error - show the next possible tokens #46

Merged
merged 3 commits into from
May 26, 2016

Conversation

da-x
Copy link
Contributor

@da-x da-x commented Apr 7, 2015

People asked for a patch upstream - so here's the place to start a discussion.

In this change, I've made sure that 'cabal test' still passes, and the interface to unmodified users is preserved. But you would probably have more comments.

@da-x
Copy link
Contributor Author

da-x commented Apr 16, 2015

I've added an additional commit to the branch, per your review. Feel free to squash/amend send more comments.

As for your request, this seems to be taking compiler/stage1/build/Parser.o from 2.2MB to 7.5MB. I'm guessing that if happy_explist_per_state was Int -> [ByteString], and with OverloadedStrings, along with unpack around its use perhaps would optimize it better? Or, maybe we can add a huge where clause instead of duplicating string literals?

@simonmar
Copy link
Member

300%+ increase in code size to too much (a lot too much!). In Happy-generated code we can use whatever tricks we like, since it's not meant to be user-editable code. For instance, we use unboxed values all over the place. I suggest there should be a single array of token names (each an unboxed string), and an unboxed array, indexed by state, with a bitmap of the allowed token values.

@da-x
Copy link
Contributor Author

da-x commented Apr 20, 2015

Followed your suggestion - now compiler/stage1/build/Parser.o is 2.4MB. Please see the commit added to the branch.

@da-x
Copy link
Contributor Author

da-x commented Jun 28, 2015

Just rechecking - is there anything more to be done for this?

@simonmar
Copy link
Member

simonmar commented Jul 2, 2015

This looks good. I'll find some time to test it and merge.

@da-x
Copy link
Contributor Author

da-x commented Sep 9, 2015

Rebased against simonmar/happy#master

However, cabal test seems to show some failures even with simonmar/happy#master, under ghc 7.10.2. With da-x/happy#master I get the same failures.

@da-x
Copy link
Contributor Author

da-x commented Nov 22, 2015

Rebased against simonmar/happy#master. Rebased my ghc changes to use the new happy feature on top of ghc#master.

Original branch simonmar/happy#master still fails cabal test with ghc 7.10.2:

monad002.g.hs:293:22:
    Couldn't match kind ‘*’ with ‘#’
    When matching types
      t0 :: *
      Happy_GHC_Exts.Int# :: #
    Expected type: t0
                   -> t0
                   -> Token
                   -> HappyState Token (t -> [Char] -> Int -> ParseResult a)
                   -> [HappyState Token (t -> [Char] -> Int -> ParseResult a)]
                   -> t
                   -> [Char]
                   -> Int
                   -> ParseResult a
      Actual type: Happy_GHC_Exts.Int#
                   -> Happy_GHC_Exts.Int#
                   -> Token
                   -> HappyState Token (t -> [Char] -> Int -> ParseResult a)
                   -> [HappyState Token (t -> [Char] -> Int -> ParseResult a)]
                   -> t
                   -> [Char]
                   -> Int
                   -> ParseResult a
    Relevant bindings include
      i :: t0 (bound at monad002.g.hs:293:18)
      cont :: t0 -> [Char] -> Int -> ParseResult a
        (bound at monad002.g.hs:293:13)
    In the expression: action i i tk (HappyState action) sts stk
    In an equation for ‘cont’:
        cont i = action i i tk (HappyState action) sts stk
    In the expression:
      let cont i = action i i tk (HappyState action) sts stk
      in
        case tk of {
          TokenEOF -> action 19# 19# tk (HappyState action) sts stk
          TokenLet -> cont 8#
          TokenIn -> cont 9#
          TokenInt happy_dollar_dollar -> cont 10#
          TokenVar happy_dollar_dollar -> cont 11#
          TokenEq -> cont 12#
          TokenPlus -> cont 13#
          TokenMinus -> cont 14#
          TokenTimes -> cont 15#
          TokenDiv -> cont 16#
          TokenOB -> cont 17#
          TokenCB -> cont 18#
          _ -> happyError' tk }

monad002.g.hs:296:26:
    Couldn't match kind ‘*’ with ‘#’
    When matching types
      t0 :: *
      Happy_GHC_Exts.Int# :: #
    Relevant bindings include
      cont :: t0 -> [Char] -> Int -> ParseResult a
        (bound at monad002.g.hs:293:13)
    In the first argument of ‘cont’, namely ‘8#’
    In the expression: cont 8#
    In a case alternative: TokenLet -> cont 8#

This change introduces the %errorsig attribute, that changes
the API of the provided handler function. A [String] of
possible next token can be passed to the function in order
to provide a better error message to the user.

We need to pass over the list of possible token names to the
happyFail function. For example:

    action_12 (12) = happyShift action_9
    action_12 (13) = happyShift action_10
    action_12 (14) = happyShift action_11
    action_12 (15) = happyShift action_12
    action_12 (5) = happyGoto action_15
    action_12 (6) = happyGoto action_5
    action_12 (7) = happyGoto action_6
    action_12 (8) = happyGoto action_7
    action_12 (9) = happyGoto action_8
    action_12 _ = happyFail ["idT","numT","boolT","\"(\""]

Signed-off-by: Dan Aloni <dan@kernelim.com>
 * Rename %errorsig to %errorhandletype
 * Detect invalid values for %errorhandletype at parsing stage,
   not code generation stage.
 * Document %errorhandlertype
 * Refactoring at the code generation function.

Signed-off-by: Dan Aloni <dan@kernelim.com>
Long list of literal strings, one string for each possible token
in a state, increases parser object size significantly.

Instead,

 * Use a bit matrix of the possiblility of a token in a state,
   generated along with the other parser tables.
 * Data.Array is now always imported, for storing the array.
 * Instead of inlining the result of possibleShifts, we now
   have a compact produceExpListArray that we can use in both
   the array-based parser and the non-array-based parser.

Signed-off-by: Dan Aloni <dan@kernelim.com>
@sinelaw
Copy link

sinelaw commented May 16, 2016

+1

Is there anything missing for this to be merged?

@simonmar
Copy link
Member

simonmar commented May 17, 2016

I got test failures, e.g.

../dist/build/happy/happy --strict --template=.. -g monad002.ly -o monad002.g.hs
ghc -hide-all-packages -package base -package array -package mtl -Werror -fforce-recomp  monad002.g.hs -o monad002.g.bin
[1 of 1] Compiling Main             ( monad002.g.hs, monad002.g.o )

monad002.g.hs:311:22:
    Couldn't match kind ‘*’ with ‘#’
    When matching types
      t0 :: *
      Happy_GHC_Exts.Int# :: #

(edited, had the wrong error message)

@da-x
Copy link
Contributor Author

da-x commented May 18, 2016

Seem related to something else - you get the same errors without this pull request, using yesterday's 1.19.6 tag, due to GHC 7.10.

With GHC 7.8.4 the tests pass (and separately, also with the changes here).

@da-x
Copy link
Contributor Author

da-x commented May 21, 2016

Opened a separate pull request for that test error.

@simonmar simonmar merged commit 9c11e1c into haskell:master May 26, 2016
@simonmar
Copy link
Member

This broke the Travis build, could you take a look please? https://travis-ci.org/simonmar/happy/builds/133044562

@da-x
Copy link
Contributor Author

da-x commented May 26, 2016

Looking into this

@da-x
Copy link
Contributor Author

da-x commented May 26, 2016

Fixed in a new pull request #68

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants