Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report an error without tokens #208

Closed
UnkindPartition opened this issue May 9, 2017 · 14 comments
Closed

Report an error without tokens #208

UnkindPartition opened this issue May 9, 2017 · 14 comments

Comments

@UnkindPartition
Copy link
Contributor

Thanks to #191, I can prohibit one-element tuples in this way:

mb_pos <- MP.getNextTokenPosition
elems <- parens $ comp `MP.sepBy` (theSymbol ",")
if length elems == 1
  then do
    let Just pos = mb_pos -- isn't Nothing because we already parsed a tuple
    MP.setPosition pos
    fail "single-element tuples are not supported"
  else return $ SHTuple elems

This produces the following message:

test/examples/one-tuple.nml:3:20:
unexpected ‘(’
expecting identifier
single-element tuples are not supported

The error location is now correct, which is great. However, the unexpected ‘(’ and expecting identifier pieces are wrong and confusing, so I'd like to get rid of them.

I tried to remove them explicitly by setting the token sets to mempty:

MP.failure mempty mempty
  (Set.singleton $ MP.DecFail "single-element tuples are not supported")

But it didn't have any effect.

Ideally, there would be a fail-like function to only issue an error message without the unexpected-expected bit.

@mrkkrp
Copy link
Owner

mrkkrp commented May 9, 2017

Thanks for opening the issue. This looks weird, I would normally expect to see only fail message in this case. I will investigate this.

@mrkkrp
Copy link
Owner

mrkkrp commented May 9, 2017

Trynig to reproduce:

module Main (main) where

import Text.Megaparsec
import Text.Megaparsec.String
import qualified Text.Megaparsec.Lexer as L

sc :: Parser ()
sc = space

symbol :: String -> Parser String
symbol = L.symbol sc

parens :: Parser a -> Parser a
parens = between (symbol "(") (symbol ")")

comp :: Parser String
comp = L.lexeme sc (some letterChar)

parser :: Parser ()
parser = do
  mb_pos <- getNextTokenPosition
  elems <- parens $ comp `sepBy` (symbol ",")
  if length elems == 1
    then do
      let Just pos = mb_pos -- isn't Nothing because we already parsed a tuple
      setPosition pos
      fail "single-element tuples are not supported"
    else return ()

main :: IO ()
main = parseTest parser "(a)"

Run:

1:1:
single-element tuples are not supported

You probably provided not full context of what's going on there?

@mrkkrp
Copy link
Owner

mrkkrp commented May 9, 2017

I think you don't tell me that this parser with fail is an alternative that is tried after you try to parse an identifier without parentheses. This is an interesting case becasue normally you want to combine error messages from different branches that happen at the same spot.

@mrkkrp
Copy link
Owner

mrkkrp commented May 9, 2017

Perhaps the fix is to better render error messages in these cases. Becasue unexpected/expected stuff is obivously also valid, it just should be displayed in such a way that the user sees:

  1. There could be an identifier here, without parentheses.
  2. Single-tuples are not allowed.

Another thing to think about for version 6.

@mrkkrp
Copy link
Owner

mrkkrp commented May 9, 2017

What if you add the word “note” to the beginning of the fail message:

test/examples/one-tuple.nml:3:20:
unexpected ‘(’
expecting identifier
note: single-element tuples are not supported

This is a minor tweak but it starts to make a lot more sense this way.

@UnkindPartition
Copy link
Contributor Author

I think your diagnosis is correct, sorry for not providing a reproducible example from the beginning.

This is an interesting case becasue normally you want to combine error messages from different branches that happen at the same spot.

I would argue that this shouldn't apply to fail. These seem to me as two different types of errors:

  • I saw token X which is not allowed here; I'm not sure if you mean Y or Z
  • I know exactly what you mean; it's just not allowed

And so when you combine them, you end up with confusing error messages.

@mrkkrp
Copy link
Owner

mrkkrp commented May 9, 2017

This makes sense. Perhaps there should be a way to mark an error message as "overwriting" or something. Also when we have both expected tokens and custom messages, we should perhaps render the word "note:" between these sections. I'm consireding improving how error messages are represented and work in version 6. I'll keep you updated ask your opinion when I start on that.

@UnkindPartition
Copy link
Contributor Author

Cool, thanks!

@mrkkrp
Copy link
Owner

mrkkrp commented Jun 28, 2017

I've a solution in #223. I'll need to think a bit more about it though.

@mrkkrp
Copy link
Owner

mrkkrp commented Jun 28, 2017

What I don't like about that solution is that there is too much complexity in parse error merging. I wonder if ParseError should be a sum type with two possibilities: 1) unexpected/expected stuff 2) something custom/fancy. (2) should probably win over (1), it already includes fail messages and indentation-related messages which are hardly "mergeable" things. (2) also includes custom error messages with user defined data, we cannot know anything about those.

@mrkkrp
Copy link
Owner

mrkkrp commented Jun 28, 2017

Perhaps custom components should be merged and expected/unexpected too, but when there are two ParseErrors, one with custom message and another one with expected/unexpected stuff, custom should override it. This is consistent with existing logic when we don't add expected tokens from hints when we only have custom components, such a change would make the rule much more natural. Would it exclude useful error messages? How good error messages with mixed unexpected/expected stuff and custom components anyway? Hmm.

@UnkindPartition
Copy link
Contributor Author

UnkindPartition commented Jun 28, 2017

Yeah, as I wrote above, I don't think these two types of errors should be combined.

Or are you talking about something different?

@mrkkrp
Copy link
Owner

mrkkrp commented Jun 28, 2017

Well, in the current solution from #223, we have:

data ParseError t e = ParseError
  { errorPos :: NonEmpty SourcePos
    -- ^ Stack of source positions
  , errorUnexpected :: Set (ErrorItem t)
    -- ^ Unexpected items
  , errorExpected :: Set (ErrorItem t)
    -- ^ Expected items
  , errorFancy :: Set (ErrorFancy e)
    -- ^ Fancier errors
    --
    -- @since 6.0.0
  , errorOverride :: Bool
    -- ^ Whether the error should override other error instead of merging
    --
    -- @since 6.0.0
  } deriving (Show, Read, Eq, Data, Typeable, Generic)

mergeError :: (Ord t, Ord e)
  => ParseError t e
  -> ParseError t e
  -> ParseError t e
mergeError e1@(ParseError s1 u1 p1 x1 o1) e2@(ParseError s2 u2 p2 x2 o2) =
  case s1 `compare` s2 of
    LT -> e2
    EQ ->
      case (o1, o2) of
        (False, False) ->
          ParseError s1 (E.union u1 u2) (E.union p1 p2) (E.union x1 x2) o1
        (True,  _)    -> e1
        (False, True) -> e2
    GT -> e1

I'm thinking about making it:

data ParseError t e
  = TrivialError (NonEmpty SourcePos) (Set (ErrorItem t)) (Set (ErrorItem t))
    -- ^ unexpected\/expected
  | FancyError (NonEmpty SourcePos) (Set (ErrorFancy e))

And then changing mergeError to follow the logic I've described. I think it's worth trying. I can't remember a case when unexpected/expected stuff mixed with custom stuff were useful.

@UnkindPartition
Copy link
Contributor Author

Yes, agreed.

edwardgeorge pushed a commit to nstack/hschedule that referenced this issue May 23, 2018
There have been a few useful additions to megaparsec recently
(like mrkkrp/megaparsec#208);
plus, this includes the MonadFix ParsecT instance
(mrkkrp/megaparsec#225).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants