More expressive functions in expression parsers #22

mrkkrp · 2015-08-22T08:33:54Z

(Title is a sort of pan.) See haskell/parsec#32 for more information. This may be good thing to implement.

mrkkrp · 2015-08-22T13:44:38Z

The proposed change amounts to rewriting of Operator data type:

data Operator s u m a
  = Infix   (ParsecT s u m (a -> a -> a)) Assoc
  | Prefix  (ParsecT s u m (a -> a))
  | Postfix (ParsecT s u m (a -> a))

will become

data Operator s u m a
  = Infix   (a -> a -> ParsecT s u m a) Assoc
  | Prefix  (a -> ParsecT s u m a)
  | Postfix (a -> ParsecT s u m a)

I think this is doable and is actually a good idea that will make expression parsers more powerful. Apart from this change Text.Megaparsec.Expr shouldn't need any enhancements, maybe some refactoring.

If you have some comments regarding this particular change or would like to propose other changes related to Text.Megaparsec.Expr, now is the right time to discuss them.

mrkkrp · 2015-08-22T13:56:34Z

I think I'll rename buildExpressionParser → exprParser, since the former is too long.

abooij · 2015-08-22T14:02:01Z

There also exist permute in Text.Megaparsec.Perm, and makeTokenParser in Text.Megaparsec.Token - all of which are functions that take some input data to generate a new parser. Perhaps this naming scheme can be unified?

mrkkrp · 2015-08-22T14:03:46Z

@tulcod, great idea. Do you have any specific suggestions?

mrkkrp · 2015-08-22T14:09:55Z

I think we should use make as part of the names to highlight the fact that these functions generate new parsers. So, maybe scheme makeXxxParser will work?

Then we can rename the functions this way:

permute → makePermParser
buildExpressionParser → makeExprParser
makeTokenParser remains the same

abooij · 2015-08-22T14:14:53Z

Actually, I was wrong. makeTokenParser takes a language definition and generates a GenTokenParser, which is a collection of Parsec-style parsers. But I think that's not a big problem. But to fit the naming scheme, it would have to be named makeLanguageTokenParser or something, which I think is wrong.

Here are some permutations:

expressionParser, permutationParser / permParser, tokenParser
vice versa: parseExpression, parsePermutation / parsePerm, parseToken
buildExpressionParser, buildPermutationParser / buildPermParser, buildTokenParser (I am personally not against long names if they are more descriptive)
ditto: makeExpressionParser, makePermutationParser / makePermParser, makeTokenParser
expression, permute, token (this is not quite grammatically uniform)
express, permute, tokenize (not sure if the latter expresses what we want)

So in retrospect I'm not quite sure about the token parser thing, but the expression and permutation parser combinators seem to fit this scheme quite well.

mrkkrp · 2015-08-22T14:20:55Z

Well, Text.Megaparsec.Token now doesn't have GenTokenParser, it's called simply TokenParser, so in a way makeTokenParser doesn't lie.

mrkkrp · 2015-08-22T14:26:20Z

Although result of makeTokenParser is often called lexer. Maybe we should rename the function and the data type too? After all LanguageDef is a good name, but TokenParser is rather misleading. It's not parser per se, it's collection of parsers that parse various tokens in a language.

abooij · 2015-08-22T14:31:34Z

That last bit definitely has my support.

So what about, say, makeExpressionParser, makePermutationParser, makeLexer, with TokenParser in Text.Megaparsec.Token replaced by Lexer?

Great work on megaparsec so far, by the way - thanks for all the effort, I'm sure this will attract more attention over time.

mrkkrp · 2015-08-22T14:39:46Z

I like this your proposition, although I would call it shorter, so makeExprParser and makePermParser — this is certainly the same, but easier to type and you can put more information on a line (theoretically, at least).

Great work on megaparsec so far, by the way - thanks for all the effort, I'm sure this will attract more attention over time.

Thank you very much! Most interesting thing so far is all the improvement in error messages, there is whole lot of cases when Megaparsec's error messages are more descriptive and precise.

ahmadsalim · 2015-08-23T15:27:10Z

This will also make things like parsing things like mixfix operators easier, since one could use additional information during further parsing.
E.g., one could make ? a post-fix operator and then ensure that the following expression must be e_1 : e_2 by setting a flag in the parsing state; this makes it possible to parse the ternary operator e_0 ? e_1 : e_2 without much hassel.

abooij · 2015-08-23T15:44:45Z

@ahmadsalim, wouldn't you normally do that using backtracking? where the syntax is

[expression] ? [expression] : [expression]

ahmadsalim · 2015-08-23T16:17:16Z

@tulcod This would avoid having backtracking, and provide better error messages! Also, having the syntax that way makes it left-recursive and you have to do factorization of the grammar to get things working.

mrkkrp · 2015-08-24T15:45:07Z

This seems to be not that easy to implement. Actually to parse operator with something like a -> a -> ParsecT s u m a you need to apply two first arguments. This means you need to parse arguments before parsing of operator itself, so infix operators don't fit this scheme for example.

Maybe I'm missing something? @ahmadsalim, can you clarify?

mrkkrp · 2015-08-25T11:34:09Z

I'm closing this for now, since it's not clear how to implement it. Feel free to reopen this issue if you can clarify implementation details.

For now work on Text.Megaparsec.Expr amounts to:

refactoring (makeExprParser should be written as combination of several little functions, each function should be documented, so future readers of code have easier time to understand how the whole thing works);
tests.

ahmadsalim · 2015-08-25T11:57:00Z

Of course, you are right.
I think that the signatures I originally provided were wrong, what I really want is that it can accept operations which result changes the underlying monad m, not generate a parser depending on the arguments it get.

So, actually they should be (or similar):

data Operator s u m a
  = Infix   (ParsecT s u m (a -> a -> m a)) Assoc
  | Prefix  (ParsecT s u m (a -> m a))
  | Postfix (ParsecT s u m (a -> m a))

mrkkrp · 2015-08-25T12:23:15Z

@ahmadsalim, how this new definition of Operator is better than the original one? To use backtracking user state (e.g. to set some flag) you need to be inside ParsecT monad, not inside m, which often is just Indentity.

You can of course use something like StateMonad as m, but this would be rather strange since ParsecT is already instance of StateMonad and that state provided by m won't be backtracking.

Another option is:

data Operator s u m a
  = Infix   (ParsecT s u m (a -> a -> ParsecT s u m a)) Assoc
  | Prefix  (ParsecT s u m (a -> ParsecT s u m a))
  | Postfix (ParsecT s u m (a -> ParsecT s u m a))

In this case you can parse argument, operator, then second argument and apply the arguments to the operator, which will be able to fail or perform arbitrary actions inside ParsecT monad.

However, in vast majority of cases this system will be overcomplicated for end user.

ahmadsalim · 2015-08-25T13:23:00Z

@mrkkrp It allows me to record something in the state, and then check that in the context that calls the parser which can do the backtracking (which the pure thing cannot do).
But yeah, the option you mentioned is definitely better to use; although it is complicated, one could have the simpler ones as I mentioned in my original proposal and put an M suffix on the ones proposed here.

* [mrkkrp#6] Check for usage of 'head' for lists Resolves mrkkrp#6 * Fix after review

mrkkrp mentioned this issue Aug 22, 2015

Allow more expressive functions to be given as input in operator tables for buildExpressionParser haskell/parsec#32

Closed

mrkkrp added the enhancement label Aug 22, 2015

mrkkrp added this to the 4.0.0 milestone Aug 22, 2015

mrkkrp self-assigned this Aug 22, 2015

mrkkrp closed this as completed Aug 25, 2015

tomjaguarpaw pushed a commit to tomjaguarpaw/megaparsec that referenced this issue Sep 29, 2022

[mrkkrp#6] Check for usage of 'head' for lists (mrkkrp#22)

ebebd66

* [mrkkrp#6] Check for usage of 'head' for lists Resolves mrkkrp#6 * Fix after review

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More expressive functions in expression parsers #22

More expressive functions in expression parsers #22

mrkkrp commented Aug 22, 2015

mrkkrp commented Aug 22, 2015

mrkkrp commented Aug 22, 2015

abooij commented Aug 22, 2015

mrkkrp commented Aug 22, 2015

mrkkrp commented Aug 22, 2015

abooij commented Aug 22, 2015

mrkkrp commented Aug 22, 2015

mrkkrp commented Aug 22, 2015

abooij commented Aug 22, 2015

mrkkrp commented Aug 22, 2015

ahmadsalim commented Aug 23, 2015

abooij commented Aug 23, 2015

ahmadsalim commented Aug 23, 2015

mrkkrp commented Aug 24, 2015

mrkkrp commented Aug 25, 2015

ahmadsalim commented Aug 25, 2015

mrkkrp commented Aug 25, 2015

ahmadsalim commented Aug 25, 2015

More expressive functions in expression parsers #22

More expressive functions in expression parsers #22

Comments

mrkkrp commented Aug 22, 2015

mrkkrp commented Aug 22, 2015

mrkkrp commented Aug 22, 2015

abooij commented Aug 22, 2015

mrkkrp commented Aug 22, 2015

mrkkrp commented Aug 22, 2015

abooij commented Aug 22, 2015

mrkkrp commented Aug 22, 2015

mrkkrp commented Aug 22, 2015

abooij commented Aug 22, 2015

mrkkrp commented Aug 22, 2015

ahmadsalim commented Aug 23, 2015

abooij commented Aug 23, 2015

ahmadsalim commented Aug 23, 2015

mrkkrp commented Aug 24, 2015

mrkkrp commented Aug 25, 2015

ahmadsalim commented Aug 25, 2015

mrkkrp commented Aug 25, 2015

ahmadsalim commented Aug 25, 2015