Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More expressive functions in expression parsers #22

Closed
mrkkrp opened this issue Aug 22, 2015 · 18 comments
Closed

More expressive functions in expression parsers #22

mrkkrp opened this issue Aug 22, 2015 · 18 comments
Assignees
Milestone

Comments

@mrkkrp
Copy link
Owner

mrkkrp commented Aug 22, 2015

(Title is a sort of pan.) See haskell/parsec#32 for more information. This may be good thing to implement.

@mrkkrp
Copy link
Owner Author

mrkkrp commented Aug 22, 2015

The proposed change amounts to rewriting of Operator data type:

data Operator s u m a
  = Infix   (ParsecT s u m (a -> a -> a)) Assoc
  | Prefix  (ParsecT s u m (a -> a))
  | Postfix (ParsecT s u m (a -> a))

will become

data Operator s u m a
  = Infix   (a -> a -> ParsecT s u m a) Assoc
  | Prefix  (a -> ParsecT s u m a)
  | Postfix (a -> ParsecT s u m a)

I think this is doable and is actually a good idea that will make expression parsers more powerful. Apart from this change Text.Megaparsec.Expr shouldn't need any enhancements, maybe some refactoring.

If you have some comments regarding this particular change or would like to propose other changes related to Text.Megaparsec.Expr, now is the right time to discuss them.

@mrkkrp mrkkrp added this to the 4.0.0 milestone Aug 22, 2015
@mrkkrp mrkkrp self-assigned this Aug 22, 2015
@mrkkrp
Copy link
Owner Author

mrkkrp commented Aug 22, 2015

I think I'll rename buildExpressionParserexprParser, since the former is too long.

@abooij
Copy link
Contributor

abooij commented Aug 22, 2015

There also exist permute in Text.Megaparsec.Perm, and makeTokenParser in Text.Megaparsec.Token - all of which are functions that take some input data to generate a new parser. Perhaps this naming scheme can be unified?

@mrkkrp
Copy link
Owner Author

mrkkrp commented Aug 22, 2015

@tulcod, great idea. Do you have any specific suggestions?

@mrkkrp
Copy link
Owner Author

mrkkrp commented Aug 22, 2015

I think we should use make as part of the names to highlight the fact that these functions generate new parsers. So, maybe scheme makeXxxParser will work?

Then we can rename the functions this way:

  • permutemakePermParser
  • buildExpressionParsermakeExprParser
  • makeTokenParser remains the same

@abooij
Copy link
Contributor

abooij commented Aug 22, 2015

Actually, I was wrong. makeTokenParser takes a language definition and generates a GenTokenParser, which is a collection of Parsec-style parsers. But I think that's not a big problem. But to fit the naming scheme, it would have to be named makeLanguageTokenParser or something, which I think is wrong.

Here are some permutations:

  • expressionParser, permutationParser / permParser, tokenParser
  • vice versa: parseExpression, parsePermutation / parsePerm, parseToken
  • buildExpressionParser, buildPermutationParser / buildPermParser, buildTokenParser (I am personally not against long names if they are more descriptive)
  • ditto: makeExpressionParser, makePermutationParser / makePermParser, makeTokenParser
  • expression, permute, token (this is not quite grammatically uniform)
  • express, permute, tokenize (not sure if the latter expresses what we want)

So in retrospect I'm not quite sure about the token parser thing, but the expression and permutation parser combinators seem to fit this scheme quite well.

@mrkkrp
Copy link
Owner Author

mrkkrp commented Aug 22, 2015

Well, Text.Megaparsec.Token now doesn't have GenTokenParser, it's called simply TokenParser, so in a way makeTokenParser doesn't lie.

@mrkkrp
Copy link
Owner Author

mrkkrp commented Aug 22, 2015

Although result of makeTokenParser is often called lexer. Maybe we should rename the function and the data type too? After all LanguageDef is a good name, but TokenParser is rather misleading. It's not parser per se, it's collection of parsers that parse various tokens in a language.

@abooij
Copy link
Contributor

abooij commented Aug 22, 2015

That last bit definitely has my support.

So what about, say, makeExpressionParser, makePermutationParser, makeLexer, with TokenParser in Text.Megaparsec.Token replaced by Lexer?

Great work on megaparsec so far, by the way - thanks for all the effort, I'm sure this will attract more attention over time.

@mrkkrp
Copy link
Owner Author

mrkkrp commented Aug 22, 2015

I like this your proposition, although I would call it shorter, so makeExprParser and makePermParser — this is certainly the same, but easier to type and you can put more information on a line (theoretically, at least).

Great work on megaparsec so far, by the way - thanks for all the effort, I'm sure this will attract more attention over time.

Thank you very much! Most interesting thing so far is all the improvement in error messages, there is whole lot of cases when Megaparsec's error messages are more descriptive and precise.

@ahmadsalim
Copy link

This will also make things like parsing things like mixfix operators easier, since one could use additional information during further parsing.
E.g., one could make ? a post-fix operator and then ensure that the following expression must be e_1 : e_2 by setting a flag in the parsing state; this makes it possible to parse the ternary operator e_0 ? e_1 : e_2 without much hassel.

@abooij
Copy link
Contributor

abooij commented Aug 23, 2015

@ahmadsalim, wouldn't you normally do that using backtracking? where the syntax is

[expression] ? [expression] : [expression]

@ahmadsalim
Copy link

@tulcod This would avoid having backtracking, and provide better error messages! Also, having the syntax that way makes it left-recursive and you have to do factorization of the grammar to get things working.

@mrkkrp
Copy link
Owner Author

mrkkrp commented Aug 24, 2015

This seems to be not that easy to implement. Actually to parse operator with something like a -> a -> ParsecT s u m a you need to apply two first arguments. This means you need to parse arguments before parsing of operator itself, so infix operators don't fit this scheme for example.

Maybe I'm missing something? @ahmadsalim, can you clarify?

@mrkkrp
Copy link
Owner Author

mrkkrp commented Aug 25, 2015

I'm closing this for now, since it's not clear how to implement it. Feel free to reopen this issue if you can clarify implementation details.

For now work on Text.Megaparsec.Expr amounts to:

  • refactoring (makeExprParser should be written as combination of several little functions, each function should be documented, so future readers of code have easier time to understand how the whole thing works);
  • tests.

@mrkkrp mrkkrp closed this as completed Aug 25, 2015
@ahmadsalim
Copy link

Of course, you are right.
I think that the signatures I originally provided were wrong, what I really want is that it can accept operations which result changes the underlying monad m, not generate a parser depending on the arguments it get.

So, actually they should be (or similar):

data Operator s u m a
  = Infix   (ParsecT s u m (a -> a -> m a)) Assoc
  | Prefix  (ParsecT s u m (a -> m a))
  | Postfix (ParsecT s u m (a -> m a))

@mrkkrp
Copy link
Owner Author

mrkkrp commented Aug 25, 2015

@ahmadsalim, how this new definition of Operator is better than the original one? To use backtracking user state (e.g. to set some flag) you need to be inside ParsecT monad, not inside m, which often is just Indentity.

You can of course use something like StateMonad as m, but this would be rather strange since ParsecT is already instance of StateMonad and that state provided by m won't be backtracking.

Another option is:

data Operator s u m a
  = Infix   (ParsecT s u m (a -> a -> ParsecT s u m a)) Assoc
  | Prefix  (ParsecT s u m (a -> ParsecT s u m a))
  | Postfix (ParsecT s u m (a -> ParsecT s u m a))

In this case you can parse argument, operator, then second argument and apply the arguments to the operator, which will be able to fail or perform arbitrary actions inside ParsecT monad.

However, in vast majority of cases this system will be overcomplicated for end user.

@ahmadsalim
Copy link

@mrkkrp It allows me to record something in the state, and then check that in the context that calls the parser which can do the backtracking (which the pure thing cannot do).
But yeah, the option you mentioned is definitely better to use; although it is complicated, one could have the simpler ones as I mentioned in my original proposal and put an M suffix on the ones proposed here.

tomjaguarpaw pushed a commit to tomjaguarpaw/megaparsec that referenced this issue Sep 29, 2022
* [mrkkrp#6] Check for usage of 'head' for lists

Resolves mrkkrp#6

* Fix after review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants