Precedence parsing #1362

cenodis · 2021-08-15T15:48:30Z

This PR introduces a new module with parsers that can handle operator precedence in expressions. It was written with the intent to allow quick and easy translation of a usual operator precedence table into a nom parser. Individual operators and operands are parsed via nom parsers allowing for easy integration with existing parsers.

Example usage (very simple integer calculator):

fn parser(i: &str) -> IResult<&str, i64> {
  precedence(
    unary_op(1, tag("-")), //prefix operators
    fail, //postfix operators
    alt(( //binary operators
      binary_op(2, Assoc::Left, tag("*")),
      binary_op(2, Assoc::Left, tag("/")),
      binary_op(3, Assoc::Left, tag("+")),
      binary_op(3, Assoc::Left, tag("-")),
    )),
    alt(( //operands
      map_res(digit1, |s: &str| s.parse::<i64>()),
      delimited(tag("("), parser, tag(")")), //subexpression handled via recursion
    )),
    |op: Operation<&str, &str, &str, i64>| { //evaluating the expression step by step
      use nom::precedence::Operation::*;
      match op {
        Prefix("-", o) => Ok(-o),
        Binary(lhs, "*", rhs) => Ok(lhs * rhs),
        Binary(lhs, "/", rhs) => Ok(lhs / rhs),
        Binary(lhs, "+", rhs) => Ok(lhs + rhs),
        Binary(lhs, "-", rhs) => Ok(lhs - rhs),
        _ => Err("Invalid combination"),
      }
    }
  )(i)
}

TODOs

Better documentation.
All elements of this module are documented and I cant think of anything more to add.
~~I have done my best to write good documentation but some feedback would be appreciated. I still feel like some parts could be improved but I have no concrete ideas right now.~~
Negative tests.
Tests for parser failures now exist.
Currently the tests for precedence only check for positive results (i.e. successful parses). I would like to get some cases with invalid input as well. I have looked at how the existing tests handle this but the current error handling in nom escapes me. Help with this would be nice.
Improving API.
A "fail" parser now exists in nom and can be used to specify "no operators of this kind". I see no other significant problems with the API.
The current API I have come up with feels solid for the most part but I still think theres some room for improvement. Especially when handling languages that may not have certain classes of operators (for example a language with no postfix operators). This necessitates the use of a parser that always fails, but a parser with that functionality does not exist in nom so workarounds like verify(tag(""), |_: &str| false) are needed.
Recipes/Examples.
The tests now have an example containing AST generation, function calls and ternary operators using this parser.
~~I would like to add more examples into the recipes or example section. Especially for more involved expression containing things like function calls and ternary operators.~~

Open Questions

How should this parser handle "ambiguous precedences"? ~~(see this comment for more details about this)~~
Resolution: The current behaviour is sufficient. See here for reasoning.

Remove unused code

The parser really cant work without it and the helpers dont make much sense without the parser.

cenodis · 2021-08-19T13:53:54Z

Due to the lack of contrary opinions I have decided to label the current behaviour with ambiguous expressions as intentional.

This was done with the following reasons:

Nom makes no guarantees regarding the handling of ambiguous grammars.
Other parser exhibit similar behaviour (alt will just pick whichever parser was specified first and swallow any ambiguity).
The parser is consistent with its handling of ambiguous grammars. So there are no surprises once the user is aware of it.
I see little practical value in adding additional special handling for ambiguous grammars. Such handling would be extremely specific to the language being implemented and at that point it would most likely be easier to just manually write an expression parser that is purpose-built for that language.

cenodis · 2021-08-19T20:42:24Z

@Stargateur

Why put this in nom ?

There are no guidelines on what parser specifically belong in nom or not (or if it exists I havent found it in the contribution guide).
To quote Geal:

more support for text parsing [...]
handling precedence in expressions (something like https://hackage.haskell.org/package/parsec-3.1.14.0/docs/Text-Parsec-Expr.html would be nice) [...]
This could live in nom, or in a separate crate with a faster release cycle [...]

To me this seperate crate still sounds like an open decision. And since I dont know of any such crate existing as of yet I figured I would take my chances and get it into standard nom.

specially you force complete input

Could you tell me where Im forcing complete input? If one of the operand or operator parser returns Needed it should just bubble to the caller. As far as I can see this parser should be compatible with both complete and streaming parsers and which ones are used depends solely on what type of parsers are passed to precedence.
If you could give me specific lines via a review I will gladly try and fix anything that causes an incompatability with streaming parsers.

Stargateur · 2021-08-19T20:47:17Z

Could you tell me where Im forcing complete input? If one of the operand or operator parser returns Needed it should just bubble to the caller. As far as I can see this parser should be compatible with both complete and streaming parsers and which ones are used depends solely on what type of parsers are passed to precedence.
If you could give me specific lines via a review I will gladly try and fix anything that causes an incompatability with streaming parsers.

nevermind I miss understand.

eaglgenes101 · 2021-08-19T21:55:58Z

That's a lot to pass into a single function at once, especially since some of the arguments correspond to aspects of the precedence parser that not everyone will use. Maybe the API should use the builder pattern instead?

cenodis · 2021-08-19T22:05:33Z

@eaglgenes101

That's a lot to pass into a single function at once.

5 parameters in total. 3 of which are always required.

builder pattern

I dont think making a builder pattern for 2 optional parameters would be that great. A builder pattern would also either replicate the current API pretty closely (i.e. a single parser for each category) or I would have to use dynamic dispatch (to maintain a list of parsers for each category). And I would really like to avoid dynamic dispatch if possible.
I did some small experiments with builders while developing this and ultimately abandoned it because the builder didnt really help and just added bloat to the module.

Also there is nothing preventing you from extracting the parsers for the individual operand or operator groups into their own functions and then just passing those to the precedence parser. Same with the fold function. I just put it all in one place to give a comprehensive example of how it works.

Edit: Looking over the docs 5 parameters dont seem that out of line. There are plenty of parser that take 3 or 4 parameters, not counting alt and permutation which can (kind of) take up to 21 parameters. There is even an already existing parser with 5 parameters, fold_many_m_n.

Geal · 2021-08-24T14:47:05Z

hi! just fyi, I'll review that on saturday or sunday, it looks like a great addition to nom :)

cenodis · 2021-09-01T07:51:10Z

@Geal Any update on this?

mullr · 2022-03-02T23:21:24Z

This looks very useful, it would certainly clean up some of my parsers. @cenodis would you consider releasing this as a separate crate, if it doesn't look like it's making it into nom proper?

…cks.

LoganDark · 2022-03-14T08:29:51Z

Great, a feature that I need is stuck in a PR that's been sitting around for 6 months. @Geal can you please take a look at this?

Geal · 2022-03-14T09:07:29Z

It's been a weird 6 months for my open source work honestly 😅
But it's time for a new major version and precedence parsing will definitely go there

mullr · 2022-04-19T18:27:47Z

In case anybody else needs it: I turned the contents of this PR into a little crate that works with nom 7: https://github.com/mullr/nom-7-precedence. I'm looking forward to being able to delete it in the future, once nom 8 is a thing!

This comment has been minimized.

Sign in to view

cenodis mentioned this pull request Aug 15, 2021

Add fail parser #1363

Merged

cenodis added 12 commits August 17, 2021 14:05

Initial prototype

043bb50

Update docs

6230ff9

Remove unused code

More doc updates

4ef9c79

Add feature flags for Vec

6e210f1

Add basic tests

4d7b464

Fix formatting

e350e49

Add precedence to choosing_a_combinator.md

302b021

Fix typo

98df3b6

Minor refractoring

859b23e

Update docs

b412fb7

Change parameter order

68135e3

Add alloc feature to the entire precedence module

7f99f1a

The parser really cant work without it and the helpers dont make much sense without the parser.

cenodis force-pushed the feature_precedence branch from 24ef4a1 to 7f99f1a Compare August 17, 2021 12:06

cenodis added 15 commits August 17, 2021 14:16

Use fail parser to express "no operators of this type"

1480d0f

Document evaluation order

3f6f2b2

Better documentation for parameters

1018888

Fix precedence in documentation

8d50cf6

Fix doc formatting

1d64103

Fix typos

3770929

Use map_res when parsing integers

46c7039

Example test for expressions with function calls and AST generation

bcd6fd0

Typo

10bdd3d

Make evaluation a bit easier to read

e5b722d

Update expression_ast

7dcd9bc

Update expression_ast doc

1a4c423

Implement ternary operator in expression_ast

ea761e6

Shorten ast nodes

cd37e16

Implement some tests for parser failures

c21b00f

Update feature flags for docs

ae7c6cf

cenodis marked this pull request as ready for review August 19, 2021 14:16

cenodis changed the title ~~[WIP] Precedence parsing~~ Precedence parsing Aug 19, 2021

cenodis added 2 commits August 19, 2021 17:00

Properly append errors

355a74b

Properly bubble up non Error errors

36e9182

Split operators into 3 distinct types to help with exhaustiveness che…

d5cabef

…cks.

Geal added this to the 8.0 milestone Mar 14, 2022

cenodis mentioned this pull request Jul 18, 2023

makeExprParser + precedence climbing #1625

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Precedence parsing #1362

Precedence parsing #1362

cenodis commented Aug 15, 2021 •

edited

This comment has been minimized.

cenodis commented Aug 19, 2021 •

edited

cenodis commented Aug 19, 2021 •

edited

Stargateur commented Aug 19, 2021

eaglgenes101 commented Aug 19, 2021

cenodis commented Aug 19, 2021 •

edited

Geal commented Aug 24, 2021 •

edited

cenodis commented Sep 1, 2021

mullr commented Mar 2, 2022

LoganDark commented Mar 14, 2022

Geal commented Mar 14, 2022

mullr commented Apr 19, 2022

Precedence parsing #1362

Are you sure you want to change the base?

Precedence parsing #1362

Conversation

cenodis commented Aug 15, 2021 • edited

TODOs

Open Questions

This comment has been minimized.

cenodis commented Aug 19, 2021 • edited

cenodis commented Aug 19, 2021 • edited

Stargateur commented Aug 19, 2021

eaglgenes101 commented Aug 19, 2021

cenodis commented Aug 19, 2021 • edited

Geal commented Aug 24, 2021 • edited

cenodis commented Sep 1, 2021

mullr commented Mar 2, 2022

LoganDark commented Mar 14, 2022

Geal commented Mar 14, 2022

mullr commented Apr 19, 2022

cenodis commented Aug 15, 2021 •

edited

cenodis commented Aug 19, 2021 •

edited

cenodis commented Aug 19, 2021 •

edited

cenodis commented Aug 19, 2021 •

edited

Geal commented Aug 24, 2021 •

edited