many_m_n can succeed when min > max #1333

carado · 2021-06-26T17:10:21Z

nom version: 6.2.1
nom compilation features used: none

The following code:

fn main() {
  let res: nom::IResult<&str, Vec<char>> =
    nom::multi::many_m_n(4, 2, nom::character::complete::char('a'))("aaa");
  dbg!(res);
}

will succeed, consuming two 'a':

[src/main.rs:4] res = Ok(
    (
        "a",
        [
            'a',
            'a',
        ],
    ),
)

While it is unlikely that someone would hardcode a minimum value greater than a maximum value, they can be the result of more complex code, and a call to many_m_n in such conditions should probably systematically fail (as the constraint of parsing something at least min but at most max times would be impossible to satisfy).

Have a nice day

The text was updated successfully, but these errors were encountered:

Geal · 2021-08-09T16:21:29Z

thanks for the report, it will be fixed in nom 7 with ab42ced

cenodis · 2021-08-10T21:21:17Z

@Geal looking at ab42ced it looks like the many_m_n parser just fails instantly with a standard nom error. This makes it look like the parser works normally but didnt match anything. I think a panic would be more appropriate since min > max is nonsensical input and can never create a working parser. The panic should also occur in the function and not inside the closure since the parser creation itself should fail.

Additionally a nom error might also never reach the user since certain parser such as alt() might swallow it. This could lead to unintended behaviour since the parser runs successfully but doesnt produce the desired output. Since no error is reported finding the source of such an issue would be potentially difficult and time consuming.

Geal · 2021-08-11T09:08:57Z

Parsers should never panic, and since parsers can be created dynamically from the output of other parsers, parser creation should not panic either (that could result in easy DoS attacks by messing with the format).
I think that error should be a failure instead, which would ensure it bubbles up through the parser chain and is not blocked by alt

carado · 2021-08-11T09:09:31Z

@cenodis it's not like all parameters to parsers are always hardcoded; I made this issue because I myself wrote a parser where min and max were the result of some computations, and it indeed made sense that, when those results happen to lead to min > max, parsing silently fails.

If this function wasn't many_m_n(min: usize, max: usize) but was instead many_m_n<const MIN: usize, const MAX: usize>() I would agree with you because that would force the parameters to be always the same, and so a piece of code compiled with MIN > MAX would indeed necessarily be wrong.

It's also consistent with e.g. std::ops::RangeBounds::contains, which simply returns false instead of panicking — a range with min > max is simply seen as an empty range.

fn main() {
    dbg!((4..=2).is_empty());   // => true
    dbg!((4..=2).contains(&3)); // => false
}

cenodis · 2021-08-11T17:25:09Z

@Geal I wasnt aware parsers should never panic, my apologies.

However, I would like to argue for Failure over Error in such cases. Especially with regards to @carado's post.

As already stated Error can be easily swallowed by a chain of parsers making it difficult to diagnose when it happens unintentionally. Especially when working with dynamic data where it can be easy to miss the fact that min might be greater than max.

The fact that the parser fails silently is also not really obvious from the outside. After all the function is called many_m_n and not many_m_n_except_if_n_greater_than_m. This makes the behaviour of this parser more complex than it needs to be and makes following the program flow more difficult.

I dont disagree that there are cases where having it fail silently is useful. But that should not be part of the many_m_n parser itself. I would recommend solving it with parser combinations:

cond_err(min <= max, many_m_n(parser, min, max))

^{Side note: I actually thought cond did this but it returns Ok when the condition is false. Hence the "cond_err" pseudoparser. Basically just cond but returns Error if the condition is false.}

This is also more in line with the parser combinator principle (Combining multiple parsers with simple, obvious behaviour instead of having one parser with complex behaviour).
Additionally it documents the fact that max > min in the surrounding code is desired and not an accident.

I dont see much practical reason to be consistent with the RangeBounds of the std library. Especially since ranges are not consistent within Rust itself. For example you cant create a slice over a range with max > min. If it was really consistent with RangeBounds shouldnt it be an empty slice since the bound is empty? In the same way I dont see what meaningful information such a range would convey when used with this parser. It would just make it more confusing and easier to make mistakes.

Edit: fold_many_m_n should probably be handled the same way since its basically the same parser just organized a bit different.

Geal · 2021-08-13T09:00:40Z

4a04c56 changes the Error to Failure and 37eedf3 adds the check to fold_many_m_n

Geal · 2021-08-21T11:14:44Z

fix released in nom 7: https://crates.io/crates/nom/7.0.0

Stargateur · 2021-08-21T18:48:32Z

I do not think the solution here is good, it's should be outside the closure and in my opinion panic on such case or return a IResult<Parser> if we want to allow dynamic behavior but I don't see any real world problem where this could be used dynamically.

Actually why not take a impl RangeBounds ?

cenodis · 2021-08-21T19:03:22Z

@Stargateur

in my opinion panic

as already stated by Geal, parser creation should never fail. A panic is therefore unnaceptable.

return a IResult

IResult makes no sense here since it encodes a parser result. Im going to assume you meant Result<Parser>. While possible, it could make composition more difficult since you have to somehow unwrap the created parser.

not take a impl RangeBounds

To my knowledge RangeBounds does not forbid the creation of ranges with max > min. So it doesnt help in this case.

In #1356 I suggested replacing the Failure with a seperate error type specifically meant to represent a "broken" parser. That would be more in line with how parsers work in nom and would allow handling of such errors via combinators (or bubble up to the caller if not handled).
I feel like that would be the most appropriate way of handling such scenarios without breaking everything (most parsers already bubble up any errors they dont explicitly handle to cover cases like Failure and Incomplete) and stays in line with the "normal" way of nom error handling.

Stargateur · 2021-08-21T19:18:38Z

IResult makes no sense here since it encodes a parser result. Im going to assume you meant Result<Parser>. While possible, it could make composition more difficult since you have to somehow unwrap the created parser.

Indeed, one would need something like many_m_n(5, 4).convert()?(input) but I believe it's make sense, the error is about call to many_m_n not the Parser created. Anyway that a detail I guess so maybe not very important and the cureent solution is ok.

Stargateur · 2021-08-21T19:22:08Z

To my knowledge RangeBounds does not forbid the creation of ranges with max > min. So it doesnt help in this case.

No but https://doc.rust-lang.org/core/ops/trait.RangeBounds.html#method.contains would corretly report false:

fn main() {
    let r = 42..21;
    
    assert_eq!(r.contains(&30), false);
}

cenodis · 2021-08-21T21:27:29Z

@Stargateur

No but https://doc.rust-lang.org/core/ops/trait.RangeBounds.html#method.contains would corretly report false:

What does that have to do with anything? max > min is invalid for this parser and Rangebound does not prevent such a range from being constructed.

Stargateur · 2021-08-21T22:10:59Z

@Stargateur

No but https://doc.rust-lang.org/core/ops/trait.RangeBounds.html#method.contains would corretly report false:

What does that have to do with anything? max > min is invalid for this parser and Rangebound does not prevent such a range from being constructed.

that would solve the original problem, but that would not allow your request to make it an hard error. Well yes That could depend on how you use it

carado · 2021-08-22T08:27:44Z

I think (42..21).contains(&30) returning false is pretty consistent with many_m_n(parser, 42, 21)("…") returning a failure; neither panic, both accept the range and consider it de-facto empty (refusing all inputs).

Geal added this to the 7.0 milestone Aug 9, 2021

Geal added the needs testing label Aug 13, 2021

Geal closed this as completed Aug 21, 2021

cenodis mentioned this issue Aug 21, 2021

error type names are confusing and hard to search and missing way to make them owned #1356

Closed

Stargateur mentioned this issue Sep 19, 2021

[WIP] Feature multi range #1402

Closed

8 tasks

Stargateur mentioned this issue Oct 23, 2021

Implement fold and try_fold using NomBounds #1436

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

many_m_n can succeed when min > max #1333

many_m_n can succeed when min > max #1333

carado commented Jun 26, 2021

Geal commented Aug 9, 2021

cenodis commented Aug 10, 2021

Geal commented Aug 11, 2021

carado commented Aug 11, 2021 •

edited

cenodis commented Aug 11, 2021 •

edited

Geal commented Aug 13, 2021

Geal commented Aug 21, 2021

Stargateur commented Aug 21, 2021

cenodis commented Aug 21, 2021 •

edited

Stargateur commented Aug 21, 2021

Stargateur commented Aug 21, 2021

cenodis commented Aug 21, 2021

Stargateur commented Aug 21, 2021 •

edited

carado commented Aug 22, 2021

many_m_n can succeed when min > max #1333

many_m_n can succeed when min > max #1333

Comments

carado commented Jun 26, 2021

Geal commented Aug 9, 2021

cenodis commented Aug 10, 2021

Geal commented Aug 11, 2021

carado commented Aug 11, 2021 • edited

cenodis commented Aug 11, 2021 • edited

Geal commented Aug 13, 2021

Geal commented Aug 21, 2021

Stargateur commented Aug 21, 2021

cenodis commented Aug 21, 2021 • edited

Stargateur commented Aug 21, 2021

Stargateur commented Aug 21, 2021

cenodis commented Aug 21, 2021

Stargateur commented Aug 21, 2021 • edited

carado commented Aug 22, 2021

carado commented Aug 11, 2021 •

edited

cenodis commented Aug 11, 2021 •

edited

cenodis commented Aug 21, 2021 •

edited

Stargateur commented Aug 21, 2021 •

edited