usability: address changes introduced with Complete* and many* combinators #839

Geal · 2018-10-06T16:13:22Z

references:

nom4: issues with slices and many0/many1/fold_many #764 nom4: issues with slices and many0/many1/fold_many
nom4: take_while broken ? #766 nom4: take_while broken ?
Update breaks 11-bit parser- always Incomplete #768 Update breaks 11-bit parser- always Incomplete
many0 returns Incomplete in weird cases #790 many0 returns Incomplete in weird cases
CompleteByteSlice/CompleteStr are invasive and make parsers not reusable #795 CompleteByteSlice/CompleteStr are invasive and make parsers not reusable

There are especially issues around transforming to and from the Complete* types, when reusing parsers or passing the results through flat_map! and other combinators.

There's also the problem of making sure basic parser functions work with all types. As an example, the hex_u32, be_* and le_* only work on &[u8], not CompleteByteSlice

The text was updated successfully, but these errors were encountered:

zaphar · 2018-10-10T01:18:19Z

Just want to say that this particular issue is a blocker for my moving a project over to nom4. I was really excited to see the improvements in error handling in nom4 but this change and the invasiveness of the fix are giving me pause to finishing the conversion.

Most of the other changes are pretty mechanical and easy to do. This one does not fall in that category though.

Geal · 2018-10-13T08:56:53Z

@zaphar can you provide examples of code that is affected by the changes? This will help me think about the issue

leshow · 2018-10-14T05:34:25Z

Since my issue is mentioned, I'll add I'm blocked form upgrading from nom 2 as well. @khernyo appears to have written a PR to fix my particular problem (referenced in the issue)

Geal · 2019-02-06T10:38:17Z

I experimented a bit in snack with edition 2018, to see if the edition 2018 could change things.
My first idea was to make different submodules, one with parsers that work in streaming (ex: nom::streaming::bytes), one for parsers that don't nom::complete::bytes) and let the user choose which macro to import with use nom::complete::bytes::tag.
Unfortunately, the #[macro_export] attribute puts macro at the top level, not at the module level: rust-lang/rust#54112

I have another idea though. It is possible to rename macros when importing them. So we could have use nom::many0 if we want to use the streaming version, but we could do use nom::many0_complete as many0 if we wanted a version that stops once it reaches the end of the buffer.
It makes imports a bit more annoying, but will keep the parsers more readable (at the cost of a lot of code duplication). More importantly, we would not need the CompleteStr and CompleteByteSlice types anymore.

I could go further with this and making nom a very small crates with only the core types, and separate the streaming or not streaming macros in different crates. But that might be a bit more annoying to maintain.

athei · 2019-02-06T20:54:53Z

That is nice! With that change is there even a use case for many0 and similar macros? They always return Incomplete and everyone just tries to come up with hacks to make them behave like the many0_complete you proposed. When you think about it they make only sense as complete version because how would a streaming parser know when many0 is done: The amount or length was transmitted and they are usually called through something like delimited! or length_bytes! through which we know for sure that the input must me complete.

That means the Incomplete is always returned by a caller of many0 but never by many0 itself and therefore there is no reason to have non-complete versions of them.

Geal · 2019-02-07T09:20:14Z

the current behaviour of many0! is actually correct. It should stop and return the Vec of values once its underlying parser fails, but the issue is in defining what makes that parser fails.
Imagine you have the following: named!(parser<Vec<&[u8]>>, many0!(tag!("abc")))
if you pass abcabcabc! the parser will return ["abc", "abc", "abc"] as expected, because the tag will fail on the ! character.
But if you pass abcabc, the parser has no way of knowing if that's the whole input, or if it's missing data and the whole input is actually abcabcabc!.
But as you mentioned, often when working in streaming mode with length delimited format, we get fields for which we know the exact length (and probably have everything in the buffer already) so we should be able to switch to another behaviour.

athei · 2019-02-07T09:27:16Z

Ahh okay I was under the impression that many0! returns an error when the underlying parser fails. My mistake. It makes much more sense now in my head. Yes, complete versions of the macros would be much appreciated.

leshow · 2019-03-11T16:06:54Z

I noticed my issue was closed, but doesn't work on 4.2. These changes will be included in 5?

Geal · 2019-06-17T14:30:14Z

closing this now, as nom 5 will fix those usability issues with CompleteByteSlice and CompleteStr by check notes REMOVING THEM ENTIRELY :D

Now there are specialized versions of parsers for streaming or complete input data in the streaming or complete submodules for some parser categories (see for example nom::character or nom::number`)

Geal added this to the 5.0 milestone Dec 14, 2018

Geal pinned this issue Dec 26, 2018

Geal closed this as completed Jun 17, 2019

Geal unpinned this issue Apr 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

usability: address changes introduced with Complete* and many* combinators #839

usability: address changes introduced with Complete* and many* combinators #839

Geal commented Oct 6, 2018

zaphar commented Oct 10, 2018

Geal commented Oct 13, 2018

leshow commented Oct 14, 2018 •

edited

Geal commented Feb 6, 2019

athei commented Feb 6, 2019

Geal commented Feb 7, 2019

athei commented Feb 7, 2019

leshow commented Mar 11, 2019 •

edited

Geal commented Jun 17, 2019

usability: address changes introduced with Complete* and many* combinators #839

usability: address changes introduced with Complete* and many* combinators #839

Comments

Geal commented Oct 6, 2018

zaphar commented Oct 10, 2018

Geal commented Oct 13, 2018

leshow commented Oct 14, 2018 • edited

Geal commented Feb 6, 2019

athei commented Feb 6, 2019

Geal commented Feb 7, 2019

athei commented Feb 7, 2019

leshow commented Mar 11, 2019 • edited

Geal commented Jun 17, 2019

leshow commented Oct 14, 2018 •

edited

leshow commented Mar 11, 2019 •

edited