Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Feature multi range #1402

Closed
wants to merge 38 commits into from
Closed

Conversation

cenodis
Copy link
Contributor

@cenodis cenodis commented Sep 18, 2021

This PR intends to fix #1393.

Description

It replaces existing parser variations that only differ by how many times a subparser may run (like the different many_* variants) with a single parser that takes a range.

In addition to a range parsers can also take a single usize value as parameter which evaluates to a range containing that value only (value..=value). This is primarily done for convenience.

Migration remains fairly simple as the new parsers are a full replacement and just require the respective parameters to be rephrased as a single range.

Reasoning

Using ranges makes the API cleaner by removing unnecessary function duplication. It also allows nom to integrate with more of Rusts language features (specifically the range syntax).

Example

many(2..=4,
  tag("Abcd")
)(i)

With the old parsers this would be expressed using the many_m_n parser:

many_m_n(2, 4,
  tag("Abcd")
)(i)

Special attention should be drawn to the fact that the other variations, such as many0 are also possible by using open ended ranges (0.., 1..).

Notes

The current implementation allows individual parsers to take ranges as parameters but requires them to document how they interpret open ended ranges. This was done to maximize flexibility and allow multiple, potentially contradicting, interpretations to coexist within the codebase.

A good example of this in action is the difference between many and fold:

  • many (a.. -> a..=usize::MAX)
    many packages its results into a Vec. Therefore going beyond the usize limit makes no sense and any open ended range is capped at usize::MAX.
  • fold (a.. -> a..=∞)
    fold has static memory requirements since there is only one accumulator. This accumulator can be used indefinitely and as such the amount of iterations can go beyond even the usize::MAX limit (the current implementations of fold_* also support such cases). Because of this the parser interprets open ended ranges to mean "can run for an infinite number of times".

TODOs

Candidates

The following parsers have been proposed for merging.

  • many_* (many0, many1, many_m_n)
    Resolution: Done
  • fold_* (fold_many0, fold_many1, fold_many_m_n)
    Resolution: Done
  • take_till* (take_till, take_till1)
  • take_while* (take_while, take_while1, take_while_m_n)
  • take_until* (take_until, take_until1)
  • many_till

Open questions

  • How should the obsolete parsers be handled?
    Currently the obsolete parsers (like many_m_n) are deprecated using #[deprecated=""] with a message pointing the developer to the replacement. (Usually in the form of a simple Replaced by <new_parser>). This allows the changes to remain backwards compatible while steering developers towards using the replacement parsers.
    Tests for the deprecated parsers are left as is but are annotated with #[allow(deprecated)] to suppress the warnings. I would recommend keeping them in for as long as the parsers still exist to make sure that they arent broken accidentally.

    Resolution: No deprecations for this release. The old parsers and the new range based parsers will exist side-by-side.
  • Resolve open questions of the individual candidates.
    See here

Copy link
Contributor

@Stargateur Stargateur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I don't like much the saturating range. This would considerably slow down most use of fold like fold(0.., ), thus I think we should keep specialization variant of many0 many1 fold_many0 and fold_many1. Cause it's more clear what they do, and would allow us to remove this saturation thing. I very concern with that affecting benchmark significantly.

src/multi/mod.rs Outdated Show resolved Hide resolved
src/multi/mod.rs Outdated Show resolved Hide resolved
@cenodis
Copy link
Contributor Author

cenodis commented Sep 18, 2021

Current state of candidate implementation.

fold

Open questions: None

  • Implementation
  • Tests
  • Documentation

while

Open questions: None

  • Implementation
  • Tests
  • Documentation

take_till*

Open questions: It is not possible to just remove the number suffix as with many or fold because the trimmed version take_till is already taken. Only take_till1 exists with suffix. As such, a different naming scheme must be choosen for such cases.

  • Implementation
  • Tests
  • Documentation

take_while*

Open questions: Same problem as take_till*

  • Implementation
  • Tests
  • Documentation

take_until*

Open questions: Same problem as take_till*

  • Implementation
  • Tests
  • Documentation

many_till

Open questions: In contrast to all other parsers on this list many_till does not have any variants. Because of this, should this parser be augmented with ranges at all? If yes, what name should that parser have?

  • Implementation
  • Tests
  • Documentation

@cenodis
Copy link
Contributor Author

cenodis commented Sep 18, 2021

This would considerably slow down most use of fold like fold(0.., )

Im not sure if this would have a significant performance impact. The performance of parsers with both lower and upper bounds should remain identical since they also used ranges internally. And I dont think the impact of a relatively simple iterator in the unbounded cases is that severe.
That being said I could remove the deprecation notice on the 0 and 1 variants. Im still not fully sure how to handle the old parsers (which is why its still an open point in the PR).

we should keep specialization variant of many0 many1 fold_many0 and fold_many1. Cause it's more clear what they do, and would allow us to remove this saturation thing.

I disagree. Ranges are a first class type in Rust (regardless of how unwieldy their structs can be) and they even have their own syntax. By using that syntax for all cases we can cut down on the amount of functions which makes the API easier to understand because now there are fewer functions whose documentation you have to look up.

Im not sure what you mean with "that saturation thing".
If you mean the interpretation of a.., thats a necessary consequence of the inherently conflicting nature of unbounded ranges. In a many we are allocating within a Vec which has a certain size limit we must never go beyond. As such a truly "endless" range is not and can never be valid. In contrast fold can saturate in order to be truly "endless" (which doesnt cause panics and is therefore allowed).
Because of these contradicting interpretation the parsers must be able to decide how they need to behave in these cases. And because this is decided for every parser individually the best place to document that behaviour is in the documentation of the parser. With all that, if you think the documentation can be improved Im all ears.

Actually now that I think about it. This doubles as a bug fix. The current implementation of many0/many1 can panic if the amount of elements exceeds isize::MAX. And parsers should never panic so that is definetely a bug.
But in order to check for that many* needs to keep track of how many elements it has allocated at which point we are back at the same point as our range iterators. So I think (based on gut feeling, not tests) that the performance penalty is negligible (or potentially even non-existent).

I misread the Vec docs as meaning elements. Its actually about the byte size of the Vec. That doesnt change the fact that many0/many1 could still panic, but guarding against that is a bit more complicated.

@Stargateur
Copy link
Contributor

And I dont think the impact of a relatively simple iterator in the unbounded cases is that severe.

Well, I think not

I disagree. Ranges are a first class type in Rust (regardless of how unwieldy their structs can be) and they even have their own syntax. By using that syntax for all cases we can cut down on the amount of functions which makes the API easier to understand because now there are fewer functions whose documentation you have to look up.

That could be used to justify use range anywhere cause you know they are first class thing... not a good argument. fold_many0 express better than fold(0.. period.

Your idea of using Range for fold was my idea in the behind #1333 (comment), but use range imply we should use it naturally, in this PR you add strange check and add a saturate feature to bend range to what you want. That not what we should do, having a hard limit for overflow is the way range are suppose to be use. And that why we must keep a not using range fold to allow to express a not bounded fold. 0.. doesn't mean 0 to infinity, it's mean 0 to what is max according to context. For example vector my_vec[..] doesn't mean infinite to infinite, but 0 to len of vec. nom should be 0..max we can.

@cenodis
Copy link
Contributor Author

cenodis commented Sep 19, 2021

That could be used to justify use range anywhere cause you know they are first class thing... not a good argument.

No, but being a first class type is a good argument to justify using it where it makes sense.

fold_many0 express better than fold(0.. period.

Ok. So how exactly does the former "express" better? A range tends to be more expressive because it makes clear that we are talking about a range of possibilities (reads as "fold that many times", in this case "fold 0 or more times"). In fold_many0 the 0 is just part of the name and could mean virtually anything.

use range imply we should use it naturally

And a "natural" range is what exactly? I am unfamiliar with any such concept (except natural numbers) and the docs make no mention of that as far as I know.

you add strange check

The check is absolutely necessary. Your comment on #1333 seems to imply that you think ranges somehow prevent the max > min condition. That is not the case. 2..0 is a perfectly valid range in Rust (but makes no sense for any parser in nom, so we have to guard against it).

add a saturate feature to bend range to what you want

I am not "bending" the range to anything. Saturation arithmetic is a perfectly valid concept and the terminology is even used within Rust. All it means is that instead of terminating after reaching the representable maximum it, it continues to return that maximum indefinitely.

To quote the Rust docs for RangeFrom:

Overflow in the Iterator implementation (when the contained data type reaches its numerical limit) is allowed to panic, wrap, or saturate.

So saturating is valid even for normal unbounded ranges.
Note: This would actually mean that limiting an unbounded range is invalid. Luckily we are not using the actual Iterator implementation and can therefore make up our own behaviour. But thats fine for as long as its properly documented (which it is). This doc snippet is just to illustrate my point.

having a hard limit for overflow is the way range are suppose to be use.

Untrue, see the doc snippet above. If anything, a "hard limit" for an unbounded range is the only thing it is not allowed to do.

0.. doesn't mean 0 to infinity, it's mean 0 to what is max according to context.

You are contradicting yourself. Previously you tell me I shouldnt "bend the range to what you want" and I am adding "strange saturation behaviour" but here you tell me there is no globally defined behaviour and it should instead be interpreted based on the context?

To reiterate, interpreting based on the context is exactly what we are doing right now:

  • many: the range is used in context with a Vec. Therefore it is limited to an upper bound of usize::MAX because going past that would be impossible. So an unbounded upper bound is interpreted as "at most usize::MAX" in this context.
  • fold: the range is merely used to check for how often the parser has run. It does not have any limiting factors in its context and can therefore go even beyond the normal usize::MAX limit. So an unbounded upper bound is interpreted as "truly infinite" in this context.

@Stargateur
Copy link
Contributor

You doesn't seem open to discussion at all, you read but don't listen. You don't look like you code in Rust from long. That not how you are suppose to implement or use unbound range. Your solution is not good. I don't have your level of english so everything I say look stupid so I will end talk and just say, I opposite to this PR.

@cenodis
Copy link
Contributor Author

cenodis commented Sep 19, 2021

You doesn't seem open to discussion at all, you read but don't listen.

I am very much open to discussion. This is what discussions are: two people exchanging their (possibly opposing) viewpoints in hopes of coming to an understanding. And I do read and try very hard to listen. But I am having a tough time understanding what exactly your problems are.

That not how you are suppose to implement or use unbound range.

Then how exactly would you implement this? Be aware that the restriction is that the parser must be able to perform all functions of the parsers that it replaces. If you can give me an alternative to how the ranges are supposed to be interpreted while staying within that restriction I would be happy to hear it. Because I dont really see any other option.

Your solution is not good.

Ok. But its a solution, it works and it solves a rather annoying issue (the parser duplication). To me thats all a solution really needs.

I opposite to this PR

Thats nice. Unfortunately I still have absolutely no idea why. So Im afraid I cant really take that vote into account as anything more than "some dude opposes this PR because reasons".

@Stargateur
Copy link
Contributor

see what I mean, I said why but you don't even acknowledge this, you are not open to talk, you talk that all.

@Stargateur
Copy link
Contributor

Stargateur commented Sep 19, 2021

Then how exactly would you implement this? Be aware that the restriction is that the parser must be able to perform all functions of the parsers that it replaces. If you can give me an alternative to how the ranges are supposed to be interpreted while staying within that restriction I would be happy to hear it. Because I dont really see any other option.

use end_bound directly to handle Bound::* directly do a while loop for Unbounded.

src/multi/mod.rs Outdated Show resolved Hide resolved
src/traits.rs Outdated Show resolved Hide resolved
@Geal
Copy link
Collaborator

Geal commented Sep 25, 2021

do we have concrete evidence of the overhead of the saturating iterator? Like, is it actually slower, or is LLVM smart enough to see that it will loop indefinitely?

Honestly I'm not a huge fan of the saturating iterator trick for fold, I prefer the approach in #1402 (comment) even if it results in some code duplication. But it can be a good tradeoff (especially since it gives some info at compile time). BTW the match could be done in the execution of the combinator instead of the parser closure that is returned.

@cenodis
Copy link
Contributor Author

cenodis commented Sep 25, 2021

do we have concrete evidence of the overhead of the saturating iterator?

I have looked at the output of the bench job and compared them to the results on master. Im not sure how reliable those are but the difference seems to drown in the noise. So from that perspective the saturating iterator seems to remain competitive.

I prefer the approach in #1402 (comment)

The solution in #1402 (comment) is incomplete. The match would need to be extended with two additional checks to see if the lower bound is 0 or 1. If its not then we still need a saturating counter to compare against the lower bound later. So I think that would need at least 5 arms.

@cenodis
Copy link
Contributor Author

cenodis commented Sep 25, 2021

So I tried my implementation (or at least a good approximation without needing to import actual nom types) on godbolt. Looking at the assembler output, the compiler actually translates my implementation to the exact same binary as stars example:

example::main:
        sub     rsp, 56
        mov     qword ptr [rsp + 8], 42
        lea     rax, [rsp + 8]
        mov     rax, qword ptr [rsp + 8]
        mov     qword ptr [rsp], rax
        cmp     rax, 42
        jne     .LBB9_2
        add     rsp, 56
        ret
.LBB9_2:
        mov     qword ptr [rsp + 8], 0
        mov     rdi, rsp
        lea     rsi, [rsp + 8]
        call    core::panicking::assert_failed
        ud2

So to me that looks like the compiler can easily optimize its way around that. At least for constant values, but dynamic ranges need to be handled via the m_n variants anyway and those do counting in the old implementation as well, so no loss there.

Here are the two examples for comparison:
Current implementation
Stargateurs example

@Stargateur
Copy link
Contributor

The solution in #1402 (comment) is incomplete. The match would need to be extended with two additional checks to see if the lower bound is 0 or 1. If its not then we still need a saturating counter to compare against the lower bound later. So I think that would need at least 5 arms.

Why you need to check 0 or 1 ?

@cenodis
Copy link
Contributor Author

cenodis commented Sep 25, 2021

Why you need to check 0 or 1 ?

Assume the wanted range is 4... That is a range without an upper bound, which would therefore fall into the Unbounded match arm. Now tell me how you want to parse that with a simple loop and no count variable to keep track of the number of iterations.

Edit: In addition the ranges in your comment are wrong. 0..max and 0..=max for exclusive and inclusive respectively would mean the loop runs for one loop too many.

@Stargateur
Copy link
Contributor

Stargateur commented Sep 25, 2021

Edit: In addition the ranges in your comment are wrong. 0..max and 0..=max for exclusive and inclusive respectively would mean the loop runs for one loop too many.

What ? no ? That not clear. fold(0..18) mean I want run at least 0 time and max 18 time. fold(0..=18) I want at least 0 and max 19.

Assume the wanted range is 4... That is a range without an upper bound, which would therefore fall into the Unbounded match arm. Now tell me how you want to parse that with a simple loop and no count variable to keep track of the number of iterations.

There is nothing wrong to just use a simple mutable counter or you could use:

let iter = std::iter::successors(Some(0usize), |n| n.checked_add(1));
for i in iter {
}

If you really want to handle overflow.

But I think I would do:

match (range.start_bound(), range.end_bound) {
  (Bound::Included(start), Bound::Included(end)) => for 0..=end {}, // need to check included start
  (Bound::Included(start), Bound::Excluded(end)) => for 0..end {}, // need to check included start same code then ^
  (Bound::Included(start), Bound::Unbounded) => { for 0..=start { } then loop { } } // the for loop is not allow to error
  (Bound::Excluded(start), Bound::Included(end) => for 0..=end {}, // need to check excluded start
  (Bound::Excluded(start), Bound::Excluded(end) => for 0..=end {}, // need to check excluded start same code then ^
  (Bound::Excluded(start), Bound::Unbounded) => { for 0..start { } then loop { } } // the for loop is not allow to error same code then ^^^
  (Bound::Unbounded, Bound::Included(end)) => for 0..=end {}, // no check
  (Bound::Unbounded, Bound::Excluded(end)) => for 0..end {},  // no check
  (Bound::Unbounded, Bound::Unbounded) => loop {},  // no check

This will allow compiler to compile check the code, there is no error possible. And I see a lot of sharing code possible.

@cenodis
Copy link
Contributor Author

cenodis commented Sep 25, 2021

What ? no ? That not clear. fold(0..18) mean I want run at least 0 time and max 18 time. fold(0..=18) I want at least 0 and max 19.

That breaks the way ranges are interpreted in the old parsers. The range was always meant to represent the minimum and maximum amount of iterations the inner loop is allowed to do. So 0..=18 is anything between 0 and 18 iterations. Your implementation is the one that is "unclear".
Even if there wasnt any established precedence. 0..18 means "anything between 0 and 18 exclusively". So why would you suddenly handle it like it was inclusive? I would suggest reading up on how ranges work.

If you really want to handle overflow.

That also breaks compatibility with existing parsers. And with checked arithmetic it still needs to check if an overflow has occured. So I dont really see how that is supposed to be much better compared to just saturating, which is also a check for overflow that just returns the max amount instead of None. Doesnt seem that different (performance wise).

This will allow compiler to compile check the code, there is no error possible.

(warning: sarcasm)
Yes, because there is absolutely no way for anyone to make a single mistake while writing out that monstrosity of a match statement. The compiler cant check for your logical errors.

My implementation exposes all the necessary type information about each range as well. It just encapsulates it in the respective trait implementation so all the necessary edge case handling doesnt need to happen in the parser.
My experiment with godbolt has shown that the compiler is perfectly able to optimize the traits and iterators away (to the point that it literally produces the exact same output for both versions).

I want to be careful that this doesnt end in another 20+ post argument because of the language barrier.
So unless you give me something with actual practical impact Im going to call it quits here. If you want to you can write your own "ideal" implementation and then we can compare.

@Stargateur
Copy link
Contributor

Stargateur commented Sep 25, 2021

That breaks the way ranges are interpreted in the old parsers. The range was always meant to represent the minimum and maximum amount of iterations the inner loop is allowed to do. So 0..=18 is anything between 0 and 18 iterations. Your implementation is the one that is "unclear".

That a new parser so I don't care of old one. Your interpretation is misleading with what user expect from range. I mean the fact that I didn't understand you ignore included from the code prove my point that your code is unclear.

Even if there wasnt any established precedence. 0..18 means "anything between 0 and 18 exclusively". So why would you suddenly handle it like it was inclusive? I would suggest reading up on how ranges work.

What ?

That also breaks compatibility with existing parsers. And with checked arithmetic it still needs to check if an overflow has occured. So I dont really see how that is supposed to be much better compared to just saturating, which is also a check for overflow that just returns the max amount instead of None. Doesnt seem that different (performance wise).

choice whatever you want

My implementation exposes all the necessary type information about each range as well. It just encapsulates it in the respective trait implementation so all the necessary edge case handling doesnt need to happen in the parser.

Your code is unclear, use magic code like where does come from this 4 ? and hide logic far away from the code. Yes my idea is way less error prone in the long run. It's may look bad but it's very rusty.

@cenodis
Copy link
Contributor Author

cenodis commented Sep 26, 2021

What ?

Please read this and the documentation for the individual types here. Then please understand why what you are proposing is fundamentally wrong not just from a programming perspective but also from a mathematical perspective.

choice whatever you want
Your code is unclear
where does come from this 4
hide logic far away from the code
Yes my idea is way less error prone in the long run.

Im going to assume this is once again the language barrier and/or you just flat out dont understand the problem. As such, this is not a good use of my time and Im going to call it quits.

Your code doesnt work, breaks assumptions about the basic types and breaks compatibility with the old parsers. One of these points alone would be enough to disqualify it.

My code only looks "magic" and unclear if you dont understand how traits and their implementations work.
Its not "hiding" anything. The proper term is "organizing" which is pretty important if you actually want to maintain your codebase for any reasonable amount of time. Anyone that has a basic understanding of rust would be able to tell how the compiler resolves this (and how that resolution has no real performance penalty since it happens at compile time).

And no, your idea is many things, but definitely not "less error prone". My implementation has a single point of failure (the trait implementations). Your implementation has a point of failure for every parser that is implemented that way (or possibly even for every single match arm if you want to be exact).

In addition to all of the above your implementation has yet another downside: dynamic data. The compiler is not guaranteed to infer anything about dynamic data and may (at worst) include the entire match statement in the final output. With my implementation the compiler will at the very least eliminate conditional checks since the type is known at compile time and guaranteed to be resolved to a concrete implementation.

As for the "rustyness" of the code. I would say taking full advantage of rusts type and trait system makes my implementation fairly "rusty". Because apparently thats a solid, objective metric now.

This is the last post I will make about this until you prove to me that you actually understand what Im trying to tell you as I am fairly convinced that this is just going to end in another dispute that solves nothing and just wastes time that I dont have.

@Stargateur
Copy link
Contributor

Stargateur commented Sep 26, 2021

Please read this and the documentation for the individual types here. Then please understand why what you are proposing is fundamentally wrong not just from a programming perspective but also from a mathematical perspective.

use std::ops::{Bound, RangeBounds};

struct RangeFromInclusive {
    pub start: usize,
}

impl RangeBounds<usize> for RangeFromInclusive {
    fn start_bound(&self) -> Bound<&usize> {
        Bound::Included(&self.start)
    }
    fn end_bound(&self) -> Bound<&usize> {
        Bound::Unbounded
    }
}

fn main() {
    let range = RangeFromInclusive { start: 42 };

    assert_eq!(range.start_bound(), Bound::Included(&42));
}

That not because core syntax didn't include this that user can't do it.

You should really take into account that I'm a rust expert. I probably still have to learn but I pretty good in Rust. I code in Rust every day in my work, and have know Rust pre 1.0. My advice are not random, what I see is you probably a very good C++ dev or whatever OOP language, but sometime in Rust one should not use trait. My match reveal we handle 5 different behavior (we can probably share the code between fold, many, etc...) , there is no way to factorize the 5 behaviors without complicated code that would need an heavy optimization that will be hard to track down. We should prefer KISS.

@Stargateur
Copy link
Contributor

Stargateur commented Sep 26, 2021

my example is stupid cause I should have done for excluded, anyway, same point.

The user can also simply use a tuple (Bound::Excluded(42), Bound::Unbounded)

@Stargateur
Copy link
Contributor

Stargateur commented Sep 26, 2021

if you don't want use range like range are mean to be use, maybe just take two parameters min: Option<usize>, max: Option<usize> that would reduce the number of behavior to 4.

@Stargateur
Copy link
Contributor

So unless you give me something with actual practical impact Im going to call it quits here. If you want to you can write your own "ideal" implementation and then we can compare.

I will do that I think I code better than I write, but I would have prefer we work together.

@cenodis
Copy link
Contributor Author

cenodis commented Sep 26, 2021

there is no way to factorize the 5 behaviors without complicated code

Im not sure what your standards for "complicated" are, but at least for me the following is pretty simple:

fn bounded_iter(&self) -> impl Iterator<usize> {
  0..core::usize::MAX
}

All implementations are like this, they just return a single value which gets used as an iterator.
The only complexity arises from the fact that each range type needs to be handled slightly differently. I did not abstract this because I thought of any OOP pattern. I abstracted it because why instantiate the ranges manually in the parser when we can have the compiler do it?
Ultimately this is no different than the ranges found in the individual match arms in your example. Except we save ourselves the work of having to write it out in every individual parser. Instead we let the compiler do the matching.

would need an heavy optimization that will be hard to track down.

Again, not sure what "hard to track down" is for you. But its nothing different from a normal trait implementation which are used everywhere in Rust.

Monomorphisation replaces the generic function call with one of the concrete implementations.
For example, with an inclusive range 0..=12 this:

for count in range.bounded_iter() {

gets substituted by the compiler using the trait implementation for RangeInclusive<usize> to this:

for count in 0..12 {

which the exact same thing as the harcoded m_n variant, except the compiler can substitue for other range types as well. This of course gets further optimized the same way in either case.

There is really nothing hard to track down in this chain. The functions are annotated with the trait generic and from there its trivial to find the respective implementation for whatever concrete type you pass to it.

And if you look at the comparison between our two godbolt examples above you will see the compiler applies the same optimizations to the point that the binary output is literally identical. So we are not really loosing anything because of this abstraction.

The only real argument for performance is the use of a saturating counter in the case of 0.. and 1.. (all other ranges with an unbounded upper bound require some kind of counter). But going of the bench tests on github any difference (if it exists) drowns in the noise. So unless proven otherwise Im going to say it has no practical impact.

We should prefer KISS

This is KISS. At least not anymore complicated than having to deal with a dozen ranges and loops in a single function. I dont think looking up a trait implementation counts as "complicated".

@Xiretza
Copy link
Contributor

Xiretza commented Jan 16, 2022

Hi, just checking in to ask about the current status here. From what I can see there has been a lot of discussion with few actual results - are there any specific open questions other than the naming of the take_* parsers at this time?

@cenodis
Copy link
Contributor Author

cenodis commented Feb 23, 2022

Hi, just checking in to ask about the current status here. From what I can see there has been a lot of discussion with few actual results

The status can be summed up as "a working implementation exists". Not more, not less.

are there any specific open questions other than the naming of the take_* parsers at this time?

Geal has also raised the question of wether the saturating counter used in the trait implementations has any overhead over the explicit match with 3 different loops. Based on the benchmark tests I couldnt see any significant difference but I havent stress tested it in great detail or inspected the dissasembly.

I personally havent really worked on this recently as I have a second PR open here that was supposed to be reviewed ~6 months ago. I assume Geal is busy but I cant really finish either until I get some proper feedback from a maintainer.

@Geal Geal added this to the 8.0 milestone Mar 14, 2022
@Geal Geal mentioned this pull request Jan 2, 2023
@Geal
Copy link
Collaborator

Geal commented Jan 2, 2023

This is now merged as part of #1608 🥳
Sorry it took so long. I might revisit some points later, but for now it's better to integrate it with the rest and iterate. Thanks a lot for implementing this complex feature!

@Geal Geal closed this Jan 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consolidate parser variants using ranges (e.g. many0, many_m_n)
5 participants