Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add the `group_by` and `group_by_mut` methods to slice #2477

Open
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
@Kerollmops
Copy link
Contributor

Kerollmops commented Jun 15, 2018

This RFC propose to add two new methods to the slice, the group_by and group_by_mut. These two will provide a way to iterate over non-overlapping sub-slices of a base slice that are separated by the predicate given by the user (e.g. Partial::eq, |a, b| a < b).

The predicate is called on two elements following themselves, it means the predicate is called on slice[0] and slice[1] then on slice[1] and slice[2]...

let slice = &[1, 1, 1, 3, 3, 2, 2, 2];

let mut iter = slice.group_by(|a, b| a == b);

assert_eq!(iter.next(), Some(&[1, 1, 1][..]));
assert_eq!(iter.next(), Some(&[3, 3][..]));
assert_eq!(iter.next(), Some(&[2, 2, 2][..]));
assert_eq!(iter.next(), None);

Pending Pull Request

Work around temporary library

Rendered

@CodesInChaos

This comment has been minimized.

Copy link

CodesInChaos commented Jun 15, 2018

I don't like the naming. group by in SQL and C# work quite differently from what you propose, they map each item to a key and then group all items having the same key.

Something like split_between would fit the behaviour better.

@Lokathor

This comment has been minimized.

Copy link

Lokathor commented Jun 15, 2018

I guess, but this is exactly what groupBy does in Haskell, so the name should stay.

@Kerollmops

This comment has been minimized.

Copy link
Contributor Author

Kerollmops commented Jun 15, 2018

@CodesInChaos the function you describe has the same behavior as GroupWith in Haskell.

@ssokolow

This comment has been minimized.

Copy link

ssokolow commented Jun 15, 2018

I don't like the naming. group by in SQL and C# work quite differently from what you propose, they map each item to a key and then group all items having the same key.

Something like split_between would fit the behaviour better.

I guess, but this is exactly what groupBy does in Haskell, so the name should stay.

I don't like the "My language is more important than yours" tone of this exchange.

That said, from my perspective, a name like split_between still wins out because I haven't seen it anywhere and I consider "no preconceptions" to be better for learnability than being torn between the incompatible preconceptions of two different languages, both significant.

Assuming that group_by turns out to be a non-starter for this reason, is there another name with prior art that could be proposed?

@rvolgers

This comment has been minimized.

Copy link

rvolgers commented Jun 15, 2018

This seems pretty similar to the existing take_while and skip_while functions. But it's even more specific, so I'd want to see some clear use cases. Maybe it's obvious to Haskell programmers, but it isn't to me 😄

I can imagine some cases where you iterate over a list and you can skip some work if some predicate hasn't changed. But in that case, it seems clearer to me to write it as a single loop with that part of the behavior written out with a prev_value variable and in imperative form. It may be a couple more lines of code but it makes the shape of the code a lot more representative of what is being accomplished (a single iteration where the processing has a special fast path case).

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jun 15, 2018

@ssokolow

I don't like the "My language is more important than yours" tone of this exchange.

I don't think that's it. There's a good reason to favor a Haskell based name in the case of iterator methods, functions, and related things because Rust already uses Haskell based naming in such cases. For example: map, zip, fold, take, take_while, scan, partition, all, any, find, unzip, cycle, sum, product, repeat, splitAt. There should be some consistency in naming, and group_by is consistent.

There's also precedent from itertools in naming this operation group_by.

I also think it's a telling and descriptive name. You are grouping elements by where the predicate applies. (Yes, a split happens every time the predicate doesn't match, so the order matters, but if you sort, then it does not. The minor difference can be cleared up in documentation..)

@rvolgers

But it's even more specific, so I'd want to see some clear use cases. Maybe it's obvious to Haskell programmers, but it isn't to me 😄

Grouping is a pretty common operation; Think of GROUP BY in SQL -- a lot of those cases can be expressed as .group_by(..).

It may be a couple more lines of code but it makes the shape of the code a lot more representative of what is being accomplished (a single iteration where the processing has a special fast path case).

Generally speaking, at least to me, iterator style composition makes data flow clearer :)

@Centril Centril added the T-libs label Jun 15, 2018

@leonardo-m

This comment has been minimized.

Copy link

leonardo-m commented Jun 15, 2018

A related by different function in D language:
https://dlang.org/phobos/std_algorithm_iteration.html#group

And Python:
https://docs.python.org/2/library/itertools.html#itertools.groupby

So what semantics do we prefer?

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jun 15, 2018

@leonardo-m It seems to me that the semantics of D and Python can be recovered from the proposed semantics in this RFC (which is the same as the semantics in the itertools crate) so I would say that the proposed solution is more general.

@Ixrec

This comment has been minimized.

Copy link
Contributor

Ixrec commented Jun 15, 2018

To me, it's not even sufficiently clear what the RFC is proposing. The name group_by makes me expect an array like [1, 1, 2, 1] to get turned into the groups [1, 1, 1] and [2]. But the fact that this is an iterator method makes me expect a more linear-time-y behavior like [1, 1], [2] then [1]. The RFC's summary sentence does use the term "runs", which I am just barely familiar with so that also seems to imply [1, 1], [2] then [1]. But the RFC doesn't contain a single example of expected inputs and outputs. And in the ~200 lines of tests in the linked implementation, not one test covers this distinction!

The reference implementation of next() seems like it would in fact produce [1, 1], [2] then [1], so I assume that was the author's intent. But still, that should be made utterly and idiot-proof-ly obvious in the RFC, without having to read an implementation.

Another problem with providing an implementation and no specification is that I have no idea what the guarantees/requirements are for the predicate. Maybe this is a silly question, but am I guaranteed that it gets invoked with (a[0], a[1]) then (a[1], a[2]) and so on in that order exactly once each, or is the implementation allowed to do things like (a[0], a[1]) then (a[0], a[2]) and so on? I assume everyone's assuming the former, but since there isn't a single example or test of a predicate other than equality (panic_param_ord doesn't count), it's not super clear what more interesting predicates I should or should not be able to pass to this algorithm.

Which is all a really long way of saying I don't think we can properly bikeshed the name quite yet. In particular, I think we need to see some examples of more interesting predicates (I can't come up with any) to figure out whether "groups", "runs" or "splitting" is the least misleading term for what's going on.

@phaazon

This comment has been minimized.

Copy link
Contributor

phaazon commented Jun 15, 2018

Assuming that group_by turns out to be a non-starter for this reason, is there another name with prior art that could be proposed?

regroup_by? contiguous_group_by? 💃

But the RFC doesn't contain a single example of expected inputs and outputs. And in the ~200 lines of tests in the linked implementation, not one test covers this distinction!

Have a look at the few tests I wrote in my allocation version, here.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jun 15, 2018

@Ixrec I would expect the behavior to be equivalent to https://docs.rs/itertools/0.7.8/itertools/trait.Itertools.html#method.group_by except that you get a slice as the element of the returned iterator instead of an IntoIterator.

@phaazon What's the prior art for regroup_by?

@Kerollmops Small library additions such as the one proposed in this RFC have historically been accepted with a PR against rust-lang/rust, so I would recommend filing a PR against that repo so that we can see the concrete implementation, tests, etc. and then we can have a much clearer idea.

@phaazon

This comment has been minimized.

Copy link
Contributor

phaazon commented Jun 15, 2018

@Centril About regroup_by, none, and I think:

  • regroup_by semantics would sort upfront, which is not the case of this suggested implementation.
  • group_by is fine enough but if people don’t like it, it shouldn’t be very hard to find a proper name based on the current Iterator other methods.
@phaazon

This comment has been minimized.

Copy link
Contributor

phaazon commented Jun 15, 2018

continuous_chunks and continuous_chunks_by?

@Ixrec

This comment has been minimized.

Copy link
Contributor

Ixrec commented Jun 15, 2018

@Centril Interesting, the itertools method expects an element -> key callback and strongly implies the keys will be compared via PartialEq to determine the groups.

What other use cases does this proposal hope to support by generalizing to a (a, b) -> bool callback? If equality and key equality are all we know of, I'd rather have the more obvious itertools signature.

@phaazon

This comment has been minimized.

Copy link
Contributor

phaazon commented Jun 15, 2018

(Also, not related to the actual discussion, this is more meta about the commit: @Kerollmops, you seem to have pushed this commit with a company email address. I think you should check twice and possibly rebase with your OSS identity – your commit is also not verified).

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jun 15, 2018

@Ixrec I would provide the following hierarchy:

  • The general operation group_by using (T, T) -> bool
  • The half-general operation group_by_key or group_key using T -> Key, Key: PartialEq
  • The specialized operation group using T: PartialEq

The two latter ones can be implemented in terms of the first one. I would not use the name group_by for T -> Key, Key: PartialEq to leave room for the most general version even if we don't add it now.

EDIT: The use cases for the most general version is for when you have some different notion of equality than the natural one ((Partial)Eq) and it is not field based; for example: you want to group all alpha-equivalent terms and you have some function is_alpha_equivalent : (&Term, &Term) -> bool. There are many other interesting equivalences in this space.

@Kerollmops

This comment has been minimized.

Copy link
Contributor Author

Kerollmops commented Jun 15, 2018

It could probably be named split_by.
@Ixrec I will update the RFC to add output examples.
@Centril Ok I will create a PR against rust-lang/rust.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jun 15, 2018

I think it would be confusing to have split and then split_by with entirely different semantics.

@Lokathor

This comment has been minimized.

Copy link

Lokathor commented Jun 15, 2018

Yeah, split is absolutely not the verb, even if group is also not the verb (though i think group is the best verb here)

@uberjay

This comment has been minimized.

Copy link

uberjay commented Jun 15, 2018

Ruby calls this chunk — https://ruby-doc.org/core-2.5.1/Enumerable.html#method-i-chunk

Of course, that’s already taken. Would it be too confusing if it were chunk_by? Perhaps...

group_by makes me think it’ll behave like the (obviously more expensive) ruby method of the same name, which enumerates and groups/collects (up front) into a hash.

Now... that’s probably not a huge issue, because (in rust :) nobody will accidentally treat the result of group_by as a HashMap?

@Ixrec

This comment has been minimized.

Copy link
Contributor

Ixrec commented Jun 15, 2018

It seems like I'm the only one that finds it unintuitive for group to only produce contiguous runs of equivalent elements rather than gathering up all equivalent elements. But just in case I'm not alone: runs, runs_by_key and runs_by seem like acceptable names to me.

@uberjay

This comment has been minimized.

Copy link

uberjay commented Jun 15, 2018

@lxrec - I also find group_by unintuitive. It's possible my view is colored by ruby use, but my feeling is that group doesn't convey the sequential nature of the operation.

For fun, I asked someone nearby who isn't familiar with any of the languages in question, and they felt pretty strongly that chunk was a far better word than group, for this operation. Totally un-scientific survey of one person! 😬

Now, there's already chunks, which is for fixed length chunks -- maybe chunks_by wouldn't be that odd? seq_group_by?

@ssokolow

This comment has been minimized.

Copy link

ssokolow commented Jun 15, 2018

I'd go with the runs-based naming because "run" is existing jargon (See, for example, Run-length encoding) and, if you're familiar with that jargon, it's intuitive for runs_by to have a similar meaning to passing a key-derivation function into a sorting routine.

(That is, assuming the "contiguous runs" interpretation which would preserve lazy evaluation of the iterator. Otherwise, perhaps something in the vein of collect_by to draw attention to the eager way it must necessarily operate to account for lists not sorted by the derived key first.)

@Lokathor

This comment has been minimized.

Copy link

Lokathor commented Jun 15, 2018

Actually group_by will produce all equivalent elements in each output batch if (and only if) the slice is already sorted and the closure used is an equality test. Otherwise you'll see some other sort of result.

Remember that group_by takes &self (for some [T]), so you can't re-arrange any of the values (since you can't mutate the slice's memory backing), and you also can't move the values into temporary memory and sort there (since you don't bound on T: Copy). Thus, the only possible output is a series of sub-slices of the starting slice that don't have any order changes.

The only question left is one of the ones you already said: assuming the first slice starts with (a[0], a[1]), do we keep checking the next element against the start of the slice or against the latest element of the slice?

@uberjay

This comment has been minimized.

Copy link

uberjay commented Jun 16, 2018

Ok, a few interesting alternatives from the thesaurus:

segment_by - this actually reads quite nicely to me.
snip_by - ok, but a little weird, and probably too similar to split.

@ssokolow

This comment has been minimized.

Copy link

ssokolow commented Jun 16, 2018

@uberjay

I'd go with segment_by. snip_by has a "too clever" feel to it.

Aside from having no precedent I'm aware of, it's too focused on process rather than effect. "Snipping" is what you do to accomplish a goal like "trimming" or "splitting" rather than a goal in itself and says nothing about which snipped pieces, if any, will be retained or discarded. (ie. Are you "snipping [something] off" or "snipping [things] apart"?)

Therefore, it's neither as intuitive as is ideal when looking through a list of methods with a goal in mind, nor obvious about what it does when looking at a use of it in code.

It also generally has a "too informal to fit in with the other terminology" feel to it.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Jun 16, 2018

The motivation here reads to me like "this makes it easier to do what this does", which I don't find persuasive. Why is it common to have this problem, to have the input in exactly the form needed for this proposed implementation, and to not just want the eager HashMap-producing version?

@Ixrec You're definitely not alone; I would absolutely expect "group by" to have the relational algebra meaning as well.

@Lokathor

This comment has been minimized.

Copy link

Lokathor commented Jun 16, 2018

well this works in a no_alloc situation, for one

@burdges

This comment has been minimized.

Copy link

burdges commented Jun 16, 2018

I suppose split and split_mut excluding the matched element was a dubious design decision, given that one can shorten slices easily enough. Anyways you can implement a version that gives slightly short slices like:

let mut b = None;
slice.split(|a| { let c = b; b = Some(a);  b != c })
@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jun 16, 2018

@Kerollmops

it asks for a key to map the element

True, but the itertools group_by method can be derived from using the same projection on both sides.
To me, this is a detail and not that significant.

and does not return contiguous elements.

It does:

extern crate itertools;
use itertools::Itertools;

fn main() {
    let arr = [1, 1, 2, 2, 1, 1, 3, 3];
    for (key, group) in arr.iter().group_by(|&&elt| elt).into_iter() {
        println!("{:?}", group.collect::<Vec<_>>());
    }
}

outputs:

[1, 1]
[2, 2]
[1, 1] // Notice this!
[3, 3]
@Kerollmops

This comment has been minimized.

Copy link
Contributor Author

Kerollmops commented Jun 16, 2018

@Centril, as you propose above, I have done the work to add to the rust-lang/rust repo.

The problem is that it will not compile if I do not specify an issue on the unstable stuff but there is no traking issue for the moment, so I put "42" as a placeholder.

Right, the itertools group_by method can be named group_by_key for the slice, having the same behavior, it's a detail.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Jun 16, 2018

@Kerollmops Leaving some preliminary notes here on your work (ill leave more as inline comments later when you file the PR):

  • You can use 0 as the issue number until a real tracking issue is made.
  • I'd make the feature name a bit more specific: slice_group_by.
  • All unsafe { .. } elimination blocks should have comments explaining why it is safe.
@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Jun 17, 2018

Why everyone immediately starts bikeshedding the name 😕

I agreed with #2477 (comment) that the RFC doesn't demonstrate why group_by needs to be a part of std, instead of using the existing Itertools::group_by. (Usable in no_std is a good argument but please show why no_std code needs to group_by generic slices.)

Furthermore, checking sourcegraph, after removing the repositories itertools (which defined group_by) and diesel (which is a false positive usage, referring to SQL's GROUP BY clause), we find that only clippy is using this function, suggesting this function isn't in high demand.

You almost always need to sort the slice before using group_by to get some meaningful result (similar to Vec::dedup). This makes it less convenient to use:

let mut packages = metadata.packages;
packages.sort_by(|a, b| a.name.cmp(&b.name));    // <------
for group in packages.group_by(|a, b| a.name == b.name) {
    let name = &group[0].name;
    if group.len() > 1 { ... }
}

The Map-based refactoring isn't that bad compared with this.

let mut grouped_packages: HashMap<_, Vec<_>> = HashMap::new();
for package in &metadata.packages {
    grouped_packages.entry(&package.name).or_default().push(package);
}
for (name, group) in grouped_packages {
    if group.len() > 1 { ... }
}
@Kerollmops

This comment has been minimized.

Copy link
Contributor Author

Kerollmops commented Jun 17, 2018

The real advantage of this group_by implementation over the one from itertools is that it doesn't allocate anything and so it has better performances (something like 10 times faster compared to an Iterator based implementation).

The itertools group_by iterator does not implement DoubleEndedIterator.

You almost always need to sort the slice before using group_by to get some meaningful result (similar to Vec::dedup). This makes it less convenient to use.

It depends on how you insert your data in the Vec before calling group_by on it, it is related to your algorithm, in some of my use cases the sort was not mandatory.

Before I propose this iterator I wrote a custom implementation for a project of mine that looked like this. So it will probably not be visible in the sourcegraph you linked.

let mut previous = None;
let mut iter = slice.iter();
while let Some(elem) = iter.next() {
    if previous.is_none() || previous != Some(elem) {
        previous = Some(elem);

        // do something here with `elem`: the first element of each group
    }
}

@Centril I have opened a PR as you asked for.

@phaazon

This comment has been minimized.

Copy link
Contributor

phaazon commented Jun 17, 2018

Now I start to get why @kennytm and @scottmcm feel this way. The RFC should be modified – especially its motivation – to explain why this should be merged to std instead of a third-party library, like itertools.

@Kerollmops Kerollmops force-pushed the Kerollmops:group-by branch from 2b744bb to e14e0a6 Jun 17, 2018

@Kerollmops

This comment has been minimized.

Copy link
Contributor Author

Kerollmops commented Jun 17, 2018

I updated the motivation section as requested.

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Jun 17, 2018

@kennytm @scottmcm Perhaps this data point is useful: I have had 3 distinct usecases for this in the past couple of months. In each case I opted to write a small utility function rather than add a dependency. In two out of the three cases I did not sort the slice first.

@Kerollmops

This comment has been minimized.

Copy link
Contributor Author

Kerollmops commented Jun 17, 2018

I have multiple usecases in one of my project and I wrote custom loops just as temporary fixes.

I do not think that this method fit in the itertools library, as the itertools description say: Extra iterator adaptors, functions and macros. And this function is really optimized for slices/contiguous data.

@Kerollmops Kerollmops force-pushed the Kerollmops:group-by branch from e14e0a6 to 5ee7a72 Jun 17, 2018

@burdges

This comment has been minimized.

Copy link

burdges commented Jun 17, 2018

Arguably, these should be called split and split_mut and the existing split and split_mut should be renamed to lines_by or separated or whatever, but presumably that's impossible now.

You can almost implement split with this, assuming you capture the closure's lifetimes.

pub fn split<'a F>(&'a self, pred: F) -> impl Iterator<Item=&'a [T]>
  where F: FnMut(&T) -> bool + 'a {
    self.group_by(|a,_| pred(a)).map(|x| x.split_last()).1
}
pub fn split_mut<'a,F>(&'a self, pred: F) -> impl Iterator<Item=&'a mut [T]>
  where F: FnMut(&T) -> bool + 'a {
    self.group_by_mut(|a,_| pred(a)).map(|x| x.split_last_mut()).1
}

It doesn't quite work however since pred(self[self.len()-1]) never gets evaluates and pred cannot be used in the map. #[async] can express this easily though I guess.

@Kerollmops

This comment has been minimized.

Copy link
Contributor Author

Kerollmops commented Jun 18, 2018

I think something like this can correctly emulated the current slice::split method.

pub fn split<F>(&self, pred: F) -> impl Iterator<Item=&[T]>
where F: FnMut(&T) -> bool
{
    self.group_by(|a, _| pred(a))
        .enumerate()
        .map(|(i, slice)| {
            match i {
                0 => slice,
                _ => slice[1..],
            }
        })
}

EDIT: this code ^ is wrong !

But this is not really the subject here, split will be kept as you say, the name for the method I propose will be something between group_by and segment_by.

For the real discussion here is: do we really want to add this method to the standard library ?
@mark-i-m, @Centril and me are likely to merge it in because we faced use cases.

And I did not understand what you mean with the #[async].

@burdges

This comment has been minimized.

Copy link

burdges commented Jun 18, 2018

I think our discussion of deriving split from group_by reflects positively on including it in the standard library, well group_by sounds more fundamental.

As I said, it's unfortunate split looks soo similar to split_at while behaving so radically different. :(

Afaik, you'd need the lifetime so the return can borrow both self and pred, but you'd also need pred to appear in both closures, so you must wrap pred in a RefCell or Arc making it look more like:

pub fn split<'a F>(&'a self, pred: F) -> impl Iterator<Item=&'a [T]>
  where F: FnMut(&T) -> bool + 'a {
    let p = RefCell::new(pred);
    self.group_by(|a,_| p.borrow_mut()(a)).map(|x| {
        { let (h,t) = x.split_last();  if pred(*h) { return t; } }
        x
    })
}
pub fn split_mut<'a,F>(&'a self, pred: F) -> impl Iterator<Item=&'a mut [T]>
  where F: FnMut(&T) -> bool + 'a {
    let p = RefCell::new(pred);
    self.group_by_mut(|a,_| p.borrow_mut()(a)).map(|x| {
        { let (h,t) = x.split_last_mut();  if pred(*h) { return t; } }
        x
    })
}

In principle, the new #[async] generator function would permit borrowing pred across a yield point, but of course that's not enough to pass it to two separate closures, so made a mistake there.

@uberjay

This comment has been minimized.

Copy link

uberjay commented Jun 18, 2018

Apologies about the immediate bike-shedding.

In terms of the actual functionality, I've found the corresponding functionality pretty useful in Ruby-land for segmenting (or chunking, or grouping, if you will :) line-based output into logical chunks more easily. It's not a great day when you're stuck parsing the output of some random command line tool, but Enumerable#chunk made it a bit more pleasant.

So, it's been useful for me in situations where sorting beforehand doesn't make sense.

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Jun 18, 2018

I think the immediate bikeshedding is a sign that people don't have any larger issues with the proposal...

@Kerollmops

This comment has been minimized.

Copy link
Contributor Author

Kerollmops commented Jun 22, 2018

Ok so the only blocking thing about this RFC seems to be the name chosen (group_by), as @Lokathor says this is the function name that is used in Haskell, and the Rust standard library is mostly inspired by it.

@uberjay propose segment_by and I think nobody is opposed to this name.

It seems that the only way I can make this RFC move forward is by changing the name from group_by to segment_by.

I am not convinced about this renaming, as @mark-i-m says:

I think the immediate bikeshedding is a sign that people don't have any larger issues with the proposal...

So I propose to tag this RFC as proposed-final-comment-period, publish it in this-week-in-rust to make it more visible and see other oppositions than naming.

@Kerollmops

This comment has been minimized.

Copy link
Contributor Author

Kerollmops commented Aug 26, 2018

Ping ! ⛵️

@uberjay

This comment has been minimized.

Copy link

uberjay commented Aug 26, 2018

Totally agree -- it'd be great to have this, regardless of what it's named! I've run across a couple times where it would have come in handy over the past couple months, even!

@Kerollmops

This comment has been minimized.

Copy link
Contributor Author

Kerollmops commented Sep 8, 2018

As the RFC seems to get stuck, I will ping someone from the Library team to take a decision.
I choose @Kimundi and @sfackler as he has already been assigned to the "final" pull request.

@Kerollmops

This comment has been minimized.

Copy link
Contributor Author

Kerollmops commented Sep 8, 2018

I propose to rename the method group and the returned iterator Group, this way it will follow the same rules as the split methods familly follows. The slice::split method returns a Split struct. Same pattern for rsplit, splitn, rsplitn and all the mutable versions.

Note that the Split struct is generic on the function so the Group struct must not be based on partial equality. In other word, no GroupBy or GroupByKey structs, only the Group one which is invoqued by the slice::group method with a generic function.

The same pattern can be found on other types like str::matches which returns a Matches struct that is generic on the predicate, no MatchesBy struct here.

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Sep 26, 2018

Nominating for discussion in the next libs team meeting, to get the process un-stuck after the various bikeshed-painting. :)

@Kerollmops

This comment has been minimized.

Copy link
Contributor Author

Kerollmops commented Nov 13, 2018

@joshtriplett When does the next libs team meeting occurs ?

@Kerollmops

This comment has been minimized.

Copy link
Contributor Author

Kerollmops commented Dec 27, 2018

For those interrested, I have made a temporary library that I will not publish to crates.io, providing a temporary workaround to this RFC/PR that has not been merged. It compiles on stable rust, the version that will be merged will use unstable functions to improve performance and clarity (i.e. offset_from).

let slice = &[1, 1, 1, 3, 3, 2, 2, 2];

let mut iter = GroupBy::new(slice, |a, b| a == b);

assert_eq!(iter.next(), Some(&[1, 1, 1][..]));
assert_eq!(iter.next(), Some(&[3, 3][..]));
assert_eq!(iter.next(), Some(&[2, 2, 2][..]));
assert_eq!(iter.next(), None);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.