Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement interleave() #405

Merged
merged 10 commits into from Sep 22, 2017

Conversation

Projects
None yet
2 participants
@laumann
Copy link
Contributor

laumann commented Sep 3, 2017

For parity with itertools, interleave() should alternately produce
elements from two given iterators. Once one of the iterators run out,
elements should just be drawn from the other until both are exhausted.

This is from #384

@laumann

This comment has been minimized.

Copy link
Contributor Author

laumann commented Sep 11, 2017

I'm not sure why this fails, some of the errors are that the ? operator is not stable. Is there something I need to do?

@cuviper

This comment has been minimized.

Copy link
Member

cuviper commented Sep 11, 2017

It's not you, but futures update -- see #412.

At a glance, the PR looks great. I'll try to give it a proper review soon!

@laumann

This comment has been minimized.

Copy link
Contributor Author

laumann commented Sep 12, 2017

Ok, thank you :-) Let me know if there's anything I can do...

@cuviper

This comment has been minimized.

Copy link
Member

cuviper commented Sep 17, 2017

Can you now please rebase, to see how CI fares?

@laumann

This comment has been minimized.

Copy link
Contributor Author

laumann commented Sep 18, 2017

Sure thing

@laumann laumann force-pushed the laumann:add-interleave branch from 0c3cfd2 to e0f18b4 Sep 18, 2017

laumann added some commits Aug 27, 2017

Implement interleave()
For parity with itertools, interleave() should alternately produce
elements from two given iterators. Once one of the iterators run out,
elements should just be drawn from the other until both are exhausted.

This is from #384
interleave: A little cleanup
Remove unused import, remove commented test code and add more test cases

@laumann laumann force-pushed the laumann:add-interleave branch from e0f18b4 to bb6017b Sep 18, 2017

@laumann

This comment has been minimized.

Copy link
Contributor Author

laumann commented Sep 18, 2017

Sorry about the noise, I rebased on a different machine, and forgot to set core.name and core.email - should be fixed now.

laumann added some commits Sep 18, 2017

interleave: Don't use shorthand field initialization
This feature wasn't present in Rust 1.12.
@laumann

This comment has been minimized.

Copy link
Contributor Author

laumann commented Sep 18, 2017

@cuviper It should build now, I installed a rust 1.12.0 locally. The commitment to backwards compatibility is pretty cool, I think. 🤞 for tests passing

@cuviper

This comment has been minimized.

Copy link
Member

cuviper commented Sep 18, 2017

Looks like the order is getting mixed up:

---- iter::test::check_interleave_eq stdout ----
	thread 'iter::test::check_interleave_eq' panicked at 'assertion failed: `(left == right)`
  left: `[0, 10, 1, 2, 11, 12, 3, 13, 4, 14, 5, 15, 6, 7, 16, 17, 8, 18, 9, 19]`,
 right: `[0, 10, 1, 11, 2, 12, 3, 13, 4, 14, 5, 15, 6, 16, 7, 17, 8, 18, 9, 19]`', src/iter/test.rs:1735:4
@laumann

This comment has been minimized.

Copy link
Contributor Author

laumann commented Sep 18, 2017

Yeah, I'm not sure why though - I can't get it to fail locally, and it seems to be only on nightly that it fails :-/

@cuviper

This comment has been minimized.

Copy link
Member

cuviper commented Sep 18, 2017

it seems to be only on nightly that it fails

nightly is the only one that runs tests, if only because the dev-dependency compiletest_rs requires it (and dev-deps can't be conditional).

@laumann

This comment has been minimized.

Copy link
Contributor Author

laumann commented Sep 18, 2017

Hmm, that makes sense - am I doing something wrong then? I run cargo +nightly test, and I tried setting RUSTFLAGS='--cfg=rayon_unstable' as well, but so far no dice.

@cuviper

This comment has been minimized.

Copy link
Member

cuviper commented Sep 18, 2017

I think Travis runs with 2-cpus, so setting RAYON_NUM_THREADS=2 locally may get you closer. I suggest adding a temporary debug-print with all your index, i_idx, j_idx, i_split, j_split, flag, even etc. so you have a trace of how things go awry.

@laumann

This comment has been minimized.

Copy link
Contributor Author

laumann commented Sep 18, 2017

Oh, that did it, thanks! I'll do some debugging now :-)

interleave: Carry self.flag through in InterleaveSeq
The flag indicating the next iterator to pull from was almost carried
through all the way, except for InterleaveSeq's constructor, which
always set it to false.
@laumann

This comment has been minimized.

Copy link
Contributor Author

laumann commented Sep 18, 2017

Thanks for the help @cuviper! I had forgotten to carry the flag through to InterleaveSeq's constructor.

{
i: I,
j: J,
flag: bool,

This comment has been minimized.

@cuviper

cuviper Sep 19, 2017

Member

Do we really need the flag here? I think it's not used until we get to the Producer.

This comment has been minimized.

@laumann

laumann Sep 20, 2017

Author Contributor

I think you're right

j: J,
i_len: usize,
j_len: usize,
flag: bool,

This comment has been minimized.

@cuviper

cuviper Sep 19, 2017

Member

I think we need a more informative name for this flag, and a doc-comment explaining its purpose as well. Think of the poor developer who may have to revisit this code many months later... :)

This comment has been minimized.

@laumann

laumann Sep 20, 2017

Author Contributor

I'll try to think of a better name... names are hard :-)

This comment has been minimized.

@laumann

laumann Sep 20, 2017

Author Contributor

I'm open for suggestions on the name :-) Any of these you'd prefer?

  • toggle_flag
  • next_iter/next_producer

This comment has been minimized.

@cuviper

cuviper Sep 21, 2017

Member

It's indicating which iterator will produce the next item, so maybe i_next or j_next? (depending on what you want true to mean)

This comment has been minimized.

@laumann

laumann Sep 21, 2017

Author Contributor

I went with i_next.

self.flag = !self.flag;
if self.flag {
match self.i.next() {
None => self.j.next(),

This comment has been minimized.

@cuviper

cuviper Sep 19, 2017

Member

This sort of assumes that i will be fused (continue returning None), as the flag flips back and forth trying this iterator again. We may want a done: bool to remember and stop flipping the flag, or I guess len() can tell us this too.

This comment has been minimized.

@laumann

laumann Sep 20, 2017

Author Contributor

Yes, the itertools version is fused. I'm not really sure of the best way to handle it, I would just add a done bool I think

This comment has been minimized.

@cuviper

cuviper Sep 21, 2017

Member

Ah, it's literally fused! You could just use that too. Fuse has a done bool itself, but it also specializes on the unstable FusedIterator trait to skip that check for naturally-fused iterators.

This comment has been minimized.

@laumann

laumann Sep 21, 2017

Author Contributor

Ok, I added Fuse to both iterators in InterleaveSeq

pub struct InterleaveSeq<I, J> {
i: I,
j: J,
flag: bool

This comment has been minimized.

@cuviper

cuviper Sep 19, 2017

Member

Again, better name and comments please.


fn size_hint(&self) -> (usize, Option<usize>) {
let (ih, jh) = (self.i.size_hint(), self.j.size_hint());
let min = ih.0.checked_add(jh.0).unwrap_or(usize::MAX);

This comment has been minimized.

@cuviper

cuviper Sep 19, 2017

Member

Perhaps saturating_add instead?

This comment has been minimized.

@laumann

laumann Sep 20, 2017

Author Contributor

Done.

{
#[inline]
fn next_back(&mut self) -> Option<I::Item> {
if self.i.len() <= self.j.len() {

This comment has been minimized.

@cuviper

cuviper Sep 19, 2017

Member

I think if they are equal length, then it needs to take the flag into account.

This comment has been minimized.

@laumann

laumann Sep 20, 2017

Author Contributor

I agree, done.

@@ -809,6 +811,14 @@ pub trait IndexedParallelIterator: ParallelIterator {
zip::new(self, zip_op.into_par_iter())
}

/// Interleave elements of this iterator and the other given iterator.

This comment has been minimized.

@cuviper

cuviper Sep 19, 2017

Member

Your PR is more descriptive than these docs -- please elaborate! An example would be nice too.

This comment has been minimized.

@laumann

laumann Sep 20, 2017

Author Contributor

Elaborated and added example.

xs.par_iter().interleave(&ys).map(|&i| i).collect_into(&mut res);
assert_eq!(expected, res, "Case {} failed", i+1);
}
}

This comment has been minimized.

@cuviper

cuviper Sep 19, 2017

Member

How about a test under rev(), for the next_back behavior?

This comment has been minimized.

@laumann

laumann Sep 20, 2017

Author Contributor

I added another assert_eq that tests each case when using rev()

laumann added some commits Sep 20, 2017

interleave: Handle most review comments
Documentation in a few places, add tests with `rev()` to exercise
`next_back()` and fix the `next_back()` implementation.
interleave: More review changes
 - Remove ExactSizeIterator from Iterator impl for InterleaveSeq
 - Use `saturating_add()` instead of `checked_add().unwrap_or(MAX)`
interleave: Review changes
 - Rename `flag` to `i_next`
 - Use `Fuse` for the iterators embedded in `InterleaveSeq`
@cuviper

This comment has been minimized.

Copy link
Member

cuviper commented Sep 22, 2017

Let's merge -- thanks!

@cuviper cuviper merged commit 2cf5ac2 into rayon-rs:master Sep 22, 2017

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@laumann

This comment has been minimized.

Copy link
Contributor Author

laumann commented Sep 22, 2017

Thanks! It should be relatively easy to implement interleave_shortest, I think, so I could give that a shot if you're interested.

@laumann laumann deleted the laumann:add-interleave branch Sep 22, 2017

@cuviper

This comment has been minimized.

Copy link
Member

cuviper commented Sep 22, 2017

Yeah, I think interleave_shortest will be a simple wrapper, much like zip_eq and zip. Go for it!

@laumann

This comment has been minimized.

Copy link
Contributor Author

laumann commented Sep 22, 2017

Hmm, it may not be as trivial as I thought - the zip_eq is simple, because it is just zip with an added assertion in front. interleave_shortest should terminate once one of the iterators runs out (interleave exhausts them both).

But by their example, it is not simply take()ing the first n elements where n = cmp::min(i.len(), j.len()):

let it = (1..7).interleave_shortest(vec![-1, -2]);
itertools::assert_equal(it, vec![1, -1, 2, -2, 3]);

(source)

Observe that the last element is drawn from the first iterator, ie three elements are drawn from that one while the second only provides two elements.

I can see how I'd change with_producer for Interleave to select the right lengths for creating the producers, but I'd like to avoid copying too much code if possible. Maybe I'll try to pull that code out in interleave.rs and implement InterleaveShortest in interleave.rs? But I'm for better suggestions on how to implement this :-)

@cuviper

This comment has been minimized.

Copy link
Member

cuviper commented Sep 22, 2017

Something like this?

if i.len() <= j.len() {
    // take equal lengths from both iterators
    let n = i.len();
    i.take(n).interleave(j.take(n))
} else {
    // take one extra item from the first iterator
    let n = j.len();
    i.take(n + 1).interleave(j.take(n))
}

(using take even when it's redundant, to get consistent types)

@laumann

This comment has been minimized.

Copy link
Contributor Author

laumann commented Sep 22, 2017

Thanks for the suggestion! I was trying something similar, but run into:

error[E0308]: mismatched types
  --> src/iter/interleave.rs:49:9
   |
42 | pub fn new_shortest<I, J>(i: I, j: J) -> Interleave<I, J>
   |                                          ---------------- expected `iter::interleave::Interleave<I, J>` because of return type
...
49 |         i.take(n).interleave(j.take(n))
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected type parameter, found struct `iter::take::Take`
   |
   = note: expected type `iter::interleave::Interleave<I, J>`
              found type `iter::interleave::Interleave<iter::take::Take<I>, iter::take::Take<J>>`

error[E0308]: mismatched types
  --> src/iter/interleave.rs:53:9
   |
42 | pub fn new_shortest<I, J>(i: I, j: J) -> Interleave<I, J>
   |                                          ---------------- expected `iter::interleave::Interleave<I, J>` because of return type
...
53 |         i.take(n + 1).interleave(j.take(n))
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected type parameter, found struct `iter::take::Take`
   |
   = note: expected type `iter::interleave::Interleave<I, J>`
              found type `iter::interleave::Interleave<iter::take::Take<I>, iter::take::Take<J>>`
@cuviper

This comment has been minimized.

Copy link
Member

cuviper commented Sep 22, 2017

pub fn new_shortest<I, J>(i: I, j: J) -> Interleave<I, J>

I think we'll want this to return a distinct InterleaveShortest<I, J> type, so we're not bound to any particular implementation details. This custom type can contain an Interleave<Take<I>, Take<J>>, and forward all methods to that. (as ZipEq does)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.