Skip to content

Add comm iterator (inspired by Unix comm command)? #963

@VorpalBlade

Description

@VorpalBlade

I recently had the need for comparing sorted iterators to figure out which elements were only in the left, right or both iterators. This is basically the same operation as comm(1) but generalised to iterators.

So I wrote an implementation that works for my purposes. However, this is a side thing in the crate I'm working on, it doesn't fit as an exposed API from it. So I wonder if there is any interest in cleaning up this and going through the effort of making a PR for itertools (where it seems like a good fit).

Below is the basic implementation (excluding tests, Clone, Debug, FusedIterator and other things that aren't core to the API). I left the doctest in to show how it is used. There are probably ways to improve all of this, but lets see if there is any interest in this at all before I spend time on that.

/// Describes which iterator(s) the item is present in
#[derive(Debug, PartialEq, Eq, PartialOrd, Ord, Clone, Copy, Hash)]
pub enum Comm<T> {
    /// The item is only present in the left iterator
    LeftOnly(T),
    /// The item is present in both iterator
    Both(T),
    /// The item is only present in the right iterator
    RightOnly(T),
}

/// An iterator that compares two sorted iterators of items
///
/// See [`comm`] for more information.
pub struct CommIterator<L: Iterator, R: Iterator> {
    left_iter: std::iter::Peekable<L>,
    right_iter: std::iter::Peekable<R>,
}

impl<L, R> Iterator for CommIterator<L, R>
where
    L: Iterator,
    R: Iterator<Item = L::Item>,
    L::Item: Ord,
    L::Item: PartialEq,
    L: FusedIterator,
    R: FusedIterator,
{
    type Item = Comm<L::Item>;

    fn next(&mut self) -> Option<Self::Item> {
        // We need to run next after the peek to placate the borrow checker.
        match (self.left_iter.peek(), self.right_iter.peek()) {
            (None, None) => None,
            (None, Some(_)) => {
                let val = self.right_iter.next().expect("We just peeked");
                Some(Comm::RightOnly(val))
            }
            (Some(_), None) => {
                let val = self.left_iter.next().expect("We just peeked");
                Some(Comm::LeftOnly(val))
            }
            (Some(l), Some(r)) => match l.cmp(r) {
                Ordering::Equal => {
                    let val = self.left_iter.next().expect("We just peeked");
                    self.right_iter.next().expect("We just peeked");
                    Some(Comm::Both(val))
                }
                Ordering::Less => {
                    let val = self.left_iter.next().expect("We just peeked");
                    Some(Comm::LeftOnly(val))
                }
                Ordering::Greater => {
                    let val = self.right_iter.next().expect("We just peeked");
                    Some(Comm::RightOnly(val))
                }
            },
        }
    }

    fn size_hint(&self) -> (usize, Option<usize>) {
        let left_hint = self.left_iter.size_hint();
        let right_hint = self.right_iter.size_hint();

        // At least the longer of the two iterators
        let min = left_hint.0.max(right_hint.0);
        // At most the sum of the lengths of the two iterators
        let max = left_hint.1.and_then(|l| right_hint.1.map(|r| l + r));

        (min, max)
    }
}

/// Compare two sorted slices of items
///
/// Example use case
/// ```
/// use itertools::{comm, Comm};
/// let results: Vec<Comm<&i32>> = comm(
///     [1, 2, 3, 4, 5, 8].iter(),
///     [3, 4, 5, 6, 7].iter()).collect();
/// assert_eq!(results, vec![
///    Comm::LeftOnly(&1),
///    Comm::LeftOnly(&2),
///    Comm::Both(&3),
///    Comm::Both(&4),
///    Comm::Both(&5),
///    Comm::RightOnly(&6),
///    Comm::RightOnly(&7),
///    Comm::LeftOnly(&8),
/// ]);
/// ```
pub fn comm<L, R>(left: L, right: R) -> CommIterator<L, R>
where
    L: Iterator,
    R: Iterator<Item = L::Item>,
    L::Item: Ord,
    L::Item: PartialEq,
    L: FusedIterator,
    R: FusedIterator,
{
    let left_iter = left.peekable();
    let right_iter = right.peekable();

    CommIterator {
        left_iter,
        right_iter,
    }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions