Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefer partition_point to look up assoc items #86392

Merged
merged 1 commit into from
Jun 17, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
52 changes: 5 additions & 47 deletions compiler/rustc_data_structures/src/sorted_map/index_map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -94,13 +94,15 @@ impl<I: Idx, K: Ord, V> SortedIndexMultiMap<I, K, V> {
Q: Ord + ?Sized,
K: Borrow<Q>,
{
// FIXME: This should be in the standard library as `equal_range`. See rust-lang/rfcs#2184.
match self.binary_search_idx(key) {
Err(_) => self.idxs_to_items_enumerated(&[]),

Ok(idx) => {
let start = self.find_lower_bound(key, idx);
let end = self.find_upper_bound(key, idx);
let start = self.idx_sorted_by_item_key[..idx]
.partition_point(|&i| self.items[i].0.borrow() != key);
let end = idx
+ self.idx_sorted_by_item_key[idx..]
.partition_point(|&i| self.items[i].0.borrow() == key);
self.idxs_to_items_enumerated(&self.idx_sorted_by_item_key[start..end])
Comment on lines -97 to 106
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this is now doing a binary_search_by to use a binary search to find any matching key, and then does two more binary searches (left and right of the item that was found), to find the first one. Why not just use a partition_point(|| .. < ..) to find the first key directly?

That is, if you just remove the match and the Err case, and change the != to <, it still works and does the same thing, right?

Copy link
Member

@m-ou-se m-ou-se Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And unless there's tons of duplicated keys, it can be a lot faster to only use partition_point for finding the lower bound, and just search linearly for the upper bound. See also the comment on line 128 before the change.

Edit: Oh, since this just returns an impl Iterator, it can just not look for the upper_bound at all and return .take_while(..) on the slice starting at the lower bound.)

Copy link
Member Author

@JohnTitor JohnTitor Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh neat, indeed we could improve it more. I'll open a PR to follow up it, it'd be great if you could review it, thanks!

}
}
Expand All @@ -114,50 +116,6 @@ impl<I: Idx, K: Ord, V> SortedIndexMultiMap<I, K, V> {
self.idx_sorted_by_item_key.binary_search_by(|&idx| self.items[idx].0.borrow().cmp(key))
}

/// Returns the index into the `idx_sorted_by_item_key` array of the first item equal to
/// `key`.
///
/// `initial` must be an index into that same array for an item that is equal to `key`.
fn find_lower_bound<Q>(&self, key: &Q, initial: usize) -> usize
where
Q: Ord + ?Sized,
K: Borrow<Q>,
{
debug_assert!(self.items[self.idx_sorted_by_item_key[initial]].0.borrow() == key);

// FIXME: At present, this uses linear search, meaning lookup is only `O(log n)` if duplicate
// entries are rare. It would be better to start with a linear search for the common case but
// fall back to an exponential search if many duplicates are found. This applies to
// `upper_bound` as well.
let mut start = initial;
while start != 0 && self.items[self.idx_sorted_by_item_key[start - 1]].0.borrow() == key {
start -= 1;
}

start
}

/// Returns the index into the `idx_sorted_by_item_key` array of the first item greater than
/// `key`, or `self.len()` if no such item exists.
///
/// `initial` must be an index into that same array for an item that is equal to `key`.
fn find_upper_bound<Q>(&self, key: &Q, initial: usize) -> usize
where
Q: Ord + ?Sized,
K: Borrow<Q>,
{
debug_assert!(self.items[self.idx_sorted_by_item_key[initial]].0.borrow() == key);

// See the FIXME for `find_lower_bound`.
let mut end = initial + 1;
let len = self.items.len();
while end < len && self.items[self.idx_sorted_by_item_key[end]].0.borrow() == key {
end += 1;
}

end
}

fn idxs_to_items_enumerated(&'a self, idxs: &'a [I]) -> impl 'a + Iterator<Item = (I, &'a V)> {
idxs.iter().map(move |&idx| (idx, &self.items[idx].1))
}
Expand Down