Max & Min on BTreeMap are unexpectedly slow #59947

jeremycochoy · 2019-04-13T20:08:36Z

Hello,

The use of iterator's max and min function do not seams to take advantage of the specific ordering of BTreeMap and looks like it is browsing the whole container.
Although it is still possible to have fast access to minimum and maximum element by using first and first_back, this is not semantically explicite and you can expect that developers may not know the subtilities regarding BTreeMap's implementation and could assume the complexity should be the same.

Here is the benchmark result

running 4 tests
test bench_first ... bench:           9 ns/iter (+/- 0)
test bench_last  ... bench:           8 ns/iter (+/- 0)
test bench_max   ... bench:       3,603 ns/iter (+/- 536)
test bench_min   ... bench:       3,755 ns/iter (+/- 328)

and here is the benchmark code:

#![feature(test)]

extern crate test;
#[macro_use]
extern crate lazy_static;

use test::Bencher;
use std::collections::BTreeMap;

lazy_static! {
    static ref MAP: BTreeMap<i32, i32> = {
        let mut map = BTreeMap::new();

        for i in 0..1000 {map.insert(i, i);}

        map
    };
}
#[bench]
fn bench_max(b: &mut Bencher) {

    b.iter(|| {MAP.iter().max()});
}

#[bench]
fn bench_first(b: &mut Bencher) {

    b.iter(|| {MAP.iter().next()});
}

#[bench]
fn bench_last(b: &mut Bencher) {
    b.iter(|| {MAP.iter().next_back()});
}

#[bench]
fn bench_min(b: &mut Bencher) {
    b.iter(|| {MAP.iter().min()});
}

I don't know the details of the trait implementation, but I think it would be very nice for the community if you can make std::collections::btree_map::Iter take advantage of the ordering. :)

The text was updated successfully, but these errors were encountered:

jonas-schievink · 2019-04-13T21:16:13Z

Correct me if I'm wrong, but it seems like min just needs to forward to next, while max can call next_back. This is possible only because tuples are lexicographically ordered and a map cannot contain the same key twice.

This should be done for Iter, IterMut, IntoIter and Keys. While we're at it, it can also be done for Range and RangeMut.

Given that BTreeSet just uses BTreeMap internally the same likely applies there and its iterators should forward min and max to the BTreeMap iterators.

lambda-fairy · 2019-04-14T00:11:35Z

That change would technically break backward compatibility, since the original implementations drained the whole iterator whereas the new implementation will leave most of the elements in place. I think that's fixable though – swap in an empty range before returning the result.

lambda-fairy · 2019-04-14T02:13:47Z

I think there's another possible issue with this change.

Currently, .min() and .map(|x| x).min() and .min_by_key(|x| x) work the same way. But with this specialization, they will have wildly different performance characteristics.

Do we have a policy on when specializing an iterator method is appropriate? My concern is that if we optimize some special cases, then users might expect other similar cases to be optimized in the same way, leading to confusion when that doesn't happen. I wonder what we can do to prevent such issues from popping up here.

the8472 · 2019-04-14T14:45:03Z

There are other optimizations that will let you run off a performance cliff if you deviate from the happy path. E.g. vec allocation relies on the OS giving you zeroed pages to avoid initializing it, making even very large empty vecs fairly cheap. If you allocate a vec but pre-init to some other value it can suddenly get a lot slower.

Leaving performance on the table to reduce surprises does not seem like a great tradeoff. And add ing map to the iterator chain is a far bigger change than changing the vec initialization value.

cuviper · 2019-04-14T17:21:38Z

That change would technically break backward compatibility, since the original implementations drained the whole iterator whereas the new implementation will leave most of the elements in place.

Iterator::min() and max() consume self, so the remains of the iterator will just drop normally. That's a no-op for the iterators by reference, and IntoIter should properly drop the remaining items. So I don't see a problem as far as that goes.

I think the only observable change would be if the key type has side effects in Ord/PartialOrd, but I doubt we should commit to exact guarantees in how that's called.

lambda-fairy · 2019-04-15T03:27:14Z

~~@cuviper My concern wasn't with whether the elements of the collection get dropped or not, because (as you note) it works either way.~~

~~The difference I was thinking of was with this code:~~

let mut it = my_map.into_iter();
it.by_ref().min();
println!("{:?}", it.next());

With the old implementation, this code would print None as all the items have been consumed. But a specialized implementation, if written in a naive way, would only consume the first element, leaving the last line to print Some(<second element>) instead.

~~Technically the Iterator::min/max docs don't guarantee how many elements get consumed, though, so maybe we can get away with not addressing this.~~

EDIT: never mind I'm dumb

cuviper · 2019-04-15T04:01:52Z

Calling .by_ref().min() uses the generic implementation for &mut I where I: Iterator, which wouldn't be affected by changes on the specific types here.

scottmcm · 2019-04-18T03:24:05Z

I recall a conversation about why we don't make .last() just call .next_back() for DEI, but I can't seem to find it to go look at what the reasoning was...

the8472 · 2019-06-02T13:19:31Z

@scottmcm that was #60130

…Simulacrum Shortcuts for min/max on double-ended BTreeMap/BTreeSet iterators Closes rust-lang#59947: a performance tweak that might benefit some. Optimizes `min` and `max ` on all btree double-ended iterators that do not drop, i.e. the iterators created by: - `BTreeMap::iter` - `BTreeMap::iter_mut` - `BTreeMap::keys` and `BTreeSet::iter` - `BTreeMap::range` and `BTreeSet::range` - `BTreeMap::range_mut` Also in these (currently) single-ended iterators, but obviously for `min` only: - `BTreeSet::difference` - `BTreeSet::intersection` - `BTreeSet::symmetric_difference` - `BTreeSet::union` Did not do this in iterators created by `into_iter` to preserve drop order, as outlined in rust-lang#62316. Did not do this in iterators created by `drain_filter`, possibly to preserve drop order, possibly to preserve predicate invocation, mostly to not have to think about it too hard (I guess maybe it wouldn't be a change for `min`, which is the only shortcut possible in this single-ended iterator).

jonas-schievink added A-collections Area: std::collections. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Apr 13, 2019

cuviper mentioned this issue Jul 24, 2019

Tracking issue for map_first_last: first/last methods on BTreeSet and BTreeMap #62924

Closed

6 tasks

ssomers mentioned this issue Jun 22, 2020

Shortcuts for min/max on ordinary BTreeMap/BTreeSet iterators #73627

Merged

bors closed this as completed in dfbba65 Jun 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Max & Min on BTreeMap are unexpectedly slow #59947

Max & Min on BTreeMap are unexpectedly slow #59947

jeremycochoy commented Apr 13, 2019

jonas-schievink commented Apr 13, 2019

lambda-fairy commented Apr 14, 2019 •

edited

Loading

lambda-fairy commented Apr 14, 2019

the8472 commented Apr 14, 2019

cuviper commented Apr 14, 2019

lambda-fairy commented Apr 15, 2019 •

edited

Loading

cuviper commented Apr 15, 2019

scottmcm commented Apr 18, 2019

the8472 commented Jun 2, 2019

Max & Min on BTreeMap are unexpectedly slow #59947

Max & Min on BTreeMap are unexpectedly slow #59947

Comments

jeremycochoy commented Apr 13, 2019

jonas-schievink commented Apr 13, 2019

lambda-fairy commented Apr 14, 2019 • edited Loading

lambda-fairy commented Apr 14, 2019

the8472 commented Apr 14, 2019

cuviper commented Apr 14, 2019

lambda-fairy commented Apr 15, 2019 • edited Loading

cuviper commented Apr 15, 2019

scottmcm commented Apr 18, 2019

the8472 commented Jun 2, 2019

lambda-fairy commented Apr 14, 2019 •

edited

Loading

lambda-fairy commented Apr 15, 2019 •

edited

Loading