Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BTreeSet intersection, is_subset & difference optimizations #64820

Merged
merged 1 commit into from Oct 2, 2019

Conversation

@ssomers
Copy link
Contributor

commented Sep 26, 2019

...based on the range of values contained; in particular, a massive improvement when these ranges are disjoint (or merely touching), like in the neg-vs-pos benchmarks already in liballoc. Inspired by #64383 but none of the ideas there worked out.

I introduced another variant in IntersectionInner and in DifferenceInner, because I couldn't find a way to initialize these iterators as empty if there's no empty set around.

Also, reduced the size of "large" sets in test cases - if Miri can't handle it, it was needlessly slowing down everyone.

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

commented Sep 26, 2019

r? @sfackler

(rust_highfive has picked a reviewer for you, use r? to override)

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 26, 2019

Copy link
Contributor Author

left a comment

Meanwhile I tweaked the order in both match expressions to move first/min before last/max

@ssomers ssomers force-pushed the ssomers:master branch from 0ed38b5 to f16fa72 Sep 30, 2019
@ssomers

This comment has been minimized.

Copy link
Contributor Author

commented Sep 30, 2019

Property based tests and performance comparison by travis are now cleaned up and as complete as I can think off.

let mut other_iter = other.iter();
let other_min = other_iter.next().unwrap();
let other_max = other_iter.next_back().unwrap();
let mut self_iter = match (self_min.cmp(other_min), self_max.cmp(other_max)) {

This comment has been minimized.

Copy link
@bluss

bluss Sep 30, 2019

Member

In the previous method you use the Ord::cmp(x, y) style and here x.cmp(y). Either is fine but consistency is best.

This comment has been minimized.

Copy link
@ssomers

ssomers Sep 30, 2019

Author Contributor

I never noticed that. Let's count: we have 4 x Ord::cmp, 3 x cmp (counting pairs as one). Before I started messing about in this code, there was just 1 Ord::cmp and 2 cmp. Notice that cmp_opt acts as a replacement for Ord::cmp but uses cmp itself.

So I say, use the shorter, member cmp.

@bluss

This comment has been minimized.

Copy link
Member

commented Sep 30, 2019

Nice! Cool benchmark setup. I only had nitpicks to contribute to the review. Would love if there was a way to write this without .unwrap() (using discriminants for control flow instead), but it is clear enough that they can never panic here. r=me when nitpicks are fixed to taste

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 30, 2019

Property based tests and performance comparison by travis are now cleaned up and as complete as I can think off.

Oh nice! -- Could we add the proptests to the test suite? cc @alexcrichton @nikomatsakis

@ssomers ssomers force-pushed the ssomers:master branch from f16fa72 to d132a70 Oct 1, 2019
@ssomers

This comment has been minimized.

Copy link
Contributor Author

commented Oct 1, 2019

write this without .unwrap

I tried several times, but always hit unsavory amounts of indentation, remote else clauses or eRFC 2497. But now I think I saw the light, resulting in a little less code that is more readable (mostly by dropping some of the micro-optimization). Peculiar indentation courtesy of cargo fmt.

r=bluss

{
(other_min, other_max)
} else {
return false; // other is empty

This comment has been minimized.

Copy link
@ssomers

ssomers Oct 1, 2019

Author Contributor

This else-part cannot be reached, due to the performance shortcut on top. It's possible to:

  • merge this let if with the if let above, but then it's not at all clear to the casual reader that it should return true
  • write a panic! explaining this, better than raw unwrap I guess, but pointless extra code

This comment has been minimized.

Copy link
@bluss

bluss Oct 1, 2019

Member

unreachable!("message") is the panic for that, but since we don't need a panic - false is correct, it seems this works just as well.

@ssomers

This comment has been minimized.

Copy link
Contributor Author

commented Oct 1, 2019

Could we add the proptests to the test suite

I don't know what test suites there are, but seeing if cfg!(miri) { // Miri is too slow appear in the unit tests tells me not everyone would welcome proptests in the standard test suite. I could easily write a bunch of small unit tests covering every corner, but not in the current scheme with 1 test function testing every kind of intersection in 1 file covering everything about sets.

@bluss bluss changed the title BTreeSet intersection, is_subnet & difference optimizations BTreeSet intersection, is_subset & difference optimizations Oct 1, 2019
@bluss

This comment has been minimized.

Copy link
Member

commented Oct 1, 2019

@bors r+ rollup

Thanks!

@bors

This comment has been minimized.

Copy link
Contributor

commented Oct 1, 2019

📌 Commit d132a70 has been approved by bluss

tmandry added a commit to tmandry/rust that referenced this pull request Oct 1, 2019
BTreeSet intersection, is_subset & difference optimizations

...based on the range of values contained; in particular, a massive improvement when these ranges are disjoint (or merely touching), like in the neg-vs-pos benchmarks already in liballoc. Inspired by rust-lang#64383 but none of the ideas there worked out.

I introduced another variant in IntersectionInner and in DifferenceInner, because I couldn't find a way to initialize these iterators as empty if there's no empty set around.

Also, reduced the size of "large" sets in test cases - if Miri can't handle it, it was needlessly slowing down everyone.
@tmandry tmandry referenced this pull request Oct 1, 2019
bors added a commit that referenced this pull request Oct 1, 2019
Rollup of 8 pull requests

Successful merges:

 - #63416 (apfloat: improve doc comments)
 - #64722 (Make all alt builders produce parallel-enabled compilers)
 - #64820 (BTreeSet intersection, is_subset & difference optimizations)
 - #64840 (SelfProfiler API refactoring and part one of event review)
 - #64910 (syntax: cleanup param, method, and misc parsing)
 - #64912 (Remove unneeded `fn main` blocks from docs)
 - #64933 (Fixes #64919. Suggest fix based on operator precendence.)
 - #64952 (Update cargo.)

Failed merges:

r? @ghost
Centril added a commit to Centril/rust that referenced this pull request Oct 1, 2019
BTreeSet intersection, is_subset & difference optimizations

...based on the range of values contained; in particular, a massive improvement when these ranges are disjoint (or merely touching), like in the neg-vs-pos benchmarks already in liballoc. Inspired by rust-lang#64383 but none of the ideas there worked out.

I introduced another variant in IntersectionInner and in DifferenceInner, because I couldn't find a way to initialize these iterators as empty if there's no empty set around.

Also, reduced the size of "large" sets in test cases - if Miri can't handle it, it was needlessly slowing down everyone.
@Centril Centril referenced this pull request Oct 1, 2019
Centril added a commit to Centril/rust that referenced this pull request Oct 1, 2019
BTreeSet intersection, is_subset & difference optimizations

...based on the range of values contained; in particular, a massive improvement when these ranges are disjoint (or merely touching), like in the neg-vs-pos benchmarks already in liballoc. Inspired by rust-lang#64383 but none of the ideas there worked out.

I introduced another variant in IntersectionInner and in DifferenceInner, because I couldn't find a way to initialize these iterators as empty if there's no empty set around.

Also, reduced the size of "large" sets in test cases - if Miri can't handle it, it was needlessly slowing down everyone.
@Centril Centril referenced this pull request Oct 1, 2019
bors added a commit that referenced this pull request Oct 1, 2019
Rollup of 7 pull requests

Successful merges:

 - #63416 (apfloat: improve doc comments)
 - #64820 (BTreeSet intersection, is_subset & difference optimizations)
 - #64910 (syntax: cleanup param, method, and misc parsing)
 - #64912 (Remove unneeded `fn main` blocks from docs)
 - #64933 (Fixes #64919. Suggest fix based on operator precendence.)
 - #64943 (Add lower bound doctests for `saturating_{add,sub}` signed ints)
 - #64950 (Simplify interners)

Failed merges:

r? @ghost
@bors bors merged commit d132a70 into rust-lang:master Oct 2, 2019
4 checks passed
4 checks passed
pr Build #20191001.42 succeeded
Details
pr (Linux mingw-check) Linux mingw-check succeeded
Details
pr (Linux x86_64-gnu-llvm-6.0) Linux x86_64-gnu-llvm-6.0 succeeded
Details
pr (LinuxTools) LinuxTools succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.