New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up `nearest_common_ancestor`. #50106

Merged
merged 2 commits into from Apr 25, 2018

Conversation

Projects
None yet
9 participants
@nnethercote
Contributor

nnethercote commented Apr 20, 2018

nearest_common_ancestor can be made faster.

Here are all the benchmarks where one of the measurements improved by at least 1%.

clap-rs-check
	avg: -4.5%	min: -8.8%	max: -0.3%
clap-rs
	avg: -2.6%	min: -4.5%	max: 0.5%
script-servo
	avg: -1.7%	min: -3.6%	max: 0.0%
regression-31157
	avg: -1.5%	min: -2.6%	max: -0.4%
hyper
	avg: -1.2%	min: -2.5%	max: -0.0%
piston-image
	avg: -1.6%	min: -2.5%	max: 0.1%
regex
	avg: -1.2%	min: -2.2%	max: 0.0%
issue-46449
	avg: -1.8%	min: -2.1%	max: -0.7%
crates.io
	avg: -1.2%	min: -2.1%	max: 0.0%
hyper-check
	avg: -1.0%	min: -2.1%	max: -0.1%
clap-rs-opt
	avg: -1.4%	min: -2.0%	max: -0.3%
piston-image-check
	avg: -1.2%	min: -1.9%	max: -0.1%
regex-check
	avg: -0.5%	min: -1.8%	max: -0.1%
syn
	avg: -1.1%	min: -1.7%	max: -0.1%
tokio-webpush-simple-check
	avg: -1.1%	min: -1.6%	max: -0.3%
tokio-webpush-simple
	avg: -1.2%	min: -1.6%	max: -0.0%
helloworld-check
	avg: -1.4%	min: -1.6%	max: -1.2%
deeply-nested
	avg: -1.2%	min: -1.4%	max: -0.8%
encoding-check
	avg: -0.8%	min: -1.3%	max: -0.3%
unify-linearly-check
	avg: -1.0%	min: -1.3%	max: -0.8%
script-servo-check
	avg: -0.6%	min: -1.3%	max: 0.0%
regression-31157-check
	avg: -0.9%	min: -1.2%	max: -0.7%
script-servo-opt
	avg: -0.5%	min: -1.2%	max: 0.1%
deeply-nested-check
	avg: -0.8%	min: -1.2%	max: -0.7%
encoding
	avg: -0.7%	min: -1.1%	max: -0.3%
issue-46449-check
	avg: -0.9%	min: -1.1%	max: -0.6%
parser-check
	avg: -0.9%	min: -1.1%	max: -0.8%
html5ever
	avg: -0.5%	min: -1.0%	max: -0.0%

nnethercote added some commits Apr 20, 2018

Fix a copy-and-paste bug in nearest_common_ancestor.
This code path is rarely hit, which likely explains why this bug hasn't
been detected before now. (I only noticed it via code inspection.)
Speed up `nearest_common_ancestor()`.
`nearest_common_ancestor()` uses an algorithm that requires computing
the full scope chain for both scopes, which is expensive because each
element involves a hash table lookup, and then looking for a common
tail.

This patch changes `nearest_common_ancestor()` to use a different
algorithm, which starts at the given scopes and works outwards (i.e. up
the scope tree) until a common ancestor is found. This is much faster
because in most cases the common ancestor is found well before the end
of the scope chains. Also, the use of a SmallVec avoids the need for any
allocation most of the time.
@@ -706,7 +706,7 @@ impl<'tcx> ScopeTree {
// "Modeling closures" section of the README in
// infer::region_constraints for more details.
let a_root_scope = a_ancestors[a_index];
let b_root_scope = a_ancestors[a_index];
let b_root_scope = b_ancestors[b_index];

This comment has been minimized.

@michaelwoerister

michaelwoerister Apr 20, 2018

Contributor

Nice catch!

@michaelwoerister

This comment has been minimized.

Contributor

michaelwoerister commented Apr 20, 2018

Thanks, @nnethercote! I think @nikomatsakis had some ideas on how to exploit some domain knowledge to speed this. He probably should take a look at this.

r? @nikomatsakis

@nnethercote

This comment has been minimized.

Contributor

nnethercote commented Apr 23, 2018

Here are things I've found from my measurements:

  • Scope chains can be long, usually at least dozens of entries, and often 100+ entries.
  • The vast majority of the time, the the nearest common ancestor is much closer to the front of the scope chain than the back. I.e. the two scope chains have a handful of elements that differ at the front and then a long common tail. This is why the new algorithm is a much better fit.

I'm not satisfied with the representation of parent_map, which is a HashMap mapping child scopes to parent scopes. It's hot and it feels to me like we should be able to use a different representation that doesn't require a hash table lookup for every child-to-parent lookup. I have some half-baked ideas that I am investigating, but they might be more suitable for a follow-up PR. Even if those ideas work out, I'm confident that this new algorithm will continue to be a better fit than the old one.

@nnethercote

This comment has been minimized.

Contributor

nnethercote commented Apr 24, 2018

I'm not satisfied with the representation of parent_map, which is a HashMap mapping child scopes to parent scopes. It's hot and it feels to me like we should be able to use a different representation that doesn't require a hash table lookup for every child-to-parent lookup. I have some half-baked ideas that I am investigating, but they might be more suitable for a follow-up PR.

I've done some work on this, and I haven't yet managed to make things any faster with the new representation. So I think it's definitely worth evaluating this PR as is.

@nikomatsakis

This looks like the algorithm I wanted to do too -- thanks @nnethercote !

@nikomatsakis

This comment has been minimized.

Contributor

nikomatsakis commented Apr 24, 2018

@bors r+

@bors

This comment has been minimized.

Contributor

bors commented Apr 24, 2018

📌 Commit cccd51c has been approved by nikomatsakis

@bors

This comment has been minimized.

Contributor

bors commented Apr 25, 2018

⌛️ Testing commit cccd51c with merge 38eb9cd...

bors added a commit that referenced this pull request Apr 25, 2018

Auto merge of #50106 - nnethercote:nearest_common_ancestor, r=nikomat…
…sakis

Speed up `nearest_common_ancestor`.

`nearest_common_ancestor` can be made faster.

Here are all the benchmarks where one of the measurements improved by at least 1%.
```
clap-rs-check
	avg: -4.5%	min: -8.8%	max: -0.3%
clap-rs
	avg: -2.6%	min: -4.5%	max: 0.5%
script-servo
	avg: -1.7%	min: -3.6%	max: 0.0%
regression-31157
	avg: -1.5%	min: -2.6%	max: -0.4%
hyper
	avg: -1.2%	min: -2.5%	max: -0.0%
piston-image
	avg: -1.6%	min: -2.5%	max: 0.1%
regex
	avg: -1.2%	min: -2.2%	max: 0.0%
issue-46449
	avg: -1.8%	min: -2.1%	max: -0.7%
crates.io
	avg: -1.2%	min: -2.1%	max: 0.0%
hyper-check
	avg: -1.0%	min: -2.1%	max: -0.1%
clap-rs-opt
	avg: -1.4%	min: -2.0%	max: -0.3%
piston-image-check
	avg: -1.2%	min: -1.9%	max: -0.1%
regex-check
	avg: -0.5%	min: -1.8%	max: -0.1%
syn
	avg: -1.1%	min: -1.7%	max: -0.1%
tokio-webpush-simple-check
	avg: -1.1%	min: -1.6%	max: -0.3%
tokio-webpush-simple
	avg: -1.2%	min: -1.6%	max: -0.0%
helloworld-check
	avg: -1.4%	min: -1.6%	max: -1.2%
deeply-nested
	avg: -1.2%	min: -1.4%	max: -0.8%
encoding-check
	avg: -0.8%	min: -1.3%	max: -0.3%
unify-linearly-check
	avg: -1.0%	min: -1.3%	max: -0.8%
script-servo-check
	avg: -0.6%	min: -1.3%	max: 0.0%
regression-31157-check
	avg: -0.9%	min: -1.2%	max: -0.7%
script-servo-opt
	avg: -0.5%	min: -1.2%	max: 0.1%
deeply-nested-check
	avg: -0.8%	min: -1.2%	max: -0.7%
encoding
	avg: -0.7%	min: -1.1%	max: -0.3%
issue-46449-check
	avg: -0.9%	min: -1.1%	max: -0.6%
parser-check
	avg: -0.9%	min: -1.1%	max: -0.8%
html5ever
	avg: -0.5%	min: -1.0%	max: -0.0%
```
@bors

This comment has been minimized.

Contributor

bors commented Apr 25, 2018

💔 Test failed - status-appveyor

@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented Apr 25, 2018

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
[00:01:39] warning: spurious network error (1 tries remaining): failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:39] error: unable to get packages from source
[00:01:39] 
[00:01:39] Caused by:
[00:01:39]   failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:39] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:39] Build completed unsuccessfully in 0:00:51
[00:01:39] Makefile:81: recipe for target 'prepare' failed
[00:01:39] make: *** [prepare] Error 1
[00:01:39]  Downloading serde_json v1.0.15
[00:01:40] error: unable to get packages from source
[00:01:40] 
[00:01:40] Caused by:
[00:01:40] Caused by:
[00:01:40]   failed to get 200 response from `https://crates.io/api/v1/crates/serde_json/1.0.15/download`, got 404
[00:01:40] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:40] Build completed unsuccessfully in 0:00:00
[00:01:40] Makefile:81: recipe for target 'prepare' failed
[00:01:40] make: *** [prepare] Error 1
[00:01:40]  Downloading getopts v0.2.17
[00:01:40] error: unable to get packages from source
[00:01:40] 
[00:01:40] Caused by:
[00:01:40] Caused by:
[00:01:40]   failed to get 200 response from `https://crates.io/api/v1/crates/getopts/0.2.17/download`, got 404
[00:01:40] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:40] Build completed unsuccessfully in 0:00:00
[00:01:40] Makefile:81: recipe for target 'prepare' failed
[00:01:40] make: *** [prepare] Error 1
[00:01:40]  Downloading serde_json v1.0.15
[00:01:41] error: unable to get packages from source
[00:01:41] 
[00:01:41] Caused by:
[00:01:41] Caused by:
[00:01:41]   failed to get 200 response from `https://crates.io/api/v1/crates/serde_json/1.0.15/download`, got 404
[00:01:41] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:41] Build completed unsuccessfully in 0:00:00
[00:01:41] Makefile:81: recipe for target 'prepare' failed
[00:01:41] make: *** [prepare] Error 1
[00:01:41]  Downloading filetime v0.1.15
[00:01:41] warning: spurious network error (2 tries remaining): failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:41] warning: spurious network error (1 tries remaining): failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:41] error: unable to get packages from source
[00:01:41] error: unable to get packages from source
[00:01:41] 
[00:01:41] Caused by:
[00:01:41]   failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:41] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:41] Build completed unsuccessfully in 0:00:00
[00:01:41] make: *** [prepare] Error 1
[00:01:41] Makefile:81: recipe for target 'prepare' failed
[00:01:41] The command has failed after 5 attempts.

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 1.
travis_time:start:1914da01
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@nnethercote

This comment has been minimized.

Contributor

nnethercote commented Apr 25, 2018

This looks like an infrastructure failure. Can someone please retry?

@nagbot-rs

This comment has been minimized.

nagbot-rs commented Apr 25, 2018

@bors

This comment has been minimized.

Contributor

bors commented Apr 25, 2018

@nagbot-rs: 🔑 Insufficient privileges: not in try users

@aturon

This comment has been minimized.

Member

aturon commented Apr 25, 2018

@bors: retry

@bors

This comment has been minimized.

Contributor

bors commented Apr 25, 2018

⌛️ Testing commit cccd51c with merge cc79420...

bors added a commit that referenced this pull request Apr 25, 2018

Auto merge of #50106 - nnethercote:nearest_common_ancestor, r=nikomat…
…sakis

Speed up `nearest_common_ancestor`.

`nearest_common_ancestor` can be made faster.

Here are all the benchmarks where one of the measurements improved by at least 1%.
```
clap-rs-check
	avg: -4.5%	min: -8.8%	max: -0.3%
clap-rs
	avg: -2.6%	min: -4.5%	max: 0.5%
script-servo
	avg: -1.7%	min: -3.6%	max: 0.0%
regression-31157
	avg: -1.5%	min: -2.6%	max: -0.4%
hyper
	avg: -1.2%	min: -2.5%	max: -0.0%
piston-image
	avg: -1.6%	min: -2.5%	max: 0.1%
regex
	avg: -1.2%	min: -2.2%	max: 0.0%
issue-46449
	avg: -1.8%	min: -2.1%	max: -0.7%
crates.io
	avg: -1.2%	min: -2.1%	max: 0.0%
hyper-check
	avg: -1.0%	min: -2.1%	max: -0.1%
clap-rs-opt
	avg: -1.4%	min: -2.0%	max: -0.3%
piston-image-check
	avg: -1.2%	min: -1.9%	max: -0.1%
regex-check
	avg: -0.5%	min: -1.8%	max: -0.1%
syn
	avg: -1.1%	min: -1.7%	max: -0.1%
tokio-webpush-simple-check
	avg: -1.1%	min: -1.6%	max: -0.3%
tokio-webpush-simple
	avg: -1.2%	min: -1.6%	max: -0.0%
helloworld-check
	avg: -1.4%	min: -1.6%	max: -1.2%
deeply-nested
	avg: -1.2%	min: -1.4%	max: -0.8%
encoding-check
	avg: -0.8%	min: -1.3%	max: -0.3%
unify-linearly-check
	avg: -1.0%	min: -1.3%	max: -0.8%
script-servo-check
	avg: -0.6%	min: -1.3%	max: 0.0%
regression-31157-check
	avg: -0.9%	min: -1.2%	max: -0.7%
script-servo-opt
	avg: -0.5%	min: -1.2%	max: 0.1%
deeply-nested-check
	avg: -0.8%	min: -1.2%	max: -0.7%
encoding
	avg: -0.7%	min: -1.1%	max: -0.3%
issue-46449-check
	avg: -0.9%	min: -1.1%	max: -0.6%
parser-check
	avg: -0.9%	min: -1.1%	max: -0.8%
html5ever
	avg: -0.5%	min: -1.0%	max: -0.0%
```
@bors

This comment has been minimized.

Contributor

bors commented Apr 25, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: nikomatsakis
Pushing cc79420 to master...

@bors bors merged commit cccd51c into rust-lang:master Apr 25, 2018

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details

@nnethercote nnethercote deleted the nnethercote:nearest_common_ancestor branch Apr 26, 2018

@kirillkh

This comment has been minimized.

kirillkh commented Apr 30, 2018

Two remarks :

  1. In the worst case, the new algorithm is quadratic in the combined length of the two stacks (seen). Did you try to use a hash table instead of SmallVec?
  2. It is possible to ~halve the number of comparisons being done in each iteration of the outer loop. Maintain two seen vecs instead of one. Add items from stack a into seena and items from stack b into seenb. Look for items from stack a in seenb and vice versa.
@nnethercote

This comment has been minimized.

Contributor

nnethercote commented Apr 30, 2018

I did try a hash table. SmallVec was slightly faster for the rustc-perf suite.

Two Vecs would halve the number of comparisons, but it would require an extra heap allocation. I don't know which effect would be greater.

@kirillkh

This comment has been minimized.

kirillkh commented Apr 30, 2018

Another option is to maintain a single seen vec and use stride 2 for search.

@kirillkh

This comment has been minimized.

kirillkh commented Apr 30, 2018

And actually I expect 2 SmallVecs to be a win, considering they are stack-allocated most of the time.

@nnethercote

This comment has been minimized.

Contributor

nnethercote commented May 2, 2018

I filed #50365 for the two vectors change. Thank you for the suggestion, @kirillkh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment