Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Use scope tree depths to speed up `nearest_common_ancestor`. #51394
Conversation
This patch adds depth markings to all entries in the `ScopeTree`'s `parent_map`. This change increases memory usage somewhat, but permits a much faster algorithm to be used: - If one scope has a greater depth than the other, the deeper scope is moved upward until they are at equal depths. - Then we move the two scopes upward in lockstep until they match. This avoids the need to keep track of which scopes have already been seen, which was the major part of the cost of the old algorithm. It also reduces the number of child-to-parent moves (which are hash table lookups) when the scopes start at different levels, because it never goes past the nearest common ancestor the way the old algorithm did. Finally, the case where one of the scopes is the root is now handled in advance, because that is moderately common and lets us skip everything. This change speeds up runs of several rust-perf benchmarks, the best by 6%.
|
@bors r+ |
|
|
Use scope tree depths to speed up `nearest_common_ancestor`.
This patch adds depth markings to all entries in the `ScopeTree`'s
`parent_map`. This change increases memory usage somewhat, but permits a
much faster algorithm to be used:
- If one scope has a greater depth than the other, the deeper scope is
moved upward until they are at equal depths.
- Then we move the two scopes upward in lockstep until they match.
This avoids the need to keep track of which scopes have already been
seen, which was the major part of the cost of the old algorithm. It also
reduces the number of child-to-parent moves (which are hash table
lookups) when the scopes start at different levels, because it never
goes past the nearest common ancestor the way the old algorithm did.
Finally, the case where one of the scopes is the root is now handled in
advance, because that is moderately common and lets us skip everything.
This change speeds up runs of several rust-perf benchmarks, the best by
6%.
A selection of the bigger improvements:
```
clap-rs-check
avg: -2.6% min: -6.6% max: 0.0%
syn-check
avg: -2.2% min: -5.0% max: 0.0%
style-servo-check
avg: -2.9%? min: -4.8%? max: 0.0%?
cargo-check
avg: -1.3% min: -2.8% max: 0.0%
sentry-cli-check
avg: -1.0% min: -2.1% max: 0.0%
webrender-check
avg: -0.9% min: -2.0% max: 0.0%
style-servo
avg: -0.9%? min: -1.8%? max: -0.0%?
ripgrep-check
avg: -0.7% min: -1.8% max: 0.1%
clap-rs
avg: -0.9% min: -1.6% max: -0.2%
regex-check
avg: -0.2% min: -1.3% max: 0.1%
syn
avg: -0.6% min: -1.3% max: 0.1%
hyper-check
avg: -0.5% min: -1.1% max: 0.0%
```
The idea came from multiple commenters on my blog and on Reddit. Thank you!
r? @nikomatsakis
Use scope tree depths to speed up `nearest_common_ancestor`.
This patch adds depth markings to all entries in the `ScopeTree`'s
`parent_map`. This change increases memory usage somewhat, but permits a
much faster algorithm to be used:
- If one scope has a greater depth than the other, the deeper scope is
moved upward until they are at equal depths.
- Then we move the two scopes upward in lockstep until they match.
This avoids the need to keep track of which scopes have already been
seen, which was the major part of the cost of the old algorithm. It also
reduces the number of child-to-parent moves (which are hash table
lookups) when the scopes start at different levels, because it never
goes past the nearest common ancestor the way the old algorithm did.
Finally, the case where one of the scopes is the root is now handled in
advance, because that is moderately common and lets us skip everything.
This change speeds up runs of several rust-perf benchmarks, the best by
6%.
A selection of the bigger improvements:
```
clap-rs-check
avg: -2.6% min: -6.6% max: 0.0%
syn-check
avg: -2.2% min: -5.0% max: 0.0%
style-servo-check
avg: -2.9%? min: -4.8%? max: 0.0%?
cargo-check
avg: -1.3% min: -2.8% max: 0.0%
sentry-cli-check
avg: -1.0% min: -2.1% max: 0.0%
webrender-check
avg: -0.9% min: -2.0% max: 0.0%
style-servo
avg: -0.9%? min: -1.8%? max: -0.0%?
ripgrep-check
avg: -0.7% min: -1.8% max: 0.1%
clap-rs
avg: -0.9% min: -1.6% max: -0.2%
regex-check
avg: -0.2% min: -1.3% max: 0.1%
syn
avg: -0.6% min: -1.3% max: 0.1%
hyper-check
avg: -0.5% min: -1.1% max: 0.0%
```
The idea came from multiple commenters on my blog and on Reddit. Thank you!
r? @nikomatsakis
Rollup of 7 pull requests Successful merges: - #51186 (Remove two redundant .nll.stderr files) - #51298 (Stabilize unit tests with non-`()` return type) - #51359 ([fuchsia] Migrate from launchpad to fdio_spawn_etc) - #51368 (Fix the use of closures within #[panic_implementation]) - #51389 (rustdoc: Fix missing stability and src links for inlined external macros) - #51394 (Use scope tree depths to speed up `nearest_common_ancestor`.) - #51399 (NLL performance boost) Failed merges:
Use scope tree depths to speed up `nearest_common_ancestor`.
This patch adds depth markings to all entries in the `ScopeTree`'s
`parent_map`. This change increases memory usage somewhat, but permits a
much faster algorithm to be used:
- If one scope has a greater depth than the other, the deeper scope is
moved upward until they are at equal depths.
- Then we move the two scopes upward in lockstep until they match.
This avoids the need to keep track of which scopes have already been
seen, which was the major part of the cost of the old algorithm. It also
reduces the number of child-to-parent moves (which are hash table
lookups) when the scopes start at different levels, because it never
goes past the nearest common ancestor the way the old algorithm did.
Finally, the case where one of the scopes is the root is now handled in
advance, because that is moderately common and lets us skip everything.
This change speeds up runs of several rust-perf benchmarks, the best by
6%.
A selection of the bigger improvements:
```
clap-rs-check
avg: -2.6% min: -6.6% max: 0.0%
syn-check
avg: -2.2% min: -5.0% max: 0.0%
style-servo-check
avg: -2.9%? min: -4.8%? max: 0.0%?
cargo-check
avg: -1.3% min: -2.8% max: 0.0%
sentry-cli-check
avg: -1.0% min: -2.1% max: 0.0%
webrender-check
avg: -0.9% min: -2.0% max: 0.0%
style-servo
avg: -0.9%? min: -1.8%? max: -0.0%?
ripgrep-check
avg: -0.7% min: -1.8% max: 0.1%
clap-rs
avg: -0.9% min: -1.6% max: -0.2%
regex-check
avg: -0.2% min: -1.3% max: 0.1%
syn
avg: -0.6% min: -1.3% max: 0.1%
hyper-check
avg: -0.5% min: -1.1% max: 0.0%
```
The idea came from multiple commenters on my blog and on Reddit. Thank you!
r? @nikomatsakis
|
|
|
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
|
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
|
Looks like an infra failure. |
|
@bors retry - network failure |
Use scope tree depths to speed up `nearest_common_ancestor`.
This patch adds depth markings to all entries in the `ScopeTree`'s
`parent_map`. This change increases memory usage somewhat, but permits a
much faster algorithm to be used:
- If one scope has a greater depth than the other, the deeper scope is
moved upward until they are at equal depths.
- Then we move the two scopes upward in lockstep until they match.
This avoids the need to keep track of which scopes have already been
seen, which was the major part of the cost of the old algorithm. It also
reduces the number of child-to-parent moves (which are hash table
lookups) when the scopes start at different levels, because it never
goes past the nearest common ancestor the way the old algorithm did.
Finally, the case where one of the scopes is the root is now handled in
advance, because that is moderately common and lets us skip everything.
This change speeds up runs of several rust-perf benchmarks, the best by
6%.
A selection of the bigger improvements:
```
clap-rs-check
avg: -2.6% min: -6.6% max: 0.0%
syn-check
avg: -2.2% min: -5.0% max: 0.0%
style-servo-check
avg: -2.9%? min: -4.8%? max: 0.0%?
cargo-check
avg: -1.3% min: -2.8% max: 0.0%
sentry-cli-check
avg: -1.0% min: -2.1% max: 0.0%
webrender-check
avg: -0.9% min: -2.0% max: 0.0%
style-servo
avg: -0.9%? min: -1.8%? max: -0.0%?
ripgrep-check
avg: -0.7% min: -1.8% max: 0.1%
clap-rs
avg: -0.9% min: -1.6% max: -0.2%
regex-check
avg: -0.2% min: -1.3% max: 0.1%
syn
avg: -0.6% min: -1.3% max: 0.1%
hyper-check
avg: -0.5% min: -1.1% max: 0.0%
```
The idea came from multiple commenters on my blog and on Reddit. Thank you!
r? @nikomatsakis
Rollup of 13 pull requests Successful merges: - #50143 (Add deprecation lint for duplicated `macro_export`s) - #51099 (Fix Issue 38777) - #51276 (Dedup auto traits in trait objects.) - #51298 (Stabilize unit tests with non-`()` return type) - #51360 (Suggest parentheses when a struct literal needs them) - #51391 (Use spans pointing at the inside of a rustdoc attribute) - #51394 (Use scope tree depths to speed up `nearest_common_ancestor`.) - #51396 (Make the size of Option<NonZero*> a documented guarantee.) - #51401 (Warn on `repr` without hints) - #51412 (Avoid useless Vec clones in pending_obligations().) - #51427 (compiletest: autoremove duplicate .nll.* files (#51204)) - #51436 (Do not require stage 2 compiler for rustdoc) - #51437 (rustbuild: generate full list of dependencies for metadata) Failed merges:
This patch adds depth markings to all entries in the
ScopeTree'sparent_map. This change increases memory usage somewhat, but permits amuch faster algorithm to be used:
If one scope has a greater depth than the other, the deeper scope is
moved upward until they are at equal depths.
Then we move the two scopes upward in lockstep until they match.
This avoids the need to keep track of which scopes have already been
seen, which was the major part of the cost of the old algorithm. It also
reduces the number of child-to-parent moves (which are hash table
lookups) when the scopes start at different levels, because it never
goes past the nearest common ancestor the way the old algorithm did.
Finally, the case where one of the scopes is the root is now handled in
advance, because that is moderately common and lets us skip everything.
This change speeds up runs of several rust-perf benchmarks, the best by
6%.
A selection of the bigger improvements:
The idea came from multiple commenters on my blog and on Reddit. Thank you!
r? @nikomatsakis