Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Use SmallVec for DepNodeIndex within dep_graph. #50565
Conversation
This avoids a decent number of allocations, enough to speed up incremental runs of many rustc-benchmarks, the best by 2%.
|
r? @cramertj (rust_highfive has picked a reviewer for you, use r? to override) |
|
Thanks, @nnethercote! @bors r+ You might also be interested in #45577, which tried to get rid of allocating the |
|
|
…elwoerister Use SmallVec for DepNodeIndex within dep_graph. This avoids a decent number of allocations, enough to speed up incremental runs of many rustc-benchmarks, the best by 2%. Here are the rustc-perf benchmarks that showed an improvement of at least 1% on one run: ``` unused-warnings-check avg: -1.7% min: -2.4% max: 0.0% unused-warnings-opt avg: -1.4% min: -2.0% max: 0.0% unused-warnings avg: -1.4% min: -2.0% max: -0.0% tokio-webpush-simple-check avg: -1.0% min: -1.7% max: 0.0% futures-opt avg: -0.9% min: -1.6% max: 0.0% encoding avg: -1.2% min: -1.6% max: -0.6% encoding-check avg: -0.9% min: -1.6% max: 0.0% encoding-opt avg: -0.8% min: -1.6% max: -0.1% futures avg: -0.9% min: -1.5% max: 0.0% futures-check avg: -0.9% min: -1.5% max: 0.1% regression-31157-check avg: -0.9% min: -1.5% max: 0.0% regex avg: -0.6% min: -1.4% max: 0.0% regression-31157-opt avg: -0.5% min: -1.4% max: 0.1% regression-31157 avg: -0.7% min: -1.4% max: 0.2% regex-opt avg: -0.6% min: -1.4% max: 0.1% hyper-check avg: -0.8% min: -1.4% max: -0.1% regex-check avg: -1.0% min: -1.4% max: 0.0% hyper-opt avg: -0.7% min: -1.4% max: -0.1% hyper avg: -0.7% min: -1.3% max: 0.1% piston-image-opt avg: -0.4% min: -1.3% max: 0.0% tokio-webpush-simple-opt avg: -0.3% min: -1.3% max: 0.0% piston-image-check avg: -0.5% min: -1.3% max: -0.0% syn-opt avg: -0.5% min: -1.3% max: 0.0% clap-rs-check avg: -0.3% min: -1.3% max: 0.2% piston-image avg: -0.5% min: -1.2% max: 0.1% syn avg: -0.5% min: -1.2% max: 0.1% syn-check avg: -0.6% min: -1.2% max: -0.1% issue-46449-opt avg: -0.4% min: -1.2% max: -0.1% parser-check avg: -0.7% min: -1.2% max: 0.1% issue-46449 avg: -0.5% min: -1.2% max: -0.0% ```
…elwoerister Use SmallVec for DepNodeIndex within dep_graph. This avoids a decent number of allocations, enough to speed up incremental runs of many rustc-benchmarks, the best by 2%. Here are the rustc-perf benchmarks that showed an improvement of at least 1% on one run: ``` unused-warnings-check avg: -1.7% min: -2.4% max: 0.0% unused-warnings-opt avg: -1.4% min: -2.0% max: 0.0% unused-warnings avg: -1.4% min: -2.0% max: -0.0% tokio-webpush-simple-check avg: -1.0% min: -1.7% max: 0.0% futures-opt avg: -0.9% min: -1.6% max: 0.0% encoding avg: -1.2% min: -1.6% max: -0.6% encoding-check avg: -0.9% min: -1.6% max: 0.0% encoding-opt avg: -0.8% min: -1.6% max: -0.1% futures avg: -0.9% min: -1.5% max: 0.0% futures-check avg: -0.9% min: -1.5% max: 0.1% regression-31157-check avg: -0.9% min: -1.5% max: 0.0% regex avg: -0.6% min: -1.4% max: 0.0% regression-31157-opt avg: -0.5% min: -1.4% max: 0.1% regression-31157 avg: -0.7% min: -1.4% max: 0.2% regex-opt avg: -0.6% min: -1.4% max: 0.1% hyper-check avg: -0.8% min: -1.4% max: -0.1% regex-check avg: -1.0% min: -1.4% max: 0.0% hyper-opt avg: -0.7% min: -1.4% max: -0.1% hyper avg: -0.7% min: -1.3% max: 0.1% piston-image-opt avg: -0.4% min: -1.3% max: 0.0% tokio-webpush-simple-opt avg: -0.3% min: -1.3% max: 0.0% piston-image-check avg: -0.5% min: -1.3% max: -0.0% syn-opt avg: -0.5% min: -1.3% max: 0.0% clap-rs-check avg: -0.3% min: -1.3% max: 0.2% piston-image avg: -0.5% min: -1.2% max: 0.1% syn avg: -0.5% min: -1.2% max: 0.1% syn-check avg: -0.6% min: -1.2% max: -0.1% issue-46449-opt avg: -0.4% min: -1.2% max: -0.1% parser-check avg: -0.7% min: -1.2% max: 0.1% issue-46449 avg: -0.5% min: -1.2% max: -0.0% ```
Rollup of 18 pull requests Successful merges: - #49423 (Extend tests for RFC1598 (GAT)) - #50010 (Give SliceIndex impls a test suite of girth befitting the implementation (and fix a UTF8 boundary check)) - #50447 (Fix update-references for tests within subdirectories.) - #50514 (Pull in a wasm fix from LLVM upstream) - #50524 (Make DepGraph::previous_work_products immutable) - #50532 (Don't use Lock for heavily accessed CrateMetadata::cnum_map.) - #50538 ( Make CrateNum allocation more thread-safe. ) - #50564 (Inline `Span` methods.) - #50565 (Use SmallVec for DepNodeIndex within dep_graph.) - #50569 (Allow for specifying a linker plugin for cross-language LTO) - #50572 (Clarify in the docs that `mul_add` is not always faster.) - #50574 (add fn `into_inner(self) -> (Idx, Idx)` to RangeInclusive (#49022)) - #50575 (std: Avoid `ptr::copy` if unnecessary in `vec::Drain`) - #50588 (Move "See also" disambiguation links for primitive types to top) - #50590 (Fix tuple struct field spans) - #50591 (Restore RawVec::reserve* documentation) - #50598 (Remove unnecessary mutable borrow and resizing in DepGraph::serialize) - #50606 (Retry when downloading the Docker cache.) Failed merges: - #50161 (added missing implementation hint) - #50558 (Remove all reference to DepGraph::work_products)
Rollup of 18 pull requests Successful merges: - #49423 (Extend tests for RFC1598 (GAT)) - #50010 (Give SliceIndex impls a test suite of girth befitting the implementation (and fix a UTF8 boundary check)) - #50447 (Fix update-references for tests within subdirectories.) - #50514 (Pull in a wasm fix from LLVM upstream) - #50524 (Make DepGraph::previous_work_products immutable) - #50532 (Don't use Lock for heavily accessed CrateMetadata::cnum_map.) - #50538 ( Make CrateNum allocation more thread-safe. ) - #50564 (Inline `Span` methods.) - #50565 (Use SmallVec for DepNodeIndex within dep_graph.) - #50569 (Allow for specifying a linker plugin for cross-language LTO) - #50572 (Clarify in the docs that `mul_add` is not always faster.) - #50574 (add fn `into_inner(self) -> (Idx, Idx)` to RangeInclusive (#49022)) - #50575 (std: Avoid `ptr::copy` if unnecessary in `vec::Drain`) - #50588 (Move "See also" disambiguation links for primitive types to top) - #50590 (Fix tuple struct field spans) - #50591 (Restore RawVec::reserve* documentation) - #50598 (Remove unnecessary mutable borrow and resizing in DepGraph::serialize) - #50606 (Retry when downloading the Docker cache.) Failed merges: - #50161 (added missing implementation hint) - #50558 (Remove all reference to DepGraph::work_products)
Rollup of 18 pull requests Successful merges: - #49423 (Extend tests for RFC1598 (GAT)) - #50010 (Give SliceIndex impls a test suite of girth befitting the implementation (and fix a UTF8 boundary check)) - #50447 (Fix update-references for tests within subdirectories.) - #50514 (Pull in a wasm fix from LLVM upstream) - #50524 (Make DepGraph::previous_work_products immutable) - #50532 (Don't use Lock for heavily accessed CrateMetadata::cnum_map.) - #50538 ( Make CrateNum allocation more thread-safe. ) - #50564 (Inline `Span` methods.) - #50565 (Use SmallVec for DepNodeIndex within dep_graph.) - #50569 (Allow for specifying a linker plugin for cross-language LTO) - #50572 (Clarify in the docs that `mul_add` is not always faster.) - #50574 (add fn `into_inner(self) -> (Idx, Idx)` to RangeInclusive (#49022)) - #50575 (std: Avoid `ptr::copy` if unnecessary in `vec::Drain`) - #50588 (Move "See also" disambiguation links for primitive types to top) - #50590 (Fix tuple struct field spans) - #50591 (Restore RawVec::reserve* documentation) - #50598 (Remove unnecessary mutable borrow and resizing in DepGraph::serialize) - #50606 (Retry when downloading the Docker cache.) Failed merges: - #50161 (added missing implementation hint) - #50558 (Remove all reference to DepGraph::work_products)
…small `reserve_and_rehash` takes up 1.4% of the runtime on the `packed-simd` benchmark which I believe is due to the number of reads are very low in many cases (see rust-lang#50565 for instance). This avoids allocating the set until we start allocating the `reads` `SmallVec` but it is possible that a lower limit might be better (not tested since the improvement will be hard to spot either way).
perf(dep_graph): Avoid allocating a set on when the number reads are … …small `reserve_and_rehash` takes up 1.4% of the runtime on the `packed-simd` benchmark which I believe is due to the number of reads are very low in many cases (see #50565 for instance). This avoids allocating the set until we start allocating the `reads` `SmallVec` but it is possible that a lower limit might be better (not tested since the improvement will be hard to spot either way).
perf(dep_graph): Avoid allocating a set on when the number reads are … …small `reserve_and_rehash` takes up 1.4% of the runtime on the `packed-simd` benchmark which I believe is due to the number of reads are very low in many cases (see rust-lang#50565 for instance). This avoids allocating the set until we start allocating the `reads` `SmallVec` but it is possible that a lower limit might be better (not tested since the improvement will be hard to spot either way).
This avoids a decent number of allocations, enough to speed up
incremental runs of many rustc-benchmarks, the best by 2%.
Here are the rustc-perf benchmarks that showed an improvement of at least 1% on one run: