This week, there were a number of cases where the incr-unchanged
variants of inflate
went up or down by 5% to 6%; we believe these are
instances of increased noise in benchmarks documented on rustc-perf#1105. I was tempted to remove these from the report, but its non-trivial to re-construct the report "as if" some benchmark were omitted.
Otherwise, there were some nice wins for performance. For example, PR #90996 more than halved the compilation time for full
builds of diesel
by revising how we hash ObligationCauseData
. If anyone is interested, it might be good to follow-up on the effects of PR #90352, "Simplify for
loop desugar", where we have hypothesized that the increased compilation time is due to more LLVM optimizations being applied.
Triage done by @pnkfelix. Revision range: 934624fe..22c2d9dd
1 Regressions, 3 Improvements, 8 Mixed; 3 of them in rollups 34 comparisons made in total
rustdoc: Make two small cleanups #91073
- Very large regression in instruction counts (up to 5.3% on
incr-unchanged
builds ofinflate
) - This was previously triaged as spurious by the PR author.
- (Only inflate debug incr-unchanged was affected significantly.)
Rollup of 7 pull requests #90966
- Very large improvement in instruction counts (up to -5.9% on
incr-unchanged
builds ofinflate
) - (This is bouncing around in the same manner that the above regression bounced around; i.e., this is a spurious improvement.)
Optimize impl Hash for ObligationCauseData
by not hashing ObligationCauseCode
variant fields #90996
- Very large improvement in instruction counts (up to -58.1% on
full
builds ofdiesel
)
Avoid documenting top-level private imports #91094
- Very large improvement in instruction counts (up to -45.1% on
full
builds ofwebrender-wrench
) - This appears to counter-act a regression that was introduced in PR #90769 a rollup where the root cause is believed to be PR #88447, "Use computed visibility in rustdoc".
rustdoc: Cleanup DocFragment
#91034
- Moderate improvement in instruction counts (up to -0.7% on
full
builds ofserde
) - It has been pointed out that this was actually a nice -0.8% to -1.4% win on max-rss as well.
- (This is solely for rustdoc benchmarking, to be clear.)
libcore: assume the input of next_code_point
and next_code_point_reverse
is UTF-8-like #89611
- Small improvement in instruction counts (up to -0.8% on
full
builds ofencoding
)
Remove DropArena
. #90919
- Small improvement in instruction counts (up to -0.3% on
incr-patched: println
builds ofclap-rs
) - Very large regression in instruction counts (up to 6.1% on
incr-unchanged
builds ofinflate
) - This was triaged as noise in the PR comments both before it landed and after it landed.
std: Get the standard library compiling for wasm64 #90382
- Large improvement in instruction counts (up to -4.3% on
incr-unchanged
builds ofclap-rs
) - Moderate regression in instruction counts (up to 1.0% on
incr-unchanged
builds ofissue-88862
) - There were a lot of various changes in this PR, such as updates to dependencies (
compiler_builtins
anddlmalloc
). We probably shoud have done a pre-merge rust-timer run on this PR. - The flagged regressions of magnitude greater than 0.5% are isolated to "issue-88862 check" and "style-servo check". The improvements tended to outweigh the regressions. For the most part almost all significant performance effects are isolated to check builds.
Rollup of 8 pull requests #91019
- Moderate improvement in instruction counts (up to -0.6% on
incr-unchanged
builds ofstyle-servo
) - Moderate regression in instruction counts (up to 0.8% on
full
builds ofawait-call-tree
) - The regression is entirely associated with doc builds, which led to the PR author to flag PR #90750 as the root cause.
- It seems to me like the extra work injected by PR #90750 may be unavoidable; but was it expected to be significant?
Implement clone_from
for State
#90535
- Small improvement in instruction counts (up to -0.2% on
full
builds ofmany-assoc-items
) - Very large regression in instruction counts (up to 6.1% on
incr-unchanged
builds ofinflate
) - This was evaluated for its effect on performance prior to merge; that run returned no relevant changes.
- As noted elsewhere, for this report we should probably treat "inflate" as noisy.
Update stdarch #91052
- Very large improvement in instruction counts (up to -5.8% on
incr-unchanged
builds ofinflate
) - Small regression in instruction counts (up to 0.3% on
incr-patched: println
builds ofhtml5ever
) - The only benchmark that regressed is "html5ever debug" ("incr-patched: println" and "incr-unchanged"), and only by a relatively small amount. This seems acceptable to me, compared to the effort involved in figuring out how this change could be related to that effect.
Point at source of trait bound obligations in more places #89580
- Large improvement in instruction counts (up to -5.0% on
incr-unchanged
builds ofinflate
) - Small regression in instruction counts (up to 1.1% on
incr-unchanged
builds ofwg-grammar
) - (Again, we can probably ignore the change to
inflate
.) - Other than that, there were a broad set of small regressions. Putting aside the ones tagged with
?
("noisy"), there are 19 benchmarks that regressed by 0.10% to 0.42%. This seems like an acceptable cost.
Simplify for
loop desugar #90352
- Very large improvement in instruction counts (up to -6.2% on
incr-full
builds ofclap-rs
) - Large regression in instruction counts (up to 2.3% on
full
builds ofregex
) - This was triaged in the PR comments both before it landed and after it landed with the justification "The regressions seem to all be in
-opt
builds and solely part of the time spent in LLVM, so I'm hoping it's that more optimizations apply now (and worst case some optimizations require more work but don't result in better code)."
Manually outline error on incremental_verify_ich #89883
- Small improvement in instruction counts (up to -0.8% on
incr-unchanged
builds ofclap-rs
) - Large regression in instruction counts (up to 1.1% on
incr-unchanged
builds ofcoercions
) - The pre-merge rustc-timer run did not predict such a significant impact on
coercions
. - The driving force for this change was to reduce the critical path in bootstrap time, so the most important thing to look at is the bootstrap timing data. Specifically: while there is a big mix of ups and downs on the percentages column, the crate that takes the longest to compile (
rustc_query_impl
, the laggard at over 80 seconds of compilation time). - This PR brings the compilation time of
rustc_query_impl
from 87.1 seconds down to 85.6 seconds, a -1.8% improvement. - That is consistent with the predicted effect of the PR, and justifies the isolated impact on instruction counts.
- #91052 Update stdarch
- #91019 Rollup of 8 pull requests
- #90883 Rollup of 3 pull requests
- #90839 Generate documentation in rustc
rustc_index::newtype_index
macro - #90821 MIRI says
reverse
is UB, so replace it with something LLVM can vectorize - #90769 Rollup of 5 pull requests
- #90746 Optimize pattern matching
- #90695 Rollup of 4 pull requests
- #90684 Change paths for
dist
command to match the components they generate - #90645 Implement diagnostic for String conversion
- #90559 Optimize bidi character detection.
- #90542 Make RawVec private to alloc
- #90535 Implement
clone_from
forState
- #90489 rustdoc: Go back to loading all external crates unconditionally
- #90485 Don't destructure args tuple in format_args!
- #90462 [master] Fix CVE-2021-42574
- #90443 Merge
DocContext.{ty,lt,ct}_substs
into one map - #90422 Rollup of 5 pull requests
- #90382 std: Get the standard library compiling for wasm64
- #90235 Rollup of 6 pull requests
- #90183 Show all Deref implementations recursively
- #90067 Rollup of 10 pull requests
- #89939 Rollup of 10 pull requests
- #89858 Rollup of 6 pull requests
- #89695 Move top part of print_item to Tera templates
- #89608 Rollup of 12 pull requests
- #89580 Point at source of trait bound obligations in more places
- #89551 Stabilize
const_raw_ptr_deref
for*const T
- #89534 Introduce
tcx.get_diagnostic_name
- #89435 Rollup of 6 pull requests