Skip to content

rustdoc: cleanups relating to allocations #141573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 30, 2025

Conversation

nnethercote
Copy link
Contributor

@nnethercote nnethercote commented May 26, 2025

These commits generally clean up the code a bit and also reduce allocation rates a bit.

r? @camelid

This reverts part of rust-lang#91948, going back to returning a
`UrlPartsBuilder`. It makes the code simpler, and also avoids some
allocations.
Currently it is passed an `fqp` slice which it calls `to_vec` on and
returns. This is a bit odd. It's better to let the call site clone if
necessary. (One call site does, one does not).
To avoids the early return and duplication of `Ok((url_parts, shortty,
fqp))`.
Most of the methods returning `impl Display` have `print` in their name.
This commit renames a few that didn't follow that convention.
It never fails, so it doesn't need to return `Result`. And the
`ItemType` in the result is just a copy of the one passed in via the
`shortty` arg, so it can also be removed.
Instead of a `Vec`, to avoid some allocations.
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. T-rustdoc-frontend Relevant to the rustdoc-frontend team, which will review and decide on the web UI/UX output. labels May 26, 2025
@rustbot
Copy link
Collaborator

rustbot commented May 26, 2025

⚠️ Warning ⚠️

  • There are issue links (such as #123) in the commit messages of the following commits.
    Please remove them as they will spam the issue with references to the commit.

@nnethercote
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 26, 2025
@bors
Copy link
Collaborator

bors commented May 26, 2025

⌛ Trying commit c51f167 with merge 9656ba7...

bors added a commit that referenced this pull request May 26, 2025
rustdoc: cleanups relating to allocations

These commits generally clean up the code a bit and also reduce allocation rates a bit.

r? `@camelid`
@bors
Copy link
Collaborator

bors commented May 26, 2025

☀️ Try build successful - checks-actions
Build commit: 9656ba7 (9656ba7f36cc1c5221657e832f653689644cb6b3)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (9656ba7): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.6% [-2.9%, -0.2%] 11
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.6% [-2.9%, -0.2%] 11

Max RSS (memory usage)

Results (primary -1.2%, secondary -2.7%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.2% [-1.2%, -1.2%] 1
Improvements ✅
(secondary)
-2.7% [-2.7%, -2.7%] 1
All ❌✅ (primary) -1.2% [-1.2%, -1.2%] 1

Cycles

Results (primary -2.8%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.8% [-2.8%, -2.8%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -2.8% [-2.8%, -2.8%] 1

Binary size

Results (primary -1.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.1% [-1.1%, -1.1%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.1% [-1.1%, -1.1%] 1

Bootstrap: 776.359s -> 777.253s (0.12%)
Artifact size: 366.28 MiB -> 366.23 MiB (-0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 26, 2025
Copy link
Contributor

@lolbinarycat lolbinarycat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 potential further optimization, 1 nit about readability, and 1 question about rationale.

@@ -1419,7 +1415,7 @@ pub(crate) fn visibility_print_with_space(item: &clean::Item, cx: &Context<'_>)
debug!("path={path:?}");
// modified from `resolved_path()` to work with `DefPathData`
let last_name = path.data.last().unwrap().data.get_opt_name().unwrap();
let anchor = anchor(vis_did, last_name, cx);
let anchor = print_anchor(vis_did, last_name, cx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would be worth it to use fmt::from_fn in visibility_print_with_space to eliminate another allocation? This function does get called fairly often.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does use fmt::from_fn, towards the bottom.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but do you think it would be worth it to move the main function body into the fmt::from_fn call? it should save at least an allocation, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not worth it from a perf perspective: this function doesn't show up as hot in profiles (I've been using profiles to guide all of this work) and I think it will only allocate in the else branch containing s.

But: worth it from a code cleanliness perspective, because it makes the code shorter and nicer. I've added a new commit doing it, thanks for the suggestion.

Comment on lines +656 to +662
let trait_ = impl_
.inner_impl()
.trait_
.as_ref()
.map(|trait_| format!("{:#}", trait_.print(cx)));
ret = Some(AliasSerializableImpl {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably document what the alternate display flag does (disables html escaping of < and >) and why we are using it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a comment explaining the "what", but not the "why" (because I don't know why).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disables html escaping of < and >

That's not really the core of it. The alternate format prints things (in this case, the trait's path) as plain text rather than as rich HTML. So as part of that, it doesn't use HTML escaping. But the main thing is that here we're generating plaintext rather than HTML.

I'm not exactly sure of the reason, though likely we're doing some kind of matching against the path in JS to know which aliased impl to load. Anyway, it's beside the point of this PR.

@@ -682,7 +680,7 @@ impl TypeAliasPart {
));

let part = OrderedJson::array_sorted(
impls.iter().map(OrderedJson::serialize).collect::<Result<Vec<_>, _>>().unwrap(),
impls.map(OrderedJson::serialize).collect::<Result<Vec<_>, _>>().unwrap(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI that this collect is also redundant,
this will work too:

Suggested change
impls.map(OrderedJson::serialize).collect::<Result<Vec<_>, _>>().unwrap(),
impls.map(|impl_| OrderedJson::serialize(impl_).unwrap()),

(though doesn't seem to affect perf on my local tests)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice suggestion! I have applied it.

- `ret` only ever gets at most one entry, so it can be an `Option`
  instead of a `Vec`.
- Which means we can use `filter_map` instead of `flat_map`.
- Move `trait_` next to the `ret` assignment, which can only happen
  once.
- No need for `impls` to be a `Vec`, it can remain an iterator.
- Avoid `Result` when collecting `impls`.
@nnethercote nnethercote force-pushed the rustdoc-alloc-cleanups branch from c51f167 to 8dde6d5 Compare May 26, 2025 22:27
Moving the visibility stuff into the `from_fn` avoids the `Cow` and
makes the code a little shorter and simpler.
@nnethercote
Copy link
Contributor Author

BTW, this is a case where selecting "Show non-relevant results" in the perf results is worthwhile. There are a number of sub-threshold improvements that I think are real, because they are so consistent.

@camelid
Copy link
Member

camelid commented May 28, 2025

Thanks! Looks good at a first glance, but will do a proper review by tomorrow or the next day.

Copy link
Member

@camelid camelid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks! Just one nit to clean up the iterator code.

Comment on lines +656 to +662
let trait_ = impl_
.inner_impl()
.trait_
.as_ref()
.map(|trait_| format!("{:#}", trait_.print(cx)));
ret = Some(AliasSerializableImpl {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disables html escaping of < and >

That's not really the core of it. The alternate format prints things (in this case, the trait's path) as plain text rather than as rich HTML. So as part of that, it doesn't use HTML escaping. But the main thing is that here we're generating plaintext rather than HTML.

I'm not exactly sure of the reason, though likely we're doing some kind of matching against the path in JS to know which aliased impl to load. Anyway, it's beside the point of this PR.

@camelid camelid removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 30, 2025
@camelid camelid added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label May 30, 2025
@nnethercote
Copy link
Contributor Author

@camelid: Thanks for the review. I've added a commit that addresses your comments.

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 30, 2025
@camelid
Copy link
Member

camelid commented May 30, 2025

@bors r+

@bors
Copy link
Collaborator

bors commented May 30, 2025

📌 Commit 68f3216 has been approved by camelid

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 30, 2025
@bors
Copy link
Collaborator

bors commented May 30, 2025

⌛ Testing commit 68f3216 with merge e6152cd...

@bors
Copy link
Collaborator

bors commented May 30, 2025

☀️ Test successful - checks-actions
Approved by: camelid
Pushing e6152cd to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 30, 2025
@bors bors merged commit e6152cd into rust-lang:master May 30, 2025
10 checks passed
@rustbot rustbot added this to the 1.89.0 milestone May 30, 2025
Copy link
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 6de3a73 (parent) -> e6152cd (this PR)

Test differences

Show 33327 test diffs

Stage 1

  • errors::verify_ast_passes_optional_trait_object_48: pass -> [missing] (J0)
  • errors::verify_codegen_ssa_bare_instruction_set_22: pass -> [missing] (J0)
  • errors::verify_codegen_ssa_expected_one_argument_29: pass -> [missing] (J0)
  • errors::verify_codegen_ssa_unable_to_exe_linker_50: pass -> [missing] (J0)
  • errors::verify_incremental_create_dep_graph_37: pass -> [missing] (J0)
  • errors::verify_incremental_undefined_clean_dirty_assertions_7: pass -> [missing] (J0)
  • errors::verify_metadata_failed_create_file_56: pass -> [missing] (J0)
  • errors::verify_parse_modifier_lifetime_145: pass -> [missing] (J0)
  • errors::verify_passes_ignored_attr_with_macro_6: pass -> [missing] (J0)
  • errors::verify_resolve_trait_impl_mismatch_58: pass -> [missing] (J0)
  • parser::tests::multiple_labels_without_message_2: pass -> [missing] (J0)
  • session_diagnostics::verify_attr_parsing_empty_confusables_30: pass -> [missing] (J0)
  • spec::tests::xtensa_esp32s2_none_elf: pass -> [missing] (J0)
  • array::array_try_from: pass -> [missing] (J1)
  • collections::binary_heap::test_in_place_iterator_specialization: pass -> [missing] (J1)
  • collections::btree::map::tests::test_range_inclusive_max_value: pass -> [missing] (J1)
  • collections::vec_deque::tests::test_drain: pass -> [missing] (J1)
  • collections::vec_deque::tests::test_try_reserve: pass -> [missing] (J1)
  • floats::f64::test_float_bits_conv: pass -> [missing] (J1)
  • fmt::num::test_format_int_exp_precision: pass -> [missing] (J1)
  • intrinsics::carrying_mul_add_fallback_i128: pass -> [missing] (J1)
  • io::tests::read_until: pass -> [missing] (J1)
  • num::i128::test_le: pass -> [missing] (J1)
  • num::i32::test_borrowing_sub: pass -> [missing] (J1)
  • num::i64::test_isolate_least_significant_one: pass -> [missing] (J1)
  • num::test_try_i64usize: pass -> [missing] (J1)
  • pattern::test_forward_search_shared_bytes: pass -> [missing] (J1)
  • sync::test_dead: pass -> [missing] (J1)
  • sys_common::wtf8::tests::wtf8_slice_from: pass -> [missing] (J1)
  • test_is_power_of_two_u8: pass -> [missing] (J1)
  • thread::tests::test_spawn_sched: pass -> [missing] (J1)
  • tuple::test_partial_ord: pass -> [missing] (J1)
  • ascii::short::case07_fake_simd_u32: pass -> [missing] (J2)
  • f16::test_powf: pass -> [missing] (J2)
  • path::bench_path_cmp_fast_path_buf_sort: pass -> [missing] (J2)
  • slice::sort_unstable_small_big: pass -> [missing] (J2)
  • sort::tests::stable::deterministic_i32_saw_mixed: pass -> [missing] (J2)
  • sort::tests::stable::observable_is_less_saw_mixed: pass -> [missing] (J2)
  • sort::tests::stable::stability_legacy: pass -> [missing] (J2)
  • sort::tests::unstable::panic_observable_is_less_saw_mixed: pass -> [missing] (J2)
  • string::bench_push_char_one_byte: pass -> [missing] (J2)
  • vec_deque::bench_into_iter_try_fold: pass -> [missing] (J2)
  • sort::tests::stable::violate_ord_retain_orig_set_cell_i32_random_d20: ignore -> [missing] (J3)

Stage 2

  • back::apple::tests::lookup_developer_dir: [missing] -> pass (J0)
  • errors::verify_expand_resolve_relative_path_6: [missing] -> pass (J0)
  • errors::verify_expand_wrong_fragment_kind_26: [missing] -> pass (J0)
  • errors::verify_metadata_symbol_conflicts_current_62: [missing] -> pass (J0)
  • errors::verify_resolve_tool_only_accepts_identifiers_57: [missing] -> pass (J0)
  • errors::verify_session_hexadecimal_float_literal_not_supported_39: [missing] -> pass (J0)
  • html::render::ordered_json::tests::escape_json_string_escaped: [missing] -> pass (J0)
  • lints::verify_lint_builtin_ellipsis_inclusive_range_patterns_34: [missing] -> pass (J0)
  • lints::verify_lint_metavariable_wrong_operator_120: [missing] -> pass (J0)
  • lints::verify_lint_tykind_59: [missing] -> pass (J0)
  • session_diagnostics::verify_attr_parsing_missing_since_3: [missing] -> pass (J0)
  • spec::tests::aarch64_apple_ios_sim: [missing] -> pass (J0)
  • spec::tests::sparc64_unknown_netbsd: [missing] -> pass (J0)
  • tests::test_unstable_options_tracking_hash: [missing] -> pass (J0)
  • atomic::int_and: [missing] -> pass (J1)
  • collections::binary_heap::test_in_place_iterator_specialization: [missing] -> pass (J1)
  • collections::btree::map::tests::test_values_mut_mutation: [missing] -> pass (J1)
  • floats::f16::test_min_nan: [missing] -> pass (J1)
  • floats::f32::test_infinity: [missing] -> pass (J1)
  • floats::f64::test_clamp_min_is_nan: [missing] -> pass (J1)
  • fmt::builders::debug_struct::test_nested: [missing] -> pass (J1)
  • io::buffered::tests::line_vectored_partial_and_errors: [missing] -> pass (J1)
  • mpmc::test_recv_into_iter_owned: [missing] -> pass (J1)
  • mpsc::drop_full_shared: [missing] -> pass (J1)
  • num::i128::test_unbounded_shl: [missing] -> pass (J1)
  • num::i16::test_saturating_abs: [missing] -> pass (J1)
  • num::i64::test_bitwise_operators: [missing] -> pass (J1)
  • num::int_sqrt::i128::isqrt: [missing] -> pass (J1)
  • num::test_try_usizei128: [missing] -> pass (J1)
  • rc::into_from_weak_raw: [missing] -> pass (J1)
  • rc::test_weak_count: [missing] -> pass (J1)
  • simd::testing: [missing] -> pass (J1)
  • slice::test_move_iterator: [missing] -> pass (J1)
  • sort::tests::stable::observable_is_less_random_z1: [missing] -> pass (J1)
  • str::slice_index::overflow::rangeinclusive::index_fail: [missing] -> pass (J1)
  • str::test_join_for_different_types: [missing] -> pass (J1)
  • str::test_trim_matches: [missing] -> pass (J1)
  • test_next_power_of_two_u64: [missing] -> pass (J1)
  • test_pop: [missing] -> pass (J1)
  • vec_deque::test_hash: [missing] -> pass (J1)
  • vec_deque::test_rotate_right_parts: [missing] -> pass (J1)
  • fmt::write_str_macro_debug_ascii: [missing] -> pass (J2)
  • fs::tests::recursive_mkdir_failure: [missing] -> pass (J2)
  • fs::tests::sync_doesnt_kill_anything: [missing] -> pass (J2)
  • fs::tests::test_eq_direntry_metadata: [missing] -> pass (J2)
  • iter::bench_cycle_take_ref_sum: [missing] -> pass (J2)
  • num::flt2dec::strategy::dragon::bench_big_exact_12: [missing] -> pass (J2)
  • num::flt2dec::strategy::dragon::bench_small_exact_12: [missing] -> pass (J2)
  • slice::rotate_huge_by9199_bytes: [missing] -> pass (J2)
  • sort::tests::unstable::correct_f128_saw_mixed: [missing] -> pass (J2)
  • str::split_ad_str::short_ascii: [missing] -> pass (J2)
  • sys::fd::unix::tests::limit_vector_count: [missing] -> pass (J2)
  • term::terminfo::parm::tests::test_multiple_int_constants: [missing] -> pass (J2)
  • term::terminfo::searcher::tests::test_get_dbpath_for_term: [missing] -> ignore (buildbots don't have ncurses installed and I can't mock everything I need) (J2)
  • vec::bench_in_place_xu32_0100_i0: [missing] -> pass (J2)
  • vec_deque::bench_from_array_1000: [missing] -> pass (J2)
  • vec_deque::test_try_reserve: [missing] -> ignore (J3)

(and 16606 additional test diffs)

Additionally, 16621 doctest diffs were found. These are ignored, as they are noisy.

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard e6152cdf5b31bd844a4cc1049433859d54863602 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. x86_64-apple-2: 4358.1s -> 2000.6s (-54.1%)
  2. x86_64-apple-1: 9944.6s -> 5277.2s (-46.9%)
  3. test-various: 4282.5s -> 2429.4s (-43.3%)
  4. x86_64-gnu-aux: 5906.4s -> 3933.8s (-33.4%)
  5. x86_64-gnu-nopt: 5679.1s -> 7055.8s (24.2%)
  6. aarch64-apple: 5299.0s -> 6484.2s (22.4%)
  7. dist-apple-various: 7770.7s -> 6055.8s (-22.1%)
  8. dist-aarch64-linux: 7778.6s -> 6411.0s (-17.6%)
  9. dist-aarch64-apple: 5886.4s -> 6492.6s (10.3%)
  10. mingw-check: 1250.8s -> 1350.6s (8.0%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (e6152cd): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.4% [-0.6%, -0.2%] 8
Improvements ✅
(secondary)
-0.6% [-0.6%, -0.6%] 1
All ❌✅ (primary) -0.4% [-0.6%, -0.2%] 8

Max RSS (memory usage)

Results (primary -2.6%, secondary 0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.6% [0.4%, 0.9%] 6
Improvements ✅
(primary)
-2.6% [-2.6%, -2.6%] 1
Improvements ✅
(secondary)
-1.1% [-2.1%, -0.4%] 3
All ❌✅ (primary) -2.6% [-2.6%, -2.6%] 1

Cycles

Results (secondary 0.8%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.0% [0.5%, 3.0%] 8
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.7% [-0.7%, -0.7%] 1
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: missing data
Artifact size: 370.26 MiB -> 370.19 MiB (-0.02%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. T-rustdoc-frontend Relevant to the rustdoc-frontend team, which will review and decide on the web UI/UX output.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants