Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid zero-length memcpy in formatting #85391

Merged
merged 2 commits into from
May 20, 2021

Conversation

Mark-Simulacrum
Copy link
Member

@Mark-Simulacrum Mark-Simulacrum commented May 17, 2021

This has two separate and somewhat orthogonal commits. The first change adjusts the ToString general impl for all types that implement Display; it no longer uses the full format machinery, rather directly falling onto a std::fmt::Display::fmt call. The second change directly adjusts the general core::fmt::write function which handles the production of format_args! to avoid zero-length push_str calls.

Both changes target the fact that push_str will still call memmove internally (or a similar function), as it doesn't know the length of the passed string. For zero-length strings in particular, this is quite expensive, and even for very short (several bytes long) strings, this is also expensive. Future work in this area may wish to have us fallback to write_char or similar, which may be cheaper on the (typically) short strings between the interpolated pieces in format_args!.

@rust-highfive
Copy link
Collaborator

r? @scottmcm

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 17, 2021
@Mark-Simulacrum
Copy link
Member Author

@bors try @rust-timer queue

It may be advantageous to consider guarding the write_str calls in the formatting machinery here with a nonzero length check too, or trying to rework formatting machinery to avoid empty pieces entirely, but those are less clear wins I suspect.

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 17, 2021
@bors
Copy link
Contributor

bors commented May 17, 2021

⌛ Trying commit 4bcad44febd1d5bed6d720c6fa314865bbf7e251 with merge 021776915c55787521fe6b579f88425c7d9bff19...

@bors
Copy link
Contributor

bors commented May 17, 2021

☀️ Try build successful - checks-actions
Build commit: 021776915c55787521fe6b579f88425c7d9bff19 (021776915c55787521fe6b579f88425c7d9bff19)

@rust-timer
Copy link
Collaborator

Queued 021776915c55787521fe6b579f88425c7d9bff19 with parent fe72845, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (021776915c55787521fe6b579f88425c7d9bff19): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 17, 2021
@joshtriplett
Copy link
Member

Nice performance win!

It seems unfortunate to have to have a public function just because it lives in core and needs to be called from alloc. Is there any way we could avoid exposing that function?

This avoids a zero-length write_str call, which boils down to a zero-length
memmove and ultimately costs quite a few instructions on some workloads.

This is approximately a 0.33% instruction count win on diesel-check.
@Mark-Simulacrum
Copy link
Member Author

I don't think exposing the function is necessarily avoidable, but it is worth noting there have been several requests to create/store Formatters; we may wish to expose that functionality publicly eventually - #74870, #46591. For now I've moved it to a method and marked it as unstable and doc(hidden) -- so we're barely exposing it.

@bors try @rust-timer queue -- added an is_empty guard on the push_str calls, we'll see if that helps or hurts here. Maybe that would be sufficient and we don't need the extra function at all, though I'd guess to_string() is a bit more efficient without going through the format machinery at all.

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 17, 2021
@bors
Copy link
Contributor

bors commented May 17, 2021

⌛ Trying commit c7c9336 with merge 1dcdf9bc01bf15dfd254ba6643d4747b20b4cb16...

@Mark-Simulacrum Mark-Simulacrum changed the title [WIP] Optimize default ToString impl Optimize default ToString impl May 17, 2021
@Mark-Simulacrum Mark-Simulacrum changed the title Optimize default ToString impl Avoid zero-length memcpy in formatting May 17, 2021
@bors
Copy link
Contributor

bors commented May 17, 2021

☀️ Try build successful - checks-actions
Build commit: 1dcdf9bc01bf15dfd254ba6643d4747b20b4cb16 (1dcdf9bc01bf15dfd254ba6643d4747b20b4cb16)

@rust-timer
Copy link
Collaborator

Queued 1dcdf9bc01bf15dfd254ba6643d4747b20b4cb16 with parent 44ec846, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (1dcdf9bc01bf15dfd254ba6643d4747b20b4cb16): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 17, 2021
@Mark-Simulacrum
Copy link
Member Author

Alright, that seemed to just make this more of an improvement. I'll try an intermediate version (i.e., just the second commit) later too, and then we can decide which pieces to land.

@Mark-Simulacrum
Copy link
Member Author

@bors try @rust-timer queue

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 17, 2021
@bors
Copy link
Contributor

bors commented May 17, 2021

⌛ Trying commit 42f56563152f3ce79b0dffe7b45fb531d6bce19e with merge 3caeb07804c31d2ad5de523df7e6348f65c4cf07...

@bors
Copy link
Contributor

bors commented May 17, 2021

☀️ Try build successful - checks-actions
Build commit: 3caeb07804c31d2ad5de523df7e6348f65c4cf07 (3caeb07804c31d2ad5de523df7e6348f65c4cf07)

@rust-timer
Copy link
Collaborator

Queued 3caeb07804c31d2ad5de523df7e6348f65c4cf07 with parent 9964284, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (3caeb07804c31d2ad5de523df7e6348f65c4cf07): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 17, 2021
@Mark-Simulacrum
Copy link
Member Author

Mark-Simulacrum commented May 17, 2021

Seems like both optimizations are complementary, so I think we should do both. Combined perf results are here, the first commit is evaluated individually here and the second one here.

Edit: Also updated the summary.

@scottmcm
Copy link
Member

This change seems reasonable, given the perf results, so
@bors r+

Does make me wonder what it is about the *-doc builds that they're hitting this so consistently, though...

@bors
Copy link
Contributor

bors commented May 18, 2021

📌 Commit c7c9336 has been approved by scottmcm

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 18, 2021
@Mark-Simulacrum
Copy link
Member Author

-doc builds are just generating documentation which is naturely more dominated by string formatting; this is likely to improve the performance of any string formatting which has a zero-length string in between or before the first interpolated (e.g., {}) piece.

@bors
Copy link
Contributor

bors commented May 19, 2021

⌛ Testing commit c7c9336 with merge d8f7764a4ffcb67338b1152661f9f318c3e16b35...

@bors
Copy link
Contributor

bors commented May 19, 2021

💔 Test failed - checks-actions

@rust-log-analyzer
Copy link
Collaborator

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 19, 2021
@scottmcm
Copy link
Member

MacOS builder lost networking.

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 19, 2021
@bors
Copy link
Contributor

bors commented May 20, 2021

⌛ Testing commit c7c9336 with merge a426fc3...

@bors
Copy link
Contributor

bors commented May 20, 2021

☀️ Test successful - checks-actions
Approved by: scottmcm
Pushing a426fc3 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 20, 2021
@bors bors merged commit a426fc3 into rust-lang:master May 20, 2021
@rustbot rustbot added this to the 1.54.0 milestone May 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants