Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leverage fmt::Arguments::as_str in the write! macro #100700

Closed
wants to merge 1 commit into from

Conversation

Kobzol
Copy link
Contributor

@Kobzol Kobzol commented Aug 17, 2022

As noted by @CAD97 here, we could use fmt::Argument::as_str() to slightly optimize the performance of write!(wr, "constant") by using as_str(). It's a small band-aid to the problem of lackluster write! performance, but it could be worth it.

I did some experiments with as_str in io and fmt's write_fmt implementation, and it seems that for this specific use-case it's enough to include it in fmt (benchmarks are located in library/core/benches/fmt.rs):

Original code
test fmt::write_str_macro1                                         ... bench:      13,047 ns/iter (+/- 1,114)
test fmt::write_str_macro2                                         ... bench:      14,810 ns/iter (+/- 602)
test fmt::write_str_macro_debug                                    ... bench:     338,593 ns/iter (+/- 26,282)
test fmt::write_str_macro_debug_ascii                              ... bench:     134,188 ns/iter (+/- 12,341)
test fmt::write_str_ref                                            ... bench:       5,739 ns/iter (+/- 502)
test fmt::write_str_value                                          ... bench:       5,684 ns/iter (+/- 362)
as_str in io
test fmt::write_str_macro1                                         ... bench:      12,602 ns/iter (+/- 497)
test fmt::write_str_macro2                                         ... bench:      14,743 ns/iter (+/- 315)
test fmt::write_str_macro_debug                                    ... bench:     333,080 ns/iter (+/- 13,721)
test fmt::write_str_macro_debug_ascii                              ... bench:     132,502 ns/iter (+/- 5,207)
test fmt::write_str_ref                                            ... bench:       5,276 ns/iter (+/- 169)
test fmt::write_str_value                                          ... bench:       5,120 ns/iter (+/- 113)
as_str in fmt
test fmt::write_str_macro1                                         ... bench:       6,898 ns/iter (+/- 298)
test fmt::write_str_macro2                                         ... bench:      15,166 ns/iter (+/- 284)
test fmt::write_str_macro_debug                                    ... bench:     366,461 ns/iter (+/- 9,659)
test fmt::write_str_macro_debug_ascii                              ... bench:     130,946 ns/iter (+/- 6,068)
test fmt::write_str_ref                                            ... bench:       5,262 ns/iter (+/- 204)
test fmt::write_str_value                                          ... bench:       5,267 ns/iter (+/- 161)
as_str in io + fmt
test fmt::write_str_macro1                                         ... bench:       6,644 ns/iter (+/- 151)
test fmt::write_str_macro2                                         ... bench:      14,595 ns/iter (+/- 309)
test fmt::write_str_macro_debug                                    ... bench:     371,741 ns/iter (+/- 11,651)
test fmt::write_str_macro_debug_ascii                              ... bench:     130,821 ns/iter (+/- 3,440)
test fmt::write_str_ref                                            ... bench:       5,062 ns/iter (+/- 209)
test fmt::write_str_value                                          ... bench:       5,058 ns/iter (+/- 151)

I included more benchmark results here, but the interesting one is fmt::write_str_macro1, which becomes ~twice as fast with this change. I benchmarked the code with disabled hyper-threading, ASLR and turbo-boost and enabled performance mode, but it's still quite noisy. Results are from (laptop) Zen 3.

Related issue: #10761

r? @m-ou-se

@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Aug 17, 2022
@rustbot
Copy link
Collaborator

rustbot commented Aug 17, 2022

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 17, 2022
@CAD97
Copy link
Contributor

CAD97 commented Aug 17, 2022

for this specific use-case it's enough to include it in fmt

Likely this is due to fmt::Write::write getting inlined into io::Write::write... though it may also be because we're just measuring fmt::Write and don't have a benchmark for going through io::Write. Either way, it's a small bit of code to include, so it shouldn't hurt to include it.

@m-ou-se
Copy link
Member

m-ou-se commented Sep 28, 2022

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 28, 2022
@bors
Copy link
Contributor

bors commented Sep 28, 2022

⌛ Trying commit 23904e5 with merge 175bd086b773bcf4505022b8ae1192cb831763de...

@bors
Copy link
Contributor

bors commented Sep 28, 2022

☀️ Try build successful - checks-actions
Build commit: 175bd086b773bcf4505022b8ae1192cb831763de (175bd086b773bcf4505022b8ae1192cb831763de)

@rust-timer
Copy link
Collaborator

Queued 175bd086b773bcf4505022b8ae1192cb831763de with parent 09ae784, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (175bd086b773bcf4505022b8ae1192cb831763de): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
0.7% [0.3%, 1.4%] 5
Regressions ❌
(secondary)
1.1% [0.4%, 1.6%] 9
Improvements ✅
(primary)
-0.6% [-0.6%, -0.6%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.5% [-0.6%, 1.4%] 6

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
7.2% [2.1%, 12.4%] 2
Regressions ❌
(secondary)
2.7% [2.6%, 2.8%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 7.2% [2.1%, 12.4%] 2

Footnotes

  1. the arithmetic mean of the percent change 2

  2. number of relevant changes 2

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Sep 28, 2022
@m-ou-se m-ou-se added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 29, 2022
@m-ou-se
Copy link
Member

m-ou-se commented Sep 29, 2022

Given these slightly-worse-than-expected results, do you still think it's worth merging this?

@Kobzol
Copy link
Contributor Author

Kobzol commented Sep 29, 2022

The doc regressions look real, which tells that this is not worth it. Would be nice to confirm this with a runtime benchmark, but they are still WIP. I'll close this for now.

@Kobzol Kobzol closed this Sep 29, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 15, 2024
perf: improve write_fmt to handle simple strings

Per `@dtolnay` suggestion in serde-rs/serde#2697 (comment) - attempt to speed up performance in the cases of a simple string format without arguments:

```rust
write!(f, "text")  ->  f.write_str("text")
```

```diff
+ #[inline]
  pub fn write_fmt(&mut self, f: fmt::Arguments) -> fmt::Result {
+     if let Some(s) = f.as_str() {
+         self.buf.write_str(s)
+     } else {
          write(self.buf, f)
+     }
  }
```

* Hopefully it will improve the simple case for the rust-lang#99012
* Another related (original?) issues rust-lang#10761
* Previous similar attempt to fix it by by `@Kobzol` rust-lang#100700

CC: `@m-ou-se` as probably the biggest expert in everything `format!`
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 20, 2024
perf: improve write_fmt to handle simple strings

In case format string has no arguments, simplify its implementation with a direct call to `output.write_str(value)`. This builds on `@dtolnay` original [suggestion](serde-rs/serde#2697 (comment)). This does not change any expectations because the original `fn write()` implementation calls `write_str` for parts of the format string.

```rust
write!(f, "text")  ->  f.write_str("text")
```

```diff
 /// [`write!`]: crate::write!
+#[inline]
 #[stable(feature = "rust1", since = "1.0.0")]
 pub fn write(output: &mut dyn Write, args: Arguments<'_>) -> Result {
+    if let Some(s) = args.as_str() { output.write_str(s) } else { write_internal(output, args) }
+}
+
+/// Actual implementation of the [`write`], but without the simple string optimization.
+fn write_internal(output: &mut dyn Write, args: Arguments<'_>) -> Result {
     let mut formatter = Formatter::new(output);
     let mut idx = 0;
```

* Hopefully it will improve the simple case for the rust-lang#99012
* Another related (original?) issues rust-lang#10761
* Previous similar attempt to fix it by by `@Kobzol` rust-lang#100700

CC: `@m-ou-se` as probably the biggest expert in everything `format!`
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 21, 2024
perf: improve write_fmt to handle simple strings

In case format string has no arguments, simplify its implementation with a direct call to `output.write_str(value)`. This builds on `@dtolnay` original [suggestion](serde-rs/serde#2697 (comment)). This does not change any expectations because the original `fn write()` implementation calls `write_str` for parts of the format string.

```rust
write!(f, "text")  ->  f.write_str("text")
```

```diff
 /// [`write!`]: crate::write!
+#[inline]
 #[stable(feature = "rust1", since = "1.0.0")]
 pub fn write(output: &mut dyn Write, args: Arguments<'_>) -> Result {
+    if let Some(s) = args.as_str() { output.write_str(s) } else { write_internal(output, args) }
+}
+
+/// Actual implementation of the [`write`], but without the simple string optimization.
+fn write_internal(output: &mut dyn Write, args: Arguments<'_>) -> Result {
     let mut formatter = Formatter::new(output);
     let mut idx = 0;
```

* Hopefully it will improve the simple case for the rust-lang#99012
* Another related (original?) issues rust-lang#10761
* Previous similar attempt to fix it by by `@Kobzol` rust-lang#100700

CC: `@m-ou-se` as probably the biggest expert in everything `format!`
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 5, 2024
perf: improve write_fmt to handle simple strings

In case format string has no arguments, simplify its implementation with a direct call to `output.write_str(value)`. This builds on `@dtolnay` original [suggestion](serde-rs/serde#2697 (comment)). This does not change any expectations because the original `fn write()` implementation calls `write_str` for parts of the format string.

```rust
write!(f, "text")  ->  f.write_str("text")
```

```diff
 /// [`write!`]: crate::write!
+#[inline]
 #[stable(feature = "rust1", since = "1.0.0")]
 pub fn write(output: &mut dyn Write, args: Arguments<'_>) -> Result {
+    if let Some(s) = args.as_str() { output.write_str(s) } else { write_internal(output, args) }
+}
+
+/// Actual implementation of the [`write`], but without the simple string optimization.
+fn write_internal(output: &mut dyn Write, args: Arguments<'_>) -> Result {
     let mut formatter = Formatter::new(output);
     let mut idx = 0;
```

* Hopefully it will improve the simple case for the rust-lang#99012
* Another related (original?) issues rust-lang#10761
* Previous similar attempt to fix it by by `@Kobzol` rust-lang#100700

CC: `@m-ou-se` as probably the biggest expert in everything `format!`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants