Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of fmt::FormattingOptions #118159

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

EliasHolzmann
Copy link
Contributor

@EliasHolzmann EliasHolzmann commented Nov 22, 2023

Tracking issue: #118117

Public API:

#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub struct FormattingOptions {}
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub enum Sign {
    Plus, 
    Minus
}
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub enum DebugAsHex {
    Lower,
    Upper
}

impl FormattingOptions {
    pub fn new() -> Self;
    pub fn sign(&mut self, sign: Option<Sign>) -> &mut Self;
    pub fn sign_aware_zero_pad(&mut self, sign_aware_zero_pad: bool) -> &mut Self;
    pub fn alternate(&mut self, alternate: bool) -> &mut Self;
    pub fn fill(&mut self, fill: char) -> &mut Self;
    pub fn align(&mut self, alignment: Option<Alignment>) -> &mut Self;
    pub fn width(&mut self, width: Option<usize>) -> &mut Self;
    pub fn precision(&mut self, precision: Option<usize>) -> &mut Self;
    pub fn debug_as_hex(&mut self, debug_as_hex: Option<DebugAsHex>) -> &mut Self;

    pub fn get_sign(&self) -> Option<Sign>;
    pub fn get_sign_aware_zero_pad(&self) -> bool;
    pub fn get_alternate(&self) -> bool;
    pub fn get_fill(&self) -> char;
    pub fn get_align(&self) -> Option<Alignment>;
    pub fn get_width(&self) -> Option<usize>;
    pub fn get_precision(&self) -> Option<usize>;
    pub fn get_debug_as_hex(&self) -> Option<DebugAsHex>;

    pub fn create_formatter<'a>(self, write: &'a mut (dyn Write + 'a)) -> Formatter<'a>;
    }

impl<'a> Formatter<'a> {
    pub fn new(write: &'a mut (dyn Write + 'a), options: FormattingOptions) -> Self;
    pub fn with_options<'b>(&'b mut self, options: FormattingOptions) -> Formatter<'b>;
    pub fn sign(&self) -> Option<Sign>;

    pub fn options(&self) -> FormattingOptions;
}

Relevant changes from the public API in the tracking issue (I'm leaving out some stuff I consider obvious mistakes, like missing #[derive(..)]s and pub specifiers):

  • enum DebugAsHex/FormattingOptions::debug_as_hex/FormattingOptions::get_debug_as_hex: To support {:x?} as well as {:X?}. I had completely missed these options in the ACP. I'm open for any and all bikeshedding, not married to the name.
  • fill/get_fill now takes/returns char instead of Option<char>. This simply mirrors what Formatter::fill returns (with default being ' ').
  • Changed zero_pad/get_zero_pad to sign_aware_zero_pad/get_sign_aware_zero_pad. This also mirrors Formatter::sign_aware_zero_pad. While I'm not a fan of this quite verbose name, I do believe that having the interface of Formatter and FormattingOptions be compatible is more important.
  • For the same reason, renamed alignment/get_alignment to aling/get_align.
  • Deviating from my initial idea, Formatter::with_options returns a Formatter which has the lifetime of the self reference as its generic lifetime parameter (in the original API spec, the generic lifetime of the returned Formatter was the generic lifetime used by self instead). Otherwise, one could construct two Formatters that both mutably borrow the same underlying buffer, which would be unsound. This solution still has performance benefits over simply using Formatter::new, so I believe it is worthwhile to keep this method.

@rustbot
Copy link
Collaborator

rustbot commented Nov 22, 2023

r? @thomcc

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Nov 22, 2023
@rust-log-analyzer

This comment has been minimized.

@EliasHolzmann
Copy link
Contributor Author

Looks like this broke some compiler tests. Will fix this evening, converting to draft PR for now.

@EliasHolzmann EliasHolzmann marked this pull request as draft November 22, 2023 10:36
@rust-log-analyzer

This comment has been minimized.

@EliasHolzmann
Copy link
Contributor Author

CI is green -> Converting back to "normal" (non-draft) PR.

@EliasHolzmann EliasHolzmann marked this pull request as ready for review November 22, 2023 23:27
@thomcc
Copy link
Member

thomcc commented Nov 24, 2023

I already have a pretty big review backlog and haven't been following this change (so I'd have to read the ACP to figure out how it's supposed to behave), so I'm going to reroll for now, sorry.

r? libs

@rustbot rustbot assigned m-ou-se and unassigned thomcc Nov 24, 2023
@rust-log-analyzer

This comment has been minimized.

@EliasHolzmann
Copy link
Contributor Author

Fixed two more issues I've stumbled onto while playing around with the PR changes (see commit descriptions for details). I'll fix the PR description to represent the current API.

@bors
Copy link
Contributor

bors commented Dec 30, 2023

☔ The latest upstream changes (presumably #116012) made this pull request unmergeable. Please resolve the merge conflicts.

@ThePuzzlemaker

This comment was marked as resolved.

@m-ou-se
Copy link
Member

m-ou-se commented Feb 15, 2024

enum DebugAsHex/FormattingOptions::debug_as_hex/FormattingOptions::get_debug_as_hex: To support {:x?} as well as {:X?}. I had completely missed these options in the ACP. I'm open for any and all bikeshedding, not married to the name.

We haven't decided yet what to do with the "debug as hex" flags. It's fine to have them as unstable, but we shouldn't stabilize an interface for those flags without a separate discussion. I've added it as an unresolved question to the tracking issue.

Comment on lines -247 to +513
flags: u32,
fill: char,
align: rt::Alignment,
width: Option<usize>,
precision: Option<usize>,
options: FormattingOptions,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes a Formatter object bigger than before, because the flags field is now stored in the new FormattingOptions as separate fields.

We should check if that doesn't have any negative impact on performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes a Formatter object bigger than before

Are you sure? Before, flags was a u32, so 4 bytes. With each flag in its own field, there are two bools and two small enums wrapped into Option – both should be one byte per field, so 4 bytes in total as well. Unless I'm not seeing something here, both the original code and the new implementation have the same size.

However, the benchmark shows a performance regression, and when I realized that Formatter didn't get bigger and therefore, flags may not be the culprit, I was already halfway through refactoring FormattingOptions to use a bitmask for flags (like it did before). So, either way, let's test that, it might help even if a difference in size is not the cause of the performance regression. Could you please rerun the benchmark?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure?

No :)

Before, flags was a u32, so 4 bytes. With each flag in its own field, there are two bools and two small enums wrapped into Option – both should be one byte per field, so 4 bytes in total as well.

Good point!

I think that in the near future we might want to put more things in the fields flag though. At least the alignment can easiliy fit in there, and the Some/None flag of the width and precision could fit in there as well:

pub struct FormattingOptions {
    flags: u32, // sign, zero pad, alt, debug-as-hex, width flag, precision flag, alignment
    fill: char,
    width: usize, // only used if width flag is set.
    precision: usize, // only used if width flag is set.
}

But maybe that's overkill.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pub struct FormattingOptions {
    flags: u32, // sign, zero pad, alt, debug-as-hex, width flag, precision flag, alignment
    fill: char,
    width: usize, // only used if width flag is set.
    precision: usize, // only used if width flag is set.
}

well, fill has lots of spare bits, you could probably merge it with flags, where flags is just the upper bits and fill is the lower 21 bits. though this would only be beneficial on <= 32-bit systems unless width and precision were changed to be u32 or the struct was #[repr(packed(4))].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that there is definitely potential for further optimization here. I'll leave the solution using flags – while it is a bit harder to read, the possible performance benefits in the future seem more important to me.

The benchmark still reports a small performance regression (~0.1 %). While I don't think this is a big problem, it would still be nice if this PR would not introduce any performance regressions. To make the code more maintainable, I changed all internal accesses to formatting options to use the getters/setters on FormattingOptions instead of direct field access. I suspect this is at least a major factor in the performance regression. I have reverted this now, can you please rerun the benchmark @m-ou-se?

library/core/src/fmt/mod.rs Outdated Show resolved Hide resolved
library/core/src/fmt/mod.rs Outdated Show resolved Hide resolved
library/core/src/fmt/mod.rs Outdated Show resolved Hide resolved
library/core/src/fmt/mod.rs Outdated Show resolved Hide resolved
@m-ou-se
Copy link
Member

m-ou-se commented Feb 15, 2024

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@m-ou-se m-ou-se added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 15, 2024
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 15, 2024
@bors
Copy link
Contributor

bors commented Feb 15, 2024

⌛ Trying commit 385944a with merge 5085172...

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 15, 2024
…<try>

Implementation of `fmt::FormatttingOptions`

Tracking issue: rust-lang#118117

Public API:
```rust
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub struct FormattingOptions { … }
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub enum Sign {
    Plus,
    Minus
}
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub enum DebugAsHex {
    Lower,
    Upper
}

impl FormattingOptions {
    pub fn new() -> Self;
    pub fn sign(&mut self, sign: Option<Sign>) -> &mut Self;
    pub fn sign_aware_zero_pad(&mut self, sign_aware_zero_pad: bool) -> &mut Self;
    pub fn alternate(&mut self, alternate: bool) -> &mut Self;
    pub fn fill(&mut self, fill: char) -> &mut Self;
    pub fn align(&mut self, alignment: Option<Alignment>) -> &mut Self;
    pub fn width(&mut self, width: Option<usize>) -> &mut Self;
    pub fn precision(&mut self, precision: Option<usize>) -> &mut Self;
    pub fn debug_as_hex(&mut self, debug_as_hex: Option<DebugAsHex>) -> &mut Self;

    pub fn get_sign(&self) -> Option<Sign>;
    pub fn get_sign_aware_zero_pad(&self) -> bool;
    pub fn get_alternate(&self) -> bool;
    pub fn get_fill(&self) -> char;
    pub fn get_align(&self) -> Option<Alignment>;
    pub fn get_width(&self) -> Option<usize>;
    pub fn get_precision(&self) -> Option<usize>;
    pub fn get_debug_as_hex(&self) -> Option<DebugAsHex>;

    pub fn create_formatter<'a>(self, write: &'a mut (dyn Write + 'a)) -> Formatter<'a>;
    }

impl<'a> Formatter<'a> {
    pub fn new(write: &'a mut (dyn Write + 'a), options: FormattingOptions) -> Self;
    pub fn with_options<'b>(&'b mut self, options: FormattingOptions) -> Formatter<'b>;
    pub fn sign(&self) -> Option<Sign>;

    pub fn options(&self) -> FormattingOptions;
}
```

Relevant changes from the public API in the tracking issue (I'm leaving out some stuff I consider obvious mistakes, like missing `#[derive(..)]`s and `pub` specifiers):

- `enum DebugAsHex`/`FormattingOptions::debug_as_hex`/`FormattingOptions::get_debug_as_hex`: To support `{:x?}` as well as `{:X?}`. I had completely missed these options in the ACP. I'm open for any and all bikeshedding, not married to the name.
- `fill`/`get_fill` now takes/returns `char` instead of `Option<char>`. This simply mirrors what `Formatter::fill` returns (with default being `' '`).
- Changed `zero_pad`/`get_zero_pad` to `sign_aware_zero_pad`/`get_sign_aware_zero_pad`. This also mirrors `Formatter::sign_aware_zero_pad`. While I'm not a fan of this quite verbose name, I do believe that having the interface of `Formatter` and `FormattingOptions` be compatible is more important.
- For the same reason, renamed `alignment`/`get_alignment` to `aling`/`get_align`.
- Deviating from my initial idea, `Formatter::with_options` returns a `Formatter` which has the lifetime of the `self` reference as its generic lifetime parameter (in the original API spec, the generic lifetime of the returned `Formatter` was the generic lifetime used by `self` instead). Otherwise, one could construct two `Formatter`s that both mutably borrow the same underlying buffer, which would be unsound. This solution still has performance benefits over simply using `Formatter::new`, so I believe it is worthwhile to keep this method.
@bors
Copy link
Contributor

bors commented Feb 15, 2024

☀️ Try build successful - checks-actions
Build commit: 5085172 (508517284df836989dd91522841c917d69eac778)

@m-ou-se m-ou-se added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Feb 20, 2024
EliasHolzmann added a commit to EliasHolzmann/rust that referenced this pull request Mar 3, 2024
@EliasHolzmann
Copy link
Contributor Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 3, 2024
@bors
Copy link
Contributor

bors commented Mar 5, 2024

☔ The latest upstream changes (presumably #121001) made this pull request unmergeable. Please resolve the merge conflicts.

EliasHolzmann added a commit to EliasHolzmann/rust that referenced this pull request Mar 5, 2024
EliasHolzmann added a commit to EliasHolzmann/rust that referenced this pull request Mar 5, 2024
@bors
Copy link
Contributor

bors commented Mar 8, 2024

☔ The latest upstream changes (presumably #122059) made this pull request unmergeable. Please resolve the merge conflicts.

EliasHolzmann and others added 18 commits March 8, 2024 15:57
The idea behind this is to make implementing `fmt::FormattingOptions` (as well
as any future changes to `std::Formatter`) easier.

In theory, this might have a negative performance impact because of the
additional function calls. However, I strongly believe that those will be
inlined anyway, thereby producing assembly code that has comparable performance.
This allows to build custom `std::Formatter`s at runtime.

Also added some related enums and two related methods on `std::Formatter`.
Formatter::with_options takes self as a mutable reference (`&'a mut
Formatter<'b>`). `'a` and `'b` need to be different lifetimes. Just taking `&'a
mut Formatter<'a>` and trusting in Rust being able to implicitely convert from
`&'a mut Formatter<'b>` if necessary (after all, `'a` must be smaller than `'b`
anyway) fails because `'b` is behind a *mutable* reference. For background on
on this behavior, see https://doc.rust-lang.org/nomicon/subtyping.html#variance.
Likewise for `get_alignment`. This is how the method is named on `Formatter`, I
want to keep it consistent.
Co-authored-by: Mara Bos <m-ou.se@m-ou.se>
@bors
Copy link
Contributor

bors commented Apr 23, 2024

☔ The latest upstream changes (presumably #124271) made this pull request unmergeable. Please resolve the merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants