Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upAdd fmt size_hints #583
Conversation
This comment has been minimized.
This comment has been minimized.
erickt
commented
Jan 14, 2015
|
I'm all for adding a hint for formatting. It would be nice in this RFC to mention unifying this with iterator size hints. cc-ing @kballard and @thestinger, who I believe collaborated on adding Does anyone know of anything using the #46 is somewhat related to this RFC. |
This comment has been minimized.
This comment has been minimized.
|
I don't know offhand of anyone using the All that said, if the |
This comment has been minimized.
This comment has been minimized.
|
A drawback to this proposal: all manual implementations of Something to consider: A macro like |
lilyball
reviewed
Jan 14, 2015
| SizeHint { | ||
| min: self.min + other.min, | ||
| max: match (self.max, other.max) { | ||
| (Some(left), Some(right)) => Some(left + right), |
This comment has been minimized.
This comment has been minimized.
lilyball
Jan 14, 2015
Contributor
This implementation doesn't handle overflow. It needs to look something like
SizeHint {
min: self.min.saturating_add(other.min),
max: if let (Some(left), Some(right)) = (self.max, other.max) {
Some(left.checked_add(right))
} else {
None
}
}
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Offhand, I am in favor of making formatting strings more efficient, especially if that makes |
This comment has been minimized.
This comment has been minimized.
|
That's exactly what I had thought with the range: if the range is large, it I looked at Iters size_hint for design, but they're a little different. @kballard perhaps debug_asserts could be added to ensure at the end that On Wed, Jan 14, 2015, 11:12 AM Kevin Ballard notifications@github.com
|
alexcrichton
reviewed
Jan 14, 2015
| } | ||
| ``` | ||
|
|
||
| Add a `SizeHint` type, with named properties, instead of using tuple indexing. Include an `Add` implementation for `SizeHint`, so they can be easily added together from nested properties. |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jan 14, 2015
Member
This is an interesting deviation from the size hints of iterators. I would personally expect the two to have the same signature, and this may want to at least discuss the discrepancy between the two return values.
This comment has been minimized.
This comment has been minimized.
seanmonstar
Jan 14, 2015
Author
Contributor
It started because I wanted to implement Add, and it felt odd to impl on such a generic tuple.
alexcrichton
reviewed
Jan 14, 2015
|
|
||
| # Drawbacks | ||
|
|
||
| I can't think of a reason to stop this. |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jan 14, 2015
Member
There's always at least a drawback of "this is extra backwards-compatible work before 1.0" as this RFC will require both implementation effort as well as design/review effort to push through.
alexcrichton
reviewed
Jan 14, 2015
| test bench_short_memcpy ... bench: 33 ns/iter (+/- 3) | ||
| ``` | ||
|
|
||
| # Detailed design |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jan 14, 2015
Member
I think this section is currently missing details about how this is actually going to be implemented in relation to format_args! unless we're only optimizing the .to_string() case. For example, these two calls are fairly different in the paths that they take:
foo.to_string();
format!("{}", foo);In the first case, we call to_string on the ToString trait which has a reference to the concrete type T, allowing an invocation of size_hint to preallocate the String. In the latter case we call std::fmt::format with an Arguments structure. In Arguments the type T has been erased, and there is not an obvious location to call .size_hint().
I would be curious to see more details about how this is handled and how far these size hints are being propagated.
This comment has been minimized.
This comment has been minimized.
pczarn
Feb 5, 2015
The format_args! macro can get size hints of all template pieces and pass the sum to Arguments. I think it's not worthwhile. format! is often used for making short-lived strings, for which size hints would only add considerable bloat. In other cases, better write! to a preallocated buffer or call shrink_to_fit afterwards.
This comment has been minimized.
This comment has been minimized.
seanmonstar
Feb 6, 2015
Author
Contributor
@pczarn the benchmarks say differently. Even smaller strings receive a noticeable improvement.
This comment has been minimized.
This comment has been minimized.
|
I've got a different approach that does not change the API of With this approach I'm seeing similar numbers for your
|
seanmonstar
added some commits
Jan 15, 2015
This comment has been minimized.
This comment has been minimized.
|
@kballard @alexcrichton I've updated the RFC with your comments. |
This comment has been minimized.
This comment has been minimized.
lilyball
commented on text/0000-fmt-size-hint.md in 5e0ad26
Jan 15, 2015
|
Well, it focuses on improvements only when the formatting operation ends up making a single call to Ultimately, the idea is that, if the format destination is going to |
This comment has been minimized.
This comment has been minimized.
|
Please benchmark number formatting. rust-lang/rust#19218 |
alexcrichton
reviewed
Jan 16, 2015
| value: &'a Void, | ||
| formatter: fn(&Void, &mut Formatter) -> Result, | ||
| hint: fn(&Void) -> SizeHint, | ||
| } |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jan 16, 2015
Member
It may be worth profiling the impact of this representation as it's inflating the size from 2 words to 3 words. Formatting is normally not necessarily on hot code paths, but it does affect code size and stack size and we've had to optimize it in the past. In theory an Argument is just a trait object so the formatter/hint pair could point to a "vtable" which could just be constructed manually instead of as a trait object itself.
Regardless though, it's a pretty minor point, just something to think about :)
This comment has been minimized.
This comment has been minimized.
seanmonstar
Jan 16, 2015
Author
Contributor
I was thinking about this as well. Currently, any function can be passed here, but if we didn't want that flexibility, we could create an enum Format { Show, String, LowerHex, ... } and change Argument to be struct Argument { formatter: Format, value: &Void }.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jan 16, 2015
Member
At some point we do need to force rustc to monomorphize an implementation of Show as how would you actually call the fmt function given just formatter and value? (e.g. a function pointer needs to be somewhere)
This comment has been minimized.
This comment has been minimized.
seanmonstar
Jan 16, 2015
Author
Contributor
I imagined being able to look them up from the enum:
impl Format {
fn get_fns(&self) -> (fn(&Void, &mut Formatter) -> Result, fn(&Void) -> SizeHint) {
match *self {
Format::Show => (Show::fmt, Show::size_hint),
// ...
}
}
}
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jan 16, 2015
Member
You'll need type inference to drive inference for the Self type of Show::fmt:
use std::fmt::Show;
fn main() {
let f = Show::fmt;
}foo.rs:4:13: 4:22 error: type annotations required: cannot resolve `_ : core::fmt::Show`
foo.rs:4 let f = Show::fmt;
^~~~~~~~~
foo.rs:4:13: 4:22 note: required by `core::fmt::Show::fmt`
foo.rs:4 let f = Show::fmt;
^~~~~~~~~
error: aborting due to previous error
This comment has been minimized.
This comment has been minimized.
|
Thanks for the additions @seanmonstar! Could you also add a part explicitly saying that everything will be |
This comment has been minimized.
This comment has been minimized.
reem
commented
Jan 16, 2015
|
I've also wondered if we could just add length hinting methods to |
This comment has been minimized.
This comment has been minimized.
|
@reem By the time you can call that, the buffer has already been allocated. And if the object has properties that are also going to be formatted, those properties can't tell the Formatter their hint until part way through the formatting. |
This comment has been minimized.
This comment has been minimized.
|
@alexcrichton I added unstable attributes to the struct and method. |
seanmonstar
reviewed
Jan 17, 2015
| ```rust | ||
| impl<T: fmt::String> ToString for T { | ||
| fn to_string(&self) -> String { | ||
| format!("{}", self) |
This comment has been minimized.
This comment has been minimized.
seanmonstar
Jan 17, 2015
Author
Contributor
I started implementing this, and ran into that the format macro is defined in libstd, and this trait is in libcollections. The reason it's in std is because it needs String to use as a buffer.
What if I moved this macro lib collections, and had it call String::format instead of std::fmt::format?
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jan 18, 2015
Member
I think it's fine to define the format! macro in libcollections and then use macro_reexport in the standard library (like it does for vec! already)
This comment has been minimized.
This comment has been minimized.
nrc
assigned
pnkfelix
Jan 22, 2015
alexcrichton
referenced this pull request
Jan 28, 2015
Merged
std: Stabilize the std::fmt module #21713
seanmonstar
referenced this pull request
Feb 4, 2015
Closed
RFC: Deprecate std::fmt::format in favor of String::format #810
This comment has been minimized.
This comment has been minimized.
Though really, I don't think either of these modifications really needs to block this RFC. I'll try to ping @alexcrichton about getting it in front of the core team. |
This comment has been minimized.
This comment has been minimized.
|
@pnkfelix I added the drawback. Regarding numbers, I imagine a minor improvement, since this reduces the number of allocations for the buffer. However, it seems the slowness in formatting numbers has more to do with that implementation, no? |
This comment has been minimized.
This comment has been minimized.
|
@seanmonstar yeah probably |
This comment has been minimized.
This comment has been minimized.
|
ping @alexcrichton |
This comment has been minimized.
This comment has been minimized.
|
Hm I've been thinking about this recently, and I'm somewhat worried about the code size implications here. For example the With formatting, however, trait objects are its bread and butter, so we're forced to keep all
I don't think that this will turn up any showstoppers, but I'd like to have a handle on what we're getting into. Otherwise I think that this is basically good to go so I'll see what we can do to get it merged soon afterwards. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
rprichard
commented
Apr 12, 2015
|
The RFC's current If there are format parameters, then the I wonder if I'm inclined to think the size hint should be |
This comment has been minimized.
This comment has been minimized.
pczarn
commented
Apr 12, 2015
|
I tried to sum size hints in the expansion of |
This comment has been minimized.
This comment has been minimized.
|
ping @alexcrichton |
This comment has been minimized.
This comment has been minimized.
|
@seanmonstar do you have any updates on the impact on code size here? |
This comment has been minimized.
This comment has been minimized.
|
I do not. The branch I started is several months old, and I haven't had time to rebase. @rprichard raises some points worth considering, though. |
aturon
added
the
T-libs
label
May 22, 2015
nikomatsakis
added
the
final-comment-period
label
May 26, 2015
This comment has been minimized.
This comment has been minimized.
|
Note: this RFC entered its Final Comment Period as of Yesterday; 6 days remain before a final decision will be made. |
This comment has been minimized.
This comment has been minimized.
bluss
commented
May 30, 2015
|
Don't make the decision until the discussion about |
This comment has been minimized.
This comment has been minimized.
|
Accepting this RFC does not require setting the name in stone. Though we all love to bikeshed names, I think we can all agree they're not really the most important part of this proposal. Implementing this of course takes time, and landing it in stable even more. |
This comment has been minimized.
This comment has been minimized.
|
Another alternative would be instead of including this method, for fmt to use a thread-local LruCache of sizes when formatting types. This would result in the first case not getting the speed benefit, but would mean that all implementations of a fmt trait can use a size hint, without requiring the |
This comment has been minimized.
This comment has been minimized.
|
Having now gone through the final comment period, it's time to make a decision on this RFC. For now I'm going to close this RFC for the following reasons:
Overall it seems best to start this RFC fresh again with a new look beyond microbenchmarks to the statistics measured here, and also perhaps await the outcome of #1034. Thanks regardless for the RFC @seanmonstar! |
alexcrichton
closed this
Jun 2, 2015
alexcrichton
referenced this pull request
Jun 2, 2015
Open
Investigate size hints for formatting #1145
This comment has been minimized.
This comment has been minimized.
bluss
commented
Jun 2, 2015
|
like gankro said, the name is a non-issue. |
seanmonstar commentedJan 14, 2015
Add a
size_hintmethod to each of thefmttraits, allowing a buffer to allocate with the correct size before writing.Rendered
cc @alexcrichton