Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to make Box::default() construct in-place + Box::new_in_place() #64062

Open
wants to merge 2 commits into
base: master
from

Conversation

@petertodd
Copy link
Contributor

commented Sep 1, 2019

To start with, I'll say straight up this is more of a sketch of a possible API then something I'm sure we should merge.

The first part here is the Box::new_in_place() API. Basically the idea is that we could provide a way to reliably create large boxed values in-place via return-value-optimization. This is something that often is never possible to optimize, even in theory, because allocation is fallible and if the creation of the value is itself fallible, the optimizer can't move the allocation to happen first because that would be an observable change.

Unfortunately this doesn't quite work if the closure can't be inlined sufficiently due to optimizer limitations. But hopefully this at least shows a way forward that might be interesting to others.

Secondly, I use this Box::new_in_place() API to optimize Box::default(). I'm not totally sure this is a good idea to actually merge, as currently this is kind of an observable change in behavior for things like large arrays that would otherwise blow up the stack; people might rely on it, and currently the optimization is unreliable.

Finally, this of course this could be extended to Rc and Arc as well in much the same fashion.

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

commented Sep 1, 2019

r? @kennytm

(rust_highfive has picked a reviewer for you, use r? to override)

/// FIXME: This is intended to work via return-value-optimization, but due to compiler
/// limitations can't reliably do that yet.
#[inline(always)]
fn new_in_place(f: impl FnOnce() -> T) -> Box<T> {

This comment has been minimized.

Copy link
@Centril

Centril Sep 1, 2019

Member

This is semantically more like new_please_try_to_do_it_in_place_but_I_am_not_expecting_any_guarantee?
(Such a guarantee is not something we can reliably or would make on a language (not LLVM) level.)

This comment has been minimized.

Copy link
@petertodd

petertodd Sep 1, 2019

Author Contributor

Point is, if we had guaranteed copy elision, I think this is as close as you could get with an API that looked something like this given Rust's semantics. Even with out-references it's not clear to me you could get semantics that nicely guarantee no copies in an easy way, as you could still mess up and wind up with a copy prior to writing it to the out-reference.

Like I said above, it's not clear to me this is actually a good idea. But it does seem novel. :)

edit: fixed thinko: we do have return-value-optimization; it's the guaranteed copy elision that we're missing.

This comment has been minimized.

Copy link
@Centril

Centril Sep 1, 2019

Member

Given that we do not guarantee copy elision we'll need to at minimum call this something other than new_in_place which suggests that there actually is a guarantee. For example: new_hint_in_place.

It might be a good idea to just use this internally for now... let's see how this measures up in terms of perf for fn default.

This comment has been minimized.

Copy link
@petertodd

petertodd Sep 1, 2019

Author Contributor

It might be a good idea to just use this internally for now

Yup, it is private right now, as I didn't want to give people false hope even in nightly. I'd suggest that we leave it that way until RVO is more reliable (specifically named-return-value-optimization if I understand the terminology right?).

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 1, 2019

@bors try

@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 1, 2019

⌛️ Trying commit b0674d3 with merge 9d1bd60...

bors added a commit that referenced this pull request Sep 1, 2019
Auto merge of #64062 - petertodd:2019-new-in-place, r=<try>
Attempt to make Box::default() construct in-place + Box::new_in_place()

To start with, I'll say straight up this is more of a sketch of a possible API then something I'm sure we should merge.

The first part here is the `Box::new_in_place()` API. Basically the idea is that we could provide a way to reliably create large boxed values in-place via return-value-optimization. This is something that often is *never* possible to optimize, even in theory, because allocation is fallible and if the creation of the value is itself fallible, the optimizer can't move the allocation to happen first because that would be an observable change.

Unfortunately this doesn't quite work if the closure can't be inlined sufficiently due to optimizer limitations. But hopefully this at least shows a way forward that might be interesting to others.

Secondly, I use this `Box::new_in_place()` API to optimize `Box::default()`. I'm not totally sure this is a good idea to actually merge, as currently this is kind of an observable change in behavior for things like large arrays that would otherwise blow up the stack; people might rely on it, and currently the optimization is unreliable.

Finally, this of course this could be extended to `Rc` and `Arc` as well in much the same fashion.
@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 1, 2019

☀️ Try build successful - checks-azure
Build commit: 9d1bd60

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 1, 2019

@rust-timer

This comment has been minimized.

Copy link

commented Sep 1, 2019

Success: Queued 9d1bd60 with parent e29faf0, comparison URL.

@rust-timer

This comment has been minimized.

Copy link

commented Sep 2, 2019

Finished benchmarking try commit 9d1bd60, comparison URL.

@petertodd

This comment has been minimized.

Copy link
Contributor Author

commented Sep 2, 2019

Looks like this is on average very slightly slower than before.

I'll bet you this is a function of caching: by allocating first, you have to access allocation-related memory flushing both registers and cache, followed by doing the work to create the object. As most objects are fairly small anyway, having to reload registers and cache to do the work of creating it may be slightly more costly on average than the extra copy.

@bjorn3

This comment has been minimized.

Copy link
Contributor

commented Sep 2, 2019

box expr already allocate a box first and then evaluate expr. That means that Box::default would have already been subject to rvo if possible.

#![feature(box_syntax)]

fn main() {
    let _: Box<String> = box String::new();
}
// [...]
        _2 = Box(std::string::String);
        (*_2) = const std::string::String::new() -> [return: bb2, unwind: bb4];
// [...]
@petertodd

This comment has been minimized.

Copy link
Contributor Author

commented Sep 5, 2019

box expr already allocate a box first and then evaluate expr. That means that Box::default would have already been subject to rvo if possible.

Ha, that's embarassing! Yeah, I wrote the initial version of this for Rc/Arc, and I was sure I'd double-checked the assembly for the Box version... But I guess not.

@Centril Just pushed another commit which adds Rc/Arc if you want to try that benchmarking thing again.

I suppose since the performance isn't significantly changed Box::new_in_place() still might be worth merging if only to provide a way to remove the old box syntax.

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

@Centril Just pushed another commit which adds Rc/Arc if you want to try that benchmarking thing again.

Sure; can't hurt.

@bors try

@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2019

⌛️ Trying commit 9b57963 with merge a683cc0...

bors added a commit that referenced this pull request Sep 5, 2019
Auto merge of #64062 - petertodd:2019-new-in-place, r=<try>
Attempt to make Box::default() construct in-place + Box::new_in_place()

To start with, I'll say straight up this is more of a sketch of a possible API then something I'm sure we should merge.

The first part here is the `Box::new_in_place()` API. Basically the idea is that we could provide a way to reliably create large boxed values in-place via return-value-optimization. This is something that often is *never* possible to optimize, even in theory, because allocation is fallible and if the creation of the value is itself fallible, the optimizer can't move the allocation to happen first because that would be an observable change.

Unfortunately this doesn't quite work if the closure can't be inlined sufficiently due to optimizer limitations. But hopefully this at least shows a way forward that might be interesting to others.

Secondly, I use this `Box::new_in_place()` API to optimize `Box::default()`. I'm not totally sure this is a good idea to actually merge, as currently this is kind of an observable change in behavior for things like large arrays that would otherwise blow up the stack; people might rely on it, and currently the optimization is unreliable.

Finally, this of course this could be extended to `Rc` and `Arc` as well in much the same fashion.
@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2019

☀️ Try build successful - checks-azure
Build commit: a683cc0

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

@rust-timer

This comment has been minimized.

Copy link

commented Sep 5, 2019

Success: Queued a683cc0 with parent f257c40, comparison URL.

@rust-timer

This comment has been minimized.

Copy link

commented Sep 5, 2019

Finished benchmarking try commit a683cc0, comparison URL.

@petertodd

This comment has been minimized.

Copy link
Contributor Author

commented Sep 5, 2019

Heh, I wonder if that benchmark suit even has a single case of Rc::default() or Arc::default() being used?

Anyway, might be too rare to be worth optimizing.

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

@petertodd "The benchmark suite" is not those crates having Arc/Rc::default() but rather it is the performance of the compiler type checking, generating code, etc. for those crates. In other words, we are measuring compile time perf.

@petertodd

This comment has been minimized.

Copy link
Contributor Author

commented Sep 10, 2019

@petertodd "The benchmark suite" is not those crates having Arc/Rc::default() but rather it is the performance of the compiler type checking, generating code, etc. for those crates. In other words, we are measuring compile time perf.

Ah ok, so basically my question boils down to whether or not the compiler uses Rc/Arc default()

Anyway, if you guys think this is worth merging, let me know and I'll fix up the comments appropriately.

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 10, 2019

Ah ok, so basically my question boils down to whether or not the compiler uses Rc/Arc default()

Basically yeah.

Anyway, if you guys think this is worth merging, let me know and I'll fix up the comments appropriately.

I'd drop the Box changes but potentially there's an improvement for Rc and Arc but it could also just be noise.

r? @nnethercote

@nnethercote

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2019

The general comments I can make are as follows.

  • The performance improvement on the benchmarks is vanishingly small, right at the edge of what can be reliably detected.
  • The lack of certainty about whether the optimization is even working as expected is discouraging.
  • The commits introduce multiple uses of unsafe.

Based on those observations, my inclination is that these commits do not have enough of a benefit to warrant landing.

I am not comfortable reviewing the specifics of the code, because my understanding of things like MaybeUninit is almost zero. @pnkfelix might be a better reviewer, if you want to push forward.

r? @pnkfelix

@pietroalbini

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

r? @pietroalbini

Just testing highfive.

@pietroalbini

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

@pietroalbini

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

Sorry y'all for the useless pings, trying to figure out why highfive didn't work earlier.

r? @nnethercote

@pietroalbini

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

Ok, apparently it's working now? The weird thing is I can't find any errors that happened when the last two r? failed, everything is green...

r? @pnkfelix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
10 participants
You can’t perform that action at this time.