Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add APIs for uninitialized Box, Rc, and Arc. (Plus get_mut_unchecked) #62451

Open
wants to merge 14 commits into
base: master
from

Conversation

@SimonSapin
Copy link
Contributor

commented Jul 6, 2019

Assigning MaybeUninit::<Foo>::uninit() to a local variable is usually free, even when size_of::<Foo>() is large. However, passing it for example to Arc::new causes at least one copy (from the stack to the newly allocated heap memory) even though there is no meaningful data. It is theoretically possible that a Sufficiently Advanced Compiler could optimize this copy away, but this is reportedly unlikely to happen soon in LLVM.

This PR proposes two sets of features:

  • Constructors for containers (Box, Rc, Arc) of MaybeUninit<T> or [MaybeUninit<T>] that do not initialized the data, and unsafe conversions to the known-initialized types (without MaybeUninit). The constructors are guaranteed not to make unnecessary copies.

  • On Rc and Arc, an unsafe get_mut_unchecked method that provides &mut T access without checking the reference count. Arc::get_mut involves multiple atomic operations whose cost can be non-trivial. Rc::get_mut is less costly, but we add Rc::get_mut_unchecked anyway for symmetry with Arc.

    These can be useful independently, but they will presumably be typical when the new constructors of Rc and Arc are used.

    An alternative with a safe API would be to introduce UniqueRc and UniqueArc types that have the same memory layout as Rc and Arc (and so zero-cost conversion to them) but are guaranteed to have only one reference. But introducing entire new types feels “heavier” than new constructors on existing types, and initialization of MaybeUninit<T> typically requires unsafe code anyway.

Summary of new APIs (all unstable in this PR):

impl<T> Box<T> { pub fn new_uninit() -> Box<MaybeUninit<T>> {…} }
impl<T> Box<MaybeUninit<T>> { pub unsafe fn assume_init(self) -> Box<T> {…} }
impl<T> Box<[T]> { pub fn new_uninit_slice(len: usize) -> Box<[MaybeUninit<T>]> {…} }
impl<T> Box<[MaybeUninit<T>]> { pub unsafe fn assume_init(self) -> Box<[T]> {…} }

impl<T> Rc<T> { pub fn new_uninit() -> Rc<MaybeUninit<T>> {…} }
impl<T> Rc<MaybeUninit<T>> { pub unsafe fn assume_init(self) -> Rc<T> {…} }
impl<T> Rc<[T]> { pub fn new_uninit_slice(len: usize) -> Rc<[MaybeUninit<T>]> {…} }
impl<T> Rc<[MaybeUninit<T>]> { pub unsafe fn assume_init(self) -> Rc<[T]> {…} }

impl<T> Arc<T> { pub fn new_uninit() -> Arc<MaybeUninit<T>> {…} }
impl<T> Arc<MaybeUninit<T>> { pub unsafe fn assume_init(self) -> Arc<T> {…} }
impl<T> Arc<[T]> { pub fn new_uninit_slice(len: usize) -> Arc<[MaybeUninit<T>]> {…} }
impl<T> Arc<[MaybeUninit<T>]> { pub unsafe fn assume_init(self) -> Arc<[T]> {…} }

impl<T: ?Sized> Rc<T> { pub unsafe fn get_mut_unchecked(this: &mut Self) -> &mut T {…} }
impl<T: ?Sized> Arc<T> { pub unsafe fn get_mut_unchecked(this: &mut Self) -> &mut T {…} }
@rust-highfive

This comment has been minimized.

Copy link
Collaborator

commented Jul 6, 2019

r? @bluss

(rust_highfive has picked a reviewer for you, use r? to override)

@SimonSapin

This comment has been minimized.

Copy link
Contributor Author

commented Jul 6, 2019

CC @rust-lang/libs for new unstable APIs
CC @RalfJung for MaybeUninit
CC @jrmuizel and @kvark for some overlap in functionality with https://crates.io/crates/copyless

@SimonSapin SimonSapin force-pushed the SimonSapin:new_uninit branch 2 times, most recently from d7d3352 to 24fd456 Jul 6, 2019

Show resolved Hide resolved src/liballoc/boxed.rs Outdated
@RalfJung

This comment has been minimized.

Copy link
Member

commented Jul 13, 2019

We have a codegen test for allocating uninitialized boxes at https://github.com/rust-lang/rust/blob/master/src/test/codegen/box-maybe-uninit.rs. Seems like a good idea to update that to test Box::new_uninit?

Show resolved Hide resolved src/liballoc/rc.rs Outdated
Show resolved Hide resolved src/liballoc/rc.rs Outdated
Show resolved Hide resolved src/liballoc/rc.rs Outdated
Show resolved Hide resolved src/liballoc/rc.rs Outdated

SimonSapin added a commit to SimonSapin/rust that referenced this pull request Jul 16, 2019

@SimonSapin

This comment has been minimized.

Copy link
Contributor Author

commented Jul 16, 2019

We have a codegen test for allocating uninitialized boxes at https://github.com/rust-lang/rust/blob/master/src/test/codegen/box-maybe-uninit.rs. Seems like a good idea to update that to test Box::new_uninit?

If you insist I don’t really mind adding it, but I think it wouldn’t be useful.

The existing test uses Box::new which uses the box keyword which is lowered to MIR here:

// malloc some memory of suitable type (thus far, uninitialized):
let box_ = Rvalue::NullaryOp(NullOp::Box, value.ty);
this.cfg
.push_assign(block, source_info, &Place::from(result), box_);
// initialize the box contents:
unpack!(
block = this.into(
&Place::from(result).deref(),
block, value
)
);
block.and(Rvalue::Use(Operand::Move(Place::from(result))))

The NullOp::Box operation (which will itself be lowered to a call to exchange_malloc) is followed by a Operand::Move operation.

At some later point (I’m not sure where exactly) an optimization eliminates this move, presumably because the source is recognized to be entirely uninitialized. But that optimization is not reliable, the test links to #58201.

A codegen test is a good way to ensure that the optimization keeps removing that move operation. But the point of Box::new_uninit is that there is no such move/copy at all in the first place.

@RalfJung

This comment has been minimized.

Copy link
Member

commented Jul 16, 2019

But the point of Box::new_uninit is that there is no such move/copy at all in the first place.

My thinking was that a codegen test would demonstrate that this is indeed the case.

But if you think the chances of anyone every accidentally changing Box::new_uninit to have a copy that needs eliding are slim, I am fine with not adding a test.

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member

commented Aug 5, 2019

Here from the tracking issue for get_mut_unchecked -- would it perhaps make sense to make those APIs instead be fn (*mut Arc<T>) -> *mut T? Presumably they could then even be made safe and possibly more useful, I'm not sure. It might be easier to use such an API in a safe manner (at least per stacked borrows) since you avoid creating the &mut T at all which could be invalid.

@SimonSapin

This comment has been minimized.

Copy link
Contributor Author

commented Aug 5, 2019

I don’t understand how this would help. Making a method safe by having it return a raw pointer only moves the unsafe over to wherever the pointer will be dereferenced.

As to cases where creating the &mut T would be invalid, how could it be valid to use this method at all?

Show resolved Hide resolved src/liballoc/rc.rs Outdated
@RalfJung

This comment has been minimized.

Copy link
Member

commented Aug 5, 2019

I don't think the *mut Arc<T> helps, not sure why anyone would have a raw pointer on the outside.

Returning *mut could help if it enables code like this:

let a = Arc::new(0);
let r1 = Arc::get_mut_unchecked();
let r2 = Arc::get_mut_unchecked();
// Two aliasing raw pointers, no problem.
*r1 = 5;
assert_eq!(*r2, 5);

To make this work with Stacked Borrows, Arc would internally have to be careful to use raw pointers, not references.

Show resolved Hide resolved src/liballoc/rc.rs Outdated
Show resolved Hide resolved src/liballoc/rc.rs
Show resolved Hide resolved src/liballoc/rc.rs Outdated

@SimonSapin SimonSapin force-pushed the SimonSapin:new_uninit branch from 85555be to 7a641f7 Aug 16, 2019

Show resolved Hide resolved src/liballoc/rc.rs Outdated
@RalfJung
Copy link
Member

left a comment

Mostly just doc nits, but I think one existing method should be renamed to account for the new way in which it is used.

Show resolved Hide resolved src/liballoc/sync.rs Outdated
Show resolved Hide resolved src/liballoc/rc.rs Outdated
Show resolved Hide resolved src/liballoc/rc.rs Outdated
Show resolved Hide resolved src/liballoc/boxed.rs Outdated
Show resolved Hide resolved src/liballoc/boxed.rs Outdated
Show resolved Hide resolved src/liballoc/sync.rs Outdated
Show resolved Hide resolved src/liballoc/sync.rs Outdated
Show resolved Hide resolved src/liballoc/sync.rs Outdated
Show resolved Hide resolved src/libcore/ptr/unique.rs Outdated
Show resolved Hide resolved src/liballoc/rc.rs Outdated
@RalfJung

This comment has been minimized.

Copy link
Member

commented Aug 17, 2019

@SimonSapin you know you can batch all suggested changes into one commit? Go to https://github.com/rust-lang/rust/pull/62451/files for that.

@SimonSapin

This comment has been minimized.

Copy link
Contributor Author

commented Aug 17, 2019

Ah, is going to that view what I need to do to have that button not be disabled? Good to know for next time. I’ll squash these.

Doc nits
Co-Authored-By: Ralf Jung <post@ralfj.de>

@SimonSapin SimonSapin force-pushed the SimonSapin:new_uninit branch from faec738 to 54cfe3e Aug 17, 2019

@SimonSapin SimonSapin force-pushed the SimonSapin:new_uninit branch from 54cfe3e to b79ce1b Aug 17, 2019

@RalfJung
Copy link
Member

left a comment

r=me with the last nit fixed.

Show resolved Hide resolved src/libcore/ptr/unique.rs Outdated
Doc nit
Co-Authored-By: Ralf Jung <post@ralfj.de>
@SimonSapin

This comment has been minimized.

Copy link
Contributor Author

commented Aug 17, 2019

@bors r=RalfJung

@bors

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2019

📌 Commit 9bd7083 has been approved by RalfJung

Centril added a commit to Centril/rust that referenced this pull request Aug 17, 2019

Rollup merge of rust-lang#62451 - SimonSapin:new_uninit, r=RalfJung
Add APIs for uninitialized Box, Rc, and Arc. (Plus get_mut_unchecked)

Assigning `MaybeUninit::<Foo>::uninit()` to a local variable is usually free, even when `size_of::<Foo>()` is large. However, passing it for example to `Arc::new` [causes at least one copy](https://youtu.be/F1AquroPfcI?t=4116) (from the stack to the newly allocated heap memory) even though there is no meaningful data. It is theoretically possible that a Sufficiently Advanced Compiler could optimize this copy away, but this is [reportedly unlikely to happen soon in LLVM](https://youtu.be/F1AquroPfcI?t=5431).

This PR proposes two sets of features:

* Constructors for containers (`Box`, `Rc`, `Arc`) of `MaybeUninit<T>` or `[MaybeUninit<T>]` that do not initialized the data, and unsafe conversions to the known-initialized types (without `MaybeUninit`). The constructors are guaranteed not to make unnecessary copies.

* On `Rc` and `Arc`, an unsafe `get_mut_unchecked` method that provides `&mut T` access without checking the reference count. `Arc::get_mut` involves multiple atomic operations whose cost can be non-trivial. `Rc::get_mut` is less costly, but we add `Rc::get_mut_unchecked` anyway for symmetry with `Arc`.

  These can be useful independently, but they will presumably be typical when the new constructors of `Rc` and `Arc` are used.

  An alternative with a safe API would be to introduce `UniqueRc` and `UniqueArc` types that have the same memory layout as `Rc` and `Arc` (and so zero-cost conversion to them) but are guaranteed to have only one reference. But introducing entire new types feels “heavier” than new constructors on existing types, and initialization of `MaybeUninit<T>` typically requires unsafe code anyway.

Summary of new APIs (all unstable in this PR):

```rust
impl<T> Box<T> { pub fn new_uninit() -> Box<MaybeUninit<T>> {…} }
impl<T> Box<MaybeUninit<T>> { pub unsafe fn assume_init(self) -> Box<T> {…} }
impl<T> Box<[T]> { pub fn new_uninit_slice(len: usize) -> Box<[MaybeUninit<T>]> {…} }
impl<T> Box<[MaybeUninit<T>]> { pub unsafe fn assume_init(self) -> Box<[T]> {…} }

impl<T> Rc<T> { pub fn new_uninit() -> Rc<MaybeUninit<T>> {…} }
impl<T> Rc<MaybeUninit<T>> { pub unsafe fn assume_init(self) -> Rc<T> {…} }
impl<T> Rc<[T]> { pub fn new_uninit_slice(len: usize) -> Rc<[MaybeUninit<T>]> {…} }
impl<T> Rc<[MaybeUninit<T>]> { pub unsafe fn assume_init(self) -> Rc<[T]> {…} }

impl<T> Arc<T> { pub fn new_uninit() -> Arc<MaybeUninit<T>> {…} }
impl<T> Arc<MaybeUninit<T>> { pub unsafe fn assume_init(self) -> Arc<T> {…} }
impl<T> Arc<[T]> { pub fn new_uninit_slice(len: usize) -> Arc<[MaybeUninit<T>]> {…} }
impl<T> Arc<[MaybeUninit<T>]> { pub unsafe fn assume_init(self) -> Arc<[T]> {…} }

impl<T: ?Sized> Rc<T> { pub unsafe fn get_mut_unchecked(this: &mut Self) -> &mut T {…} }
impl<T: ?Sized> Arc<T> { pub unsafe fn get_mut_unchecked(this: &mut Self) -> &mut T {…} }
```

bors added a commit that referenced this pull request Aug 17, 2019

Auto merge of #63671 - Centril:rollup-zufavt5, r=Centril
Rollup of 5 pull requests

Successful merges:

 - #62451 (Add APIs for uninitialized Box, Rc, and Arc. (Plus get_mut_unchecked))
 - #63487 (Remove meaningless comments in src/test)
 - #63657 (Crank up invalid value lint)
 - #63667 (resolve: Properly integrate derives and `macro_rules` scopes)
 - #63669 (fix typos in mir/interpret)

Failed merges:

r? @ghost
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.