Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for uninitialized constructors for Box, Rc, Arc #63291

Open
SimonSapin opened this issue Aug 5, 2019 · 35 comments
Open

Tracking issue for uninitialized constructors for Box, Rc, Arc #63291

SimonSapin opened this issue Aug 5, 2019 · 35 comments

Comments

@SimonSapin
Copy link
Contributor

@SimonSapin SimonSapin commented Aug 5, 2019

Assigning MaybeUninit::<Foo>::uninit() to a local variable is usually free, even when size_of::<Foo>() is large. However, passing it for example to Arc::new causes at least one copy (from the stack to the newly allocated heap memory) even though there is no meaningful data. It is theoretically possible that a Sufficiently Advanced Compiler could optimize this copy away, but this is reportedly unlikely to happen soon in LLVM.

This issue tracks constructors for containers (Box, Rc, Arc) of MaybeUninit<T> or [MaybeUninit<T>] that do not initialized the data, and unsafe conversions to the known-initialized types (without MaybeUninit). The constructors are guaranteed not to make unnecessary copies.

PR #62451 adds:

impl<T> Box<T> { pub fn new_uninit() -> Box<MaybeUninit<T>> {…} }
impl<T> Box<MaybeUninit<T>> { pub unsafe fn assume_init(self) -> Box<T> {…} }
impl<T> Box<[T]> { pub fn new_uninit_slice(len: usize) -> Box<[MaybeUninit<T>]> {…} }
impl<T> Box<[MaybeUninit<T>]> { pub unsafe fn assume_init(self) -> Box<[T]> {…} }

impl<T> Rc<T> { pub fn new_uninit() -> Rc<MaybeUninit<T>> {…} }
impl<T> Rc<MaybeUninit<T>> { pub unsafe fn assume_init(self) -> Rc<T> {…} }
impl<T> Rc<[T]> { pub fn new_uninit_slice(len: usize) -> Rc<[MaybeUninit<T>]> {…} }
impl<T> Rc<[MaybeUninit<T>]> { pub unsafe fn assume_init(self) -> Rc<[T]> {…} }

impl<T> Arc<T> { pub fn new_uninit() -> Arc<MaybeUninit<T>> {…} }
impl<T> Arc<MaybeUninit<T>> { pub unsafe fn assume_init(self) -> Arc<T> {…} }
impl<T> Arc<[T]> { pub fn new_uninit_slice(len: usize) -> Arc<[MaybeUninit<T>]> {…} }
impl<T> Arc<[MaybeUninit<T>]> { pub unsafe fn assume_init(self) -> Arc<[T]> {…} }

PR #66128 adds:

impl<T> Box<T> { pub fn new_zeroed() -> Box<MaybeUninit<T>> {…} }
impl<T> Arc<T> { pub fn new_zeroed() -> Arc<MaybeUninit<T>> {…} }
impl<T> Rc<T> { pub fn new_zeroed() -> Rc<MaybeUninit<T>> {…} }

Unresolved question:

  • The constructor that returns for example Box<MaybeUninit<T>> might “belong” more as an associated function of that same type, rather than Box<T>. (And similarly for other constructors.) However this would make a call like Box::<u32>::new_uninit() becomes Box::<MaybeUnint<u32>>::new_uninit() which feels unnecessarily verbose. I suspect that this turbofish will be needed in a lot of cases to appease type inference.
@TimDiekmann
Copy link
Member

@TimDiekmann TimDiekmann commented Oct 5, 2019

Just from looking at the current source code:

pub fn new_uninit() -> Box<mem::MaybeUninit<T>> {
    let layout = alloc::Layout::new::<mem::MaybeUninit<T>>();
    let ptr = unsafe {
        Global.alloc(layout)
            .unwrap_or_else(|_| alloc::handle_alloc_error(layout))
    };
    Box(ptr.cast().into())
}

When mem::size_of::<T>() == 0, a zero-sized layout is passed to Global.alloc but alloc mentions:

This function is unsafe because undefined behavior can result if the caller does not ensure that layout has non-zero size.

Did I have overseen something or is this a bug?

@SimonSapin
Copy link
Contributor Author

@SimonSapin SimonSapin commented Oct 5, 2019

Good point! I copied this pattern from Rc and Arc, but there the header (refcounts) cause the allocation to never be zero-size.

Box::new_uninit_slice has a similar bug with size_of() == 0 or len == 0.

@TimDiekmann
Copy link
Member

@TimDiekmann TimDiekmann commented Oct 5, 2019

I have noticed this when implementing Box for custom allocators with non-zero layouts. Turns out, that this is indeed useful 🙂

This is my implementation for the sliced version.

SimonSapin added a commit to SimonSapin/rust that referenced this issue Oct 6, 2019
Requesting a zero-size allocation is not allowed,
return a dangling pointer instead.

CC rust-lang#63291 (comment)
@SimonSapin
Copy link
Contributor Author

@SimonSapin SimonSapin commented Oct 6, 2019

Centril added a commit to Centril/rust that referenced this issue Oct 16, 2019
Fix zero-size uninitialized boxes

Requesting a zero-size allocation is not allowed, return a dangling pointer instead.

CC rust-lang#63291 (comment)
Centril added a commit to Centril/rust that referenced this issue Oct 19, 2019
Fix zero-size uninitialized boxes

Requesting a zero-size allocation is not allowed, return a dangling pointer instead.

CC rust-lang#63291 (comment)
Centril added a commit to Centril/rust that referenced this issue Oct 19, 2019
Fix zero-size uninitialized boxes

Requesting a zero-size allocation is not allowed, return a dangling pointer instead.

CC rust-lang#63291 (comment)
Centril added a commit to Centril/rust that referenced this issue Oct 19, 2019
Fix zero-size uninitialized boxes

Requesting a zero-size allocation is not allowed, return a dangling pointer instead.

CC rust-lang#63291 (comment)
XiangQingW added a commit to XiangQingW/rust that referenced this issue Oct 21, 2019
Requesting a zero-size allocation is not allowed,
return a dangling pointer instead.

CC rust-lang#63291 (comment)
kevgrasso added a commit to kevgrasso/rust that referenced this issue Oct 23, 2019
Requesting a zero-size allocation is not allowed,
return a dangling pointer instead.

CC rust-lang#63291 (comment)
tmandry added a commit to tmandry/rust that referenced this issue Nov 26, 2019
alloc: Add new_zeroed() versions like new_uninit().

MaybeUninit has both uninit() and zeroed(), it seems reasonable to have the same
surface on Box/Rc/Arc.

Needs tests.

cc rust-lang#63291
@Kixunil
Copy link
Contributor

@Kixunil Kixunil commented Dec 13, 2019

What are the requirements for stabilizing the Box API?

I don't think using Rc<MaybeUninit<T>> (and Arc) is currently that useful, as Arc happens to be shared (no mutation). However, there's this pattern where one wants to create a value while it's uniquely owned and then share it later. Doing it with Box would mean copying, so I was thinking about having RcMut and ArcMut that work exactly as Box, but reserve space for reference counters, so that when you call share() on them, they just initialize ref count and convert to Rc/Arc. This is however another topic. Did anyone had this idea before or should I start discussion somewhere?

@SimonSapin
Copy link
Contributor Author

@SimonSapin SimonSapin commented Dec 13, 2019

there's this pattern where one wants to create a value while it's uniquely owned and then share it later

Yes, that’s exactly the idea with having Rc::new_uninit together with Rc::get_mut_unchecked #63292. It would also work with Rc::new_uninit + Rc::get_mut + unwrap, with an unnecessary run-time check (assuming sharing only after this).

@Kixunil
Copy link
Contributor

@Kixunil Kixunil commented Dec 16, 2019

Great! I still think it'd be better to provide safe abstraction for it, but at least it'd be possible to do in a library.

@rickvanprim
Copy link

@rickvanprim rickvanprim commented Mar 28, 2020

Similar to new_zeroed() it would be nice to have a new_zeroed_slice().

@mahkoh
Copy link
Contributor

@mahkoh mahkoh commented Dec 26, 2020

The current documentation of assume_init for Rc allows unsound operations:

let mut rc_uninit: Rc<MaybeUninit<NonNull<()>>> = Rc::new(MaybeUninit::new(t));
let rc: Rc<NonNull<()>> = unsafe {
    // SAFETY:
    // > As with MaybeUninit::assume_init, it is up to the caller to guarantee
    //   that the inner value really is in an initialized state.
    // The contents of rc_uninit are initialized.
    rc_uninit.clone().assume_init()
};
unsafe {
    // SAFETY:
    // > Any other `Rc` or [`Weak`] pointers to the same allocation must not be
    //    dereferenced for the duration of the returned borrow.
    // The borrow only lasts for this unsafe block and no other Rc is dereferenced.
    *Rc::get_mut_unchecked(&mut rc_uninit) = MaybeUninit::uninit();
}

@Kixunil
Copy link
Contributor

@Kixunil Kixunil commented Sep 13, 2021

I wonder if there should be more methods:

impl<T> Box<MaybeUninit<T>> {
    fn write(self, T) -> Box<T>;
}

impl<T> Arc<MaybeUninit<T>> {
    fn write(self, T) -> Arc<T>;
}

impl<T> Rc<MaybeUninit<T>> {
    fn write(self, T) -> Rc<T>;
}

As far as I understand it could optimize the code to avoid copying large T from stack if written like Box::new_uninit().write(create_value()) instead of Box::new(create_value()) without need to use unsafe.

@SimonSapin
Copy link
Contributor Author

@SimonSapin SimonSapin commented Sep 13, 2021

Yes, especially if we can ensure (with codegen tests?) that the copy is actually elided.

@jplatte
Copy link
Contributor

@jplatte jplatte commented Sep 13, 2021

@Kixunil Doesn't that work only for Box? Or is the idea for Rc / Arc that you panic or create an entirely new Rc / Arc if the strong count is not 1? (to avoid being able to invalidate the new Rc<T> by writing uninit to it the inner value again through a Rc<MaybeUninit<T>>)

@Kixunil
Copy link
Contributor

@Kixunil Kixunil commented Sep 13, 2021

@jplatte Heh, just started implementing it and realized it.

I guess it could return Result<Rc<T>, (Rc<MaybeUninit<T>, T)>. However it seems it'd fit better into rc-box crate.

Kixunil added a commit to Kixunil/rust that referenced this issue Sep 13, 2021
This adds method similar to `MaybeUninit::write` main difference being
it returns owned `Box`. This can be used to elide copy from stack
safely, however it's not currently tested that the optimization actually
occurs.

Analogous methods are not provided for `Rc` and `Arc` as those need to
handle the possibility of sharing. Some version of them may be added in
the future.

This was discussed in rust-lang#63291 which this change extends.
Kixunil added a commit to Kixunil/rust that referenced this issue Sep 13, 2021
This adds method similar to `MaybeUninit::write` main difference being
it returns owned `Box`. This can be used to elide copy from stack
safely, however it's not currently tested that the optimization actually
occurs.

Analogous methods are not provided for `Rc` and `Arc` as those need to
handle the possibility of sharing. Some version of them may be added in
the future.

This was discussed in rust-lang#63291 which this change extends.
Kixunil added a commit to Kixunil/rust that referenced this issue Sep 13, 2021
This adds method similar to `MaybeUninit::write` main difference being
it returns owned `Box`. This can be used to elide copy from stack
safely, however it's not currently tested that the optimization actually
occurs.

Analogous methods are not provided for `Rc` and `Arc` as those need to
handle the possibility of sharing. Some version of them may be added in
the future.

This was discussed in rust-lang#63291 which this change extends.
@kornelski
Copy link
Contributor

@kornelski kornelski commented Oct 4, 2021

There's new_uninit_slice, and try_new_uninit (#32838), but I'm missing a combination of the two: try_new_uninit_slice for fallible allocation of Arc<[_]>.

@Kixunil
Copy link
Contributor

@Kixunil Kixunil commented Oct 25, 2021

One more thing I realized could be useful:

impl<T> Box<T> {
    fn take(this: Self) -> (T, Box<MaybeUninit<T>>) { ... }
}

This would allow one to reuse allocation. (Would be even cooler to allow casting between same-sized T, U within Box<MaybeUninit<T>>)

This is currently used in std as I found when implementing Box::write() and I guess it is more widely applicable.

@WaffleLapkin
Copy link
Contributor

@WaffleLapkin WaffleLapkin commented Oct 25, 2021

@Kixunil I once had an idea of having an Allocation type that would allow reusing allocations. I had even sketched up something like this:

pub struct Allocation {
    ptr: NonNull<u8>,
    layout: Layout
}

// Some methods to convert Vec/Box/etc -> Allocation and back (if the layout matches)

Not sure if this is a good idea, but it surely allows "casting between same-sized (and aligned) T, U", so I'm just throwing it here :)

@Kixunil
Copy link
Contributor

@Kixunil Kixunil commented Oct 25, 2021

@WaffleLapkin that one seems to be dynamic which I'm unsure if it's applicable for Box. Maybe there are some interesting cases when it's useful though. Can't think of any right now.

Kixunil added a commit to Kixunil/rust that referenced this issue Dec 2, 2021
This adds method similar to `MaybeUninit::write` main difference being
it returns owned `Box`. This can be used to elide copy from stack
safely, however it's not currently tested that the optimization actually
occurs.

Analogous methods are not provided for `Rc` and `Arc` as those need to
handle the possibility of sharing. Some version of them may be added in
the future.

This was discussed in rust-lang#63291 which this change extends.
Kixunil added a commit to Kixunil/rust that referenced this issue Dec 2, 2021
This adds method similar to `MaybeUninit::write` main difference being
it returns owned `Box`. This can be used to elide copy from stack
safely, however it's not currently tested that the optimization actually
occurs.

Analogous methods are not provided for `Rc` and `Arc` as those need to
handle the possibility of sharing. Some version of them may be added in
the future.

This was discussed in rust-lang#63291 which this change extends.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 3, 2021
…tolnay

Implement write() method for Box<MaybeUninit<T>>

This adds method similar to `MaybeUninit::write` main difference being
it returns owned `Box`. This can be used to elide copy from stack
safely, however it's not currently tested that the optimization actually
occurs.

Analogous methods are not provided for `Rc` and `Arc` as those need to
handle the possibility of sharing. Some version of them may be added in
the future.

This was discussed in rust-lang#63291 which this change extends.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Dec 3, 2021
…tolnay

Implement write() method for Box<MaybeUninit<T>>

This adds method similar to `MaybeUninit::write` main difference being
it returns owned `Box`. This can be used to elide copy from stack
safely, however it's not currently tested that the optimization actually
occurs.

Analogous methods are not provided for `Rc` and `Arc` as those need to
handle the possibility of sharing. Some version of them may be added in
the future.

This was discussed in rust-lang#63291 which this change extends.
@Kixunil
Copy link
Contributor

@Kixunil Kixunil commented Dec 4, 2021

@SimonSapin regarding the unresolved question I just realized there's a precedent: Box::<T>::pin() returns Pin<Box<T>> - it's not a method of Pin<Box<T>>. I agree that it's less annoying to do this way even though a bit surprising for newbies and I think it's net-benefit. However, re-reading this thread, @m-ou-se mentioned people at meeting thought differently. I wonder if the argument about Pin was brought up/if it's worth reconsidering.

Arc/Rc of MaybeUninit seem a bit awkward to work with, so maybe there's a better way for the cases where these apis would be useful.

Is this meant to imply that writing these APIs (RcMut) is desired and required for this issue to progress? I think I'd find it a fun side project to work on if yes. :)

OT: my PR to add write() method was merged. 🎉

@SimonSapin
Copy link
Contributor Author

@SimonSapin SimonSapin commented Jan 9, 2022

If awkward initialization of {Arc,Rc}<MaybeUninit<_>> is the remaining blocker, I think we should split up this feature to let Box APIs make progress separately.

As to RcMut, that seems like a big enough feature with enough design space that experimenting on crates.io would be good. Or survey what crates already exist, I vaguely recall there’s at least one.

@WaffleLapkin
Copy link
Contributor

@WaffleLapkin WaffleLapkin commented Jan 9, 2022

There is rc_box

@SimonSapin
Copy link
Contributor Author

@SimonSapin SimonSapin commented Jan 9, 2022

@Kixunil
Copy link
Contributor

@Kixunil Kixunil commented Jan 9, 2022

Is the design space that large though? We can pretty much copy Box methods and add efficient conversions to Rc/Arc. I don't see a reason why would it be different given that it is also an owning pointer - the only difference being the layout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet