Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to safely initialize an array of Vec with MaybeUninit doc example #54542

Closed
leonardo-m opened this issue Sep 24, 2018 · 18 comments
Closed

How to safely initialize an array of Vec with MaybeUninit doc example #54542

leonardo-m opened this issue Sep 24, 2018 · 18 comments
Labels
A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools P-medium Medium priority T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@leonardo-m
Copy link

In this page:
https://doc.rust-lang.org/nightly/std/mem/union.MaybeUninit.html

I suggest to add an equivalent example of "how to safely initialize an array of Vecs" using MaybeUninit instead of uninitialized, if possible:

https://doc.rust-lang.org/nightly/std/mem/fn.uninitialized.html

@cramertj
Copy link
Member

What would the example look like? Something like this?

let arr: [MaybeUninit<Vec<u8>>; 5] = [MaybeUninit::uninitialized(); 5];
for v in &mut arr {
    v.set(vec![]);
}
let arr: [Vec<u8>; 5] = mem::transmute(arr);

cc @RalfJung

@leonardo-m
Copy link
Author

As arr array length is better to use 50 instead of 5, because Default:default goes up to about 30.

@frewsxcv frewsxcv added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools labels Sep 25, 2018
@FenrirWolf
Copy link
Contributor

One gotcha is that the example doesn't even work as-is because MaybeUninit isn't Copy:

https://play.rust-lang.org/?gist=1d2d5fa2a49bbbbceb8a3b113587f87a&version=nightly&mode=debug&edition=2015

If MaybeUninit is intended to be compatible with arrays then maybe that was overlooked?

@mbrubeck
Copy link
Contributor

mbrubeck commented Sep 27, 2018

In 7c37c6d#diff-322c42fd5e8ec8e8f8093261d59a5782R87, @eddyb and @RalfJung mention that MaybeUninit “distributes over products” such as arrays, so you could instead do something like:

let mut arr: MaybeUninit<[Vec<u8>; 5]> = MaybeUninit::uninitialized();
for i in 0..5 {
    ptr::write(&mut arr.get_mut()[i], vec![]);
}
let arr: [Vec<u8>; 5] = arr.into_inner();

...if it is safe to get a pointer or reference to the uninitialized inner value or its fields/elements, as long as this pointer isn't used to read from uninitialized portions of the value. It would only be safe to read from an element after it is initialized, and only safe to read from the whole (e.g. using into_inner) when all the elements are initialized.

Note that the expression arr.get_mut() above creates an &mut T that points to an uninitialized T, so this can only work if we decide that such references are “unsafe but valid” (i.e., not instant UB). Also it may be a bit risky passing this to the safe Index::index function, so a more conservative version might use only raw pointers:

    (arr.as_mut_ptr() as *mut Vec<u8>).offset(i).write(vec![]);

@FenrirWolf
Copy link
Contributor

So would this be a valid example of how to initialize an array of Vecs then?

#![feature(maybe_uninit)]

use std::mem::{self, MaybeUninit};

fn main() {
    let mut arr: [MaybeUninit<Vec<u8>>; 50] = {
        // The compiler complains about `Copy` bounds if we try to directly
        // create a `[MaybeUninit<Vec<u8>>; 50]`, so we use the fact that
        // MaybeUninit<[T; N]> == [MaybeUninit<T>; N]  to work around that
        let arr: MaybeUninit<[Vec<u8>; 50]> = MaybeUninit::uninitialized();
        unsafe { mem::transmute(arr) }
    };

    // Initialize the elements one at a time. The `Iterator` trait isn't
    // implemented for arrays over size 32 (integer generics when!?),
    // so let's use good ol' indexing instead.
    for idx in 0..arr.len() {
        arr[idx].set(vec![]);
    }

    // Now that everything is initialized (and they'd better be fully
    // initialized or things WILL blow up here), we can transmute to
    // a fully valid array of Vecs
    let arr: [Vec<u8>; 50] = unsafe { mem::transmute(arr) };

    // `Debug` is also unimplemented for arrays of this size, so let's
    // index-loop again to show our completed work!
    for idx in 0..arr.len() {
        println!("Index {}: {:?}", idx, arr[idx]);
    }
}

https://play.rust-lang.org/?gist=95140594f74290a294a7cea06d37c597&version=nightly&mode=release&edition=2015

@leonardo-m
Copy link
Author

Is it worth adding to the Rust std lib a helper macro like this, to simplify the creation of arrays of noncopy values?

https://crates.io/crates/array-init

@mbrubeck
Copy link
Contributor

mbrubeck commented Sep 27, 2018

So would this be a valid example of how to initialize an array of Vecs then?

My code from above was meant to be an entire solution, without any transmuting necessary. Here's a complete example program:

use std::mem::MaybeUninit;

fn main() {
    const N: usize = 5;

    let arr: [Vec<u8>; N] = unsafe {
        let mut arr = MaybeUninit::uninitialized();
        for i in 0..N {
            (arr.as_mut_ptr() as *mut Vec<u8>).add(i).write(vec![]);
        }
        arr.into_inner()
    };

    for idx in 0..arr.len() {
        println!("Index {}: {:?}", idx, arr[idx]);
    }
}

@FenrirWolf
Copy link
Contributor

Ah nice, that's way more concise.

@RalfJung
Copy link
Member

@mbrubeck One extremely subtle point about your code is the Nice to have no transmute, though as *mut Vec<u8> is pretty much the same thing. But yeah, I agree that is correct. Might be nicer to do the arr.as_mut_ptr() as *mut Vec<u8> only once? Also I think arr should have a type ascribed, inference is not entirely trivial here.

use std::mem::MaybeUninit;

fn main() {
    const N: usize = 5;

    let arr: [Vec<u8>; N] = unsafe {
        let mut arr: MaybeUninit<[Vec<u8>; N]> = MaybeUninit::uninitialized();
        let arr_ptr = arr.as_mut_ptr() as *mut Vec<u8>; // pointer to 1st element
        for i in 0..N {
            arr_ptr.add(i).write(vec![]);
        }
        arr.into_inner()
    };

    for idx in 0..arr.len() {
        println!("Index {}: {:?}", idx, arr[idx]);
    }
}

@steveklabnik steveklabnik added the P-medium Medium priority label Dec 27, 2018
@DevQps
Copy link
Contributor

DevQps commented Mar 30, 2019

@mbrubeck One extremely subtle point about your code is the Nice to have no transmute, though as *mut Vec<u8> is pretty much the same thing. But yeah, I agree that is correct. Might be nicer to do the arr.as_mut_ptr() as *mut Vec<u8> only once? Also I think arr should have a type ascribed, inference is not entirely trivial here.

use std::mem::MaybeUninit;

fn main() {
    const N: usize = 5;

    let arr: [Vec<u8>; N] = unsafe {
        let mut arr: MaybeUninit<[Vec<u8>; N]> = MaybeUninit::uninitialized();
        let arr_ptr = arr.as_mut_ptr() as *mut Vec<u8>; // pointer to 1st element
        for i in 0..N {
            arr_ptr.add(i).write(vec![]);
        }
        arr.into_inner()
    };

    for idx in 0..arr.len() {
        println!("Index {}: {:?}", idx, arr[idx]);
    }
}

I think the ascribed type is also very helpful!

When I try to run this on the Playground I get an error however:

error[E0599]: no method named `into_inner` found for type `std::mem::MaybeUninit<[std::vec::Vec<u8>; 5]>` in the current scope

When I try to run this on my own machine with nightly-1.35.0 I get the same error. Am I missing something?

Ultimately I think we should somehow create a good safe abstraction for this. I passed this to the security workgroup to see if there is something we can come up with.

@RalfJung
Copy link
Member

The API has changed, the following should work now:

#![feature(maybe_uninit)]
use std::mem::MaybeUninit;

fn main() {
    const N: usize = 5;

    let arr: [Vec<u8>; N] = unsafe {
        let mut arr: MaybeUninit<[Vec<u8>; N]> = MaybeUninit::uninit();
        let arr_ptr = arr.as_mut_ptr() as *mut Vec<u8>; // pointer to 1st element
        for i in 0..N {
            arr_ptr.add(i).write(vec![]);
        }
        arr.assume_init()
    };

    for idx in 0..arr.len() {
        println!("Index {}: {:?}", idx, arr[idx]);
    }
}

But that's basically a hack. Better interaction of MaybeUninit with Box/arrays/slices is definitely on the roadmap, but let's first land the basics.

@eddyb
Copy link
Member

eddyb commented Mar 30, 2019

Isn't it simpler and safer to turn &mut MaybeUninit<[Vec<u8>; N]> into &mut [MaybeUninit<Vec<u8>>; N] and iterate that normally, instead of iterating with raw pointers?

@RalfJung
Copy link
Member

RalfJung commented Mar 30, 2019

I am not saying the above is the API we should use.^^

Probably it makes most sense to start with [MaybeUninit<Vec<u8>>; N] created by uninitialized_array![Vec<u8>; N]. But then we need a good way to go from that to [Vec<u8>; N]. We cannot write that as a generic function yet, unfortunately (nor your array pointer conversion function, FWIW).

@eddyb
Copy link
Member

eddyb commented Mar 30, 2019

So, here's a trick...
&mut MaybeUninit<[T; N]> should coerce to &mut MaybeUninit<[T]> (do we have to solve union / custom DST unresolved questions for this?)

Then you don't need const generics to turn it into &mut [MaybeUninit<T>].

Btw, we should just make [MaybeUninit::uninit(); N] and even [vec![]; N] work in the NLL borrowck, we were waiting for Rust 2018 to ship, but it should be easy now.

We can even bypass const fn promotability for vec![] by special-casing it to use a perma-unstable Vec::EMPTY associated const! (but Vec::new() working would be nice)

cc @oli-obk (all we have to do is promote Rvalue::Repeat's element, where possible, and move the Copy check from typeck to old & new borrowck)

There is an issue open for this, I think, but I can't find it easily right now.

@RalfJung
Copy link
Member

I suggest to add an equivalent example of "how to safely initialize an array of Vecs"

The docs have been expanded to contain such an example:

use std::mem::{self, MaybeUninit};
use std::ptr;

let data = {
    // Create an uninitialized array of `MaybeUninit`. The `assume_init` is
    // safe because the type we are claiming to have initialized here is a
    // bunch of `MaybeUninit`s, which do not require initialization.
    let mut data: [MaybeUninit<Vec<u32>>; 1000] = unsafe {
        MaybeUninit::uninit().assume_init()
    };

    // Dropping a `MaybeUninit` does nothing, so if there is a panic during this loop,
    // we have a memory leak, but there is no memory safety issue.
    for elem in &mut data[..] {
        unsafe { ptr::write(elem.as_mut_ptr(), vec![42]); }
    }

    // Everything is initialized. Transmute the array to the
    // initialized type.
    unsafe { mem::transmute::<_, [Vec<u32>; 1000]>(data) }
};

assert_eq!(&data[0], &[42]);

@iago-lito
Copy link
Contributor

@RalfJung This pattern is a way to avoid unnecessary initialization of the array. And it is crystal clear :)

In a context where unnecessary initialization is not wanted, it is likely that unnecessary moves are also unwanted. Is it somehow guaranteed by the compiler that returning unsafe { mem::transmute::(...) } from the block in this case will not copy all bytes to another location owned by the new variable data? If not, how to ensure that results will not be moved around?

Should this be more explicit in this example of the doc, or is it the wrong place to bring up a move-elision-related topic?

@RalfJung
Copy link
Member

RalfJung commented Apr 8, 2020

AFAIK Rust has no guaranteed move elision, and I am afraid I don't know how well the optimizer can elide such moves in practice. This is below the layers of Rust that I am familiar with.

@RalfJung
Copy link
Member

@iago-lito I think the feature you are asking for was also requested at #61011.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools P-medium Medium priority T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants