Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: Direct and Partial Initialization using &uninit T #2534
Conversation
This comment has been minimized.
This comment has been minimized.
|
Earlier conversation can be found in the Pre-RFC |
Centril
added
the
T-lang
label
Aug 29, 2018
Centril
changed the title
Initial Commit for Write Pointer
RFC: Partial Initialization and Write Pointers using &out T and &uninit T
Aug 29, 2018
This comment has been minimized.
This comment has been minimized.
newpavlov
commented
Aug 29, 2018
•
|
I think it's better to write a separate RFC for EDIT: |
KrishnaSannasi
changed the title
RFC: Partial Initialization and Write Pointers using &out T and &uninit T
RFC: Write Pointers for Direct and Partial Initialization using &out T and &uninit T
Aug 29, 2018
This comment has been minimized.
This comment has been minimized.
|
Did you mean |
KrishnaSannasi
changed the title
RFC: Write Pointers for Direct and Partial Initialization using &out T and &uninit T
RFC: Write References for Direct and Partial Initialization using &out T and &uninit T
Aug 29, 2018
This comment has been minimized.
This comment has been minimized.
|
Can you add some discussion of the advantages of |
This comment has been minimized.
This comment has been minimized.
|
This example is broken, see my comment below for the udpated version #[derive(Clone, Copy)]
pub struct Outer {
pub public_field: u32
private_field: u32
}
impl Outer {
pub fn new(private_field: u32) -> Self {
Outer { public_field: 0, private_field }
}
}
// The code below lives in a different crate
// relevant imports here
let outer = Outer::new(20);
foo(&out outer);
let outer = MaybeUninit::new(Outer::new(30));
bar(&mut outer);
fn foo(outer: &out Outer) {
// println!("{}", outer.should_only_write); // Compile time error
outer.should_only_write = 20; // Fine, because we only write
}
fn bar(outer: &mut MaybeUninit<Outer>) {
outer.set(
// what to put in here?
)
}
This example had a few problems, see my below for the updated version. |
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
|
@KrishnaSannasi Reading values from a |
This comment has been minimized.
This comment has been minimized.
|
@cramertj How about values on the heap? (Such as in a |
This comment has been minimized.
This comment has been minimized.
|
The process works no differently on the heap, but the specific API for placement-new inside of
|
This comment has been minimized.
This comment has been minimized.
|
A impl<T> Vec<T> {
fn emplace_back(put: impl FnOnce(&uninit T)) {
// ... allocate magic ...
put(&uninit end);
}
}and we could have some-placement-new like sugar for ease of use, but that's not necessary. |
This comment has been minimized.
This comment has been minimized.
|
@KrishnaSannasi Yeah, that works out nicely. A similar thing using pub struct Uninit<'a, T, Tag> {
location: &'a mut MaybeUninit<T>,
marker: PhantomData<Tag>,
}
pub struct UninitFilled<Tag>(PhantomData<Tag>);
impl<'a, T, Tag> Uninit<'a, T, Tag> {
pub fn new(location: &'a mut MaybeUninit<T>) -> Self {
...
}
#[inline(always)]
pub fn fill(self, value: T) -> UninitFilled<UniqToken> {
self.location.set(value);
UninitFilled(PhantomData)
}
}
impl<T> Vec<T> {
pub fn emplace_back(put: impl for<Tag> FnOnce(Uninit<'_, T, Tag>) -> UninitFilled<Tag>) {
...
}
}The |
This comment has been minimized.
This comment has been minimized.
|
@cramertj Cool, why do we need Also, how about this: #[derive(Clone, Copy)]
pub struct Outer {
pub public_field: u32
private_field: u32
}
impl Outer {
pub fn new(private_field: u32) -> Self {
Outer {
public_field: 0,
private_field
}
}
}
#[repr(transparent)]
struct MaybeUninitWrite<'a, T: 'a>(&mut MaybeUninit<T>);
impl<'a, T: 'a> MaybeUninitWrite<'a, T> {
pub fn new(inner: &'a mut MaybeUninit) -> Self {
MaybeUninitWrite(inner)
}
pub fn set(&mut self, value: T) {
(self.0).set(value);
}
}
// The code below lives in a different crate
// relevant imports here
let outer = Outer::new(20);
foo(&out outer);
let mut outer = MaybeUninit::new(Outer::new(30));
let outer = MaybeUninitWrite::new(&mut outer);
bar(&mut outer);
let mut outer = MaybeUninit::new(Outer::new(40));
unsafe { baz(&mut outer); }
// These functions should not be able to read any value in Outer. It could cause a segfault.
// (obviously this code won't but we can imagine code very similiar to this
// that might actually segfault, using hw interfaces managed by the OS)
// This enforces that contract statically
fn foo(outer: &out Outer) {
// println!("{}", outer.public_field); // Compile time error
outer.public_field = 20; // Fine, because we only write
}
// This cannot do the same thing as foo
fn bar(outer: &mut MaybeUninitWrite<Outer>) {
// not possible to change public_field on outer
}
// This cannot enforce a write-only contract,
// and might segfualt if outer holds an uninitialized value
unsafe fn baz(outer: &mut MaybeUninit<Outer>) {
outer.get_mut().public_field = 20;
}Doing something this simple, should not require unsafe code, or complex type usage. |
Centril
assigned
cramertj
Aug 30, 2018
This comment has been minimized.
This comment has been minimized.
|
Another idea for the bounds of The downside to this approach is that we add a new trait to the standard library, and adding a |
This comment has been minimized.
This comment has been minimized.
bjorn3
commented
Sep 5, 2018
•
Why not use a lifetime, instead of a type as tag? |
This comment has been minimized.
This comment has been minimized.
|
@bjorn3 It's possible that an invariant lifetime would be sufficient (so that the user couldn't create a |
This comment has been minimized.
This comment has been minimized.
YaLTeR
commented
Sep 6, 2018
|
In the examples with vector |
This comment has been minimized.
This comment has been minimized.
|
@YaLTeR You would be forgetting the pointer, not the underlying value. In the closure case, you would not be initializing the value, so it would break this rule and get a compiler error
and you cannot return a
I put this rule in for the exact reason you pointed out. If we create an uninitialized value in a That way, only
How would you define an invariant lifetime, using references will always give a covariant lifetime, and there doesn't seem to be another way to constrain lifetimes, if I'm understanding the nomicon section of variance correctly. |
This comment has been minimized.
This comment has been minimized.
ExpHP
commented
Sep 9, 2018
This assumes the compiler can identify that ...and this is certainly something it can do; any function with the signature |
This comment has been minimized.
This comment has been minimized.
derekdreery
commented
Sep 10, 2018
|
Would this help to solve the problem where only part of a struct has defaults? e.g. struct Person {
given_name: String, // no default
family_name: String, // no default
other_names: Vec<String>, // default = Vec::new()
email: Option<String>, // default = None
//...
}There could be some function that returns an Would this be possible? |
This comment has been minimized.
This comment has been minimized.
|
@ExpHP Yes, that sounds right. @derekdreery That would require global knowledge of the entire program in the worst case, and that is undesirable, so no, this will not solve that problem. This is fine for the compiler to do, but the human programming would find this confusing, and hard to reason about. This also makes changing the implementation of a function a breaking change, which is also undesirable. |
Centril
added
A-uninit
A-typesystem
A-expressions
A-syntax
I-nominated
labels
Nov 22, 2018
This comment has been minimized.
This comment has been minimized.
|
@rfcbot fcp postpone Per my previous comment, I am moving to postpone this RFC. My reasoning remains unchanged:
(Per that same comment, we are still working through the NLL and Polonius transition and will be for some time.) |
This comment has been minimized.
This comment has been minimized.
rfcbot
commented
Dec 20, 2018
•
|
Team member @nikomatsakis has proposed to postpone this. The next step is review by the rest of the tagged team members:
No concerns currently listed. Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
Centril
removed
the
I-nominated
label
Dec 20, 2018
fstirlitz
reviewed
Dec 31, 2018
| # Unresolved questions | ||
| [unresolved-questions]: #unresolved-questions | ||
|
|
||
| - What is the correct bounds for `T` in `&out T`? conservatively, `T: Copy` works and is sound. But as [@TechnoMancer](https://internals.rust-lang.org/u/TechnoMancer) pointed out, `T: !Drop` is not the correct bound. For example, `Wrapper(Vec<String>)`, clearly cannot be overwritten safely, because `Vec<String>`, must drop do deallocate the `Vec`, but `Wrapper` itself does not implement drop. Therefore either a new trait is needed (but unwanted), or we must keep the `Copy` bound. |
This comment has been minimized.
This comment has been minimized.
fstirlitz
Dec 31, 2018
Failing to call destructors is safe; better find another word. This is mostly a technicality, however; resource leaks are still something Rust would rather avoid, and I agree not accounting for this here at all would make them too easy.
But besides this, I've been long thinking that Drop needs an overhaul. I'd replace it with something which looks like this:
trait Drop { /* ... */ }: same interface as beforeunsafe trait Destruct: Drop {}: marker trait, assures that values of a given type can be disposed of by merely callingDropimplementations of their constituent fields; the type has no disposal logic of its own and can be safely destructured (this corresponds to what is now informally expressed as!Drop)unsafe auto trait Forget: Destruct {}: marker trait, assures that allDropimplementations of transitively constituent fields are no-ops; in other words,mem::dropon the type is completely equivalent tomem::forget.
Now, automatic deriving behaviour:
- for any given type
T, unless the containing module provides an explicitDropimplementation, the compiler automatically derives a no-opDropandDestructforT; - if all constituent fields are also
Forget, the compiler derivesForgetas well.
This system would remove a major motivation for introducing negative traits/bounds (which have been pointed out to be rather problematic, especially in the context of automatic impls for !), as it would create a way to formalise 'trivial disposal behaviour' in the type system without them; in this case, the correct bound would be Forget. It would also open the doors to adding true linear typing to Rust: those could simply explicitly opt-out of implementing Drop. In other words, lacking an impl Drop would no longer mean 'trivial disposal logic', but 'no automatic disposal at all'.
However, I'm not sure how backwards compatible this would be. Obviously, introducing true linear typing would incur some major changes.
This comment has been minimized.
This comment has been minimized.
KrishnaSannasi
Dec 31, 2018
Author
The Drop overhaul sounds nice on paper, I would still still want some form of negative trait impls, because there are some APIs to need this to be work at all. But that is beside the point. The Drop overhaul doesn't fit in with this RFC, and you could post it to internals if you want more feedback.
I did note that failing to call destructors is safe as the very next sub bullet. This is an unresolved question because I wasn't sure if we wanted to allow leaks this way.
This comment has been minimized.
This comment has been minimized.
graydon
commented
Jan 12, 2019
|
Strong oppose. Cognitive load and risk of writing wrong code is high, motivation is marginal. |
This comment has been minimized.
This comment has been minimized.
@graydon Could you perhaps elaborate with examples of where it could go wrong and examples of the risks? |
This comment has been minimized.
This comment has been minimized.
|
@graydon I'm currently in the middle of stripping down this proposal. No more I can see how it can be a large cognitive load, but dealing with uninitialized memory is hard and so the rules around
struct Out<'a, T> { ptr: &'a mut T }
impl<'a, T> Out<'a, T> {
// to set without dropping old value
fn set(&mut self, value: T) { unsafe { std::ptr::write(self.ptr, value); } }
fn reborrow<'b>(out: &mut Self) where 'a: 'b { Out { ptr: out.ptr } }
}
impl<'a, T> From<&'a mut T> for Out<'a, T> {
fn from(ptr: &'a mut T) -> Self {
Self { ptr }
}
}edit: I know this isn't exactly what I put out for |
KrishnaSannasi
added some commits
Jan 12, 2019
KrishnaSannasi
changed the title
RFC: Write References for Direct and Partial Initialization using &out T and &uninit T
RFC: Direct and Partial Initialization using &uninit T
Jan 12, 2019
This comment has been minimized.
This comment has been minimized.
graydon
commented
Jan 12, 2019
Users obtaining pointers to uninitialized memory casually because "it's faster", which they then convert to a raw pointer. Presently casually taking a ref to an uninit variable gets you an error; you have to work somewhat to get yourself some uninit memory. We worked hard to paint uninitialized memory into a very small corner in this language you have to do some ceremony to get at. Now it'll be a chapter heading in a book that's very easy to see, learn, get interested in, tinker with, use in your own libraries. Additional unclear risk: drop semantics. Drops execute top-down, so this has to not work on drop types. The proposal isn't entirely clear if it's going to try to allow that. Cognitive load I think is obvious: it's a whole new qualifier with a complex set of rules that people will naturally sprinkle all over codebases, especially if it's associated with speed. We worked super hard to minimize the number of qualifiers. It looks like it can all be done with a library type, so it should be. It's also a new state of initialized-ness for users to think about, for every memory cell. Currently the user's mental model does not include (unless they're off in unsafe land) the initialized-ness of the substructure of a struct. Now it will. |
This comment has been minimized.
This comment has been minimized.
SoniEx2
commented
Jan 12, 2019
|
PITs are supposed to put stricter requirements on this thing. a pointer to a PIT is valid, but is a pointer to a PIT, not to a full object, because PITs encode initialization state in the type. you can still access uninitialized or invalid memory by going through usize but we have that issue with many existing types anyway. |
This comment has been minimized.
This comment has been minimized.
I am not convinced by this argument, to do anything with with raw pointers you need let x: String = unsafe { std::mem::unintialized() };
let y: *mut _ = &mut x; // raw pointer to uninit memoryBut now that you have a raw pointer to uninit memory you need to do pretty much all the same work that I have laid out to make sure that you don't do anything unsound with it. This is cognitive load that could be shifted onto the compiler to have it validate our program. Which is what this proposal is trying to do
Currently this will be allowed, because it logically identical to storing the data in a bunch of temporaries, then constructing the
I agree that this is a large cognitive load, but I would like to make dealing with uninitialized code safe. Also dealing with uninitialized code is entirely local, with absolutely no global analysis, so it shouldn't be too bad. |
This comment has been minimized.
This comment has been minimized.
|
FWIW, my inclination would be to close, but I can live with "postpone" for now. @rfcbot reviewed |
rfcbot
added
final-comment-period
and removed
proposed-final-comment-period
labels
Jan 18, 2019
This comment has been minimized.
This comment has been minimized.
rfcbot
commented
Jan 18, 2019
|
|
rfcbot
added
the
finished-final-comment-period
label
Jan 28, 2019
This comment has been minimized.
This comment has been minimized.
rfcbot
commented
Jan 28, 2019
|
The final comment period, with a disposition to postpone, as per the review above, is now complete. By the power vested in me by Rust, I hereby postpone this RFC. |
KrishnaSannasi commentedAug 29, 2018
•
edited
Use
&uninit Tto initialize your memory directly!This can have performance improvements and makes dealing with uninitialized memory completely safe.
Rendered
this RFC has been edited This RFC used to also have
&out, but I removed them due to insufficient motivation, and a relatively easy workaround to&out.