Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MaybeUninit: Document UnsafeCell byte ranges #121215

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

joshlf
Copy link
Contributor

@joshlf joshlf commented Feb 17, 2024

No description provided.

@rustbot
Copy link
Collaborator

rustbot commented Feb 17, 2024

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 17, 2024
Comment on lines +246 to +251
/// ## `UnsafeCell`
///
/// As a result of having the same layout as `T`, `MaybeUninit<T>` has [`UnsafeCell`]s covering
/// the same byte ranges as `T`.
///
/// [`UnsafeCell`]: crate::cell::UnsafeCell
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh? what does this mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's another way of expressing the same thing: for every byte offset, o, it is either the case that the byte at offset o in T is nested in an UnsafeCell and the byte at offset o in MaybeUninit<T> is nested in an UnsafeCell, or neither is. I'd definitely welcome suggestions on how to better word this.

The reason this is an important property is that if you operate on a given memory location using an UnsafeCell-typed reference at one point, and using a non-UnsafeCell-typed reference at another point, it's UB. In unsafe code, converting from &T to &MaybeUninit<T> (or vice versa) requires this property in order to prove that it's a sound conversion.

For example, consider this code:

#[repr(C)]
struct A {
    one: u8,
    two: UnsafeCell<[u8; 2]>,
}

#[repr(C)]
struct B {
    one: UnsafeCell<[u8; 2]>,
    two: u8,
}

fn main() {
    let a = A { one: 1, two: UnsafeCell::new([2, 3]) };
    let a_ptr: *const A = &a;
    let b = unsafe { &*(a_ptr as *const B) };
    println!("{:?}", unsafe { *b.one.get() });
}

If we run this under Miri, it detects UB:

error: Undefined Behavior: trying to retag from <1723> for SharedReadWrite permission at alloc816[0x0], but that tag only grants SharedReadOnly permission for this location
  --> src/main.rs:18:22
   |
18 |     let b = unsafe { &*(a_ptr as *const B) };
   |                      ^^^^^^^^^^^^^^^^^^^^^
   |                      |
   |                      trying to retag from <1723> for SharedReadWrite permission at alloc816[0x0], but that tag only grants SharedReadOnly permission for this location
   |                      this error occurs as part of retag at alloc816[0x0..0x3]
   |
   = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
   = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <1723> was created by a SharedReadOnly retag at offsets [0x0..0x1]
  --> src/main.rs:17:27
   |
17 |     let a_ptr: *const A = &a;
   |                           ^^
   = note: BACKTRACE (of the first span):
   = note: inside `main` at src/main.rs:18:22: 18:43

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's another way of expressing the same thing: for every byte offset, o, it is either the case that the byte at offset o in T is nested in an UnsafeCell and the byte at offset o in MaybeUninit is nested in an UnsafeCell, or neither is. I'd definitely welcome suggestions on how to better word this.

OH! I see. I had to read that a few times but I think I get what you mean.

Perhaps something like:

Suggested change
/// ## `UnsafeCell`
///
/// As a result of having the same layout as `T`, `MaybeUninit<T>` has [`UnsafeCell`]s covering
/// the same byte ranges as `T`.
///
/// [`UnsafeCell`]: crate::cell::UnsafeCell
/// ## `MaybeUninit<UnsafeCell<Type>>`
///
/// For any `T` that has interior mutability due to having [`UnsafeCell`] wrapping some
/// or all parts of its layout, then `MaybeUninit<T>` also has interior mutability for
/// the same parts of its layout.
///
/// [`UnsafeCell`]: crate::cell::UnsafeCell

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@workingjubilee's suggestion looks good to me.

Comment on lines +212 to +213
/// `MaybeUninit<T>` is guaranteed to have the same layout (size, alignment, and field offsets and layouts)
/// and ABI as `T`:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "field offsets and layouts" part is redundant because layouts are compositional. If the size and alignment is the same as T then by necessity the offset of T within MaybeUninit is zero. And any specific T is a T, so its field offsets are its field offsets.

This is no different than #[repr(transparent)] struct Wrapper(MyStruct)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this is overly complicating things and feels more confusing to me than just saying that the size and alignment are the same. MaybeUninit is usually treated as having a single field itself (T) so it's confusing to me to talk about field offsets and layouts, especially after size/alignment (which is all a layout is).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem as I see it is twofold:

  • While we technically define "layout" to refer to size, alignment, and field offset/layout [1], it's colloquially often used to refer only to size and alignment, and so I think being verbose may help here

  • While const generics aren't stable today, you could define MaybeUninit as follows, which would be consistent with the size and alignment properties, but would not be consistent with the field offset/layout properties:

    union MaybeUninit<T> {
        uninit: (),
        value: [u8; size_of::<T>()],
    }

[1] I know this was officially documented within the past year, but I can't figure out where we did that...

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 24, 2024
@joshlf
Copy link
Contributor Author

joshlf commented Mar 11, 2024

I'm going to hold off on working on this/trying to land it until rust-lang/unsafe-code-guidelines#495 is resolved. There are a number of questions there which will affect what wording is useful here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants