Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify rules for ZST Boxes #77844

Merged
merged 3 commits into from
Nov 21, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions library/alloc/src/boxed.rs
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,11 @@
//! T` obtained from [`Box::<T>::into_raw`] may be deallocated using the
//! [`Global`] allocator with [`Layout::for_value(&*value)`].
//!
//! For zero-sized values, the `Box` pointer still has to be [valid] for reads and writes and
//! sufficiently aligned. In particular, casting any aligned non-zero integer literal to a raw
//! pointer produces a valid pointer, but a pointer pointing into previously allocated memory that
//! since got freed is not valid.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any limitations here -- i.e., it's only a problem if the integer value is somehow derived from a freed value via some trackable provenance -- or is it UB even if you just pull an integer out of thin air that happens to be the same as a freed value?

If the latter, can we give some concrete advice about what value people should use for zero-sized values? I at least am feeling a bit confused about what kind of value is valid to use, given that it can't be null but must be aligned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are good questions. Unfortunately, LLVM is not specified precisely enough to properly answer them. We are building on quicksand here and can just take a guess about how it will behave. ;)

My personal interpretation is that it is okay to do this if the pointer has no provenance. That should always be the case for pointers created from an integer literal, or any other computations where there are no pointers involved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you feel about adding some text to that effect, even if it is quite tentative? What does the compiler do if we create a Box::new(T) where struct T; has some kind of high alignment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are saying I should use the term "provenance"? I had avoided that, because I don't think we are defining it anywhere outside the UCG glossary.

What does the compiler do if we create a Box::new(T) where struct T; has some kind of high alignment?

This is implemented in the stdlib actually, not the compiler. It uses the alignment as ptr value:

0 => Ok(NonNull::slice_from_raw_parts(layout.dangling(), 0)),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I am not suggesting you should use the word provenance necessarily. What I was suggesting is saying something like "The recommended way to build a Box to a ZST is to use an integer equal to the alignment as the pointer value."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And/or we could mention ptr::NonNull::dangling() as a canonically-valid-and-always-up-to-date implementation 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be even better, good point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment along these lines.

//!
//! So long as `T: Sized`, a `Box<T>` is guaranteed to be represented
//! as a single pointer and is also ABI-compatible with C pointers
//! (i.e. the C type `T*`). This means that if you have extern "C"
Expand Down Expand Up @@ -125,6 +130,7 @@
//! [`Global`]: crate::alloc::Global
//! [`Layout`]: crate::alloc::Layout
//! [`Layout::for_value(&*value)`]: crate::alloc::Layout::for_value
//! [valid]: ptr#safety

#![stable(feature = "rust1", since = "1.0.0")]

Expand Down Expand Up @@ -385,7 +391,10 @@ impl<T: ?Sized> Box<T> {
/// memory problems. For example, a double-free may occur if the
/// function is called twice on the same raw pointer.
///
/// The safety conditions are described in the [memory layout] section.
///
/// # Examples
///
/// Recreate a `Box` which was previously converted to a raw pointer
/// using [`Box::into_raw`]:
/// ```
Expand Down
7 changes: 5 additions & 2 deletions library/core/src/ptr/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,15 @@
//! provided at this point are very minimal:
//!
//! * A [null] pointer is *never* valid, not even for accesses of [size zero][zst].
//! * All pointers (except for the null pointer) are valid for all operations of
//! [size zero][zst].
//! * For a pointer to be valid, it is necessary, but not always sufficient, that the pointer
//! be *dereferenceable*: the memory range of the given size starting at the pointer must all be
//! within the bounds of a single allocated object. Note that in Rust,
//! every (stack-allocated) variable is considered a separate allocated object.
//! * Even for operations of [size zero][zst], the pointer must not be "dangling" in the sense of
//! pointing to deallocated memory. However, casting any non-zero integer literal to a pointer is
//! valid for zero-sized accesses. This corresponds to writing your own allocator; allocating
//! zero-sized objects is not very hard. In contrast, when you use the standard allocator, after
//! memory got deallocated, even zero-sized accesses to that memory are invalid.
//! * All accesses performed by functions in this module are *non-atomic* in the sense
//! of [atomic operations] used to synchronize between threads. This means it is
//! undefined behavior to perform two concurrent accesses to the same location from different
Expand Down