Should we / can we make MaybeUninit<T> always preserve all bytes of T? #518

RalfJung · 2024-07-26T06:30:38Z

It is a frequent source of confusion that MaybeUninit<T> is not just preserving all the underlying bytes of storage, but actually if T has padding then those bytes are lost on copies/moves of MaybeUninit<T>.

This is currently pretty much a necessary consequence of the promise that MaybeUninit<T> is ABI-compatible with T: some ABIs don't preserve the padding of T when it is passed to a function. However, this was not part of the intention with MaybeUninit at all, it is something we discovered later.

Maybe we should try to take this back, and make the guarantee only for types without padding?

I am not even sure why we made this a guarantee. We made the type repr(transparent) because for performance it is quite important that MaybeUninit<$int> becomes just an iN in LLVM. But that doesn't require a stable guarantee. And in fact it seems like it would almost always be a bug if the caller and callee disagree about whether the value has to be initialized. So I would be curious about real-world examples where this guarantee is needed.

The text was updated successfully, but these errors were encountered:

elichai · 2024-07-26T09:38:36Z

I'll add that a typed copy of an uninitialized variable is UB in C, so there's no need to promise any ABI for FFI compatibility,
So this leaves us with the "Rust" ABI which isn't stable anyway.

RalfJung · 2024-07-26T10:34:27Z

In C I think this is not true for char types. But yeah for most types you cannot pass them uninit by value.

carbotaniuman · 2024-07-26T12:11:54Z

The PR that introduced the guarantees does not talk about padding, and it seems like that wasn't really understood back then. The t-lang minutes discussing this are lost to time and reorganizations, but it seems doubtful that such a consideration was raised. Discussions from 2018 raise a lack of real-world use cases for ABI compatibility, and I agree with such a sentiment in the present.

I don't this this would be approved nowadays, but I am incredibly apprehensive about removing it. There are few places in the Rust documentation that use always for guarantees like this, and the use cases for some weird FFI thunks or bindings would be nigh-impossible to properly test with crater or similar...

RalfJung · 2024-07-26T12:28:24Z

@carbotaniuman I think we should consider removing it. If we can't come up with any legitimate usecase, I think we should definitely remove it. I don't like going back on a promise like this, but if we don't have a usecase that could be broken by taking back this promise, then the chances that someone is affected should be very slim.

@Diggsey thanks for explaining why you think this belongs in this thread. But I disagree. "MaybeUninit preserves provenance" is not relevant here. You will note that provenance does not appear in the issue description. Furthermore, provenance on CHERI works like it does everywhere else, so even if provenance were relevant, CHERI wouldn't change anything. It is true that you can write code with MaybeUninit that will work everywhere but not on CHERI; discussing that is off-topic here as the reasons are completely different from what this thread is about and also different from what you mentioned -- it is caused not by padding and not by provenance, but by capabilities. So please take this elsewhere, e.g. Zulip or a new issue where you explain why you think MaybeUninit's provenance behavior is incompatible with CHERI, but I'd prefer not to see yet another thread derailed by CHERI. I like CHERI and want to see it work in Rust, but our main task here is to figure out Rust for the existing targets we already support. CHERI support is a nice extra that I'll happily discuss, as long as it doesn't distract from our core task.

carbotaniuman · 2024-07-26T13:17:33Z

If we do agree to remove the guarantee, I expect it to break 0 uses in practice. My only other concern would be the performance impact of having to copy more bytes. It probably won't affect SIMD or buffers though, so I don't really think that's it's really an issue.

bjorn3 · 2024-07-26T13:23:31Z

I think we should preserve the memory layout compatibility, but drop the calling convention compatibility. That could be done using repr(C) instead of repr(transparent) I think.

chorman0773 · 2024-07-26T13:24:11Z

FTR, I use MaybeUninit<T> in the signatures of lccc's libatomic and libsfp ABI-level routines. This is because they get called from xlang's codegen, and xlang allows uninit for those operations. Though in these cases, they are types without padding in the signature.

However, for compatibility with gcc/clang, they have to expose an ABI equal to the rountines using primitives.

chorman0773 · 2024-07-26T13:27:22Z

(And in general, I agree with @carbotaniuman - unless crater is testing all kinds of targets, I'm betting it primarily tests x86_64, where aggregate-of-one-field will get passed the same way as that one field*, so without using miri-crater, the ABI checks won't be found by crater. If the code is used on something like arm32 though, it's going to be very visibility broken)

RalfJung · 2024-07-26T13:28:12Z

Yes, this is super hard to test for. I wonder if it's worth having a blog post asking people whether they need this guarantee...

I think we should preserve the memory layout compatibility, but drop the calling convention compatibility. That could be done using repr(C) instead of repr(transparent) I think.

Yes, concretely the proposal would be:

T and MaybeUninit<T> always have the same size and alignment
they have the same ABI if T has no padding

Or maybe "no padding" should be restricted a bit further, like "if T is a primitive integer/float/pointer type" or so. Note that some non-power-of-2 SIMD types have padding so we have to be careful if we want to talk about those types.

Ddystopia · 2024-07-26T13:55:00Z

Is the only motivation to backing up on that promise is the fact that this is a frequent source of confusion? Which benefits except clarity can Rust gain?

RalfJung · 2024-07-26T14:12:36Z

We never intended MaybeUninit<(u8, u16)> to have a padding byte that would be lost on copies. This is a complete accident. It's not just a source of confusion, it's not the semantics we want. It came up in #517 where it means that returning a MaybeUninit<T> from an atomic compare-exchange actually doesn't work since padding bytes still get lost so we won't end up having the same bit pattern as what is stored in the atomic location.

jamesmunns · 2024-07-26T14:16:16Z

I'm pretty sure this is still the case, but it might be worth it to enumerate things that ARE still allowed for this wrt FFI/ABI concerns. My primary use of MaybeUninit<T> in FFI is for "outptr" usages (edit: specifically &mut MaybeUninit<T> or *mut MaybeUninit<T>), which seems to be still good (because we never copy/pass by value - the part that is discussed by this issue), but for folks like myself might be worth spelling out clearly/contrasting what is no longer allowed.

RalfJung · 2024-07-26T14:22:09Z

Yes, ABI compatibility is about the "by-value" part of a function argument or return type. That's how we've consistently been using this term for a while now, also see our glossary and the documentation on ABI compatibility.

In public communication we'll obviously spell out the details more than in internal discussion. ("Internal" not as in "private" but as in "among the team members and anyone else who's willing to participate".)

chorman0773 · 2024-07-26T14:40:47Z

Or maybe "no padding" should be restricted a bit further, like "if T is a primitive integer/float/pointer type" or so. Note that some non-power-of-2 SIMD types have padding so we have to be careful if we want to talk about those types.

I at the very least need target simd types as well - for floating-point types that aren't directly supported by rust (e.g. f2x64_t), I wrap them in a target-specific ABI type, which on x86_64, is mostly __m128.

RalfJung · 2024-07-26T14:59:26Z

Since that is a compiler-internal concern, you could also do this by providing more ABI guarantees than what Rust provides in general.

But that case would be covered by "types without padding", or we could explicitly mention the stdarch SIMD types (since they are all powers of 2).

chorman0773 · 2024-07-26T15:17:28Z

Since that is a compiler-internal concern, you could also do this by providing more ABI guarantees than what Rust provides in general.

Not fully - you don't necessarily need to compile the rtlibs with lccc themselves, they're written in mostly portable rust, and quite deliberately. I'd like to be able to continue providing that guarantee.

Yes, ABI compatibility is about the "by-value" part of a function argument or return type. That's how we've consistently been using this term for a while now, also see our glossary and the documentation on ABI compatibility.

You can also now see a formalization in reference#1545, as a note.

RalfJung mentioned this issue Jul 26, 2024

Packing pointers into double-word width atomics #517

Open

This comment was marked as off-topic.

Sign in to view

RalfJung mentioned this issue Sep 11, 2024

interpret: make typed copies lossy wrt provenance and padding rust-lang/rust#129778

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we / can we make MaybeUninit<T> always preserve all bytes of T? #518

Should we / can we make MaybeUninit<T> always preserve all bytes of T? #518

RalfJung commented Jul 26, 2024

elichai commented Jul 26, 2024

RalfJung commented Jul 26, 2024 via email

This comment was marked as off-topic.

This comment was marked as off-topic.

carbotaniuman commented Jul 26, 2024

This comment was marked as off-topic.

RalfJung commented Jul 26, 2024 •

edited

Loading

carbotaniuman commented Jul 26, 2024

bjorn3 commented Jul 26, 2024

chorman0773 commented Jul 26, 2024

chorman0773 commented Jul 26, 2024

RalfJung commented Jul 26, 2024

Ddystopia commented Jul 26, 2024

RalfJung commented Jul 26, 2024

jamesmunns commented Jul 26, 2024 •

edited

Loading

RalfJung commented Jul 26, 2024 •

edited

Loading

chorman0773 commented Jul 26, 2024

RalfJung commented Jul 26, 2024

chorman0773 commented Jul 26, 2024

Should we / can we make MaybeUninit<T> always preserve all bytes of T? #518

Should we / can we make MaybeUninit<T> always preserve all bytes of T? #518

Comments

RalfJung commented Jul 26, 2024

elichai commented Jul 26, 2024

RalfJung commented Jul 26, 2024 via email

This comment was marked as off-topic.

This comment was marked as off-topic.

carbotaniuman commented Jul 26, 2024

This comment was marked as off-topic.

RalfJung commented Jul 26, 2024 • edited Loading

carbotaniuman commented Jul 26, 2024

bjorn3 commented Jul 26, 2024

chorman0773 commented Jul 26, 2024

chorman0773 commented Jul 26, 2024

RalfJung commented Jul 26, 2024

Ddystopia commented Jul 26, 2024

RalfJung commented Jul 26, 2024

jamesmunns commented Jul 26, 2024 • edited Loading

RalfJung commented Jul 26, 2024 • edited Loading

chorman0773 commented Jul 26, 2024

RalfJung commented Jul 26, 2024

chorman0773 commented Jul 26, 2024

RalfJung commented Jul 26, 2024 •

edited

Loading

jamesmunns commented Jul 26, 2024 •

edited

Loading

RalfJung commented Jul 26, 2024 •

edited

Loading