Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is repr(transparent) completely transparent within repr(Rust) types? #298

Closed
Manishearth opened this issue Jul 9, 2021 · 15 comments
Closed

Comments

@Manishearth
Copy link
Member

Manishearth commented Jul 9, 2021

repr(transparent) exists which makes it safe to cast references between e.g. MaybeUninit<T> and T

Does this extend to repr(Rust) types containing repr(transparent)? I.e., are we guaranteed that the following pairs of types:

struct Struct {
  x: String,
  y: Foo
}
struct StructUninit {
  x: String,
  y: MaybeUninit<Foo>
}

enum Enum {
   A(String),
   B(Foo, String)
}
enum EnumUninit {
   A(String),
   B(MaybeUninit<Foo>, String)
}

are fully layout compatible? The primary guarantee of repr(transparent) is that it does not affect the size and alignment, but my understanding is that at the moment repr(Rust) has no guarantees; it would be legal for repr(Rust) to say "add a padding of 2 in every field whose type starts with an 'M'".

This guarantee would be very nice to have; because otherwise there's no way to do in-place initialization of Rust enums without repr(C).

cc @rust-lang/lang (unsure if this is the right place to file this)

@thomcc
Copy link
Member

thomcc commented Jul 9, 2021

I've asked this question before and the answer at the time was no. It would be nice for a lot of things, enums or not.

(One example of why it would be nice is that there's certainly code out there that assumes this is true already).

@Manishearth
Copy link
Member Author

I'm curious, was there any reasoning given behind that no answer?

@Manishearth
Copy link
Member Author

ah, @mystor pointed out a flaw: Option<&T> and Option<MaybeUninit<&T>> are not layout compatible and can't be

I forgot about niche optimizations.

I wonder if there's still a way to get something here that allows for in-place initialization of Rust enums

@digama0
Copy link

digama0 commented Jul 9, 2021

At the very least it can affect enum layout optimization, since MaybeUninit<T> doesn't have a niche even if T does

@thomcc
Copy link
Member

thomcc commented Jul 9, 2021

I think just that repr(Rust) has absolutely no guarantees, and people intend to add flags to randomize padding/offsets. As mentioned, there's also the union case. I think that when I asked MaybeUninit wasn't yet repr(transparent), but changing this for the non-union case would still arguably be useful.

I suspect changing this for a repr(transparent) (sans-unions) would take an RFC, but hopefully not a particularly contentious one.

@digama0
Copy link

digama0 commented Jul 9, 2021

In particular, a slight variation on your example shows that it is affected by the niche optimization:

enum Enum {
   A,
   B(Foo)
}
enum EnumUninit { // not layout compatible with Enum!
   A,
   B(MaybeUninit<Foo>)
}

How feasible is it to just use MaybeUninit<Enum> in your application? The main disadvantage is that you have to have all the data for the variant available at the same time, but I'm not sure it's even possible to partially initialize an enum with arbitrary layout optimizations applied: you would need to be able to say "I have set the discriminant to B, so it's not all zeroes, but it is otherwise uninitialized (?)".

@RalfJung
Copy link
Member

RalfJung commented Jul 9, 2021

I wonder if there's still a way to get something here that allows for in-place initialization of Rust enums

n-place initialization of something like Option<&T> seems fundamentally... if not impossible then very very hard. You basically have to now how the niche of &T is exploited, don't you?

I can't really imagine how that would work.

@Diggsey
Copy link

Diggsey commented Jul 9, 2021

@RalfJung well an Option<T> must still contain a valid T somewhere in its representation, so you could have a two-step process.

We'd just need a new intrinsic that's sort of the opposite of discriminant_value - you would give it a discriminant and a raw pointer, and it would write the discriminant. Any part of the discriminant that's part of a niche for that variant could simply be ignored.

Then you do two steps:

  • Write the discrimnant via the above intrinsic.
  • Write the contained value(s) via ptr::write.

As long as you write the correct types for the variant you want then any niches would be taken care of automatically.

@Lokathor
Copy link
Contributor

Lokathor commented Jul 9, 2021

Option<char>
Option<NonZeroU8>
Option<bool>
Option<Vec<i32>>

all of these do not contain a valid instance of the inner type in their bits if the value is currently None.

@Manishearth
Copy link
Member Author

n-place initialization of something like Option<&T> seems fundamentally... if not impossible then very very hard. You basically have to now how the niche of &T is exploited, don't you?

I don't see a problem?

I'm imagining having a MaybeUninit<Option<&T>>, and having:

  • an intrinsic to set it to the Some variant
  • an intrinsic to get the relevant offset of the &T in that Some variant so that it can be written to

Neither of these can be structured as "regular" rust function-style intrinsics, but there's definitely something doable there.

The compiler knows how that niche is exploited, and it should be possible to ask the compiler how. It's not great that currently there is no safe way to do this; if uninitialized values are going to be UB there has got to be a defined way to manipulate uninitialized enums.

@RalfJung
Copy link
Member

RalfJung commented Jul 9, 2021

Something with a lot of intrinsics could work, but then repr(transparent) is not terribly relevant any more, is it? So maybe it's just good ol' XY problem.

For structs, what we do is we can take a raw ptr to Struct and turn it into a raw ptr to Field (using ptr::addr_of!). So what we really need is addr_of! support for enum variants.

@mystor
Copy link

mystor commented Jul 9, 2021

Something with a lot of intrinsics could work, but then repr(transparent) is not terribly relevant any more, is it? So maybe it's just good ol' XY problem.

I think repr(transperent) was originally proposed as a way to achieve the end goal of initializing an enum, but isn't really relevant anymore, yeah.

For structs, what we do is we can take a raw ptr to Struct and turn it into a raw ptr to Field (using ptr::addr_of!). So what we really need is addr_of! support for enum variants.

Perhaps a solution might be to have a built-in pat_offsets! macro which accepts a type and a fallible pattern, and binds the offsets of fields in that pattern to their relevant names, like:

pat_offsets!(Option<String>, Some(x) => {
  // in this scope, `x` is a `usize` containing the offset of a binding `x` of
  // the `Some` variant of a value of type `Option<String>`
});

There'd still need to be a mechanism for setting the discriminant (possibly to be called after the value payloads are written in, I suppose, to handle niche optimizations properly), but that could probably be done with a normal intrinsic and the Discriminant type, like:

unsafe fn write_discriminant<T: DiscriminantKind>(to: *mut T, discr: <T as DiscriminantKind>::Discriminant);

Getting this to somehow work with how we do offset_of! on top of addr_of! today feels a bit trickier, though, as there's no syntax for naming an unmatched enum's variant right now. I imagine it might require explicit variant types like Option<T>::{Some} (made up syntax), which could theoretically support .0 but would be coerceable to Option<T>, to express with normal member syntax. With something like that, and each variant type having a const DISCRIMINANT: ... associated const, one could do something vaguely like this (using all fake syntax and everything, naturally):

let mut out: MaybeUninit<Option<T>> = ...;
let value: T = ...;

let offset = unsafe {
  let base = <MaybeUninit<Option<T>::{Some}>>::uninit();
  addr_of!((*base.as_ptr()).0).offset_from(base.as_ptr()) as usize
};

unsafe {
  ptr::write(out.as_mut_ptr().add(offset) as *mut T, value);
  write_discriminant(out.as_mut_ptr(), <Option<T>::{Some}>::DISCRIMINANT);
}

@RalfJung
Copy link
Member

I think repr(transperent) was originally proposed as a way to achieve the end goal of initializing an enum, but isn't really relevant anymore, yeah.

In that case I think it'd be better to close this issue and open a new one focused on the problem. This issue contains a bunch of irrelevant discussion.

I'm not even sure if this is a UCG thing, since we will need language extensions for this -- UCG is mostly about "how do I best sue the unsafe part of the language", this is about "how can we extend the language to support this use-case that we currently do not support".

@Manishearth
Copy link
Member Author

Yeah, I opened this here because I thought clarifying repr(transparent) was a UCG thing but that's not the issue at hand.

@RalfJung
Copy link
Member

Yes, that would indeed have been a UCG thing. :)

alercah added a commit to alercah/unsafe-code-guidelines that referenced this issue Oct 8, 2022
I have tried to document everything I believe we have consensus on. I've
left some things open that I possibly could have closed, but because
this PR is very big, I would like to focus on getting it in as quickly
as possible and worrying about whatever's left aftwards.

I strongly encourage others to submit follow up PRs to close out the
other open issues.

Closes rust-lang#156.
Closes rust-lang#298.
Closes rust-lang#352.
alercah added a commit to alercah/unsafe-code-guidelines that referenced this issue Oct 8, 2022
I have tried to document everything I believe we have consensus on. I've
left some things open that I possibly could have closed, but because
this PR is very big, I would like to focus on getting it in as quickly
as possible and worrying about whatever's left aftwards.

I strongly encourage others to submit follow up PRs to close out the
other open issues.

Closes rust-lang#156.
Closes rust-lang#298.
Closes rust-lang#352.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants