Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify #[repr(transparent)] #1758

Merged
merged 2 commits into from
Jul 3, 2017
Merged

Specify #[repr(transparent)] #1758

merged 2 commits into from
Jul 3, 2017

Conversation

nox
Copy link
Contributor

@nox nox commented Sep 26, 2016

Summary: Extend the existing #[repr] attribute on newtypes with a transparent option
specifying that the type representation is the representation of its only field.
This matters in FFI context where struct Foo(T) might not behave the same
as T.

Cc @eddyb @Manishearth

rendered

[edited to update rendered link]

# Alternatives
[alternatives]: #alternatives

None.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One obvious alternative would be to do this automatically for all structs that have just a single field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is done automatically for all newtypes, but then you can't pass that to FFI stuff without it becoming a struct on the C side too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a tooling solution here? Could a sufficiently smart bindgen work around this problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you explain why please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because there is no tooling that can be invented to do something that is not possible without this attribute. Either you do struct Foo(f64) and it can't be used in FFI, or you do #[repr(C)] struct Foo(f64) and it won't behave as a bare f64 in FFI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like something like bindgen could pretty easily create wrappers that unbox transparent structs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means an additional type, additional transmutes and/or additional boxing/unboxing functions. That's code bloat to me and I don't see why I would want that. It also doesn't help for structures with UnsafeCell<T> fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like something like bindgen could pretty easily create wrappers that unbox transparent structs.

This would mean that the fields can't be private, however. Not sure how much that matters in practice.

@Amanieu
Copy link
Member

Amanieu commented Sep 26, 2016

I feel that the RFC could do with a note about transmute, for example specifying that transmute<TransparentNewtype, f64> should be a no-op since the two types have the same representation.

@Amanieu
Copy link
Member

Amanieu commented Sep 26, 2016

I can also see this attribute being useful for atomic types such as AtomicUsize, which RFC #1649 specifies should have the same representation as the underlying type. This attribute could also be added to Cell and UnsafeCell for the same reasons.

@nox
Copy link
Contributor Author

nox commented Sep 26, 2016

This attribute could also be added to Cell and UnsafeCell for the same reasons.

Yep the whole reason I'm writing this RFC is for UnsafeCell<T> to be used in some structures that are passed to Gecko from Servo.

@nox
Copy link
Contributor Author

nox commented Sep 26, 2016

@Amanieu Added a note about FFI and AtomicUsize use cases.

Copy link
Member

@nagisa nagisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this RFC is quite underspecified, especially around interactions with some more obscure features.

Overall the idea seems okay to me, though.

the same type as the single field. For example on ARM64, functions returning
a structure with a single `f64` field return nothing and take a pointer to be
filled with the return value, whereas functions returning a `f64` return the
floating-point number directly.
Copy link
Member

@nagisa nagisa Sep 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see why the (unspecified) Rust ABI {w,sh}ould share the same behaviour, as opposed to being “transparent” by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+, transparent seems to be relevant only for repr(C) stuff, as a tweak to the default C ABI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"+"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @petrochenkov meant "and"


Syntactically, the `repr` meta list will be extended to accept a meta item
with the name "transparent". This attribute can be placed only on newtypes,
which means structures (and structure tuples) with a single field.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not univariant unions and enums?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to be conservative for now given I don't have a use case for univariant unions and enums in FFI context.


This new representation is mostly useful when the structure it is put on must be
used in FFI context as a wrapper to the underlying type without actually being
affected by any ABI semantics.
Copy link
Member

@nagisa nagisa Sep 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, so you expect to use this when passing stuff into foreign functions… Then a multiple questions arise, for example:

#[repr(transparent)] struct Transp<T>(T);

extern { fn banana(x: Transp<SomeRustType>); } // does the non-c-types lint warn? probably yes, but not specified in the RFC
extern { fn banana(x: Transp<SomeCType>); } // does the non-c-types lint warn? probably no, but not specified in the RFC

#[repr(transparent)] struct TranspPack(PackedTy);  // is packed? Probably yes.
#[repr(packed,transparent)] struct TransPack(SomeTy); // is packed? No idea.
#[repr(transparent, align="128")] struct TransU32(u32); // is valid? If so, is alignment 128 or `align_of::<u32>()`? What happens if native alignment of u32 is greater than alignment specified for the transparent newtype?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This RFC specifies that Transp<SomeRustType> should warn because it says the representation of Transp is whatever the representation of its single field is. So if the single field is not a proper C type, using it in FFI should warn.

But yeah, it should mention that no other repr should be used with it, so no align nor packed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned in the RFC that these are illegal.

@nox
Copy link
Contributor Author

nox commented Sep 26, 2016

I feel like this RFC is quite underspecified, especially around interactions with some more obscure features.

I have no idea how to specify more, feel free to amend or give me directions.

@ahicks92
Copy link

I agree with the people saying this should only apply to repr(C). If this is an optimization worth having, then Rust should do it automatically for repr(Rust). This doesn't sound like it would be too terribly hard to implement, though I'm not entirely sure of that offhand. Probably just adding a Newtype variant to Layout (alternatives end up in the territory of adding flags to Univariant, which feels like a hack to me).

I am working with @eddyb on optimizing struct layout, the first PR of which just merged into the compiler. While we could apply repr(transparent) to anything containing one field, it's only useful if that field also has repr(c) on its type. Otherwise, you eliminate the outer layer but may not have the fields in the order you want or expect. This would mostly match the C ABI now, but won't in future if I have my way.

This feels really hacky and temporary, and hacky/temporary-feeling things that we have a need for seem like a good time to revisit the general pattern. There is never any reason to leave repr(transparent) off a newtype that I can see, so perhaps someone should work out the semantics of newtype Foo = Bar instead of just leaving this is an ill-defined pattern that everyone eventually learns on their own.

@nox
Copy link
Contributor Author

nox commented Sep 26, 2016

I agree with the people saying this should only apply to repr(C).

What? No, f64 is not #[repr(C)] explicitly but it is nonetheless usable in C functions.

If I have a newtype without an explicit representation, I can't pass it to C functions. If I have a newtype with #[repr(C)], I can't use it in place of its inner type in a C function because it may be handled differently on the ABI level.

@nox
Copy link
Contributor Author

nox commented Sep 26, 2016

There is never any reason to leave repr(transparent) off a newtype that I can see, so perhaps someone should work out the semantics of newtype Foo = Bar instead of just leaving this is an ill-defined pattern that everyone eventually learns on their own.

That doesn't help to use UnsafeCell<T> in a field of a #[repr(C)] structure to signal that the C side of the code may change this field, which is what is needed to make Gecko and Servo work hand in hand in some cases.

@durka
Copy link
Contributor

durka commented Sep 26, 2016

What about extra zero-sized fields? Out of scope?

#[repr(transparent)]
struct Wrapper<T: SomeKindaMarker> {
    wrapped: i32,
    _ghost: PhantomData<T>
}

@eddyb
Copy link
Member

eddyb commented Sep 26, 2016

@camlorn This is exactly the newtype unpacking optimization I had in mind, which requires writing a not-entirely-recursive algorithm to handle cycles. It's pretty nasty stuff and I was waiting for old trans to die.
@durka At least for the Rust default, I would support that, too, since it affects ~all smart pointers.

For everyone else: there's some confusion here: #[repr(C)] is not the requirement of the FFI warning.
It's "you must only use types which are either some primitive with a cross-ABI representation or an user-defined type marked with #[repr(C)]". This RFC expands that rule to recurse into newtypes marked with #[repr(transparent)].

The reason that doing it automatically is not enough, and which @nox should most definitely make it the prime motivation of this RFC is that depending on it leaks implementation details by default.

If I have:

pub struct Opaque(T);

Then why should Opaque be usable with FFI if T is a primitive, or some #[repr(C)] struct?
Why should that detail be exposed and depended upon, if that field is private?

This is why you need #[repr(transparent)], to indicate that ABI properties of the wrapped private type, although not the type itself, are exposed by the public structure.

@ahicks92
Copy link

So shouldn't we just do a proper newtype, then say that repr(transparent) is only allowed on newtypes?

If things like UnsafeCell need to be exposed through this method, then we change them to newtype in the standard library and put repr(transparent) on them. This shouldn't cause a backwards compatibility problem. As things stand, the case of UnsafeCell even being a newtype isn't documented by the std docs,

My problem here is that it feels like this takes a step in the direction of enshrining the single-field struct as a newtype forever. This is workable if we then do stuff like accept #1406 in some form, which may have some uses anyway. But it feels like this is one of those decision points: accepting this without a dedicated newtype makes it more likely that we never will.

Also, unless it's already named, I hereby define FFI-capable type to refer to any type which can be safely used with the C FFI. Run with it or not as you like.

@eddyb
Copy link
Member

eddyb commented Sep 26, 2016

@camlorn But these types with private fields are not newtypes in the classical sense: it's not just a type disambiguator, but the field is opaque modulo the ABI implications of #[repr(transparent)] (if present).

"FFI-capable" sounds good, cc @steveklabnik.

@nagisa
Copy link
Member

nagisa commented Sep 26, 2016

@nox

No, f64 is not #[repr(C)] explicitly but it is nonetheless usable in C functions.

f64 et al are POD and none of the repr stuff really applies to it.


@eddyb

leaks implementation details by default

Sure, nobody ought to rely on default behaviour if it involves repr(Rust) stuff, whereas repr(transparent) mandates a concrete behaviour, similarly to how repr(C) does…

@nox

That doesn't help to use UnsafeCell in a field of a #[repr(C)] structure to signal that the C side

…that being said, @camlorn’s critique seems very valid to me. There’s no reason why some alternative approach (e.g. newtype Peach<T> = Banana<T>?) couldn’t be functionally equivalent to #[repr(transparent)] struct Peach<T>(Banana<T>);.

At this point I would like to note that adding (in some cases) and removing (in all cases) repr(transparent) would become a breaking change. However, it being a plain attribute (OTOH, repr(C) shares the same problem) does not really signal that weight.

Interestingly, AFAIR rustdoc does not in any way make note of any representational particularities. These are part of the API and probably should be prominently presented to everyone interested. Perhaps fixing that would help to add that weight I was speaking of in the previous paragraph.


Bikeshed time. repr(newtype). Yay? Nay? Less to type, is clearer.

@arielb1
Copy link
Contributor

arielb1 commented Sep 26, 2016

@nagisa

You mean an ABI-breaking change? Isn't that the entire point of the repr attribute?

@eddyb

repr(Rust) is intentionally unspecified (but should be equivalent to repr(transparent) when that is valid). We want to hide that implementation detail.

OTOH, repr(transparent) only means that your type has the same ABI as the wrapped type - it's a specific kind of ABI declaration. IMO it should cause a compilation error if the wrapped type has more than 1 non-zero-sized fields.

@camlorn

The problem is that nobody agrees what a newtype is. Is an Rc a newtyped raw pointer? I can certainly imagine it being a #[repr(transparent)] wrapped pointer, for easier interoperability with C.

@eddyb
Copy link
Member

eddyb commented Sep 26, 2016

@nagisa @arielb1 I agree fully. While guaranteeing pub struct Foo(pub T); as having this behavior is technically possible, it's neither sufficient nor desirable (for the various reasons you've mentioned).

The RFC does need to account for all of these requirements which lead to #[repr(transparent)].

@ahicks92
Copy link

@arielb1
I can try to define newtype. Let the capability set of a type T be a set of all the logical actions a program can perform on T. A newtype U of some type T is a type whose capability set is a strict subset of T, plus any additional operations which can be written using only operations from the capability set of T. This may make sense only to me, is dense, and probably needs to be more rigorously defined and/or may have exceptions.

Under this definition, Rc is not a newtype because it also requires operations from the capability set of whatever type you use for the reference counted pointer in addition to the pointer itself.

Nevertheless, two things should be noted. A newtype meeting my definition should be adequate here because any way we do that is going to have to conceptually look like a struct with one field. And because newtypes are a subset, going from newtype Foo = Bar to some struct Foo or whatever isn't a breaking change, save that you have to lose repr(transparent) (which you would have to lose under this proposal anyway, should you suddenly need extra fields).

I don't have a problem with us keeping it as it is and never having newtype, but newtypes need machinery to forward some traits at least in order to be ergonomic. This machinery would have to apply to all structs and tuples without it. So either you gain the complexity of teaching people how to use newtype or the complexity of teaching people how to write good code that uses the trait forwarding machinery. I would personally prefer general-purpose trait forwarding machinery, but I can deal with a lot more complexity than most programmers. Consequently, I would begin by arguing against what I personally want in this case.

You could easily counter by saying "well, okay, trait forwarding will only work on single-field structs". Which is fine as far as it goes. But that's not a general feature, and I will argue for more general features over more specific ones. If we choose to leave newtypes as single-field structs then add newtype trait forwarding, someone has to answer the question of why we couldn't make it general.

All I'm trying to say is that I think it's necessary for someone with the power to make decisions to make a definitive decision as to what kind of newtypes we want longterm before we consider this RFC: single-field structs plus general-case machinery that affects more than newtypes or a dedicated mechanism for newtypes with newtype-specific functionality.

@nagisa
Copy link
Member

nagisa commented Sep 26, 2016

You mean an ABI-breaking change? Isn't that the entire point of the repr attribute?

With #[repr(transparent)] and it inheriting the repr(C)-ness it would become potentially API-breaking too.

@nox
Copy link
Contributor Author

nox commented Sep 26, 2016

@camlorn Replace every instance of "newtype" in that RFC by "structure with one field". This RFC doesn't need to be blocked by things that are completely unrelated like trait forwarding.

@ahicks92
Copy link

@nox
Got into a discussion on IRC about this. The problem is that I wasn't asking what the harm is in allowing repr(transparent) on single-field structs; instead, I was asking something along the lines of why we should allow it on such if it's only useful on newtype. The former question puts this in a much better light. Then possibly needing to allow this with PhantomData came up.

So, anyway, carry on.

@Mark-Simulacrum
Copy link
Member

Though the two are different in meaning as far as I can tell, it may be worth considering the similarity in naming to nrc's proposal for #[transparent] in #1744. (Just pointing it out, no suggestions).

@nrc nrc added the T-lang Relevant to the language team, which will review and decide on the RFC. label Sep 27, 2016
[summary]: #summary

Extend the existing `#[repr]` attribute on newtypes with a `transparent` option
specifying that the type representation is the representation of its only field.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this RFC confusing, I think because I am unclear what you mean by representation. Rust uses interior allocation for structs (which while not guaranteed because of no formal ABI, you can definitely rely on). So in that respect, the in-memory representation does not change at all. But you seem to be proposing changing the how newtypes interact with FFI? It might be better to write the RFC with that focus.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[repr(C)] can also influence how values are passed to and returned from FFI functions, notably struct Foo(f64) on ARM64.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with @nrc here. Unless I'm confused, this has nothing to do with 'repr' but is purely a calling convention issue.

However, if you change the calling convention like this then it is incompatible with FFI. Could you possibly give a more detailed example how you would use this in a FFI case?

@nox
Copy link
Contributor Author

nox commented Sep 28, 2016

I'll write a FFI example, but calling convention is also affected by representation, cf. the example where f64 and Struct(f64) don't behave the same in C.

@nox
Copy link
Contributor Author

nox commented Sep 28, 2016

Added the PhantomData unresolved question and a paragraph explaining why it matters, with a f64 wrapper example.

@parched
Copy link

parched commented Sep 28, 2016

but calling convention is also affected by representation

or rather, calling convention is affected by the type, f64 and Struct(f64) have the same layout but are different types so (can) have different calling conventions.

Why not just implement std::convert::{From,Into} for your struct type?

@rfcbot
Copy link
Collaborator

rfcbot commented May 7, 2017

The final comment period is now complete.

@RalfJung
Copy link
Member

RalfJung commented Jun 24, 2017

An alternative that I expected to find discussed in the RFC but did not is to declare that in general, (i.e., no matter the repr), if a struct has a single non-ZST field, then it is guaranteed to have the same layout as that field. I would not be surprised if people are actually already relying on that.

In fact, #1789 does. That other RFC relies on Cell<T> having the same representation as T; it also sends a signal that we can actually rely on this kind of interconversion.

EDIT: I see this has been brought up in discussion, but not finally resolved yet. All right.

@Amanieu
Copy link
Member

Amanieu commented Jun 24, 2017

I agree with @RalfJung, I think this should be implicit as part of the Rust ABI. This would not apply to #[repr(C)] structs, which will have the layout of a C struct containing a single member. The main difference is that the latter will likely be passed as a function argument in a less efficient way.

@sfackler
Copy link
Member

@Amanieu the main motivation for this feature is FFI use cases where it is normally very bad to pass a repr(Rust) struct into C. It seems strange for it to be bad except for this one use case. A repr(transparent) allows the programmer to explicitly identify that they are defining a C struct with one field (#[repr(C)]), or just a transparent wrapper for the inner type #[repr(transparent)].

@aturon aturon merged commit 2c49bef into rust-lang:master Jul 3, 2017
@aturon
Copy link
Member

aturon commented Jul 3, 2017

Huzzah! This RFC has been merged. Tracking issue.

(Apologies for the delay!)

@briansmith
Copy link

Sorry, it's been forever since we had that conversation. Maybe I'm overlooking something, but I think #[repr(transparent, align=16)] would be very common. For example:

#[repr(transparent, align=16)]
struct Aes256Key([u8; 32]);

#[repr(transparent, align=16)]
struct AesNonce([u8; 16]);

// Keep this in sync with the C definition in aes.h
#[repr(C)]
struct Aes256Context {
    key: Aes256Key,
    ...
    nonce: AesNonce,
    ...
}

extern aes256Encrypt(context: &mut Aes256Context, ...);

I'm not sure how one would express this otherwise.

@briansmith
Copy link

(Sorry, the previous comment was intended for another issue.)

@arielb1
Copy link
Contributor

arielb1 commented Dec 16, 2017

@briansmith

What C declaration is this intended to match? Aka why can't you use

#[repr(C, align=16)]
struct AesNonce([u8; 16]);

@briansmith
Copy link

What C declaration is this intended to match? Aka why can't you use

#[repr(C, align=16)]
struct AesNonce([u8; 16]);

Because, for example, sizeof(AesNonce) isn't guaranteed to equal sizeof(AesNonce::0) because C allows the ABI to specify extra (even useless and arbitrary) padding at the end. More generally the're not guaranteed to have the same memory layout. See rust-lang/rust#43036 (comment)

@tbelaire
Copy link

tbelaire commented Jan 24, 2018

The RFC says that this should be illegal.

#[repr(transparent, align = "128")]
struct BogusAlign(f64);

But this should be legal

#[repr(transparent, align = "128")]
struct OverAligned(f64); // Behaves as a bare f64 with 128 bits alignment.

I think you've made an error somewhere.

@willmo
Copy link

willmo commented Jul 8, 2018

@tbelaire this RFC was amended to fix that in #2271.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-data-types RFCs about data-types A-machine Proposals relating to Rust's abstract machine. A-repr #[repr(...)] related proposals & ideas final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.