Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upImplement unsized unions #47650
Conversation
rust-highfive
assigned
eddyb
Jan 22, 2018
eddyb
reviewed
Jan 22, 2018
| @@ -141,7 +141,7 @@ impl<'a, 'gcx> CheckTypeWellFormedVisitor<'a, 'gcx> { | |||
| self.check_variances_for_type_defn(item, ast_generics); | |||
| } | |||
| hir::ItemUnion(ref struct_def, ref ast_generics) => { | |||
| self.check_type_defn(item, true, |fcx| { | |||
| self.check_type_defn(item, false, |fcx| { | |||
This comment has been minimized.
This comment has been minimized.
eddyb
Jan 22, 2018
Member
What's this argument and can it be removed from check_type_defn's definition (i.e. are there any non-false calls)?
This comment has been minimized.
This comment has been minimized.
mikeyhew
Jan 22, 2018
•
Author
Contributor
The argument is called all_sized and is currently true for enums
eddyb
reviewed
Jan 22, 2018
| | | ||
| = help: the trait `std::marker::Sized` is not implemented for `T` | ||
| = help: consider adding a `where T: std::marker::Sized` bound | ||
| = note: no field of a union may have a dynamically sized type | ||
| = note: only the last field of a union may have a dynamically sized type |
This comment has been minimized.
This comment has been minimized.
eddyb
Jan 22, 2018
Member
This doesn't seem right. union doesn't have ordered fields. Maybe require that at most one field is unsized?
This comment has been minimized.
This comment has been minimized.
mikeyhew
Jan 22, 2018
Author
Contributor
It's not really right, I just did it this way for now to avoid having to refactor things. I can make it more general if I must :)
petrochenkov
self-assigned this
Jan 22, 2018
petrochenkov
added
the
S-waiting-on-review
label
Jan 22, 2018
petrochenkov
removed their assignment
Jan 24, 2018
This comment has been minimized.
This comment has been minimized.
main--
commented
Jan 26, 2018
|
Is there a specific reason why this should only be allowed for a single field? Just today I wanted to create a union with two unsized fields. I'm working with postgresql where this "varlena" format is how they represent variable-length data. This code is an exact translation of the C code I'm interacting with (of course the extern type is a 0-sized array there). I want to build |
This comment has been minimized.
This comment has been minimized.
|
Wait, is |
This comment has been minimized.
This comment has been minimized.
|
@eddyb the metadata is created during unsizing, and the actual value of the field being unsized is not needed – only the type. Once unsized, the metadata cannot change, and the unsized union field cannot be written to. There is no way of knowing which field type is active, so the return value of There are currently no types in Rust for which this isn't true, but there may be in the future, and there would have to be some way of disallowing calls to Single-field unions like |
kennytm
added
the
T-lang
label
Jan 27, 2018
This comment has been minimized.
This comment has been minimized.
|
Interesting question @eddyb. As @main-- asked, what happens if we support unions with ≥2 DST fields? union U {
slice: [u8],
trobj: dyn Debug,
}The problem is how do we represent the metadata:
I think supporting unions with ≥2 DST fields require an RFC to define the proper semantic, and thus best avoided in this PR. |
This comment has been minimized.
This comment has been minimized.
|
@mikeyhew Okay so what if I have this: union OneOrManyBytes {
one: u8,
many: [u8]
}
size_of_val(&OneOrManyBytes { one: 0 })Can I Just never create this? Does that mean a |
This comment has been minimized.
This comment has been minimized.
main--
commented
Jan 27, 2018
|
Given that rust-lang/rfcs#1897 wants to prohibit unsized unions altogether that does seem a little strange. At least to me it was simple and obvious/intuitive that an unsized enum would represent its metadata as an union of the metadata of its members. Clearly, the unsized metadata of a sized type is empty (union). I understand that this probably warrants an RFC though. I don‘t think a union can ever logically be DynSized, given that you can never know in advance which variant is active. Even if only one field is unsized, the size metadata is garbage if a sized variant is active - I feel like the concept is just ill-defined like that. |
This comment has been minimized.
This comment has been minimized.
So the only way to get something useful is to borrow one of the fields, which would extract the metadata appropriate for that field, right? This definitely needs cc @nikomatsakis I like the idea of having an union of metadata but it should fit with custom DSTs. |
This comment has been minimized.
This comment has been minimized.
It could be DynSized if the metadata is a struct, i.e. you provide the metadata for all fields no matter they are active or not, e.g. union U {
a: [u8],
b: [u16],
}
=>
struct U::Meta {
a: usize,
b: usize,
}then |
This comment has been minimized.
This comment has been minimized.
|
@kennytm How do you initialize those metadata fields? Especially when trait objects are involved. |
This comment has been minimized.
This comment has been minimized.
|
@eddyb I think it can only be initialized through unsize coercion; or require that there is a type ( |
This comment has been minimized.
This comment has been minimized.
|
@kennytm Oh so the unsize coercion would have to do multiple fields in parallel or not be allowed at all, that makes a bit more sense. |
This comment has been minimized.
This comment has been minimized.
FWIW, rust-lang/rfcs#1897 doesn't try to prohibit unsized unions (they are already "prohibited" aka not supported), it just doesn't try to introduce them (they are listed in future directions). |
This comment has been minimized.
This comment has been minimized.
I agree, single-field unions are all we need for |
This comment has been minimized.
This comment has been minimized.
|
Regarding unions with metadata for each unsized field:
Theoretically you could unsize one field, and then unsize the other – after the first coercion, you'd have pointer metadata for one field, and after the second coercion, metadata for both. |
This comment has been minimized.
This comment has been minimized.
|
@mikeyhew Right, by "in parallel" I meant independently, which means you could allow multiple at once or one at a time, but you'd still be able to unsize more than one field (kind of neat, huh?). |
This comment has been minimized.
This comment has been minimized.
|
@eddyb oh |
This comment has been minimized.
This comment has been minimized.
|
@mikeyhew I did mean that, originally, but the important bit is that they're independent. |
This comment has been minimized.
This comment has been minimized.
|
Huh. Re-reading this thread, I'm feeling a bit lost. I certainly agree that the concept of an "unsized union" with multiple fields has a lot of question marks.
I don't get it. =) How would you unsize multiple fields independently? Would they have to have compatible metadata? |
This comment has been minimized.
This comment has been minimized.
|
@nikomatsakis The context is choosing |
This comment has been minimized.
This comment has been minimized.
|
@nikomatsakis Just clarifying #47650 (comment) since the thread seems to start from my comment. Unsizing will never modify data. It just generates the necessary metadata from the original sized type, and reinterpret-cast the data into the DST. So unsizing a
The actual representation of (I don't think we should support this in this PR.) |
This comment has been minimized.
This comment has been minimized.
|
Hmm, maybe I was thinking about it wrong. I guess that unsizing doesn't require that the actual data be valid, at least not currently. That is, But this may not be true once we hit custom dst, right? I guess it depends on just how the trait looks. |
This comment has been minimized.
This comment has been minimized.
|
Wait. With the custom DST proposal, |
This comment has been minimized.
This comment has been minimized.
I think by this you mean: unless it can compute the size without a reference to the actual data, i.e., purely from the metadata...right? If so, that is precisely what I was trying to get at, yeah. =) |
This comment has been minimized.
This comment has been minimized.
|
Ping from triage @mikeyhew! We haven't heard from you in a while, will you have time to work on this in the near future? |
This comment has been minimized.
This comment has been minimized.
|
This is now on the unsafe code guidelines agenda: https://internals.rust-lang.org/t/proposal-reboot-the-unsafe-code-guidelines-team-as-a-working-group/7307 |
This comment has been minimized.
This comment has been minimized.
|
@pietroalbini thanks for the ping, and sorry for letting this slide. I'm still interested in finishing this, if it is desired
I wasn't at the meeting, so I'm not sure what specific issues were discussed around unsized unions or how unions are dropped. But would it be OK to implement unsized unions anyway, with the intention of keeping them unstable at least until the details have been figured out? |
This comment has been minimized.
This comment has been minimized.
|
The issue is that we may not want to make any restrictions for the bit patterns that are valid for a union. If we adapt that policy, then unsized unions with thin pointers do not work as they have to always be able to determine the size by dereferencing the union. This is not related to dropping.
I have no idea who's even in charge of such decisions. ;) They certainly have to be unstable at first. Is there an RFC for this feature? The unions RFC does not mention "unsized" at all. |
RalfJung
referenced this pull request
Apr 17, 2018
Open
Untagged unions (tracking issue for RFC 1444) #32836
This comment has been minimized.
This comment has been minimized.
I don't think that this PR puts us into any corners that we can't get out of. My guess is that unions with thin-pointer DST fields would be completely unsized by default, since as you said you have to dereference the union to read the metadata and get the size/alignment of the union field. This PR does not support thin-pointer DSTs, but it doesn't rule them out either, it just supports the dynamically-sized types that currently exist in Rust, where the size and alignment are determined from the pointer metadata. |
This comment has been minimized.
This comment has been minimized.
Interesting. I don't think this alternative came up in the discussions so far. But wouldn't that make them fairly useless? |
This comment has been minimized.
This comment has been minimized.
|
@RalfJung by themselves, yeah. But AFAIK there is no safe way to get the size of such a union, so that's what it would have to be. What were the other options that came up? |
This comment has been minimized.
This comment has been minimized.
|
The idea so far was that if DSTs work in unions, then they'd fully work. So This is in conflict with the idea that a union is just a bag of bytes and makes no assumptions about its contents being valid. |
This comment has been minimized.
This comment has been minimized.
|
And keep in mind, with custom DST you could still implement |
This comment has been minimized.
This comment has been minimized.
OK, I see. It looks like the conflict lies in one statement saying that the vtable will always be valid, and the other saying that no fields are guaranteed to be valid. That works for normal trait objects, because the vtable is stored in the pointer metadata, but not for thin trait objects, where the vtable is stored with the actual data. I always thought that at least one union field had to be active (and therefore valid). Is there an advantage to that not being the case? |
This comment has been minimized.
This comment has been minimized.
This is a discussion we are currently having in another thread. |
shepmaster
added
S-blocked
and removed
S-waiting-on-author
S-blocked
labels
Apr 30, 2018
pietroalbini
added
S-blocked
and removed
S-blocked
labels
May 14, 2018
kennytm
added
the
S-waiting-on-team
label
May 15, 2018
pietroalbini
added
S-blocked
and removed
S-blocked
labels
May 28, 2018
TimNN
added
A-allocators
and removed
A-allocators
labels
Jun 5, 2018
This comment has been minimized.
This comment has been minimized.
|
Ping from triage @RalfJung! What's the outcome of that discussion? |
This comment has been minimized.
This comment has been minimized.
|
I don't think there is an outcome yet... Someone will have to make an RFC for settling this, I think. |
This comment has been minimized.
This comment has been minimized.
|
Should this PR be closed in the meantime then? |
This comment has been minimized.
This comment has been minimized.
|
I think that's for the lang team to decide. |
This comment has been minimized.
This comment has been minimized.
stokhos
commented
Jun 29, 2018
|
Ping from triage @rust-lang/lang will someone have time to review this PR ? |
mikeyhew commentedJan 22, 2018
r? @eddyb
This allows unions to have at most one unsized field/variant, in order to allow
ManuallyDrop<T> where T: ?Sized(see #47034). For now, I made it so that the unsized field has to be the last field, which makes the implementation simpler because there's a lot of similarities to structs.cc #32836
Questions for reviewers:
Sizedbound onManuallyDropin this PR? That would be insta-stable, so would require an FCP.