-
Notifications
You must be signed in to change notification settings - Fork 15
#[repr(C)] C-like enums and out of range values #41
Comments
With the current implementation this is UB - we emit I think that if we want to allow "invalid" values in enums, we need to think about it more - we e.g. probably want to allow things like comparisons. |
From the programmer's (and FFI bindings) perspective, it would be ideal if C-like enums with explicit |
C++ seems to treat out-of-rage casts to enum as "unspecified values", which means after the cast it can have any valid value of the target range. Casting back to int does not have to yield the same result, it could return anything. I couldn't find the relevant part in the C standard. |
It says in Annex I of N1570 (C11) that an implementation may generate warnings when
I couldn't find any other mention anywhere, and I checked the lists of implementation-defined and undefined behaviors. |
@RalfJung incorrect - http://eel.is/c++draft/expr.static.cast#10
http://eel.is/c++draft/dcl.enum#8
(basically, this all says that the range of an enum is enough to hold all of the variants or'd together, for bitflags) |
I stand corrected. And it seems the rules changed, because the one I quoted above still says out-of-range is unspecified value, not UB. That was C++14, your link seems to be a draft of C++17. Thanks! |
AFAIU in C it's not undefined behaviour (or implementation-defined behaviour) to use values that are in range of the underlying integer type but not part of the enum. It basically is just syntactic sugar around integers and giving names to specific ones. Compiler warnings for using values outside the range of the enum were only added relatively recently (few years ago) too. For C++ the story seems to be a bit different as @ubsan is quoting above. |
It's also worth noting that |
I consider the Rust FFI equivalent of a C enum to be a newtype integer with a set of constants. All sorts of "semi-invalid" values are basically armed bombs waiting to UB up your program, to be touched very carefully, but with C-style enums, out-of-range values are not "invalid" in any way or form - using an "old" header with a "new" library is a to-be-explicitly-supported usecase. |
I guess one question is whether |
Or it could piggyback on the non-exhaustive enum RFC that's currently (hopefully) in the process of being accepted. We could define that |
What's the exact meaning of |
@sdroege as |
While that makes sense, it seems like a potential footgun, at least I expected it to also behave like in C :) Using |
As long as |
The question is -- how does it affect representation in memory? C doesn't have tagged unions, so "this is laid out like C would" doesn't make any sense for |
@sdroege you seem to misunderstand the section I quoted - this is only for enums with no fixed underlying type. Any enums with fixed underlying type are treated as if they are newtyped integers. @RalfJung I agree. I don't think people should be using |
I believe that the original question stated:
which we only accept if none of the variants in the enum have data associated with them. So @RalfJung this question:
is sort of besides the point, I think. That said, I think I would agree with @arielb1's original comment:
In other words, this is an error today, which presumably implies some speed win in some scenarios, and I see no reason not to keep this as UB. Enums in Rust are more meaningful than in C, where they are often used (righly or wrongly) as glorified integers. I think that's ok. |
(But I do think we need to arrive at some better set of guiding principles by which to make such decisions.) |
I stand by my earlier comment. It's very hazardous to use C-like enums in FFI, and the danger is immensely counterintuitive to boot. Saying "eh, well, it shouldn't be used that way" is just hiding head in sand. The issue exists and it needs to be fixed somehow, even if just by finding a way to teach everyone the proper way to do enums in FFI. Personally, I'd consider pretty much any other alternative easier to achieve. |
I also disagree: if you see a repr(C) enum you would intuitively assume that it behaves like an enum in C. If it doesn't (i.e. is not a glorified integer), that is surprising and can result in hard to track down bugs and maybe worse, makes repr(C) enums basically useless. What else than C FFI would you use it for? For all the other cases you can just use plain enums, or any of the other repr(X) variants if you want to ensure that it's stored as some kind of integer. IMHO it would make sense to deprecated repr(C) enums in that case, and make definition of them a compiler warning at least. |
Is it possible to mark repr(C) enum's as unsafe, so access to them will be allowed in unsafe blocks only, but keep C-like behaviour in unsafe blocks? Or maybe we should add This behaviour affects Prost: i32 type is used instead of enum type, because Protobuf defines that enum variable must be able to hold values outside of enum range to be compatible with future versions. As alternative solution, it's proposed to use In turn, this problem hurts me as user of Prost. I want to have easy to use interface, out of box serialization/deserialization to JSON, and so on. |
Was there any update on this? It should probably also be mentioned in the unsafe code guidelines in the enums section. Edit: Created an issue there rust-lang/unsafe-code-guidelines#137 |
This is basically subsumed by rust-lang/unsafe-code-guidelines#69: any enum value must have a valid discriminant value. |
@sdroege asked me the following on IRC, and I .. wasn't sure of the answer:
I wasn't sure what I thought the answer should be.
The text was updated successfully, but these errors were encountered: