-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Background
In 0.15.1, Zig introduced new rules for arithmetic on undefined. So far these rules have been specified and implemented for most integer and float operations in #23177 and #24674. However the langref states:
packedstructs, likeenum, are based on the concept of interpreting integers differently. All packed structs have a backing integer, which is implicitly determined by the total bit count of fields, or explicitly specified. Packed structs have well-defined memory layout - exactly the same ABI as their backing integer.
Field access and assignment can be understood as shorthand for bitshifts on the backing integer.
This means that packed structs/unions are closely connected to their backing integer and must also obey these new semantics. At runtime, they effectively already behave the same as integers because they literally are, but at comptime they're currently much closer to structs than to integers.
The problem descibed in the accepted #24657 has so far been a major blocker for this proposal because some pointers can be comptime-known without their actual address being available at comptime (see the linked issue for more details). This means that packed types could not be modeled as actual integers at comptime but rather as structs with distinct fields to hold references to not-yet-resolvable values.
Now that that's no longer a problem (at least as soon as #24657 is implemented) packed types can trivially adopt the same semantics as integers.
Proposal
packed structs shall interact with undefined according to the following rules:
- any field is/becomes
undefined-> the entirepackedstruct is/becomesundefined - any operation on an
undefinedpackedstruct (i.e. setting fields) yields anundefinedpackedstruct - initialization needs to provide defined values for all fields, otherwise the result is
undefined
#19754 has made these semantics trivial for packed unions as well.
Justification
These rules result in very clear semantics: packed types just behave like fancy integers and operations on them are effectively syntax sugar for bitwise manipulations, which is in line with what the langref already states anyway.
The semantics of @bitCast become obvious: the result is either 100% defined or 100% undefined, no ambiguities (#19755).
They're also more consistent with runtime behaviour, sanitizers don’t have the concept of partially defined ints anyway so whether a packed struct can be partially undefined would currently have to depend on its field sizes/alignments (not good).
Implementation-wise, the representation of packed types as integers, the detection of undefined values at comptime and the lowering of packed types becomes trivial.
Also these changes shouldn't have any significant overhead. packed types tend to be rather small anyway so interning every permutation as an integer probably won't use more memory than interning every field individually, especially since interning comes with a certain overhead for very small types (e.g. bitflags or enums with few members).
packed container size is also capped at 64KiB so the worst case likely isn’t even that bad.
They will also be able to share interned values with regular integers, further reducing their memory footprint.
If undefined tracking at a bit level ever gets introduces these semantics should be changed accordingly.