Define `_BitInt` ABI #175

jsm28 · 2022-10-25T20:22:33Z

C23 (most recent public draft: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3054.pdf ) defines "bit-precise integer types": types _BitInt(N) and unsigned _BitInt(N) with a given width. The Arm ABI (both AAPCS32 and AAPCS64) needs to define the ABI for these types; see the x86_64 ABI https://gitlab.com/x86-psABIs/x86-64-ABI for an example.

This means specifying the size, alignment and representation for objects of those types (including whether there are any requirements on the values of padding bits in the in-memory representation), and the interface for argument passing and return (including, again, any requirements on padding bits - both padding bits up to the size of an object of that type, and any further padding beyond that within the size of a register or stack slot used for argument passing or return).

The text was updated successfully, but these errors were encountered:

`_BitInt(N)` and `unsigned _BitInt(N)` are new integral types added for C23. These types have a bit width of `N` and define a different type for each `N`. Here we define the language mapping between these language types and the machine data types used throughout AAPCS64 through which our calling standard is defined. Along with the AAPCS32 commit, this closes ARM-software#175. The rationale for the choices in this patch are presented in a seperate commit with a rationale document. Some points of note around the language in this commit: -- This commit does not update any wording around bit-fields. The current wording under "Bit-fields subdivision" mentions that "A member of an aggregate that is a Fundamental Data Type may be subdivided into bit-fields". I do not believe this needs updating. While a _BitInt(N>128) can be subdivided into a bit-field at the *language level*, once it has been converted to a Machine Type such a subdivision would look like some number of quad-words which have not been subdivided (either fully used or fully unused) and either one or zero quad-words which have been subdivided. The alignment requirements of the _BitInt are also the same as the alignment requirements of this fundamental data type of a quad-word, which this paragraph uses to explain the resulting alignment requirements on the aggregate containing a bit-field. In the C/C++ language mapping description of "Bit-fields" we mention that a bit-field may have any integral type and since a bit-precise integer is an integral type this still holds. The explanation of where a field can be placed in this section relies on *alignment* requirements of the field type. For _BitInt this lines up with the discussion on *fundamental data types* at the machine-level, since the alignment requirements of a _BitInt are that of the "chunk" it is made up of, which is the fundamental data type of a quad-word. Hence I believe the current language does not need updating for bit-precise integers. -- _BitInt(N > 128) types and Homogeneous Aggregates. With this changeset, _BitInt(N>128) types are treated as arrays of __int128 values. Hence at the machine level they would be a Homogeneous Aggregate of quad-words. The wording in section 5.9.5 currently mentions "uniquely addressable Members". I am not sure what this is referring to, but would expect this is referring to addressable members of the base type. If that is the case then I don't believe anything needs to be updated. If it were referring to addressable members at the language level (which would be strange given the context) then this may need updating since one language-level _BitInt(256) type would not equat to one Fundamental Data Type. -- Combination of unspecified bits in _BitInt and C.16 in PCS rules. The mapping this commits defines from a _BitInt to a Machine Type specifies that the bits of the relevant Machine Type that are unused in a _BitInt(N) have unspecified value. The rules C.12 and C.16 of our parameter passing standard specify that when there are unused bits of a structure and/or Integral fundamental data type that are passed in registers, those unused bits are unspecified. The combination of these two rules means that e.g. when passing a _BitInt(2) across a PCS boundary in a register, bits [2-63] inclusive are unspecified. -- In-memory and in-register representations match. This commit only specifies the mapping from language level type to machine type. The machine type is then treated as it currently is in memory and in registers.

`_BitInt(N)` and `unsigned _BitInt(N)` are new integral types added for C23. These types have a bit width of `N` and define a different type for each `N`. Here we define the language mapping between these language types and the machine data types used throughout AAPCS32 through which our calling standard is defined. Along with the AAPCS64 commit, this closes ARM-software#175. The rationale for the choices in this patch are presented in a seperate commit with a rationale document. Some points of note around the language in this commit: -- This commit does not update any wording around bit-fields. The current wording under "Bit-fields" section 5.3.4 mentions that "A member of an aggregate that is a Fundamental Data Type may be subdivided into bit-fields". I do not believe this needs updating. While a _BitInt(N>64) can be subdivided into a bit-field at the *language level*, once it has been converted to a Machine Type, such a subdivision would look like some number of double-words which have not been subdivided (either fully used or fully unused) and either one or zero double-words which have been subdivided. The alignment requirements of the _BitInt are also the same as the alignment requirements of this fundamental data type of a double-word, which this paragraph uses to explain the resulting alignment requirements on the aggregate containing a bit-field. In the C/C++ language mapping description of "Bit-fields" we mention that a bit-field may have any integral type and since a bit-precise integer is an integral type this still holds. The explanation of where a field can be placed in this section relies on *alignment* requirements of the field type. For _BitInt this lines up with the discussion on *fundamental data types* at the machine-level, since the alignment requirements of a _BitInt are that of the "chunk" it is made up of, which is the fundamental data type of a double-word. Hence I believe the current language does not need updating for bit-precise integers. -- _BitInt(N > 64) types and Homogeneous Aggregates. With this changeset, _BitInt(N>64) types are treated as arrays of uint64_t values. Hence at the machine level they would be a Homogeneous Aggregate of double-words. -- Combination of unspecified bits in _BitInt and B.2 in PCS rules. The mapping this commits defines from a _BitInt to a Machine Type specifies that the bits of the relevant Machine Type that are unused in a _BitInt(N) have unspecified value. Rule B.2 of our parameter passing standard specifies that when there are unused bits in an integral Fundamental Data Type that is passed in registers, those unused bits are zero- or sign-extended to a full-word. The combination of this rule with the fact that _BitInt types are zero- or sign-extended to the Fundamental Data Type which they are passed in means that e.g. when passing a _BitInt(2) across a PCS boundary in a register, bits [2-63] inclusive are sign-extended. -- In-memory and in-register representations match. This commit only specifies the mapping from language level type to machine type. The machine type is then treated as it currently is in memory and in registers.

`_BitInt(N)` and `unsigned _BitInt(N)` are new integral types added for C23. These types have a bit width of `N` and define a different type for each `N`. Here we define the language mapping between these language types and the machine data types used throughout AAPCS64 through which our calling standard is defined. Along with the AAPCS32 commit, this closes ARM-software#175. The rationale for the choices in this patch are presented in a seperate commit with a rationale document. Some points of note around the language in this commit: -- This commit does not update any wording around bit-fields. The current wording under "Bit-fields subdivision" mentions that "A member of an aggregate that is a Fundamental Data Type may be subdivided into bit-fields". I do not believe this needs updating. While a _BitInt(N>128) can be subdivided into a bit-field at the *language level*, once it has been converted to a Machine Type such a subdivision would look like some number of quad-words which have not been subdivided (either fully used or fully unused) and either one or zero quad-words which have been subdivided. The alignment requirements of the _BitInt are also the same as the alignment requirements of this fundamental data type of a quad-word, which this paragraph uses to explain the resulting alignment requirements on the aggregate containing a bit-field. In the C/C++ language mapping description of "Bit-fields" we mention that a bit-field may have any integral type and since a bit-precise integer is an integral type this still holds. The explanation of where a field can be placed in this section relies on *alignment* requirements of the field type. For _BitInt this lines up with the discussion on *fundamental data types* at the machine-level, since the alignment requirements of a _BitInt are that of the "chunk" it is made up of, which is the fundamental data type of a quad-word. Hence I believe the current language does not need updating for bit-precise integers. -- _BitInt(N > 128) types and Homogeneous Aggregates. With this changeset, _BitInt(N>128) types are treated as arrays of __int128 values. Hence at the machine level they would be a Homogeneous Aggregate of quad-words. The wording in section 5.9.5 currently mentions "uniquely addressable Members". I am not sure what this is referring to, but would expect this is referring to addressable members of the base type. If that is the case then I don't believe anything needs to be updated. If it were referring to addressable members at the language level (which would be strange given the context) then this may need updating since one language-level _BitInt(256) type would not equat to one Fundamental Data Type. -- Combination of unspecified bits in _BitInt and C.16 in PCS rules. The mapping this commits defines from a _BitInt to a Machine Type specifies that the bits of the relevant Machine Type that are unused in a _BitInt(N) have unspecified value. The rules C.12 and C.16 of our parameter passing standard specify that when there are unused bits of a structure and/or Integral fundamental data type that are passed in registers, those unused bits are unspecified. The combination of these two rules means that e.g. when passing a _BitInt(2) across a PCS boundary in a register, bits [2-63] inclusive are unspecified. -- In-memory and in-register representations match. This commit only specifies the mapping from language level type to machine type. The machine type is then treated as it currently is in memory and in registers.

`_BitInt(N)` and `unsigned _BitInt(N)` are new integral types added for C23. These types have a bit width of `N` and define a different type for each `N`. Here we define the language mapping between these language types and the machine data types used throughout AAPCS32 through which our calling standard is defined. Along with the AAPCS64 commit, this closes ARM-software#175. The rationale for the choices in this patch are presented in a seperate commit with a rationale document. Some points of note around the language in this commit: -- This commit does not update any wording around bit-fields. The current wording under "Bit-fields" section 5.3.4 mentions that "A member of an aggregate that is a Fundamental Data Type may be subdivided into bit-fields". I do not believe this needs updating. While a _BitInt(N>64) can be subdivided into a bit-field at the *language level*, once it has been converted to a Machine Type, such a subdivision would look like some number of double-words which have not been subdivided (either fully used or fully unused) and either one or zero double-words which have been subdivided. The alignment requirements of the _BitInt are also the same as the alignment requirements of this fundamental data type of a double-word, which this paragraph uses to explain the resulting alignment requirements on the aggregate containing a bit-field. In the C/C++ language mapping description of "Bit-fields" we mention that a bit-field may have any integral type and since a bit-precise integer is an integral type this still holds. The explanation of where a field can be placed in this section relies on *alignment* requirements of the field type. For _BitInt this lines up with the discussion on *fundamental data types* at the machine-level, since the alignment requirements of a _BitInt are that of the "chunk" it is made up of, which is the fundamental data type of a double-word. Hence I believe the current language does not need updating for bit-precise integers. -- _BitInt(N > 64) types and Homogeneous Aggregates. With this changeset, _BitInt(N>64) types are treated as arrays of uint64_t values. Hence at the machine level they would be a Homogeneous Aggregate of double-words. -- Combination of unspecified bits in _BitInt and B.2 in PCS rules. The mapping this commits defines from a _BitInt to a Machine Type specifies that the bits of the relevant Machine Type that are unused in a _BitInt(N) have unspecified value. Rule B.2 of our parameter passing standard specifies that when there are unused bits in an integral Fundamental Data Type that is passed in registers, those unused bits are zero- or sign-extended to a full-word. The combination of this rule with the fact that _BitInt types are zero- or sign-extended to the Fundamental Data Type which they are passed in means that e.g. when passing a _BitInt(2) across a PCS boundary in a register, bits [2-63] inclusive are sign-extended. -- In-memory and in-register representations match. This commit only specifies the mapping from language level type to machine type. The machine type is then treated as it currently is in memory and in registers.

`_BitInt(N)` and `unsigned _BitInt(N)` are new integral types added for C23. These types have a bit width of `N` and define a different type for each `N`. Here we define the language mapping between these language types and the machine data types used throughout AAPCS64 through which our calling standard is defined. Along with the AAPCS32 commit, this closes ARM-software#175. The rationale for the choices in this patch are presented in a seperate commit with a rationale document. Some points of note around the language in this commit: -- This commit does not update any wording around bit-fields. The current wording under "Bit-fields subdivision" mentions that "A member of an aggregate that is a Fundamental Data Type may be subdivided into bit-fields". I do not believe this needs updating. While a _BitInt(N>128) can be subdivided into a bit-field at the *language level*, once it has been converted to a Machine Type such a subdivision would look like some number of quad-words which have not been subdivided (either fully used or fully unused) and either one or zero quad-words which have been subdivided. The alignment requirements of the _BitInt are also the same as the alignment requirements of this fundamental data type of a quad-word, which this paragraph uses to explain the resulting alignment requirements on the aggregate containing a bit-field. In the C/C++ language mapping description of "Bit-fields" we mention that a bit-field may have any integral type and since a bit-precise integer is an integral type this still holds. The explanation of where a field can be placed in this section relies on *alignment* requirements of the field type. For _BitInt this lines up with the discussion on *fundamental data types* at the machine-level, since the alignment requirements of a _BitInt are that of the "chunk" it is made up of, which is the fundamental data type of a quad-word. Hence I believe the current language does not need updating for bit-precise integers. -- _BitInt(N > 128) types and Homogeneous Aggregates. With this changeset, _BitInt(N>128) types are treated as arrays of __int128 values. Hence at the machine level they would be a Homogeneous Aggregate of quad-words. The wording in section 5.9.5 currently mentions "uniquely addressable Members". I am not sure what this is referring to, but would expect this is referring to addressable members of the base type. If that is the case then I don't believe anything needs to be updated. If it were referring to addressable members at the language level (which would be strange given the context) then this may need updating since one language-level _BitInt(256) type would not equat to one Fundamental Data Type. -- Combination of unspecified bits in _BitInt and C.16 in PCS rules. The mapping this commits defines from a _BitInt to a Machine Type specifies that the bits of the relevant Machine Type that are unused in a _BitInt(N) have unspecified value. The rules C.12 and C.16 of our parameter passing standard specify that when there are unused bits of a structure and/or Integral fundamental data type that are passed in registers, those unused bits are unspecified. The combination of these two rules means that e.g. when passing a _BitInt(2) across a PCS boundary in a register, bits [2-63] inclusive are unspecified. -- In-memory and in-register representations match. This commit only specifies the mapping from language level type to machine type. The machine type is then treated as it currently is in memory and in registers.

`_BitInt(N)` and `unsigned _BitInt(N)` are new integral types added for C23. These types have a bit width of `N` and define a different type for each `N`. Here we define the language mapping between these language types and the machine data types used throughout AAPCS32 through which our calling standard is defined. Along with the AAPCS64 commit, this closes ARM-software#175. The rationale for the choices in this patch are presented in a seperate commit with a rationale document. Some points of note around the language in this commit: -- This commit does not update any wording around bit-fields. The current wording under "Bit-fields" section 5.3.4 mentions that "A member of an aggregate that is a Fundamental Data Type may be subdivided into bit-fields". I do not believe this needs updating. While a _BitInt(N>64) can be subdivided into a bit-field at the *language level*, once it has been converted to a Machine Type, such a subdivision would look like some number of double-words which have not been subdivided (either fully used or fully unused) and either one or zero double-words which have been subdivided. The alignment requirements of the _BitInt are also the same as the alignment requirements of this fundamental data type of a double-word, which this paragraph uses to explain the resulting alignment requirements on the aggregate containing a bit-field. In the C/C++ language mapping description of "Bit-fields" we mention that a bit-field may have any integral type and since a bit-precise integer is an integral type this still holds. The explanation of where a field can be placed in this section relies on *alignment* requirements of the field type. For _BitInt this lines up with the discussion on *fundamental data types* at the machine-level, since the alignment requirements of a _BitInt are that of the "chunk" it is made up of, which is the fundamental data type of a double-word. Hence I believe the current language does not need updating for bit-precise integers. -- _BitInt(N > 64) types and Homogeneous Aggregates. With this changeset, _BitInt(N>64) types are treated as arrays of uint64_t values. Hence at the machine level they would be a Homogeneous Aggregate of double-words. -- Combination of unspecified bits in _BitInt and B.2 in PCS rules. The mapping this commits defines from a _BitInt to a Machine Type specifies that the bits of the relevant Machine Type that are unused in a _BitInt(N) have unspecified value. Rule B.2 of our parameter passing standard specifies that when there are unused bits in an integral Fundamental Data Type that is passed in registers, those unused bits are zero- or sign-extended to a full-word. The combination of this rule with the fact that _BitInt types are zero- or sign-extended to the Fundamental Data Type which they are passed in means that e.g. when passing a _BitInt(2) across a PCS boundary in a register, bits [2-63] inclusive are sign-extended. -- In-memory and in-register representations match. This commit only specifies the mapping from language level type to machine type. The machine type is then treated as it currently is in memory and in registers.

jakubjelinek · 2023-05-26T17:02:23Z

Do you want to use 64-bit limbs even on 32-bit ARM? I guess that could be a challenge e.g. on the libgcc side,
e.g. my https://gcc.gnu.org/PR102989 WIP patch uses umul_ppmm for limbbits * limbbits -> 2xlimbbits multiplication, which 32-bit arm in longlong.h only defines for W_TYPE_SIZE == 32.
And, for big endian, will you use in memory big-endian ordering of the limbs (first limb the most significant) or little endian (first limb the least significant, but bits within the limb big-endian)?

mmalcomson · 2023-05-30T11:30:48Z

@jakubjelinek As it stands yes, we were thinking of using 64-bit limbs for 32-bit ARM (and 128-bit limbs for 64-bit).
The idea being that this way _BitInt(64) would match int64_t on AArch32 and _BitInt(128) would match __int128 on AArch64.

W.r.t. the memory ordering we're currently suggesting the use of big-endian memory ordering of the limbs for big-endian systems, but AFAIR we don't have a strong rationale for this decision, so if you have any feedback here that would be welcome.

FWIW the current proposal (along with the rationale document to explain our decisions) can be seen here #191.

jakubjelinek · 2023-05-30T12:16:12Z

Can 64-bit ARM do 128-bit x 128-bit -> 256-bit multiplications or 256-bit / 128-bit -> 128-bit divisions or what is the rationale for such large limbs? Of course, under the hood there could be ABI limb and optimization limb which would be smaller than the ABI one, but then what is the advantage, just making the types larger?

jakubjelinek · 2023-05-30T12:21:41Z

If the reason is to have _BitInt(128) argument passing/return value compatible with __int128, then you can just say
so in the list of exceptions for the smaller sizes, it doesn't need to imply the size of the limb for even larger sizes.
Shall _BitInt(257) have 3sizeof(__int128) size or just 5sizeof(long long)? And alignment can be yet another thing that could be independent from that.

mmalcomson · 2023-05-30T16:54:56Z

Yes, the reasoning so far was focused on ensuring the _BitInt(128) passing and return values were compatible with __int128.
The alignment and limb size had been suggested in order to make the description simple rather than to satisfy any fundamental need.

Having a smaller limb size while still maintaining that property seems reasonable on first blush (i.e. without having put much thought into it).
Will look into it (probably asking you a few questions along the way) and update the PR with a rationale addressing this (whether for or against).

To ensure I've understand your point correctly:
Is it right to say that the main positive you would see from using smaller limbs is that the implementation of multiplication for large sizes would be performing intermediate multiplications on limbsize chunks rather than 1/2 limbsize chunks, and hence the implementation would be simpler.
(Edit: I guess the simplification is around having all architectures do logically "the same thing" of working with a limbsize -- is that right?)

mmalcomson · 2023-05-31T15:14:18Z

@jakubjelinek my current thoughts are not to change, based on the following reasoning:

We really want _BitInt(128) to match __int128 as we think it would be a footgun otherwise (especially since the "quad word" is an ABI-level data type and mapping these C level types to different fundamental data types seems like it could cause problems).
If _BitInt(128) alignment is 16bytes (to match __int128), I think having _BitInt(N>128) have lesser alignment would cause confusion to programmers.
In order to ensure sizeof(Array)/sizeof(ElementType) gives the number of elements in the array, we need no padding between elements, so the size of the elements needs to be divisible by their alignment.
The combination of those points implying that we want the size of elements to always be divisible by 128.

Does this seem reasonable to you?

jakubjelinek · 2023-05-31T15:36:13Z

@jakubjelinek my current thoughts are not to change, based on the following reasoning:

* We really want `_BitInt(128)` to match `__int128` as we think it would be a footgun otherwise (especially since the "quad word" is an ABI-level data type and mapping these C level types to different fundamental data types seems like it could cause problems).

The programmers will need to be prepared for that already, x86-64 psABI behaves like that.
While sizeof (__int128) == sizeof (_BitInt(128)), alignof (__int128) > alignof (_BitInt(128))).
That also means that __int128 and _BitInt(128) are passed there the same if it is passed in registers, but not necessarily
when it is passed on the stack,
__int128 foo (int, int, int, int, int, int, int, __int128 x) { return x; }
_BitInt(128) bar (int, int, int, int, int, int, int, _BitInt(128) x) { return x; }
results in different code (at least in GCC, in clang apparently foo is compiled like bar, which means I think that clang doesn't follow the psABI there -
https://gitlab.com/x86-psABIs/x86-64-ABI/-/blob/master/x86-64-ABI/low-level-sys-info.tex#L601
).

* If `_BitInt(128)` alignment is 16bytes (to match `__int128`), I think having `_BitInt(N>128)` have lesser alignment would cause confusion to programmers.

Why? _BitInt(128) and _BitInt(129) types are distinct types, user shouldn't make assumptions on the alignments or sizes of those types unless he/she knows the corresponding ABI.

Anyway, regarding GCC implementation, as long as _BitInt uses the same endian ordering of limbs as bits inside of those limbs, we could have two separate limb modes, one used for the alignment and sizing and another used for the actual implementation of arithmetics on the type, perhaps including libgcc implementation of the multiplication/division.
If the endianity is different, those two limb modes would need to be the same obviously.

nsz-arm · 2023-06-01T14:22:58Z

* If `_BitInt(128)` alignment is 16bytes (to match `__int128`), I think having `_BitInt(N>128)` have lesser alignment would cause confusion to programmers.
Why? _BitInt(128) and _BitInt(129) types are distinct types, user shouldn't make assumptions on the alignments or sizes of those types unless he/she knows the corresponding ABI.

there were several cases when a small change to linux uapi structs (e.g using bits from a previously reserved field) broke the abi because the alignment requirement changed unexpectedly. so weird alignment requirement can definitely cause problems. but i don't know if that's worse or mismatching _BitInt(128) and __int128_t alignment.

nsz-arm · 2023-06-01T18:04:39Z

it seems released versions of clang already implement _BitInt up to N=128 and it has 8byte alignment on both aarch64 and arm. so it is probably better to document the existing practice instead of doing something different that breaks ABI when -std=c23 is used. https://godbolt.org/z/4aTq5Ezeq

`_BitInt(N)` and `unsigned _BitInt(N)` are new integral types added for C23. These types have a bit width of `N` and define a different type for each `N`. Here we define the language mapping between these language types and the machine data types used throughout AAPCS64 through which our calling standard is defined. Along with the AAPCS32 commit, this closes ARM-software#175. The rationale for the choices in this patch are presented in a seperate commit with a rationale document. Some points of note around the language in this commit: -- This commit does not update any wording around bit-fields. The current wording under "Bit-fields subdivision" mentions that "A member of an aggregate that is a Fundamental Data Type may be subdivided into bit-fields". I do not believe this needs updating. While a _BitInt(N>128) can be subdivided into a bit-field at the *language level*, once it has been converted to a Machine Type such a subdivision would look like some number of quad-words which have not been subdivided (either fully used or fully unused) and either one or zero quad-words which have been subdivided. The alignment requirements of the _BitInt are also the same as the alignment requirements of this fundamental data type of a quad-word, which this paragraph uses to explain the resulting alignment requirements on the aggregate containing a bit-field. In the C/C++ language mapping description of "Bit-fields" we mention that a bit-field may have any integral type and since a bit-precise integer is an integral type this still holds. The explanation of where a field can be placed in this section relies on *alignment* requirements of the field type. For _BitInt this lines up with the discussion on *fundamental data types* at the machine-level, since the alignment requirements of a _BitInt are that of the "chunk" it is made up of, which is the fundamental data type of a quad-word. Hence I believe the current language does not need updating for bit-precise integers. -- _BitInt(N > 128) types and Homogeneous Aggregates. With this changeset, _BitInt(N>128) types are treated as arrays of __int128 values. Hence at the machine level they would be a Homogeneous Aggregate of quad-words. The wording in section 5.9.5 currently mentions "uniquely addressable Members". I am not sure what this is referring to, but would expect this is referring to addressable members of the base type. If that is the case then I don't believe anything needs to be updated. If it were referring to addressable members at the language level (which would be strange given the context) then this may need updating since one language-level _BitInt(256) type would not equat to one Fundamental Data Type. -- Combination of unspecified bits in _BitInt and C.16 in PCS rules. The mapping this commits defines from a _BitInt to a Machine Type specifies that the bits of the relevant Machine Type that are unused in a _BitInt(N) have unspecified value. The rules C.12 and C.16 of our parameter passing standard specify that when there are unused bits of a structure and/or Integral fundamental data type that are passed in registers, those unused bits are unspecified. The combination of these two rules means that e.g. when passing a _BitInt(2) across a PCS boundary in a register, bits [2-63] inclusive are unspecified. -- In-memory and in-register representations match. This commit only specifies the mapping from language level type to machine type. The machine type is then treated as it currently is in memory and in registers.

`_BitInt(N)` and `unsigned _BitInt(N)` are new integral types added for C23. These types have a bit width of `N` and define a different type for each `N`. Here we define the language mapping between these language types and the machine data types used throughout AAPCS32 through which our calling standard is defined. Along with the AAPCS64 commit, this closes ARM-software#175. The rationale for the choices in this patch are presented in a seperate commit with a rationale document. Some points of note around the language in this commit: -- This commit does not update any wording around bit-fields. The current wording under "Bit-fields" section 5.3.4 mentions that "A member of an aggregate that is a Fundamental Data Type may be subdivided into bit-fields". I do not believe this needs updating. While a _BitInt(N>64) can be subdivided into a bit-field at the *language level*, once it has been converted to a Machine Type, such a subdivision would look like some number of double-words which have not been subdivided (either fully used or fully unused) and either one or zero double-words which have been subdivided. The alignment requirements of the _BitInt are also the same as the alignment requirements of this fundamental data type of a double-word, which this paragraph uses to explain the resulting alignment requirements on the aggregate containing a bit-field. In the C/C++ language mapping description of "Bit-fields" we mention that a bit-field may have any integral type and since a bit-precise integer is an integral type this still holds. The explanation of where a field can be placed in this section relies on *alignment* requirements of the field type. For _BitInt this lines up with the discussion on *fundamental data types* at the machine-level, since the alignment requirements of a _BitInt are that of the "chunk" it is made up of, which is the fundamental data type of a double-word. Hence I believe the current language does not need updating for bit-precise integers. -- _BitInt(N > 64) types and Homogeneous Aggregates. With this changeset, _BitInt(N>64) types are treated as arrays of uint64_t values. Hence at the machine level they would be a Homogeneous Aggregate of double-words. -- Combination of unspecified bits in _BitInt and B.2 in PCS rules. The mapping this commits defines from a _BitInt to a Machine Type specifies that the bits of the relevant Machine Type that are unused in a _BitInt(N) have unspecified value. Rule B.2 of our parameter passing standard specifies that when there are unused bits in an integral Fundamental Data Type that is passed in registers, those unused bits are zero- or sign-extended to a full-word. The combination of this rule with the fact that _BitInt types are zero- or sign-extended to the Fundamental Data Type which they are passed in means that e.g. when passing a _BitInt(2) across a PCS boundary in a register, bits [2-63] inclusive are sign-extended. -- In-memory and in-register representations match. This commit only specifies the mapping from language level type to machine type. The machine type is then treated as it currently is in memory and in registers.

mmalcomson · 2023-09-25T13:40:51Z

@jakubjelinek Just FYI I've recently pushed an update to the bitint rationale document in the relevant PR (#191 )

The point you raised about x86-64 doing something different and hence programmers will have to be ready for _BitInt(128) != __int128 is a good one, so I added that to the rationale and adjusted the rationale to mention that the decision is close.
That said, we're still leaning towards 128bit alignment for _BitInt(128), as it does seem to fit better with the rest of our ABI, and allow single-copy atomicity for LSE2 LDP and STP.

N.b. for completeness of documentation in the ticket -- after Szabolcs mentioned that clang has released a compiler using 8byte alignment for such types I double-checked with the LLVM folk that their AArch64 _BitInt ABI is explicitly called out as unstable, so that shouldn't be a problem.

`_BitInt(N)` and `unsigned _BitInt(N)` are new integral types added for C23. These types have a bit width of `N` and define a different type for each `N`. Here we define the language mapping between these language types and the machine data types used throughout AAPCS32 through which our calling standard is defined. Along with the AAPCS64 commit, this closes #175. The rationale for the choices in this patch are presented in a seperate commit with a rationale document. Some points of note around the language in this commit: -- This commit does not update any wording around bit-fields. The current wording under "Bit-fields" section 5.3.4 mentions that "A member of an aggregate that is a Fundamental Data Type may be subdivided into bit-fields". I do not believe this needs updating. While a _BitInt(N>64) can be subdivided into a bit-field at the *language level*, once it has been converted to a Machine Type, such a subdivision would look like some number of double-words which have not been subdivided (either fully used or fully unused) and either one or zero double-words which have been subdivided. The alignment requirements of the _BitInt are also the same as the alignment requirements of this fundamental data type of a double-word, which this paragraph uses to explain the resulting alignment requirements on the aggregate containing a bit-field. In the C/C++ language mapping description of "Bit-fields" we mention that a bit-field may have any integral type and since a bit-precise integer is an integral type this still holds. The explanation of where a field can be placed in this section relies on *alignment* requirements of the field type. For _BitInt this lines up with the discussion on *fundamental data types* at the machine-level, since the alignment requirements of a _BitInt are that of the "chunk" it is made up of, which is the fundamental data type of a double-word. Hence I believe the current language does not need updating for bit-precise integers. -- _BitInt(N > 64) types and Homogeneous Aggregates. With this changeset, _BitInt(N>64) types are treated as arrays of uint64_t values. Hence at the machine level they would be a Homogeneous Aggregate of double-words. -- Combination of unspecified bits in _BitInt and B.2 in PCS rules. The mapping this commits defines from a _BitInt to a Machine Type specifies that the bits of the relevant Machine Type that are unused in a _BitInt(N) have unspecified value. Rule B.2 of our parameter passing standard specifies that when there are unused bits in an integral Fundamental Data Type that is passed in registers, those unused bits are zero- or sign-extended to a full-word. The combination of this rule with the fact that _BitInt types are zero- or sign-extended to the Fundamental Data Type which they are passed in means that e.g. when passing a _BitInt(2) across a PCS boundary in a register, bits [2-63] inclusive are sign-extended. -- In-memory and in-register representations match. This commit only specifies the mapping from language level type to machine type. The machine type is then treated as it currently is in memory and in registers.

mmalcomson added the requires-toolchain-change This ABI change would require a corresponding toolchain change label Mar 7, 2023

stuij closed this as completed in d621417 Oct 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define `_BitInt` ABI #175

Define `_BitInt` ABI #175

jsm28 commented Oct 25, 2022

jakubjelinek commented May 26, 2023

mmalcomson commented May 30, 2023

jakubjelinek commented May 30, 2023

jakubjelinek commented May 30, 2023

mmalcomson commented May 30, 2023 •

edited

Loading

mmalcomson commented May 31, 2023

jakubjelinek commented May 31, 2023

nsz-arm commented Jun 1, 2023

nsz-arm commented Jun 1, 2023

mmalcomson commented Sep 25, 2023

Define _BitInt ABI #175

Define _BitInt ABI #175

Comments

jsm28 commented Oct 25, 2022

jakubjelinek commented May 26, 2023

mmalcomson commented May 30, 2023

jakubjelinek commented May 30, 2023

jakubjelinek commented May 30, 2023

mmalcomson commented May 30, 2023 • edited Loading

mmalcomson commented May 31, 2023

jakubjelinek commented May 31, 2023

nsz-arm commented Jun 1, 2023

nsz-arm commented Jun 1, 2023

mmalcomson commented Sep 25, 2023

Define `_BitInt` ABI #175

Define `_BitInt` ABI #175

mmalcomson commented May 30, 2023 •

edited

Loading