Add target-specific NaN payloads for the missing tier 2 targets #138870

beetrees · 2025-03-23T23:24:56Z

This PR adds target-specific NaN payloads for the remaining tier 2 targets:

arm64ec: This target is a mix of x86_64 and aarch64, meaning as they both have no extra payloads arm64ec also has no extra payloads.
loongarch64: Per LoongArch Reference Manual - Volume 1: Basic Architecture section 3.1.1.3, LoongArch does quieting NaN propagation with the Rust preferred NaN as its default NaN, meaning it has no extra payloads.
nvptx64: Per PTX ISA documentation section 9.7.3 (and section 9.7.4 for f16), all payloads are possible. The documentation explicitly states that f16 and f32 operations result in an unspecified NaN payload, while for f64 it states "NaN payloads are supported" without specifying how or what payload will be generated if there are no input NaNs.
powerpc and powerpc64: Per Power Instruction Set Architecture Book I section 4.3.2, PowerPC does quieting NaN propagation with the Rust preferred NaN being generated if no there are no input NaNs, meaning it has no extra payloads.
s390x: Per IBM z/Architecture Principles of Operation page 9-3, s390x does quieting NaN propagation with the Rust's preferred NaN as its default NaN, meaning it has no extra payloads.

Tracking issue: #128288

cc @RalfJung
@rustbot label +T-lang

Also cc relevant target maintainers of tier 2 targets:

arm64ec: @dpaoliello
loongarch64: @heiher @xiangzhai @zhaixiaojuan @xen0n
nvptx64: @RDambrosio016 @kjetilkjeka
powerpc: the only documented maintainer is @BKPepe for the tier 3 powerpc-unknown-linux-muslspe.
powerpc64: @daltenty @gilamn5tr @Gelbpunkt @famfo @neuschaefer
s390x: @uweigand @cuviper

rustbot · 2025-03-23T23:25:01Z

r? @jhpratt

rustbot has assigned @jhpratt.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

beetrees · 2025-03-23T23:59:22Z

I was unable to find authoritative ISA documentation specifically for the tier 3 powerpc-*spe targets, which appear to use a different FPU from the regular powerpc targets. @BKPepe (or anyone else familiar with the SPE targets) do you know if the SPE targets conform to the same NaN handling as the regular powerpc targets? If not (or nobody knows) I'll update this PR to say `powerpc` (except when `target_abi = "spe"`) (I'm focusing on tier 2 targets for this PR).

jhpratt · 2025-03-24T00:53:32Z

I know I've seen some other discussions about NaN stuff pop up in my notifications recently. As such, passing off as Ralf likely knows better.

r? @RalfJung

RalfJung · 2025-03-24T11:30:16Z

This LGTM, but I can't fact-check it so let's wait a bit to give target maintainers time to take a look.

uweigand · 2025-03-24T11:46:46Z

For s390x this LGTM, thanks.

kjetilkjeka · 2025-03-24T13:38:01Z

Looks correct for nvptx as well

famfo · 2025-03-24T20:12:35Z

LGTM for powerpc64

For future reference, I came to the conclusion based on this part from Book I 4.3.2 on Not a Numbers:

Quiet NaNs are used to represent the results of certain invalid operations, such as invalid arithmetic operations on infinities or on NaNs, when Invalid Operation Exception is disabled (VE=0).
[...]
Any instruction that generates a QNaN as the result of a disabled Invalid Operation Exception generates this QNaN (i.e., 0x7FF8_0000_0000_0000).

There seems to be some documentation on QNaN values in Book I 7.3.2.2 (even though that's for VSX instructions):

Any instruction that generates a QNaN as the result of a disabled Invalid Operation exception generates the value,

0x7E00 for half-precision results,

0x7FC0 for bfloat16 results,

0x7FC0_0000 for single-precision results,

0x7FF8_0000_0000_0000 for double-precision results,

0x7FFF_8000_0000_0000_0000_0000_0000_0000 for quad-precision results.

xen0n · 2025-03-25T04:02:54Z

We'll have to double-check for loongarch64, as the LoongArch Reference Manual - Volume 1: Basic Architecture section 3.1.1.3 also has:

Case 2: When there is no SNaN in the source operand but QNaN exists, the QNaN with the highest priority is selected as the result of this instruction. At this time, the way of judging the priority of the source operand is the same as in the above situation.

Whereas the "source operand priority" is based on the instruction format of individual relevant instructions. So QNaN propagation should be possible on LoongArch, and the preferred QNaN is only generated for operations whose inputs don't involve any NaN, but we'd like to do some experiments to definitely confirm.

RalfJung · 2025-03-25T07:04:29Z

I was unable to find authoritative ISA documentation specifically for the tier 3 powerpc-*spe targets, which appear to use a different FPU from the regular powerpc targets. @BKPepe (or anyone else familiar with the SPE targets) do you know if the SPE targets conform to the same NaN handling as the regular powerpc targets? If not (or nobody knows) I'll update this PR to say `powerpc` (except when `target_abi = "spe"`) (I'm focusing on tier 2 targets for this PR).

The docs I found for this at https://www.nxp.com/docs/en/reference-manual/SPEPEM.pdf state some truly strange things:
"Embedded floating-point operations do not produce +Inf, –Inf, NaN, or a denormalized number. If the
result of an instruction overflows and floating-point overflow exceptions are disabled
(SPEFSCR[FOVFE] is cleared), pmax or nmax is generated as the result of that instruction depending on
the sign of the result. If the result of an instruction underflows and floating-point underflow exceptions are
disabled (SPEFSCR[FUNFE] is cleared), +0 or -0 is generated as the result of that instruction based upon
the sign of the result."

The docs also mention software routines can be used to override this behavior, but I don't know if this will happen on a typical instance of this target. If not, this is non-compliant with IEEE 754 and hence unsound for Rust.

So probably it'd be better for now to clarify that the statement in the table only refers to powerpc chips with the normal FPU.

heiher · 2025-03-25T07:33:28Z

We'll have to double-check for loongarch64, as the LoongArch Reference Manual - Volume 1: Basic Architecture section 3.1.1.3 also has:

Case 2: When there is no SNaN in the source operand but QNaN exists, the QNaN with the highest priority is selected as the result of this instruction. At this time, the way of judging the priority of the source operand is the same as in the above situation.

Whereas the "source operand priority" is based on the instruction format of individual relevant instructions. So QNaN propagation should be possible on LoongArch, and the preferred QNaN is only generated for operations whose inputs don't involve any NaN, but we'd like to do some experiments to definitely confirm.

"Case 2" aligns with the Quieting NaN propagation defined in Rust's primitive NaN bit patterns, where QNaN is propagated from any input operand. For LoongArch, except for "Case 1" and "Case 2," all other cases follow the Preferred NaN. As a result, LoongArch does not have target-specific NaN payloads.

beetrees · 2025-03-26T02:08:42Z

The docs I found for this at https://www.nxp.com/docs/en/reference-manual/SPEPEM.pdf state some truly strange things: "Embedded floating-point operations do not produce +Inf, –Inf, NaN, or a denormalized number. If the result of an instruction overflows and floating-point overflow exceptions are disabled (SPEFSCR[FOVFE] is cleared), pmax or nmax is generated as the result of that instruction depending on the sign of the result. If the result of an instruction underflows and floating-point underflow exceptions are disabled (SPEFSCR[FUNFE] is cleared), +0 or -0 is generated as the result of that instruction based upon the sign of the result."

The docs also mention software routines can be used to override this behavior, but I don't know if this will happen on a typical instance of this target. If not, this is non-compliant with IEEE 754 and hence unsound for Rust.

So probably it'd be better for now to clarify that the statement in the table only refers to powerpc chips with the normal FPU.

I've updated the powerpc entry to say (except when `target_abi = "spe"`). I was getting confused by the three SPE targets using regular FPU instructions instead of SPE instructions as none of them have the spe target feature enabled; I've opened #138960 to track that.

rustbot assigned jhpratt Mar 23, 2025

rustbot added S-waiting-on-review T-libs T-lang labels Mar 23, 2025

rustbot assigned RalfJung and unassigned jhpratt Mar 24, 2025

Add target-specific NaN payloads for the missing tier 2 targets

049bb26

beetrees force-pushed the tier-2-nans branch from af1d7e9 to 049bb26 Compare March 26, 2025 02:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add target-specific NaN payloads for the missing tier 2 targets #138870

Add target-specific NaN payloads for the missing tier 2 targets #138870

beetrees commented Mar 23, 2025

rustbot commented Mar 23, 2025

beetrees commented Mar 23, 2025

jhpratt commented Mar 24, 2025

RalfJung commented Mar 24, 2025

uweigand commented Mar 24, 2025

kjetilkjeka commented Mar 24, 2025

famfo commented Mar 24, 2025

xen0n commented Mar 25, 2025

RalfJung commented Mar 25, 2025

heiher commented Mar 25, 2025

beetrees commented Mar 26, 2025

Add target-specific NaN payloads for the missing tier 2 targets #138870

Are you sure you want to change the base?

Add target-specific NaN payloads for the missing tier 2 targets #138870

Conversation

beetrees commented Mar 23, 2025

rustbot commented Mar 23, 2025

beetrees commented Mar 23, 2025

jhpratt commented Mar 24, 2025

RalfJung commented Mar 24, 2025

uweigand commented Mar 24, 2025

kjetilkjeka commented Mar 24, 2025

famfo commented Mar 24, 2025

xen0n commented Mar 25, 2025

RalfJung commented Mar 25, 2025

heiher commented Mar 25, 2025

beetrees commented Mar 26, 2025