Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add target-specific NaN payloads for the missing tier 2 targets #138870

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

beetrees
Copy link
Contributor

This PR adds target-specific NaN payloads for the remaining tier 2 targets:

  • arm64ec: This target is a mix of x86_64 and aarch64, meaning as they both have no extra payloads arm64ec also has no extra payloads.
  • loongarch64: Per LoongArch Reference Manual - Volume 1: Basic Architecture section 3.1.1.3, LoongArch does quieting NaN propagation with the Rust preferred NaN as its default NaN, meaning it has no extra payloads.
  • nvptx64: Per PTX ISA documentation section 9.7.3 (and section 9.7.4 for f16), all payloads are possible. The documentation explicitly states that f16 and f32 operations result in an unspecified NaN payload, while for f64 it states "NaN payloads are supported" without specifying how or what payload will be generated if there are no input NaNs.
  • powerpc and powerpc64: Per Power Instruction Set Architecture Book I section 4.3.2, PowerPC does quieting NaN propagation with the Rust preferred NaN being generated if no there are no input NaNs, meaning it has no extra payloads.
  • s390x: Per IBM z/Architecture Principles of Operation page 9-3, s390x does quieting NaN propagation with the Rust's preferred NaN as its default NaN, meaning it has no extra payloads.

Tracking issue: #128288

cc @RalfJung
@rustbot label +T-lang

Also cc relevant target maintainers of tier 2 targets:

@rustbot
Copy link
Collaborator

rustbot commented Mar 23, 2025

r? @jhpratt

rustbot has assigned @jhpratt.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Mar 23, 2025
@beetrees
Copy link
Contributor Author

I was unable to find authoritative ISA documentation specifically for the tier 3 powerpc-*spe targets, which appear to use a different FPU from the regular powerpc targets. @BKPepe (or anyone else familiar with the SPE targets) do you know if the SPE targets conform to the same NaN handling as the regular powerpc targets? If not (or nobody knows) I'll update this PR to say `powerpc` (except when `target_abi = "spe"`) (I'm focusing on tier 2 targets for this PR).

@jhpratt
Copy link
Member

jhpratt commented Mar 24, 2025

I know I've seen some other discussions about NaN stuff pop up in my notifications recently. As such, passing off as Ralf likely knows better.

r? @RalfJung

@rustbot rustbot assigned RalfJung and unassigned jhpratt Mar 24, 2025
@RalfJung
Copy link
Member

This LGTM, but I can't fact-check it so let's wait a bit to give target maintainers time to take a look.

@uweigand
Copy link
Contributor

For s390x this LGTM, thanks.

@kjetilkjeka
Copy link
Contributor

Looks correct for nvptx as well

@famfo
Copy link
Contributor

famfo commented Mar 24, 2025

LGTM for powerpc64

For future reference, I came to the conclusion based on this part from Book I 4.3.2 on Not a Numbers:

Quiet NaNs are used to represent the results of certain invalid operations, such as invalid arithmetic operations on infinities or on NaNs, when Invalid Operation Exception is disabled (VE=0).
[...]
Any instruction that generates a QNaN as the result of a disabled Invalid Operation Exception generates this QNaN (i.e., 0x7FF8_0000_0000_0000).

There seems to be some documentation on QNaN values in Book I 7.3.2.2 (even though that's for VSX instructions):

Any instruction that generates a QNaN as the result of a disabled Invalid Operation exception generates the value,

  • 0x7E00 for half-precision results,
  • 0x7FC0 for bfloat16 results,
  • 0x7FC0_0000 for single-precision results,
  • 0x7FF8_0000_0000_0000 for double-precision results,
  • 0x7FFF_8000_0000_0000_0000_0000_0000_0000 for quad-precision results.

@xen0n
Copy link
Contributor

xen0n commented Mar 25, 2025

We'll have to double-check for loongarch64, as the LoongArch Reference Manual - Volume 1: Basic Architecture section 3.1.1.3 also has:

Case 2: When there is no SNaN in the source operand but QNaN exists, the QNaN with the highest priority is selected as the result of this instruction. At this time, the way of judging the priority of the source operand is the same as in the above situation.

Whereas the "source operand priority" is based on the instruction format of individual relevant instructions. So QNaN propagation should be possible on LoongArch, and the preferred QNaN is only generated for operations whose inputs don't involve any NaN, but we'd like to do some experiments to definitely confirm.

@RalfJung
Copy link
Member

I was unable to find authoritative ISA documentation specifically for the tier 3 powerpc-*spe targets, which appear to use a different FPU from the regular powerpc targets. @BKPepe (or anyone else familiar with the SPE targets) do you know if the SPE targets conform to the same NaN handling as the regular powerpc targets? If not (or nobody knows) I'll update this PR to say `powerpc` (except when `target_abi = "spe"`) (I'm focusing on tier 2 targets for this PR).

The docs I found for this at https://www.nxp.com/docs/en/reference-manual/SPEPEM.pdf state some truly strange things:
"Embedded floating-point operations do not produce +Inf, –Inf, NaN, or a denormalized number. If the
result of an instruction overflows and floating-point overflow exceptions are disabled
(SPEFSCR[FOVFE] is cleared), pmax or nmax is generated as the result of that instruction depending on
the sign of the result. If the result of an instruction underflows and floating-point underflow exceptions are
disabled (SPEFSCR[FUNFE] is cleared), +0 or -0 is generated as the result of that instruction based upon
the sign of the result."

The docs also mention software routines can be used to override this behavior, but I don't know if this will happen on a typical instance of this target. If not, this is non-compliant with IEEE 754 and hence unsound for Rust.

So probably it'd be better for now to clarify that the statement in the table only refers to powerpc chips with the normal FPU.

@heiher
Copy link
Contributor

heiher commented Mar 25, 2025

We'll have to double-check for loongarch64, as the LoongArch Reference Manual - Volume 1: Basic Architecture section 3.1.1.3 also has:

Case 2: When there is no SNaN in the source operand but QNaN exists, the QNaN with the highest priority is selected as the result of this instruction. At this time, the way of judging the priority of the source operand is the same as in the above situation.

Whereas the "source operand priority" is based on the instruction format of individual relevant instructions. So QNaN propagation should be possible on LoongArch, and the preferred QNaN is only generated for operations whose inputs don't involve any NaN, but we'd like to do some experiments to definitely confirm.

"Case 2" aligns with the Quieting NaN propagation defined in Rust's primitive NaN bit patterns, where QNaN is propagated from any input operand. For LoongArch, except for "Case 1" and "Case 2," all other cases follow the Preferred NaN. As a result, LoongArch does not have target-specific NaN payloads.

@beetrees
Copy link
Contributor Author

The docs I found for this at https://www.nxp.com/docs/en/reference-manual/SPEPEM.pdf state some truly strange things: "Embedded floating-point operations do not produce +Inf, –Inf, NaN, or a denormalized number. If the result of an instruction overflows and floating-point overflow exceptions are disabled (SPEFSCR[FOVFE] is cleared), pmax or nmax is generated as the result of that instruction depending on the sign of the result. If the result of an instruction underflows and floating-point underflow exceptions are disabled (SPEFSCR[FUNFE] is cleared), +0 or -0 is generated as the result of that instruction based upon the sign of the result."

The docs also mention software routines can be used to override this behavior, but I don't know if this will happen on a typical instance of this target. If not, this is non-compliant with IEEE 754 and hence unsound for Rust.

So probably it'd be better for now to clarify that the statement in the table only refers to powerpc chips with the normal FPU.

I've updated the powerpc entry to say (except when `target_abi = "spe"`). I was getting confused by the three SPE targets using regular FPU instructions instead of SPE instructions as none of them have the spe target feature enabled; I've opened #138960 to track that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants