riscv_fpu: pre-approved scalar FPU conformance fixes#241
Merged
Conversation
The scalar FP path is interpreted, not JIT'd, so size-optimizing its dispatch only costs throughput. Dropping func_opt_size is roughly 2x interpreter FP at zero size cost on the rest of the build. Signed-off-by: Sol Astrius Phoenix <sol@astrius.ink>
IEEE 754 squareRoot(-0) is -0 with no exception; only a negative non-zero (or -inf) operand is invalid. Gate the invalid path on a new FPU_LIB_FPxx_NEGATIVE_ZERO constant so the -0 carve-out reads clearly. Signed-off-by: Sol Astrius Phoenix <sol@astrius.ink>
fnmadd computes -(rs1*rs2) - rs3 as a single fused operation. Negating the result of fma(rs1, rs2, rs3) instead rounds rs1*rs2 + rs3 and then flips the sign, which under the directed rounding modes can land on the wrong side of the result. Negate rs1 and rs3 so the one rounding sees the correctly-signed value. Signed-off-by: Sol Astrius Phoenix <sol@astrius.ink>
Per the RISC-V NaN-boxing rule, a narrow (f32) operand that is not correctly NaN-boxed in its 64-bit register must read as the canonical NaN. Route the value reads through riscv_read_s (the NaN-unboxing read) across arithmetic, conversions, sign-injection, min/max, compares, classification, and the FMA family. fmv.x.w keeps the raw riscv_view_s read, as it moves the register bits verbatim. Signed-off-by: Sol Astrius Phoenix <sol@astrius.ink>
Contributor
Author
purplesyringa
approved these changes
Jun 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The four scalar-FPU fixes that were already reviewed and marked LGTM in #239 / #240.
What's here
func_opt_sizeon the FP op dispatch. The scalar FP path is interpreted, not JIT'd, so size-optimizing the dispatch only costs throughput — roughly 2x interpreter FP at zero size cost elsewhere.sqrt(-0.0)returns-0.0without raising invalid. IEEE 754squareRoot(-0)is-0with no exception; only a negative non-zero (or-inf) operand is invalid. The-0carve-out is gated on a newFPU_LIB_FPxx_NEGATIVE_ZEROconstant so the intent reads clearly (per review feedback).fnmaddcomputes-(rs1*rs2) - rs3as a single fused op; negatingfma(rs1, rs2, rs3)instead roundsrs1*rs2 + rs3and then flips the sign, which under directed rounding modes can land on the wrong side. Negaters1andrs3so the one rounding sees the correctly-signed value.riscv_read_sacross arithmetic, conversions, sign-injection, min/max, compares, classification, and the FMA family.fmv.x.wkeeps the rawriscv_view_sread (it moves the register bits verbatim).