[ARM] enable FENV_ACCESS pragma support for hard-float targets #137101

Varnike · 2025-04-24T01:08:21Z

Added support for FENV_ACCESS pragma on hard-float ARM platforms. Also changes were made to clang/test/Parser/pragma-fp-warn.c so that for thumbv7a only the soft-float-abi target case is checked.

github-actions · 2025-04-24T01:08:40Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-04-24T01:09:12Z

@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-backend-arm

@llvm/pr-subscribers-clang

Author: Erik Enikeev (Varnike)

Changes

Added support for FENV_ACCESS pragma on hard-float ARM platforms. Also changes were made to clang/test/Parser/pragma-fp-warn.c so that for thumbv7a only the soft-float-abi target case is checked.

Full diff: https://github.com/llvm/llvm-project/pull/137101.diff

2 Files Affected:

(modified) clang/lib/Basic/Targets/ARM.cpp (+2)
(modified) clang/test/Parser/pragma-fp-warn.c (+1-1)

diff --git a/clang/lib/Basic/Targets/ARM.cpp b/clang/lib/Basic/Targets/ARM.cpp
index ca2c1ffbb0eb7..e3ab6e9abf78a 100644
--- a/clang/lib/Basic/Targets/ARM.cpp
+++ b/clang/lib/Basic/Targets/ARM.cpp
@@ -363,6 +363,8 @@ ARMTargetInfo::ARMTargetInfo(const llvm::Triple &Triple,
                            : "\01mcount";
 
   SoftFloatABI = llvm::is_contained(Opts.FeaturesAsWritten, "+soft-float-abi");
+  if (!SoftFloatABI)
+    HasStrictFP = true;
 }
 
 StringRef ARMTargetInfo::getABI() const { return ABI; }
diff --git a/clang/test/Parser/pragma-fp-warn.c b/clang/test/Parser/pragma-fp-warn.c
index c52bd4e4805ab..f743cb87997dc 100644
--- a/clang/test/Parser/pragma-fp-warn.c
+++ b/clang/test/Parser/pragma-fp-warn.c
@@ -1,6 +1,6 @@
 
 // RUN: %clang_cc1 -triple wasm32 -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s
-// RUN: %clang_cc1 -triple thumbv7 -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s
+// RUN: %clang_cc1 -triple thumbv7 -fsyntax-only -target-feature +soft-float-abi  -Wno-unknown-pragmas -Wignored-pragmas -verify %s
 // RUN: %clang_cc1 -DEXPOK -triple aarch64 -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s
 // RUN: %clang_cc1 -DEXPOK -triple x86_64 -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s
 // RUN: %clang_cc1 -DEXPOK -triple systemz -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s

davemgreen · 2025-04-24T13:28:57Z

I believe the backend would still need work to make sure this is supported, which has not been done yet. I was expecting it to fail more noisily, but it appears the strict nodes are lowered to generic nodes. That doesn't mean that strict-fp is supported by the Arm backend, as it might not handle the instructions correctly at the moment (moving instructions past one another, for example).

davemgreen · 2025-04-24T13:30:43Z

@john-brawn-arm did the work for AArch64, I'm not sure if he would have an idea how much work it would involve for the Arm backend.

john-brawn-arm · 2025-04-24T13:58:35Z

It's been several years since I looked at this, but from my notes what's needed before setting HasStrictFP=true is:

Instruction selection patterns for strict fp ops. This is probably just changing e.g. "fadd" to "any_fadd" as I expect for most or all of the instruction selection patterns work fine with strict fp.
Instructions that can raise fp exceptions need to have mayRaiseFPException=1. Without this transformations can move around or just remove instructions that raise fp exceptions.
We need proper modelling of reads/write of the FPSCR rounding control bits for controllable rounding modes to be handled correctly.

john-brawn-arm · 2025-04-24T14:09:21Z

Simple example where setting HasStrictFP=true without doing the above gives wrong results:

int fetestexcept( int excepts );
#define FE_DIVBYZERO 0x04

int fn(float x) {
#pragma STDC FENV_ACCESS ON
  x / 0;
  return fetestexcept(FE_DIVBYZERO);
}

With --target=aarch64-none-elf we get a divide instruction and so return FE_DIVBYZERO, with --target=arm-non-eabi we get no divide instruction and return 0.

Varnike · 2025-07-16T11:29:41Z

Hello!

Thank you for the comments and sorry for the delayed response.

I've now returned to this task and plan to work towards full support for strict fp on hard-float ARM targets.

Based on @john-brawn-arm's example and suggestions, I was able to add support for STRICT_FDIV. With such changes the example is compiled correctly (with divide instruction). Could you please review these changes and let me know if you spot any issues? Are there any additional cases that require special handling? Also, is there a test suite that would be sufficient to cover most strict fp usage scenarios?

If these changes are acceptable, I plan to proceed with support for the remaining strict fp ops.

davemgreen · 2025-07-17T10:21:33Z

Hi - I went looking at out internal tracker for what happened the last time we enabled strictfp for AArch64. This was the list of patches mentioned, there might have been some more either before these or some minor fixups.

94843ea7d7e5 [AArch64] Make machine combiner patterns preserve MIFlags
6f53960d6416 [AArch64] Adjust machine-combiner-reassociate.mir test
bca998ed3c9a [AArch64] Generate fcmps when appropriate for neon intrinsics
0d8092dd485a [AArch64] Fix legalization of v1f64 strict_fsetcc and strict_fsetccs
d4342efb6959 [AArch64] Add instruction selection for strict FP
9d68ed08178d [AArch64] Allow strict opcodes in fp->int->fp patterns
b670da798d35 [AArch64] Allow strict opcodes in indexed fmul and fma patterns
d916856bee11 [AArch64] Allow strict opcodes in faddp patterns
8e17c9613f36 [AArch64] Add some missing strict FP vector lowering
bbd7eac800e6 [AArch64] Remove an unused variable in my previous patch
12c1022679d4 [AArch64] Lowering and legalization of strict FP16
1b1466c34669 [AArch64] Adjust aarch64 constrained intrinsics tests and un-XFAIL
27a8735a444f [AArch64] Add mayRaiseFPException to appropriate instructions
88ac25b357aa [MachineCSE] Allow PRE of instructions that read physical registers
2d8c1597e51c [MIRVRegNamer] Avoid opcode hash collision
49510c50200c [AArch64] Mark all instructions that read/write FPCR as doing so
9e3264ab2021 [FPEnv] Enable strict fp for AArch64 in clang

We would probably need something similar for Arm - it is worth not underestimating the amount of work we would need to do, but if you are happy to try and push it through we would need to handle all the strict variants of the instructions and add mayRaiseFPException/FPCR to the relevant places.

Varnike · 2025-07-18T09:10:06Z

Now I have an approximate volume of work and plan to try to do it. I have provided these edits rather as an example of correction for a particular case (not taking tests into account), by analogy with which other cases can be considered. Therefore, I would like to know how correct and complete they are.

davemgreen · 2025-07-18T09:37:14Z

@john-brawn-arm any ideas? It looks OK to me. (It would need to handle all instructions before HasStrictFP was added though).

Varnike · 2025-08-01T19:16:11Z

Hi,

I added the processing of remaining strict fp ops for VFP and adjusted the tests. Could you please review these changes and tell what else needs to be done? I also have a question about whether it's necessary to change the instruction selection patterns for MVE and NEON (as was done for VFP), since, as far as I understand, they do not provide full ieee compliance on all ARM platforms.

Regarding HasStrictFP setting, I have left this commit for now; however, it can only be considered once the conditions you specified are met.

Varnike · 2025-08-06T08:01:18Z

@john-brawn-arm, @davemgreen can you please take a look at the patch?

davemgreen

I believe it looks sensible. We will need strict-fp tests for the new operations now supported, and there is quite a bit going on in this patch. Neon and MVE sometimes don't implement ieee-fp exactly, I would need to remind myself of the details.

I would not expect the tests to change (like llvm/test/CodeGen/Thumb2/mve-fmas.ll) - strictfp should not make the normal case worse. Do you know what might be causing those to be different?

Varnike · 2025-08-13T12:29:10Z

We will need strict-fp tests for the new operations now supported, and there is quite a bit going on in this patch.

Will it be enough to add strict fp tests similar to those made for AArch64?

Regarding the changes in some tests (e.g. Thumb2/mve-fmas.ll): they arise after updating patterns in ARMInstrVFP.td, which in turn reorders some instructions. For VFP tests, as far as I understand, such strict fp behavior can be expected, but I’m unsure to what extent the MVE tests should be affected.

github-actions · 2025-08-14T05:49:03Z

✅ With the latest revision this PR passed the C/C++ code formatter.

davemgreen · 2025-08-14T06:36:38Z

Will it be enough to add strict fp tests similar to those made for AArch64?

Yeah I think so for the most part. There is the added complication of soft-fp but it should mostly be a case of testing all the strict-fp operations under a couple of configs to make sure everything works OK.
Note that there is talk of changing how the constrained-fp intrinsics are represented in the IR https://discourse.llvm.org/t/rfc-change-of-strict-fp-operation-representation-in-ir/85021, but my understanding is that they will work the same for codegen.

Regarding the changes in some tests (e.g. Thumb2/mve-fmas.ll): they arise after updating patterns in ARMInstrVFP.td, which in turn reorders some instructions. For VFP tests, as far as I understand, such strict fp behavior can be expected, but I’m unsure to what extent the MVE tests should be affected.

I think we need to teach it that the fpscr bits of nofpexcept instructions and fpscr_nzcv do not interact and can be treated separately for scheduling. In general we need to be careful not to make perf worse for all the people who do not use strict-fp. Maybe in this case it is caused by the vmrs APSR_nzcv, fpscr instructions?

…s. Also changes were made to clang/test/Parser/pragma-fp-warn.c so that for thumbv7a only the soft-float-abi target case is checked.

…mayRaiseFPException to appropriate instructions

Varnike · 2025-08-27T18:28:37Z

Managed to remove the connection between fpscr bits of nofpexcept instructions and fpscr_nzcv, now the tests like Thumb2/mve-fmas.ll do not change.
Added strict fp tests.
Function calls marked as possibly changing FPSCR (by analogy with https://reviews.llvm.org/D143001).

Varnike · 2025-08-28T14:03:46Z

Fix test failures reported by CI.

davemgreen

It looks like this is progressing nicely. It's quite a large change and if there are any parts we can pull out and commit separately, that would help.

davemgreen · 2025-08-28T06:56:23Z

llvm/lib/Target/ARM/ARMRegisterInfo.td

 def SPSR       : ARMReg<2,  "spsr">;
 def FPSCR      : ARMReg<3,  "fpscr">;
-def FPSCR_NZCV : ARMReg<3,  "fpscr_nzcv"> {
-  let Aliases = [FPSCR];


Can we really say this doesn't alias FPSCR? If we set fpscr_nzcv it will set bits of fpscr and vice-versa. The exception status bits (and the rounding controls) are separate from nzcv, but that might not be the only use of the register.

Yes, in general they should be aliases. However, this behavior will lead to the test changes discussed above (e.g. Thumb2/mve-fmas.ll): they occur because some of the instructions begin to change the FPSCR (that is, FPSCR_NZCV too because they are aliases), so during scheduling it is no longer possible to leave everything as is.

I think we need to teach it that the fpscr bits of nofpexcept instructions and fpscr_nzcv do not interact and can be treated separately for scheduling.

My understanding is that while they alias, this approach won’t work (please correct me if I'm wrong). To keep the existing tests unchanged, it might be necessary to change the patterns for most instructions which can raise fp exception and change FPSCR: instead of using any_* add individual patterns fot strict_*. This can lead to significant complications in writing patterns for fp.

Can you tell if you see a simpler way to solve this? Also how critical is the fact that some tests (e.g. Thumb2/mve-fmas.ll) may change?

davemgreen · 2025-08-29T08:18:18Z

llvm/test/CodeGen/ARM/strict-fp-ops.ll

@@ -0,0 +1,202 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5


For reference when they are needed, this was a list of strict-fp ops that we would need to make sure are tested. Hopefully a lot of them are simple expanded into libcalls.

llvm.experimental.constrained.acos llvm.experimental.constrained.asin llvm.experimental.constrained.atan2 llvm.experimental.constrained.atan llvm.experimental.constrained.ceil llvm.experimental.constrained.cosh llvm.experimental.constrained.cos llvm.experimental.constrained.exp2 llvm.experimental.constrained.exp llvm.experimental.constrained.fadd llvm.experimental.constrained.fcmp llvm.experimental.constrained.fcmp llvm.experimental.constrained.fdiv llvm.experimental.constrained.floor llvm.experimental.constrained.fma llvm.experimental.constrained.fmuladd llvm.experimental.constrained.fmul llvm.experimental.constrained.fpext llvm.experimental.constrained.fptosi llvm.experimental.constrained.fptoui llvm.experimental.constrained.fptrunc llvm.experimental.constrained.frem llvm.experimental.constrained.fsub llvm.experimental.constrained.ldexp llvm.experimental.constrained.llrint llvm.experimental.constrained.llround llvm.experimental.constrained.log10 llvm.experimental.constrained.log2 llvm.experimental.constrained.log llvm.experimental.constrained.lrint llvm.experimental.constrained.lround llvm.experimental.constrained.maximum llvm.experimental.constrained.maxnum llvm.experimental.constrained.minimum llvm.experimental.constrained.minnum llvm.experimental.constrained.nearbyint llvm.experimental.constrained.powi llvm.experimental.constrained.pow llvm.experimental.constrained.rint llvm.experimental.constrained.roundeven llvm.experimental.constrained.round llvm.experimental.constrained.sinh llvm.experimental.constrained.sin llvm.experimental.constrained.sitofp llvm.experimental.constrained.sqrt llvm.experimental.constrained.tanh llvm.experimental.constrained.tan llvm.experimental.constrained.trunc llvm.experimental.constrained.uitofp

There are double. float and fp16 types that are of interest, both soft-fp and hard. I hope the others are handled automatically. I'm not sure what happens with vectors, both neon and mve. The frontend intrinsics I think should be considered as being in the current fp environment, strict-fp and all. Some do not do denormal flushing or set flags though.

Varnike · 2025-09-04T11:11:44Z

Added remaining tests for f16/32/64 intrinsic.
Fixed lowering for several f16 ops, which leaded to errors at selection stage (https://godbolt.org/z/G8E9Ksv4P).
The only untested intrinsics are maximum and minimum. At the moment they also lead to compilation fail (https://godbolt.org/z/cbdx5bj9v, same for llvm.maximum and minimum). Is there support for these intrinsics in the backend and should it be addressed within this patch?

I haven’t addressed FPSCR yet; before proceeding I'd like to hear your opinion on how should it be done.

Varnike · 2025-09-15T11:27:44Z

@davemgreen @john-brawn-arm, ping.

davemgreen · 2025-09-16T13:36:37Z

I will try and find some time to take a look at this, I haven't managed to yet. Are there any smaller parts we can split away into their own reviews to start moving it forward?

Varnike · 2025-09-16T16:00:41Z

Are there any smaller parts we can split away into their own reviews to start moving it forward?

Yes, I think it will be possible to divide the patch into several separate ones. However, they will depend on the patch with changes to intruction selection patterns (commit 9e5fbe0), which should be processed first. This patch affects changes in tests related to FPSCR (e.g. Thumb2/mve-fmas.ll). Therefore, at first it would be better to decide on the position regarding this issue, after which it will be possible to split the patch.

As for how to organize the patches after separation: should each individual patch have a separate PR?

davemgreen · 2025-09-21T10:47:52Z

Am I right that the bits of FPSCR that are read by the instruction are just the rounding mode? Can we add a new register for just those bits, like FPSCR_NZCV. FPSCR_RM would alias with FPSCR but not FPSCR_NZCV.

def FPSCR_RM   : ARMReg<3,  "fpscr_rm"> {
  let Aliases = [FPSCR];
}

I didn't try it past a few of the basic tests in llvm. The important part is that we don't hoist the instructions that use the rounding mode past an instructions that sets it. (And an instruction that sets FPSCR doesn't get moved past one that sets FPSCR_NZCV).

The alternative might be to try and make fpscr_rm and fpscr_nzcv SubRegs of fpscr, which might work but ran into other issues when I tried it.

As for how to organize the patches after separation: should each individual patch have a separate PR?

Ideally yeah, So long as they are individually testable. It is more difficult to review larger changes and get them to stick in llvm without causing issues and needing to be reverted.

Varnike · 2025-09-25T12:55:03Z

Can we add a new register for just those bits, like FPSCR_NZCV. FPSCR_RM would alias with FPSCR but not FPSCR_NZCV.

Thanks for the comment! I took this approach and it seems that everything works with it.

I managed to split the patch into 4 parts that can be tested independently:

[ARM] Add instruction selection for strict FP #160696
[ARM] Add mayRaiseFPException to appropriate instructions and mark all instructions that read/write fpscr rounding bits as doing so #160698
[ARM] Mark function calls as possibly changing FPSCR #160699
commit 06f4aea from this pr which enables strict fp support by setting HasStrictFP = true. It can only be applied after the previous patches have been committed. Would it be preferable to leave this as a single commit in the current PR, or to move it into a separate new one?

However, this split leaves three tests, added by analogy with AArch64, that I haven’t yet included in any patch. One of them (strict-fp-ops.ll) depends on both patches, which I still wanted to keep separate for review convenience. The other two (strict-fp-int-promote.ll, strictfp_f16_abi_promote.ll) are already supported correctly without my changes. Perhaps it would be better to add them after the other patches have been committed to LLVM? If this approach is not appropriate and the tests should be included in the patches above, please let me know and I will update them.

As shown in llvm#137101, fp16 lrint are not handled correctly on Arm. This adds soft-half promotion for them, reusing the function that promotes a value with operands (and can handle strict fp once that is added).

llvmbot added clang Clang issues not falling into any other category backend:ARM clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Apr 24, 2025

Varnike force-pushed the main branch 2 times, most recently from f57085e to 1b9b201 Compare June 9, 2025 16:08

Varnike force-pushed the main branch 2 times, most recently from 0231231 to 6010639 Compare July 16, 2025 11:18

davemgreen requested review from davemgreen and john-brawn-arm July 18, 2025 09:36

Varnike force-pushed the main branch from 6010639 to cc159d8 Compare August 1, 2025 18:45

llvmbot added the llvm:globalisel label Aug 1, 2025

davemgreen reviewed Aug 11, 2025

View reviewed changes

Varnike added 3 commits August 27, 2025 18:50

[ARM] Added support for FENV_ACCESS pragma on hard-float ARM platform…

06f4aea

…s. Also changes were made to clang/test/Parser/pragma-fp-warn.c so that for thumbv7a only the soft-float-abi target case is checked.

[ARM] Lowering and legalization of strict ops

01aeb80

[ARM] Change intruction selection patterns for strict fp ops and add …

9e5fbe0

…mayRaiseFPException to appropriate instructions

[ARM] remove FPSCR alias from FPSCR_NZCV

c48aee6

Varnike force-pushed the main branch from cc159d8 to e8b7fdf Compare August 27, 2025 18:10

Varnike added 3 commits August 28, 2025 16:48

[ARM] Adjust tests after adding strict fp

908a32d

[ARM] Mark function calls as possibly changing FPSCR

b203c39

[ARM] Add various strictfp tests

3a9ac7c

Varnike force-pushed the main branch from e8b7fdf to 3a9ac7c Compare August 28, 2025 14:00

davemgreen reviewed Aug 29, 2025

View reviewed changes

Varnike added 2 commits September 3, 2025 17:35

[ARM] Add various f32/f64 intrinsic tests

1359c8f

[ARM] Fix lowering for fp16 ops & add fp16 intrinsics test

700169d

davemgreen mentioned this pull request Sep 28, 2025

[ARM][SDAG] Half promote llvm.lrint nodes. #161088

Open

		@@ -0,0 +1,202 @@
		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5

[ARM] enable FENV_ACCESS pragma support for hard-float targets #137101

Are you sure you want to change the base?

[ARM] enable FENV_ACCESS pragma support for hard-float targets #137101

Uh oh!

Conversation

Varnike commented Apr 24, 2025

Uh oh!

github-actions bot commented Apr 24, 2025

Uh oh!

llvmbot commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davemgreen commented Apr 24, 2025

Uh oh!

davemgreen commented Apr 24, 2025

Uh oh!

john-brawn-arm commented Apr 24, 2025

Uh oh!

john-brawn-arm commented Apr 24, 2025

Uh oh!

Varnike commented Jul 16, 2025

Uh oh!

davemgreen commented Jul 17, 2025

Uh oh!

Varnike commented Jul 18, 2025

Uh oh!

davemgreen commented Jul 18, 2025

Uh oh!

Varnike commented Aug 1, 2025

Uh oh!

Varnike commented Aug 6, 2025

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

Varnike commented Aug 13, 2025

Uh oh!

github-actions bot commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davemgreen commented Aug 14, 2025

Uh oh!

Varnike commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Varnike commented Aug 28, 2025

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

davemgreen Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

Varnike Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

davemgreen Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

Varnike commented Sep 4, 2025

Uh oh!

Varnike commented Sep 15, 2025

Uh oh!

davemgreen commented Sep 16, 2025

Uh oh!

Varnike commented Sep 16, 2025

Uh oh!

davemgreen commented Sep 21, 2025

Uh oh!

Varnike commented Sep 25, 2025

Uh oh!

Uh oh!

llvmbot commented Apr 24, 2025 •

edited

Loading

github-actions bot commented Aug 14, 2025 •

edited

Loading

Varnike commented Aug 27, 2025 •

edited

Loading