-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[ARM] enable FENV_ACCESS pragma support for hard-float targets #137101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-clang Author: Erik Enikeev (Varnike) ChangesAdded support for FENV_ACCESS pragma on hard-float ARM platforms. Also changes were made to clang/test/Parser/pragma-fp-warn.c so that for thumbv7a only the soft-float-abi target case is checked. Full diff: https://github.com/llvm/llvm-project/pull/137101.diff 2 Files Affected:
diff --git a/clang/lib/Basic/Targets/ARM.cpp b/clang/lib/Basic/Targets/ARM.cpp
index ca2c1ffbb0eb7..e3ab6e9abf78a 100644
--- a/clang/lib/Basic/Targets/ARM.cpp
+++ b/clang/lib/Basic/Targets/ARM.cpp
@@ -363,6 +363,8 @@ ARMTargetInfo::ARMTargetInfo(const llvm::Triple &Triple,
: "\01mcount";
SoftFloatABI = llvm::is_contained(Opts.FeaturesAsWritten, "+soft-float-abi");
+ if (!SoftFloatABI)
+ HasStrictFP = true;
}
StringRef ARMTargetInfo::getABI() const { return ABI; }
diff --git a/clang/test/Parser/pragma-fp-warn.c b/clang/test/Parser/pragma-fp-warn.c
index c52bd4e4805ab..f743cb87997dc 100644
--- a/clang/test/Parser/pragma-fp-warn.c
+++ b/clang/test/Parser/pragma-fp-warn.c
@@ -1,6 +1,6 @@
// RUN: %clang_cc1 -triple wasm32 -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s
-// RUN: %clang_cc1 -triple thumbv7 -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s
+// RUN: %clang_cc1 -triple thumbv7 -fsyntax-only -target-feature +soft-float-abi -Wno-unknown-pragmas -Wignored-pragmas -verify %s
// RUN: %clang_cc1 -DEXPOK -triple aarch64 -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s
// RUN: %clang_cc1 -DEXPOK -triple x86_64 -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s
// RUN: %clang_cc1 -DEXPOK -triple systemz -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s
|
I believe the backend would still need work to make sure this is supported, which has not been done yet. I was expecting it to fail more noisily, but it appears the strict nodes are lowered to generic nodes. That doesn't mean that strict-fp is supported by the Arm backend, as it might not handle the instructions correctly at the moment (moving instructions past one another, for example). |
@john-brawn-arm did the work for AArch64, I'm not sure if he would have an idea how much work it would involve for the Arm backend. |
It's been several years since I looked at this, but from my notes what's needed before setting HasStrictFP=true is:
|
Simple example where setting HasStrictFP=true without doing the above gives wrong results:
With --target=aarch64-none-elf we get a divide instruction and so return FE_DIVBYZERO, with --target=arm-non-eabi we get no divide instruction and return 0. |
f57085e
to
1b9b201
Compare
0231231
to
6010639
Compare
Hello! Thank you for the comments and sorry for the delayed response. I've now returned to this task and plan to work towards full support for strict fp on hard-float ARM targets. Based on @john-brawn-arm's example and suggestions, I was able to add support for STRICT_FDIV. With such changes the example is compiled correctly (with divide instruction). Could you please review these changes and let me know if you spot any issues? Are there any additional cases that require special handling? Also, is there a test suite that would be sufficient to cover most strict fp usage scenarios? If these changes are acceptable, I plan to proceed with support for the remaining strict fp ops. |
Hi - I went looking at out internal tracker for what happened the last time we enabled strictfp for AArch64. This was the list of patches mentioned, there might have been some more either before these or some minor fixups.
We would probably need something similar for Arm - it is worth not underestimating the amount of work we would need to do, but if you are happy to try and push it through we would need to handle all the strict variants of the instructions and add mayRaiseFPException/FPCR to the relevant places. |
Now I have an approximate volume of work and plan to try to do it. I have provided these edits rather as an example of correction for a particular case (not taking tests into account), by analogy with which other cases can be considered. Therefore, I would like to know how correct and complete they are. |
@john-brawn-arm any ideas? It looks OK to me. (It would need to handle all instructions before HasStrictFP was added though). |
Hi, I added the processing of remaining strict fp ops for VFP and adjusted the tests. Could you please review these changes and tell what else needs to be done? I also have a question about whether it's necessary to change the instruction selection patterns for MVE and NEON (as was done for VFP), since, as far as I understand, they do not provide full ieee compliance on all ARM platforms. Regarding HasStrictFP setting, I have left this commit for now; however, it can only be considered once the conditions you specified are met. |
@john-brawn-arm, @davemgreen can you please take a look at the patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it looks sensible. We will need strict-fp tests for the new operations now supported, and there is quite a bit going on in this patch. Neon and MVE sometimes don't implement ieee-fp exactly, I would need to remind myself of the details.
I would not expect the tests to change (like llvm/test/CodeGen/Thumb2/mve-fmas.ll) - strictfp should not make the normal case worse. Do you know what might be causing those to be different?
Will it be enough to add strict fp tests similar to those made for AArch64? Regarding the changes in some tests (e.g. Thumb2/mve-fmas.ll): they arise after updating patterns in ARMInstrVFP.td, which in turn reorders some instructions. For VFP tests, as far as I understand, such strict fp behavior can be expected, but I’m unsure to what extent the MVE tests should be affected. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
Yeah I think so for the most part. There is the added complication of soft-fp but it should mostly be a case of testing all the strict-fp operations under a couple of configs to make sure everything works OK.
I think we need to teach it that the fpscr bits of nofpexcept instructions and |
…s. Also changes were made to clang/test/Parser/pragma-fp-warn.c so that for thumbv7a only the soft-float-abi target case is checked.
…mayRaiseFPException to appropriate instructions
|
Fix test failures reported by CI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is progressing nicely. It's quite a large change and if there are any parts we can pull out and commit separately, that would help.
def SPSR : ARMReg<2, "spsr">; | ||
def FPSCR : ARMReg<3, "fpscr">; | ||
def FPSCR_NZCV : ARMReg<3, "fpscr_nzcv"> { | ||
let Aliases = [FPSCR]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we really say this doesn't alias FPSCR? If we set fpscr_nzcv it will set bits of fpscr and vice-versa. The exception status bits (and the rounding controls) are separate from nzcv, but that might not be the only use of the register.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in general they should be aliases. However, this behavior will lead to the test changes discussed above (e.g. Thumb2/mve-fmas.ll): they occur because some of the instructions begin to change the FPSCR (that is, FPSCR_NZCV too because they are aliases), so during scheduling it is no longer possible to leave everything as is.
I think we need to teach it that the fpscr bits of nofpexcept instructions and fpscr_nzcv do not interact and can be treated separately for scheduling.
My understanding is that while they alias, this approach won’t work (please correct me if I'm wrong). To keep the existing tests unchanged, it might be necessary to change the patterns for most instructions which can raise fp exception and change FPSCR: instead of using any_* add individual patterns fot strict_*. This can lead to significant complications in writing patterns for fp.
Can you tell if you see a simpler way to solve this? Also how critical is the fact that some tests (e.g. Thumb2/mve-fmas.ll) may change?
@@ -0,0 +1,202 @@ | |||
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference when they are needed, this was a list of strict-fp ops that we would need to make sure are tested. Hopefully a lot of them are simple expanded into libcalls.
llvm.experimental.constrained.acos
llvm.experimental.constrained.asin
llvm.experimental.constrained.atan2
llvm.experimental.constrained.atan
llvm.experimental.constrained.ceil
llvm.experimental.constrained.cosh
llvm.experimental.constrained.cos
llvm.experimental.constrained.exp2
llvm.experimental.constrained.exp
llvm.experimental.constrained.fadd
llvm.experimental.constrained.fcmp
llvm.experimental.constrained.fcmp
llvm.experimental.constrained.fdiv
llvm.experimental.constrained.floor
llvm.experimental.constrained.fma
llvm.experimental.constrained.fmuladd
llvm.experimental.constrained.fmul
llvm.experimental.constrained.fpext
llvm.experimental.constrained.fptosi
llvm.experimental.constrained.fptoui
llvm.experimental.constrained.fptrunc
llvm.experimental.constrained.frem
llvm.experimental.constrained.fsub
llvm.experimental.constrained.ldexp
llvm.experimental.constrained.llrint
llvm.experimental.constrained.llround
llvm.experimental.constrained.log10
llvm.experimental.constrained.log2
llvm.experimental.constrained.log
llvm.experimental.constrained.lrint
llvm.experimental.constrained.lround
llvm.experimental.constrained.maximum
llvm.experimental.constrained.maxnum
llvm.experimental.constrained.minimum
llvm.experimental.constrained.minnum
llvm.experimental.constrained.nearbyint
llvm.experimental.constrained.powi
llvm.experimental.constrained.pow
llvm.experimental.constrained.rint
llvm.experimental.constrained.roundeven
llvm.experimental.constrained.round
llvm.experimental.constrained.sinh
llvm.experimental.constrained.sin
llvm.experimental.constrained.sitofp
llvm.experimental.constrained.sqrt
llvm.experimental.constrained.tanh
llvm.experimental.constrained.tan
llvm.experimental.constrained.trunc
llvm.experimental.constrained.uitofp
There are double. float and fp16 types that are of interest, both soft-fp and hard. I hope the others are handled automatically. I'm not sure what happens with vectors, both neon and mve. The frontend intrinsics I think should be considered as being in the current fp environment, strict-fp and all. Some do not do denormal flushing or set flags though.
I haven’t addressed FPSCR yet; before proceeding I'd like to hear your opinion on how should it be done. |
@davemgreen @john-brawn-arm, ping. |
I will try and find some time to take a look at this, I haven't managed to yet. Are there any smaller parts we can split away into their own reviews to start moving it forward? |
Yes, I think it will be possible to divide the patch into several separate ones. However, they will depend on the patch with changes to intruction selection patterns (commit 9e5fbe0), which should be processed first. This patch affects changes in tests related to FPSCR (e.g. Thumb2/mve-fmas.ll). Therefore, at first it would be better to decide on the position regarding this issue, after which it will be possible to split the patch. As for how to organize the patches after separation: should each individual patch have a separate PR? |
Am I right that the bits of FPSCR that are read by the instruction are just the rounding mode? Can we add a new register for just those bits, like FPSCR_NZCV. FPSCR_RM would alias with FPSCR but not FPSCR_NZCV.
I didn't try it past a few of the basic tests in llvm. The important part is that we don't hoist the instructions that use the rounding mode past an instructions that sets it. (And an instruction that sets FPSCR doesn't get moved past one that sets FPSCR_NZCV). The alternative might be to try and make fpscr_rm and fpscr_nzcv SubRegs of fpscr, which might work but ran into other issues when I tried it.
Ideally yeah, So long as they are individually testable. It is more difficult to review larger changes and get them to stick in llvm without causing issues and needing to be reverted. |
Thanks for the comment! I took this approach and it seems that everything works with it. I managed to split the patch into 4 parts that can be tested independently:
However, this split leaves three tests, added by analogy with AArch64, that I haven’t yet included in any patch. One of them (strict-fp-ops.ll) depends on both patches, which I still wanted to keep separate for review convenience. The other two (strict-fp-int-promote.ll, strictfp_f16_abi_promote.ll) are already supported correctly without my changes. Perhaps it would be better to add them after the other patches have been committed to LLVM? If this approach is not appropriate and the tests should be included in the patches above, please let me know and I will update them. |
As shown in llvm#137101, fp16 lrint are not handled correctly on Arm. This adds soft-half promotion for them, reusing the function that promotes a value with operands (and can handle strict fp once that is added).
As shown in llvm#137101, fp16 lrint are not handled correctly on Arm. This adds soft-half promotion for them, reusing the function that promotes a value with operands (and can handle strict fp once that is added).
As shown in llvm#137101, fp16 lrint are not handled correctly on Arm. This adds soft-half promotion for them, reusing the function that promotes a value with operands (and can handle strict fp once that is added).
Added support for FENV_ACCESS pragma on hard-float ARM platforms. Also changes were made to clang/test/Parser/pragma-fp-warn.c so that for thumbv7a only the soft-float-abi target case is checked.