[AArch64][CodeGen] Fix crash when fptrunc returns fp16 with +nofp attr #81724

nasherm · 2024-02-14T11:06:07Z

When performing lowering of the fptrunc opcode returning fp16 with the +nofp flag enabled we could trigger a compiler crash. This is because we had no custom lowering implemented. This patch implements a custom lowering for the case in which we need to promote an fp16 return type for fptrunc when the +nofp attr is enabled.

llvmbot · 2024-02-14T11:06:37Z

@llvm/pr-subscribers-backend-aarch64

Author: Nashe Mncube (nasherm)

Changes

When performing lowering of the fptrunc opcode returning fp16 with the +nofp flag enabled we could trigger a compiler crash. This is because we had no custom lowering implemented. This patch implements a custom lowering for the case in which we need to promote an fp16 return type for fptrunc when the +nofp attr is enabled.

Full diff: https://github.com/llvm/llvm-project/pull/81724.diff

2 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+9)
(added) llvm/test/CodeGen/AArch64/float16-promotion-with-nofp.ll (+21)

diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index a3b7e3128ac1a4..c58600ae386730 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -25008,6 +25008,15 @@ void AArch64TargetLowering::ReplaceNodeResults(
       Results.push_back(
           LowerToPredicatedOp(SDValue(N, 0), DAG, AArch64ISD::MULHU_PRED));
     return;
+  case ISD::FP_ROUND:{
+    if (N->getValueType(0) == MVT::f16 && !Subtarget->hasFullFP16()) {
+        // Promote fp16 result to legal type
+        SDLoc DL(N);
+        EVT NVT = getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
+        Results.push_back(DAG.getNode(ISD::FP16_TO_FP, DL, NVT, N->getOperand(0)));
+    }
+    return;
+  }
   case ISD::FP_TO_UINT:
   case ISD::FP_TO_SINT:
   case ISD::STRICT_FP_TO_SINT:
diff --git a/llvm/test/CodeGen/AArch64/float16-promotion-with-nofp.ll b/llvm/test/CodeGen/AArch64/float16-promotion-with-nofp.ll
new file mode 100644
index 00000000000000..03426579131a1d
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/float16-promotion-with-nofp.ll
@@ -0,0 +1,21 @@
+; RUN: llc -mcpu=cortex-r82 -O1 -o - %s | FileCheck %s
+
+; Source used:
+; __fp16 f2h(float a) { return a; }
+; Compiled with: clang --target=aarch64-arm-none-eabi -march=armv8-r+nofp
+
+define hidden noundef nofpclass(nan inf) half @f2h(float noundef nofpclass(nan inf) %a) local_unnamed_addr #0 {
+;CHECK:      f2h:                                    // @f2h
+;CHECK-NEXT: // %bb.0:                               // %entry
+;CHECK-NEXT:     str x30, [sp, #-16]!                // 8-byte Folded Spill
+;CHECK-NEXT:     bl  __gnu_h2f_ieee
+;CHECK-NEXT:     ldr x30, [sp], #16                  // 8-byte Folded Reload
+;CHECK-NEXT:     ret
+entry:
+  %0 = fptrunc float %a to half
+  ret half %0
+}
+
+
+attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) "denormal-fp-math"="preserve-sign,preserve-sign" "no-infs-fp-math"="true" "no-nans-fp-math"="true" "no-signed-zeros-fp-math"="true" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+crc,+lse,+pauth,+ras,+rcpc,+sb,+ssbs,+v8r,-complxnum,-dotprod,-fmv,-fp-armv8,-fp16fml,-fullfp16,-jsconv,-neon,-rdm" }
+

github-actions · 2024-02-14T11:08:33Z

✅ With the latest revision this PR passed the C/C++ code formatter.

nasherm · 2024-02-14T11:32:19Z

Investigating why new test is failing

davemgreen · 2024-02-14T11:40:18Z

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

@@ -25008,6 +25008,16 @@ void AArch64TargetLowering::ReplaceNodeResults(
      Results.push_back(
          LowerToPredicatedOp(SDValue(N, 0), DAG, AArch64ISD::MULHU_PRED));
    return;
+  case ISD::FP_ROUND: {


This looks like it is doing something that should be handled by the generic legalizer.

Would it be better to not mark the FP_RUND as custom instead?

So I'm looking into not making this custom and offloading the work to the generic legalizer but it appears to cause another crash elsewhere. Currently debugging this

Using a generic legalizer we run into the following:

Initial selection DAG: %bb.0 'f2h:entry' SelectionDAG has 11 nodes: t0: ch,glue = EntryToken t2: i32,ch = CopyFromReg t0, Register:i32 %0 t3: f32 = bitcast t2 t5: f16 = fp_round t3, TargetConstant:i64<0> t6: i16 = bitcast t5 t7: i32 = any_extend t6 t9: ch,glue = CopyToReg t0, Register:i32 $w0, t7 t10: ch = AArch64ISD::RET_GLUE t9, Register:i32 $w0, t9:1 Optimized lowered selection DAG: %bb.0 'f2h:entry' SelectionDAG has 11 nodes: t0: ch,glue = EntryToken t2: i32,ch = CopyFromReg t0, Register:i32 %0 t3: f32 = bitcast t2 t5: f16 = fp_round t3, TargetConstant:i64<0> t6: i16 = bitcast t5 t7: i32 = any_extend t6 t9: ch,glue = CopyToReg t0, Register:i32 $w0, t7 t10: ch = AArch64ISD::RET_GLUE t9, Register:i32 $w0, t9:1 SoftenFloatResult #0: t15: f32 = INSERT_SUBREG undef:i32, t5, TargetConstant:i32<7> LLVM ERROR: Do not know how to soften the result of this operator!

After investigating a bit it appears that INSERT_SUBREG is an MIR instruction that is used to implement any_extend. This is confusing to me as I wouldn't expect the DAG to try and legalize an MIR instruction. Am I right in thinking this is a bug?

It is.. a very old bit of the code that probably would be better if it worked differently. But I'm not sure if it's easy to change.

I believe it comes from AArch64TargetLowering::LowerBITCAST? Could that function behave differently, or could the ISD::BITCAST be not marked Custom with no-fp?

My newest patch tries to just default to the generic legalizer when attempting to bitcast with the operand being an fp16. Do let me know your thoughts on this approach

llvm/test/CodeGen/AArch64/float16-promotion-with-nofp.ll

davemgreen · 2024-02-17T09:48:11Z

This looks like it's on the right track. I think it should be based on hasFPARMv8, not hasFullFP16.

If I change this to only be custom when we have fp16:

-  setOperationAction(ISD::FP_ROUND, MVT::f16, Custom);  
+  if (!Subtarget->hasFPARMv8())                         
+    setOperationAction(ISD::FP_ROUND, MVT::f16, Custom);

It runs into problems with BITCAST, so If I do the same thing there for the i16/f16/bf16 types then the test compiles, but produces multiple calls that might not be expected. I didn't look too deeply into why that was happening, and there is a chance it might cause problems in other places, but we don't have a lot of tests for "no-fp" codegen so far.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/test/CodeGen/AArch64/float16-promotion-with-nofp.ll

davemgreen

Thanks. This looks OK to me with a little bit of cleanup so long as others dont disagree.
Make sure you update the strictfp_f16_abi_promote.ll test.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

hvdijk · 2024-02-21T18:41:10Z

This effectively includes my PR #80576, which is ready to merge except for an unresolved question about where and how to document the change. If I can get some guidance on what documentation is wanted there, I can amend that, push it, so that when we then rebase this one, it shows fewer changes?

davemgreen · 2024-02-22T06:38:33Z

This effectively includes my PR #80576, which is ready to merge except for an unresolved question about where and how to document the change. If I can get some guidance on what documentation is wanted there, I can amend that, push it, so that when we then rebase this one, it shows fewer changes?

Oh I thought I had seen a patch like that, but didn't manage to find it in a quick search - only the Arm parts and the generic part, both of which I think have been committed. Thanks for the info, we can try and get that one in.

When performing lowering of the fptrunc opcode returning fp16 with the +nofp flag enabled we could trigger a compiler crash. This is because we had no custom lowering implemented. This patch implements a custom lowering for the case in which we need to promote an fp16 return type for fptrunc when the +nofp attr is enabled. Change-Id: Ibea20a676d40fde3f25e1ade365620071f46ff2b

Change-Id: Icf71578773b4c44fe8d79edd984661ec79fe1b09

Change-Id: Ic64506bb76bcc8f059b1de9ea6041fe8c0093b9c

Change-Id: Ib18e0ceeaab18d31d3fc43daab838ea95d62c2c1

Change-Id: I92af1dc9413486d2c20d90aebf0e377b077ad428

Change-Id: Ic1b7f8774ae529680597b4e032c4a63416ac927a

Change-Id: Icdb7c3742dc30da0652701ffc6c7468b8504e416

nasherm · 2024-02-22T13:18:41Z

Thanks. This looks OK to me with a little bit of cleanup so long as others dont disagree. Make sure you update the strictfp_f16_abi_promote.ll test.

Rebasing onto @hvdijk merged PR updates updates strictfp_f16_abi_promote.ll and reduces the size of this PR. Hopefully addressed all outstanding comments

davemgreen · 2024-02-22T14:36:54Z

Thanks. This looks good to me now. One last thing, can you uncomment the @f16_return test from strictfp_f16_abi_promote.ll, so that we test the strict-fp case too?

With that this LGTM if @ostannard doesn't disagree.

ostannard

LGTM too, assuming the strict-fp test case @davemgreen mentioned passes.

davemgreen

Thanks

Change-Id: I11779e4a7e4494b45c1a166ebc46ef5887b38770

llvmbot added the backend:AArch64 label Feb 14, 2024

nasherm requested a review from ostannard February 14, 2024 11:13

nasherm force-pushed the nashe/fp16-promotion-nofp branch from 6e09677 to d09478d Compare February 14, 2024 11:16

nasherm marked this pull request as draft February 14, 2024 11:40

davemgreen reviewed Feb 14, 2024

View reviewed changes

ostannard requested changes Feb 14, 2024

View reviewed changes

llvm/test/CodeGen/AArch64/float16-promotion-with-nofp.ll Outdated Show resolved Hide resolved

nasherm marked this pull request as ready for review February 15, 2024 13:47

davemgreen reviewed Feb 20, 2024

View reviewed changes

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Outdated Show resolved Hide resolved

llvm/test/CodeGen/AArch64/float16-promotion-with-nofp.ll Outdated Show resolved Hide resolved

nasherm force-pushed the nashe/fp16-promotion-nofp branch 2 times, most recently from eba9ee8 to 02032f3 Compare February 21, 2024 16:44

davemgreen reviewed Feb 21, 2024

View reviewed changes

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Show resolved Hide resolved

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Outdated Show resolved Hide resolved

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Show resolved Hide resolved

davemgreen requested review from hvdijk and efriedma-quic February 21, 2024 17:57

nasherm added 7 commits February 22, 2024 12:05

clang-format fixes

a70a40c

Change-Id: Icf71578773b4c44fe8d79edd984661ec79fe1b09

Responding to review comments

752605f

Change-Id: Ic64506bb76bcc8f059b1de9ea6041fe8c0093b9c

Responding to review comments

e88f793

Change-Id: Ib18e0ceeaab18d31d3fc43daab838ea95d62c2c1

Using hasFPARMv8 over hasFullFP16

ea9475e

Change-Id: I92af1dc9413486d2c20d90aebf0e377b077ad428

Responding to review comments

fb775fb

Change-Id: Ic1b7f8774ae529680597b4e032c4a63416ac927a

Rebasing onto similar PR

46a77e3

Change-Id: Icdb7c3742dc30da0652701ffc6c7468b8504e416

nasherm force-pushed the nashe/fp16-promotion-nofp branch from 02032f3 to 46a77e3 Compare February 22, 2024 13:14

davemgreen approved these changes Feb 22, 2024

View reviewed changes

ostannard approved these changes Feb 22, 2024

View reviewed changes

davemgreen approved these changes Feb 22, 2024

View reviewed changes

nasherm force-pushed the nashe/fp16-promotion-nofp branch from f2530cb to 53d140a Compare February 22, 2024 15:44

hvdijk approved these changes Feb 22, 2024

View reviewed changes

Uncommenting f16_return within strictfp_f16_abi_promote.ll

f7df5b2

Change-Id: I11779e4a7e4494b45c1a166ebc46ef5887b38770

nasherm force-pushed the nashe/fp16-promotion-nofp branch from 53d140a to f7df5b2 Compare February 22, 2024 15:58

nasherm merged commit 744c005 into llvm:main Feb 22, 2024
3 of 4 checks passed

nasherm deleted the nashe/fp16-promotion-nofp branch February 22, 2024 19:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AArch64][CodeGen] Fix crash when fptrunc returns fp16 with +nofp attr #81724

[AArch64][CodeGen] Fix crash when fptrunc returns fp16 with +nofp attr #81724

nasherm commented Feb 14, 2024

llvmbot commented Feb 14, 2024

github-actions bot commented Feb 14, 2024 •

edited

nasherm commented Feb 14, 2024

davemgreen Feb 14, 2024

nasherm Feb 14, 2024

nasherm Feb 14, 2024 •

edited

davemgreen Feb 14, 2024

nasherm Feb 15, 2024

davemgreen commented Feb 17, 2024

davemgreen left a comment •

edited

hvdijk commented Feb 21, 2024

davemgreen commented Feb 22, 2024

nasherm commented Feb 22, 2024

davemgreen commented Feb 22, 2024

ostannard left a comment

davemgreen left a comment

[AArch64][CodeGen] Fix crash when fptrunc returns fp16 with +nofp attr #81724

[AArch64][CodeGen] Fix crash when fptrunc returns fp16 with +nofp attr #81724

Conversation

nasherm commented Feb 14, 2024

llvmbot commented Feb 14, 2024

github-actions bot commented Feb 14, 2024 • edited

nasherm commented Feb 14, 2024

davemgreen Feb 14, 2024

Choose a reason for hiding this comment

nasherm Feb 14, 2024

Choose a reason for hiding this comment

nasherm Feb 14, 2024 • edited

Choose a reason for hiding this comment

davemgreen Feb 14, 2024

Choose a reason for hiding this comment

nasherm Feb 15, 2024

Choose a reason for hiding this comment

davemgreen commented Feb 17, 2024

davemgreen left a comment • edited

Choose a reason for hiding this comment

hvdijk commented Feb 21, 2024

davemgreen commented Feb 22, 2024

nasherm commented Feb 22, 2024

davemgreen commented Feb 22, 2024

ostannard left a comment

Choose a reason for hiding this comment

davemgreen left a comment

Choose a reason for hiding this comment

github-actions bot commented Feb 14, 2024 •

edited

nasherm Feb 14, 2024 •

edited

davemgreen left a comment •

edited