[LLVM] Slay undead copysign code #111269

workingjubilee · 2024-10-06T00:50:52Z

Frontends that emit LLVMIR would like to be able to count on float operations that can be encoded as simple bit-operations as lowering to machine-code routines embedded into the binary. This is desired so that they do not affect what one might think of as "freestanding" binaries. SelectionDAG¹ and GlobalISel² both explicitly handle this lowering for copysign.

However, later code implies that it is, at least in theory, possible to lower to runtime libcalls, as indicated by its entry in llvm/include/llvm/IR/RuntimeLibcalls.def³! Yet this code is largely impotent. We seem to need to preserve runtime libcalls for architecture-specific variants of these float operations that have architecture-specific semantics. However, ordinary IEEE754 floats have nonesuch: these are bitops.

So, purge this dead code, that it may plague us no more with its revenant half-life.

llvmbot · 2024-10-06T00:51:24Z

@llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-backend-systemz
@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-webassembly

Author: Jubilee (workingjubilee)

Changes

Frontends that emit LLVMIR would like to be able to count on float operations that can be encoded as simple bit-operations as lowering to machine-code routines embedded into the binary. This is desired so that they do not affect what one might think of as "freestanding" binaries. GlobalISel explicitly handles this lowering for copysign¹.

Perhaps it is just my newness to the codebase but I was unable to find a case where SelectionDAG likewise guaranteed that copysign was lowered to machine code. Instead, it seemed to be, at least in theory, prone to lowering to runtime libcalls, as indicated by its entry in llvm/include/llvm/IR/RuntimeLibcalls.def²!

In removing this code, it may be necessary to preserve runtime libcalls only for architecture-specific variants of these float operations that may have architecture-specific semantics. However, ordinary IEEE754 floats have nonesuch, and these bitops are then guaranteed to be bitops.

As no tests seem to fail after doing so, it seems plausible to me this was actually-dead code that was only in-theory reachable. If that is so, purge it from the codebase, so this revenant may plague us no more.

Full diff: https://github.com/llvm/llvm-project/pull/111269.diff

6 Files Affected:

(modified) llvm/include/llvm/IR/RuntimeLibcalls.def (-3)
(modified) llvm/lib/CodeGen/IntrinsicLowering.cpp (+15-3)
(modified) llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp (+5-6)
(modified) llvm/lib/IR/RuntimeLibcalls.cpp (-1)
(modified) llvm/lib/Target/SystemZ/ZOSLibcallNames.def (-3)
(modified) llvm/lib/Target/WebAssembly/WebAssemblyRuntimeLibcallSignatures.cpp (-3)

diff --git a/llvm/include/llvm/IR/RuntimeLibcalls.def b/llvm/include/llvm/IR/RuntimeLibcalls.def
index 69cf43140ad4bd..b891dce07d60dd 100644
--- a/llvm/include/llvm/IR/RuntimeLibcalls.def
+++ b/llvm/include/llvm/IR/RuntimeLibcalls.def
@@ -284,10 +284,7 @@ HANDLE_LIBCALL(FLOOR_F64, "floor")
 HANDLE_LIBCALL(FLOOR_F80, "floorl")
 HANDLE_LIBCALL(FLOOR_F128, "floorl")
 HANDLE_LIBCALL(FLOOR_PPCF128, "floorl")
-HANDLE_LIBCALL(COPYSIGN_F32, "copysignf")
-HANDLE_LIBCALL(COPYSIGN_F64, "copysign")
 HANDLE_LIBCALL(COPYSIGN_F80, "copysignl")
-HANDLE_LIBCALL(COPYSIGN_F128, "copysignl")
 HANDLE_LIBCALL(COPYSIGN_PPCF128, "copysignl")
 HANDLE_LIBCALL(FMIN_F32, "fminf")
 HANDLE_LIBCALL(FMIN_F64, "fmin")
diff --git a/llvm/lib/CodeGen/IntrinsicLowering.cpp b/llvm/lib/CodeGen/IntrinsicLowering.cpp
index 256c081b46e262..61d5f75273fa2b 100644
--- a/llvm/lib/CodeGen/IntrinsicLowering.cpp
+++ b/llvm/lib/CodeGen/IntrinsicLowering.cpp
@@ -58,7 +58,9 @@ static Value *LowerBSWAP(LLVMContext &Context, Value *V, Instruction *IP) {
   IRBuilder<> Builder(IP);
 
   switch(BitSize) {
-  default: llvm_unreachable("Unhandled type size of value to byteswap!");
+  default:
+    assert("Unhandled type size of value to byteswap!");
+    break;
   case 16: {
     Value *Tmp1 = Builder.CreateShl(V, ConstantInt::get(V->getType(), 8),
                                     "bswap.2");
@@ -203,7 +205,9 @@ static void ReplaceFPIntrinsicWithCall(CallInst *CI, const char *Fname,
                                        const char *Dname,
                                        const char *LDname) {
   switch (CI->getArgOperand(0)->getType()->getTypeID()) {
-  default: llvm_unreachable("Invalid type in intrinsic");
+  default:
+    assert("Invalid type in intrinsic");
+    break;
   case Type::FloatTyID:
     ReplaceCallWith(Fname, CI, CI->arg_begin(), CI->arg_end(),
                     Type::getFloatTy(CI->getContext()));
@@ -438,7 +442,15 @@ void IntrinsicLowering::LowerIntrinsicCall(CallInst *CI) {
     break;
   }
   case Intrinsic::copysign: {
-    ReplaceFPIntrinsicWithCall(CI, "copysignf", "copysign", "copysignl");
+    switch (CI->getArgOperand(0)->getType()->getTypeID()) {
+      default:
+        assert("only need a copysign libcall for arch-specific floats");
+        break;
+      case Type::X86_FP80TyID:
+      case Type::PPC_FP128TyID:
+        ReplaceCallWith("copysignl", CI, CI->arg_begin(), CI->arg_end(),
+                        Type::getFloatTy(CI->getContext()));
+    }
     break;
   }
   case Intrinsic::get_rounding:
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
index 2c81c829e75cbb..0e4d364f69f216 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp
@@ -1649,12 +1649,11 @@ void DAGTypeLegalizer::ExpandFloatRes_FCEIL(SDNode *N,
 
 void DAGTypeLegalizer::ExpandFloatRes_FCOPYSIGN(SDNode *N,
                                                 SDValue &Lo, SDValue &Hi) {
-  ExpandFloatRes_Binary(N, GetFPLibCall(N->getValueType(0),
-                                        RTLIB::COPYSIGN_F32,
-                                        RTLIB::COPYSIGN_F64,
-                                        RTLIB::COPYSIGN_F80,
-                                        RTLIB::COPYSIGN_F128,
-                                        RTLIB::COPYSIGN_PPCF128), Lo, Hi);
+
+  auto VT = N->getValueType(0);
+  ExpandFloatRes_Binary(N, (VT == MVT::f80 ? RTLIB::COPYSIGN_F80 :
+                            VT == MVT::ppcf128 ? RTLIB::COPYSIGN_PPCF128 :
+                            RTLIB::UNKNOWN_LIBCALL), Lo, Hi);
 }
 
 void DAGTypeLegalizer::ExpandFloatRes_FCOS(SDNode *N,
diff --git a/llvm/lib/IR/RuntimeLibcalls.cpp b/llvm/lib/IR/RuntimeLibcalls.cpp
index d806f8093459ee..ca78892ec78838 100644
--- a/llvm/lib/IR/RuntimeLibcalls.cpp
+++ b/llvm/lib/IR/RuntimeLibcalls.cpp
@@ -61,7 +61,6 @@ void RuntimeLibcallsInfo::initLibcalls(const Triple &TT) {
     setLibcallName(RTLIB::ROUND_F128, "roundf128");
     setLibcallName(RTLIB::ROUNDEVEN_F128, "roundevenf128");
     setLibcallName(RTLIB::FLOOR_F128, "floorf128");
-    setLibcallName(RTLIB::COPYSIGN_F128, "copysignf128");
     setLibcallName(RTLIB::FMIN_F128, "fminf128");
     setLibcallName(RTLIB::FMAX_F128, "fmaxf128");
     setLibcallName(RTLIB::LROUND_F128, "lroundf128");
diff --git a/llvm/lib/Target/SystemZ/ZOSLibcallNames.def b/llvm/lib/Target/SystemZ/ZOSLibcallNames.def
index 12a01522a7e643..a53c9618696fcc 100644
--- a/llvm/lib/Target/SystemZ/ZOSLibcallNames.def
+++ b/llvm/lib/Target/SystemZ/ZOSLibcallNames.def
@@ -87,9 +87,6 @@ HANDLE_LIBCALL(EXP2_F128, "@@LXP2@B")
 HANDLE_LIBCALL(COS_F64, "@@SCOS@B")
 HANDLE_LIBCALL(COS_F32, "@@FCOS@B")
 HANDLE_LIBCALL(COS_F128, "@@LCOS@B")
-HANDLE_LIBCALL(COPYSIGN_F64, "@@DCPY@B")
-HANDLE_LIBCALL(COPYSIGN_F32, "@@FCPY@B")
-HANDLE_LIBCALL(COPYSIGN_F128, "@@LCPY@B")
 HANDLE_LIBCALL(CEIL_F64, "@@SCEL@B")
 HANDLE_LIBCALL(CEIL_F32, "@@FCEL@B")
 HANDLE_LIBCALL(CEIL_F128, "@@LCEL@B")
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyRuntimeLibcallSignatures.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyRuntimeLibcallSignatures.cpp
index ba3ab5164af267..13c8a6fe1524fc 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyRuntimeLibcallSignatures.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyRuntimeLibcallSignatures.cpp
@@ -261,9 +261,6 @@ struct RuntimeLibcallSignatureTable {
     Table[RTLIB::FLOOR_F32] = f32_func_f32;
     Table[RTLIB::FLOOR_F64] = f64_func_f64;
     Table[RTLIB::FLOOR_F128] = i64_i64_func_i64_i64;
-    Table[RTLIB::COPYSIGN_F32] = f32_func_f32_f32;
-    Table[RTLIB::COPYSIGN_F64] = f64_func_f64_f64;
-    Table[RTLIB::COPYSIGN_F128] = i64_i64_func_i64_i64_i64_i64;
     Table[RTLIB::FMIN_F32] = f32_func_f32_f32;
     Table[RTLIB::FMIN_F64] = f64_func_f64_f64;
     Table[RTLIB::FMIN_F128] = i64_i64_func_i64_i64_i64_i64;

github-actions · 2024-10-06T00:54:20Z

✅ With the latest revision this PR passed the C/C++ code formatter.

llvm/lib/CodeGen/IntrinsicLowering.cpp

arsenm · 2024-10-06T05:44:21Z

Regardless of the usage, I think these should still be recorded in RuntimeLibcalls. Ideally it would be a complete database of all calls for the system, not just used

workingjubilee · 2024-10-06T05:51:20Z

Regardless of the usage, I think these should still be recorded in RuntimeLibcalls. Ideally it would be a complete database of all calls for the system, not just used

Of all calls for... what? All function calls that are hypothetically possible at runtime? Used for what purpose? Based on what conditions? What are we assuming is linked and therefore imposing on frontends that they must guarantee is linked?

Can we have some way of marking the ones that aren't actually going to, and should not, get spontaneously lowered to?

RalfJung · 2024-10-06T06:42:04Z

Regardless of the usage, I think these should still be recorded in RuntimeLibcalls. Ideally it would be a complete database of all calls for the system, not just used

It would be good to know which functions LLVM might actually use so that frontends have a chance of ensuring they are all provided. And it would be good for that list to be as short as possible so that frontends don't spend effort trying to provide some functions that LLVM will not actually use.

workingjubilee · 2024-10-07T21:39:56Z

If a true database of all function calls that might be made at runtime is desired, @arsenm, and not "all calls that LLVM might generate", then the .def file in question is hopelessly short. You have approximately the entirety of libc (minus some of math.h), Win32, and POSIX APIs to include. I don't think setting this list back by even a few more functions than this would be appreciable.

It doesn't even have:

abs
fdim
nan
remainder
remquo
hypot
atan2
acosh
asinh
atanh
erf
erfc
lgamma
tgamma
modf (not to be confused with fmodf)
scalbn
log1p
logb
ilogb
expm1
nextafter
nexttoward

...or all their differently-sized variants.

workingjubilee · 2024-10-07T21:52:11Z

@topperc This is using report_fatal_error now.

Also, good news: I seem to have found the spot where SelectionDAG expands copysign that catches this before we ever hit the libcall legalization:

llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Lines 1646 to 1707 in 75103aa

    
           SDValue SelectionDAGLegalize::ExpandFCOPYSIGN(SDNode *Node) const { 
        
             SDLoc DL(Node); 
        
             SDValue Mag = Node->getOperand(0); 
        
             SDValue Sign = Node->getOperand(1); 
        
             // Get sign bit into an integer value. 
        
             FloatSignAsInt SignAsInt; 
        
             getSignAsIntValue(SignAsInt, DL, Sign); 
        
             EVT IntVT = SignAsInt.IntValue.getValueType(); 
        
             SDValue SignMask = DAG.getConstant(SignAsInt.SignMask, DL, IntVT); 
        
             SDValue SignBit = DAG.getNode(ISD::AND, DL, IntVT, SignAsInt.IntValue, 
        
                                           SignMask); 
        
             // If FABS is legal transform FCOPYSIGN(x, y) => sign(x) ? -FABS(x) : FABS(X) 
        
             EVT FloatVT = Mag.getValueType(); 
        
             if (TLI.isOperationLegalOrCustom(ISD::FABS, FloatVT) && 
        
                 TLI.isOperationLegalOrCustom(ISD::FNEG, FloatVT)) { 
        
               SDValue AbsValue = DAG.getNode(ISD::FABS, DL, FloatVT, Mag); 
        
               SDValue NegValue = DAG.getNode(ISD::FNEG, DL, FloatVT, AbsValue); 
        
               SDValue Cond = DAG.getSetCC(DL, getSetCCResultType(IntVT), SignBit, 
        
                                           DAG.getConstant(0, DL, IntVT), ISD::SETNE); 
        
               return DAG.getSelect(DL, FloatVT, Cond, NegValue, AbsValue); 
        
             } 
        
             // Transform Mag value to integer, and clear the sign bit. 
        
             FloatSignAsInt MagAsInt; 
        
             getSignAsIntValue(MagAsInt, DL, Mag); 
        
             EVT MagVT = MagAsInt.IntValue.getValueType(); 
        
             SDValue ClearSignMask = DAG.getConstant(~MagAsInt.SignMask, DL, MagVT); 
        
             SDValue ClearedSign = DAG.getNode(ISD::AND, DL, MagVT, MagAsInt.IntValue, 
        
                                               ClearSignMask); 
        
             // Get the signbit at the right position for MagAsInt. 
        
             int ShiftAmount = SignAsInt.SignBit - MagAsInt.SignBit; 
        
             EVT ShiftVT = IntVT; 
        
             if (SignBit.getScalarValueSizeInBits() < 
        
                 ClearedSign.getScalarValueSizeInBits()) { 
        
               SignBit = DAG.getNode(ISD::ZERO_EXTEND, DL, MagVT, SignBit); 
        
               ShiftVT = MagVT; 
        
             } 
        
             if (ShiftAmount > 0) { 
        
               SDValue ShiftCnst = DAG.getConstant(ShiftAmount, DL, ShiftVT); 
        
               SignBit = DAG.getNode(ISD::SRL, DL, ShiftVT, SignBit, ShiftCnst); 
        
             } else if (ShiftAmount < 0) { 
        
               SDValue ShiftCnst = DAG.getConstant(-ShiftAmount, DL, ShiftVT); 
        
               SignBit = DAG.getNode(ISD::SHL, DL, ShiftVT, SignBit, ShiftCnst); 
        
             } 
        
             if (SignBit.getScalarValueSizeInBits() > 
        
                 ClearedSign.getScalarValueSizeInBits()) { 
        
               SignBit = DAG.getNode(ISD::TRUNCATE, DL, MagVT, SignBit); 
        
             } 
        
             SDNodeFlags Flags; 
        
             Flags.setDisjoint(true); 
        
             // Store the part with the modified sign and convert back to float. 
        
             SDValue CopiedSign = 
        
                 DAG.getNode(ISD::OR, DL, MagVT, ClearedSign, SignBit, Flags); 
        
             return modifySignAsInt(MagAsInt, DL, CopiedSign); 
        
           }

workingjubilee · 2024-10-08T07:39:59Z

And as I suspected, trying to simplify the code further hits the PowerPC float case on some of the regression tests for that target, so this seems to be, unfortunately, exactly the correct scope.

I noticed this while following #111269. It makes little sense that FCOPYSIGN would look at the sign of `x`, right? Surely this must be `y`. Also fix the inconsistency where it's sometimes `x` and sometimes `X`.

arsenm · 2024-10-10T12:04:11Z

Regardless of the usage, I think these should still be recorded in RuntimeLibcalls. Ideally it would be a complete database of all calls for the system, not just used

Of all calls for... what? All function calls that are hypothetically possible at runtime? Used for what purpose? Based on what conditions? What are we assuming is linked and therefore imposing on frontends that they must guarantee is linked?

The current TargetLibraryInfo + RuntimeLibcalls is an unworkable mess for doing LTO where the libc/libm/compiler-rt-esque builtins are all getting compiled in the same module. We have to have some way of knowing what function names might need to be retained for usage elsewhere in the compilation. e.g. see 615b7ee, which is trying to make use of this list in LTO

If a true database of all function calls that might be made at runtime is desired, @arsenm, and not "all calls that LLVM might generate", then the .def file in question is hopelessly short

Yes, this is a huge mess which needs work.

llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp

workingjubilee · 2024-10-11T05:53:56Z

@arsenm I see. If you can coach me on some of the finer points of making a .td (I have the basic gist, but there will be some questions), I'd be happy to assemble something such that the data covered in those two files is in a tablegen, in a followup. Then I will be able to make sure that it or another file is supplied with the necessary information that frontends... frontends that aren't clang, anyways... need. And then it will cover libm properly, instead of the current state of affairs.

llvm/include/llvm/IR/RuntimeLibcalls.def

jhuber6

I like the idea of getting rid of target libcalls, linking in runtime libraries in the same TU is still an unsolved problem. Right now the first step was to only keep symbols alive if the backend actually emits them. Next step would probably have the IR transform LLVM IR intrinsics to libcalls during IR linking and then mark them as nobuiltin so they don't get transformed again.

Seems weird that this doesn't handle the x86 / ppc cases. They're weird but it should just be copying a bit I would think.

workingjubilee · 2024-10-11T15:37:51Z

I believe the PowerPC target code has setOperationAction for ppc_f128 in a way that somehow still finds this codepath. As it is logically the only user of this codepath, it seemed harmless to just leave it. If you want, I can go muck about inside llvm/lib/Target/PowerPC/PPCISelLowering.cpp so that the regression tests pass and we can completely remove this, but my initial inclination was to not bother because the Rust frontend doesn't use these types at all and I will be hacking pretty wildly at that point.

I believe the x87 case might actually be handled... "correctly". At least in the sense that the code produced by LLVM should be no more incorrect than it was before, for this target.

arsenm · 2024-10-11T16:32:41Z

I believe the PowerPC target code has setOperationAction for ppc_f128 in a way that somehow still finds this codepath.

It actually hits the default expand handling. ppc_fp128 surprisingly isn't a legal type:

Legalizing node: t20: ppcf128 = fcopysign t5, t4
Analyzing result type: ppcf128
Expand float result: t20: ppcf128 = fcopysign t5, t4
Creating new node: t23: i128 = bitcast t5
Creating new node: t24: i64 = extract_element t23, Constant:i64<1>
Creating new node: t25: i64 = extract_element t23, Constant:i64<0>
Creating new node: t26: f64 = bitcast t25
Creating new node: t27: f64 = bitcast t24

and it does result in the call:

	lfs 2, .LCPI0_0@toc@l(3)
	bl copysignl

workingjubilee · 2024-10-11T16:48:38Z

lmao PowerPC doesn't treat its own float as a legal type

workingjubilee · 2024-10-11T17:04:17Z

I checked and the x87 float seems fine if this codepath is pruned. No visible regressions in LLVM's test suite. I mean, clang-format whines about my formatting again, but

arsenm

I guess this is fine, although unfortunate to leave ppc_fp128 behind.

workingjubilee · 2024-10-12T00:32:32Z

...is getOperand actually getOperands?

topperc · 2024-10-12T02:08:56Z

...is getOperand actually getOperands?

No. What makes you think that?

workingjubilee · 2024-10-12T11:16:08Z

@topperc Momentary delirium, it seems?

@arsenm re: your remarks about not leaving the PowerPC libcall behind, I figured out how to remove the code in question, after some wrangling, by doing the conversion in question. See the recently pushed commits.

arsenm · 2024-10-14T05:22:35Z

llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp

+  SDLoc DL = SDLoc(N);
+  SDValue Tmp = SDValue();


Suggested change

SDLoc DL = SDLoc(N);

SDValue Tmp = SDValue();

SDLoc DL(N);

SDValue Tmp;

The LLVM style guide specifically said I should "Use auto Type Deduction to Make Code More Readable". So I did. Then I was advised to not. So I accepted that.

But I find these changes make the code significantly harder to read.

Are you aware of another programming language that implements this... most vexing constructor form?

llvm/test/CodeGen/PowerPC/copysignl.ll

llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp

arsenm · 2024-10-14T05:25:44Z

llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp

+  // a double-double is Hi + Lo, so if Hi flips sign, so must Lo
+  Lo = DAG.getSelectCC(DL, Tmp, Hi, Lo,
+                       DAG.getNode(ISD::FNEG, DL, Lo.getValueType(), Lo),
+                       ISD::SETEQ);


Using SETEQ for FP types is weird, since it's the case that may be oeq or ueq. Can this break the handling of nan sign bits?

Can this just be emitted as another copysign?

I am not aware of a rule that says that the two floats have to both be the same sign, which would be required for a second copysign to make sense. The only one that I am aware of that is relevant is that Hi has to be greater. That's why it has to be a flip of the existing sign of Lo. We are not trying to apply the logic of copysign to the individual parts, but to the "summed float".

For a given float it might go something like this (I suspect most real cases would not have these be so close in orders of magnitude but...):

dbldbl = { Hi = 27.0, Lo = -4.0 }; assert(dbldbl == 23.0); dbldbl = copysign(dbldbl, -1.0); assert(dbldbl.Hi == -27.0); assert(dbldbl.Lo == 4.0); assert(dbldbl == -23.0);

So no, it cannot be "another copysign", I think, as that would leave us with this possibility:

dbldbl = copysign(dbldbl, -1.0); assert(dbldbl.Hi == -27.0); assert(dbldbl.Lo == -4.0); assert(dbldbl == -27.0);

This pattern is also used in ExpandFloatRes_FABS, here:

llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp

Lines 1557 to 1560 in 4a0caf6

// Lo = Hi==fabs(Hi) ? Lo : -Lo;

Lo = DAG.getSelectCC(dl, Tmp, Hi, Lo,

DAG.getNode(ISD::FNEG, dl, Lo.getValueType(), Lo),

ISD::SETEQ);

We should only ever, when copying the sign from this, take the value from Hi, per here, since it is the sign that matters:

llvm-project/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp

Lines 2205 to 2208 in 367c3c9

// The ppcf128 value is providing only the sign; take it from the

// higher-order double (which must have the larger magnitude).

return DAG.getNode(ISD::FCOPYSIGN, SDLoc(N),

N->getValueType(0), N->getOperand(0), Hi);

Thus it should not break NaN float sign handling, as the sign of Lo should be of no consequence to a NaN ppc_fp128, at least "externally" to the float.

copysign needs to respect the sign bit of incoming nan values. This logic will flip the sign bit for Lo for a nan Hi. I don't really understand how ppcf128 is supposed to work for nans, but I would expect this logic to be inverted (e.g. Hi UEQ fabs(hi) ? Lo : neg(Lo))

I guess it's ok for fabs, or fabs is wrong?

Hmm. Because the rule is Hi + Lo == "the real float", if one is NaN then logically they both are, aren't they?

Though, to be clear:

If the double-double is NaN or Infinity, the Hi value must encode NaN or Infinity¹.

If the double-double is Infinity, then Lo must not be NaN.

Otherwise, the value of Lo is not significant.

Footnotes

This is basically an extension of the "Hi must be the greater magnitude" rule. ↩

Basically, the answer of "how does ppc_fp128 work for this" is "mmmostly by not really caring too hard, honestly".

Which doesn't mean I am against using a different encoding of the logic if it is preferred, it just can't be two copysigns.

Unlike a true floating point operation, the "not significant" bits of a nan shouldn't change. It would be better if we weren't flipping the bit of the insignificant half here (or for any of fneg/fabs/fcopysign)

jayfoad · 2024-10-14T09:19:37Z

Slay undead copysign code

Suggest writing this in plainer English, e.g. "Remove unused copysign code".

llvm/test/CodeGen/PowerPC/copysignl.ll

arsenm · 2024-10-31T21:49:20Z

llvm/test/CodeGen/PowerPC/ctrloop-cpsgn.ll

@@ -1,28 +0,0 @@
-; RUN: llc -verify-machineinstrs < %s -mcpu=ppc | FileCheck %s


Lost this test?

according to 1cf48ab

Don't form PPC CTR-based loops around a copysignl call
copysign/copysignf never become function calls (because the SDAG expansion code
does not lower to the corresponding function call, but rather directly
implements the associated logic), but copysignl almost always is lowered into a
call to the requested libm functon (and, thus, might clobber CTR).

So it seemed like it was now irrelevant.

Inserted a test with the baseline at start for this so you can examine the diff for it, if you want.

workingjubilee · 2024-10-31T22:58:32Z

llvm/test/CodeGen/PowerPC/ctrloop-cpsgn.ll

+; CHECK-NEXT:    lwz 4, 88(1)
+; CHECK-NEXT:    stw 4, 104(1)
+; CHECK-NEXT:    lfd 0, 104(1)
+; CHECK-NEXT:    mtctr 3


when I said "the test seems irrelevant", we now emit an MTCTR instruction here, but the copysignl call we were guarding against is nowhere to be found, sooo

…alls This makes real what is already true: Copysign does not ever need to lower to runtime libcalls! Its operation should be possible to always implement via bitops.

This reduces the burden on frontends that wish to support float ops without needing a C compiler to build LLVM's compiler-rt for that target, e.g. so that they can be a fully self-contained toolchain for bare-metal. All other floats are expanded for all current architectures just fine. PowerPC, however, does not efficiently legalize its very own float.

…ysignl This allows ripping out the remaining copysignl infra.

llvmbot added backend:SystemZ backend:WebAssembly llvm:SelectionDAG SelectionDAGISel as well llvm:ir labels Oct 6, 2024

workingjubilee force-pushed the remove-dead-copysign-code branch 2 times, most recently from d971a20 to fd73b72 Compare October 6, 2024 01:30

workingjubilee mentioned this pull request Oct 6, 2024

float types: move copysign, abs, signum to libcore rust-lang/rust#131304

Open

topperc reviewed Oct 6, 2024

View reviewed changes

llvm/lib/CodeGen/IntrinsicLowering.cpp Outdated Show resolved Hide resolved

llvm/lib/CodeGen/IntrinsicLowering.cpp Outdated Show resolved Hide resolved

llvm/lib/CodeGen/IntrinsicLowering.cpp Outdated Show resolved Hide resolved

workingjubilee force-pushed the remove-dead-copysign-code branch 3 times, most recently from 0699d47 to 0606b62 Compare October 6, 2024 06:37

RalfJung mentioned this pull request Oct 8, 2024

Fix comment typo in ExpandFCOPYSIGN #111489

Merged

arsenm reviewed Oct 10, 2024

View reviewed changes

llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp Outdated Show resolved Hide resolved

llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp Show resolved Hide resolved

workingjubilee force-pushed the remove-dead-copysign-code branch from 0606b62 to 444c9f7 Compare October 11, 2024 05:55

arsenm requested a review from jhuber6 October 11, 2024 12:09

arsenm reviewed Oct 11, 2024

View reviewed changes

llvm/include/llvm/IR/RuntimeLibcalls.def Outdated Show resolved Hide resolved

jhuber6 reviewed Oct 11, 2024

View reviewed changes

arsenm approved these changes Oct 11, 2024

View reviewed changes

workingjubilee force-pushed the remove-dead-copysign-code branch 2 times, most recently from 02b1a52 to f68c342 Compare October 11, 2024 22:45

llvmbot added the backend:PowerPC label Oct 12, 2024

workingjubilee mentioned this pull request Oct 12, 2024

Need .tds for libm and runtime libcall API for ~all LLVM targets #112093

Open

arsenm reviewed Oct 14, 2024

View reviewed changes

workingjubilee mentioned this pull request Oct 17, 2024

Implement copysign for libcalls rust-lang/compiler-builtins#710

Closed

workingjubilee force-pushed the remove-dead-copysign-code branch from 4a0caf6 to 32d117c Compare October 29, 2024 10:03

workingjubilee commented Oct 29, 2024

View reviewed changes

llvm/test/CodeGen/PowerPC/copysignl.ll Outdated Show resolved Hide resolved

RKSimon added a commit that referenced this pull request Oct 29, 2024

[PowerPC] copysignl.ll - regenerate to reduce the diff in #111269

32aa782

workingjubilee force-pushed the remove-dead-copysign-code branch from 32d117c to 11a7cf8 Compare October 30, 2024 03:36

arsenm reviewed Oct 31, 2024

View reviewed changes

workingjubilee force-pushed the remove-dead-copysign-code branch from 11a7cf8 to 8c2297d Compare October 31, 2024 22:56

workingjubilee commented Oct 31, 2024

View reviewed changes

workingjubilee force-pushed the remove-dead-copysign-code branch from f4e8a10 to 65e3cc5 Compare October 31, 2024 23:29

workingjubilee added 3 commits October 31, 2024 17:32

[PowerPC] regenerate ctrloop-cpsgn.ll test

7745181

[CodeGen] Demand llvm.copysign.f{16,32,64,80,128} lowers without libc…

5aaacd7

…alls This makes real what is already true: Copysign does not ever need to lower to runtime libcalls! Its operation should be possible to always implement via bitops.

workingjubilee force-pushed the remove-dead-copysign-code branch from 65e3cc5 to 74adfd2 Compare November 1, 2024 00:34

workingjubilee added 2 commits October 31, 2024 17:35

[PowerPC][SelectionDAG] Expand @llvm.copysign.ppc_fp128 without cop…

5fd0d86

…ysignl This allows ripping out the remaining copysignl infra.

[PowerPC][SelectionDAG] Use SETUEQ in copysign expansion

bd017cc

workingjubilee force-pushed the remove-dead-copysign-code branch from 74adfd2 to bd017cc Compare November 1, 2024 00:36

[PowerPC] Erase now-irrelevant test

2ddf2ef

NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024

[PowerPC] copysignl.ll - regenerate to reduce the diff in llvm#111269

be3d089

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLVM] Slay undead copysign code #111269

[LLVM] Slay undead copysign code #111269

workingjubilee commented Oct 6, 2024 •

edited

Loading

llvmbot commented Oct 6, 2024 •

edited

Loading

github-actions bot commented Oct 6, 2024 •

edited

Loading

arsenm commented Oct 6, 2024

workingjubilee commented Oct 6, 2024

RalfJung commented Oct 6, 2024

workingjubilee commented Oct 7, 2024

workingjubilee commented Oct 7, 2024

workingjubilee commented Oct 8, 2024 •

edited

Loading

arsenm commented Oct 10, 2024 •

edited

Loading

workingjubilee commented Oct 11, 2024

jhuber6 left a comment

workingjubilee commented Oct 11, 2024

arsenm commented Oct 11, 2024

workingjubilee commented Oct 11, 2024

workingjubilee commented Oct 11, 2024

arsenm left a comment

workingjubilee commented Oct 12, 2024

topperc commented Oct 12, 2024

workingjubilee commented Oct 12, 2024

arsenm Oct 14, 2024

workingjubilee Oct 14, 2024

arsenm Oct 14, 2024

workingjubilee Oct 14, 2024

arsenm Oct 14, 2024

workingjubilee Oct 15, 2024 •

edited

Loading

workingjubilee Oct 15, 2024 •

edited

Loading

arsenm Oct 15, 2024

jayfoad commented Oct 14, 2024

arsenm Oct 31, 2024

workingjubilee Oct 31, 2024

workingjubilee Nov 1, 2024

workingjubilee Oct 31, 2024

	// Lo = Hi==fabs(Hi) ? Lo : -Lo;
	Lo = DAG.getSelectCC(dl, Tmp, Hi, Lo,
	DAG.getNode(ISD::FNEG, dl, Lo.getValueType(), Lo),
	ISD::SETEQ);

	// The ppcf128 value is providing only the sign; take it from the
	// higher-order double (which must have the larger magnitude).
	return DAG.getNode(ISD::FCOPYSIGN, SDLoc(N),
	N->getValueType(0), N->getOperand(0), Hi);

		@@ -1,28 +0,0 @@
		; RUN: llc -verify-machineinstrs < %s -mcpu=ppc \| FileCheck %s

[LLVM] Slay undead copysign code #111269

Are you sure you want to change the base?

[LLVM] Slay undead copysign code #111269

Conversation

workingjubilee commented Oct 6, 2024 • edited Loading

Footnotes

llvmbot commented Oct 6, 2024 • edited Loading

Footnotes

github-actions bot commented Oct 6, 2024 • edited Loading

arsenm commented Oct 6, 2024

workingjubilee commented Oct 6, 2024

RalfJung commented Oct 6, 2024

workingjubilee commented Oct 7, 2024

workingjubilee commented Oct 7, 2024

workingjubilee commented Oct 8, 2024 • edited Loading

arsenm commented Oct 10, 2024 • edited Loading

workingjubilee commented Oct 11, 2024

jhuber6 left a comment

Choose a reason for hiding this comment

workingjubilee commented Oct 11, 2024

arsenm commented Oct 11, 2024

workingjubilee commented Oct 11, 2024

workingjubilee commented Oct 11, 2024

arsenm left a comment

Choose a reason for hiding this comment

workingjubilee commented Oct 12, 2024

topperc commented Oct 12, 2024

workingjubilee commented Oct 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

workingjubilee Oct 15, 2024 • edited Loading

Choose a reason for hiding this comment

Footnotes

workingjubilee Oct 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jayfoad commented Oct 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

workingjubilee commented Oct 6, 2024 •

edited

Loading

llvmbot commented Oct 6, 2024 •

edited

Loading

github-actions bot commented Oct 6, 2024 •

edited

Loading

workingjubilee commented Oct 8, 2024 •

edited

Loading

arsenm commented Oct 10, 2024 •

edited

Loading

workingjubilee Oct 15, 2024 •

edited

Loading

workingjubilee Oct 15, 2024 •

edited

Loading