Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GISel][CombinerHelper] Push freeze through non-poison-producing operands #90618

Merged

Conversation

dc03-work
Copy link
Contributor

@dc03-work dc03-work commented Apr 30, 2024

This combine matches the existing fold in InstCombine, i.e. InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating.

It tries to push freeze through an operand if the operand has only one maybe-poison operand and all other operands are guaranteed non-poison, and if the operation itself cannot generate poison (eg. add with nsw can generate poison, even with non-poison operands).

This is beneficial because it can potentially enable other optimizations to occur that would otherwise be blocked because of the freeze.

@llvmbot
Copy link

llvmbot commented Apr 30, 2024

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-aarch64

Author: Dhruv Chawla (dc03-work)

Changes

This combine matches the existing fold in InstCombine, i.e. InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating.

It tries to push freeze through an operand if the operand has only one maybe-poison operand and all other operands are guaranteed non-poison.

This is beneficial because it can potentially enable other optimizations to occur that would otherwise be blocked because of the freeze.


Full diff: https://github.com/llvm/llvm-project/pull/90618.diff

8 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h (+2)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h (+29)
  • (modified) llvm/include/llvm/Target/GlobalISel/Combine.td (+8-1)
  • (modified) llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp (+59)
  • (modified) llvm/lib/CodeGen/GlobalISel/Utils.cpp (+16-1)
  • (modified) llvm/lib/Target/AArch64/AArch64Combine.td (+2-1)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/combine-select.mir (+21-21)
  • (modified) llvm/test/CodeGen/AArch64/pr58431.ll (+2-2)
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h b/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
index 76e8d1166ae0cd..bd59aab03a6103 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
@@ -856,6 +856,8 @@ class CombinerHelper {
   /// Combine insert vector element OOB.
   bool matchInsertVectorElementOOB(MachineInstr &MI, BuildFnTy &MatchInfo);
 
+  bool matchFreezeOfSingleMaybePoisonOperand(MachineInstr &MI, BuildFnTy &MatchInfo);
+
 private:
   /// Checks for legality of an indexed variant of \p LdSt.
   bool isIndexedLoadStoreLegal(GLoadStore &LdSt) const;
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h b/llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h
index 25e47114e4a39a..90a806df86226a 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h
@@ -780,6 +780,35 @@ class GFreeze : public GenericMachineInstr {
   }
 };
 
+/// Represents cast instructions.
+class GCastOp : public GenericMachineInstr {
+public:
+  Register getSourceReg() const { return getOperand(1).getReg(); }
+
+  static bool classof(const MachineInstr *MI) {
+    switch (MI->getOpcode()) {
+    case TargetOpcode::G_TRUNC:
+    case TargetOpcode::G_ANYEXT:
+    case TargetOpcode::G_SEXT:
+    case TargetOpcode::G_ZEXT:
+    case TargetOpcode::G_FPTOSI:
+    case TargetOpcode::G_FPTOUI:
+    case TargetOpcode::G_UITOFP:
+    case TargetOpcode::G_SITOFP:
+    case TargetOpcode::G_FPTRUNC:
+    case TargetOpcode::G_FPEXT:
+    case TargetOpcode::G_INTTOPTR:
+    case TargetOpcode::G_PTRTOINT:
+    case TargetOpcode::G_BITCAST:
+    case TargetOpcode::G_ADDRSPACE_CAST:
+      return true;
+
+    default:
+      return false;
+    }
+  }
+};
+
 } // namespace llvm
 
 #endif // LLVM_CODEGEN_GLOBALISEL_GENERICMACHINEINSTRS_H
diff --git a/llvm/include/llvm/Target/GlobalISel/Combine.td b/llvm/include/llvm/Target/GlobalISel/Combine.td
index dbbb3abaa83047..7185cd2bbcb136 100644
--- a/llvm/include/llvm/Target/GlobalISel/Combine.td
+++ b/llvm/include/llvm/Target/GlobalISel/Combine.td
@@ -218,6 +218,13 @@ def idempotent_prop : GICombineRule<
    (match (idempotent_prop_frags $dst, $src)),
    (apply (GIReplaceReg $dst, $src))>;
 
+// Convert freeze(Op(Op0, NonPoisonOps...)) to Op(freeze(Op0), NonPoisonOps...)
+// when Op0 is not guaranteed non-poison
+def push_freeze_to_prevent_poison_propagation : GICombineRule<
+  (defs root:$root, build_fn_matchinfo:$matchinfo),
+  (match (wip_match_opcode G_FREEZE):$root,
+         [{ return Helper.matchFreezeOfSingleMaybePoisonOperand(*${root}, ${matchinfo}); }]),
+  (apply [{ Helper.applyBuildFn(*${root}, ${matchinfo}); }])>;
 
 def extending_loads : GICombineRule<
   (defs root:$root, extending_load_matchdata:$matchinfo),
@@ -1662,7 +1669,7 @@ def all_combines : GICombineGroup<[trivial_combines, vector_ops_combines,
     sub_add_reg, select_to_minmax, redundant_binop_in_equality,
     fsub_to_fneg, commute_constant_to_rhs, match_ands, match_ors,
     combine_concat_vector, double_icmp_zero_and_or_combine, match_addos,
-    combine_shuffle_concat]>;
+    combine_shuffle_concat, push_freeze_to_prevent_poison_propagation]>;
 
 // A combine group used to for prelegalizer combiners at -O0. The combines in
 // this group have been selected based on experiments to balance code size and
diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
index 653e7689b57743..0a79702bf7080d 100644
--- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
@@ -223,6 +223,64 @@ void CombinerHelper::applyCombineCopy(MachineInstr &MI) {
   replaceRegWith(MRI, DstReg, SrcReg);
 }
 
+bool CombinerHelper::matchFreezeOfSingleMaybePoisonOperand(
+    MachineInstr &MI, BuildFnTy &MatchInfo) {
+  // Ported from InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating
+  Register DstOp = MI.getOperand(0).getReg();
+  Register OrigOp = MI.getOperand(1).getReg();
+
+  if (OrigOp.isPhysical() || !MRI.hasOneNonDBGUse(OrigOp))
+    return false;
+
+  MachineInstr *OrigDef = MRI.getUniqueVRegDef(OrigOp);
+  // Avoid trying to fold G_PHI, G_UNMERGE_VALUES, G_FREEZE (the latter is
+  // handled by idempotent_prop)
+  if (!OrigDef || OrigDef->isPHI() || isa<GUnmerge>(OrigDef) ||
+      isa<GFreeze>(OrigDef))
+    return false;
+
+  if (canCreateUndefOrPoison(OrigOp, MRI))
+    return false;
+
+  std::optional<MachineOperand> MaybePoisonOperand = std::nullopt;
+  for (MachineOperand &Operand : OrigDef->uses()) {
+    // Avoid working on non-register operands or physical registers.
+    if (!Operand.isReg() || Operand.getReg().isPhysical())
+      return false;
+
+    if (isGuaranteedNotToBeUndefOrPoison(Operand.getReg(), MRI))
+      continue;
+
+    if (!MaybePoisonOperand)
+      MaybePoisonOperand = Operand;
+    // We have more than one maybe-poison operand. Moving the freeze is unsafe.
+    else
+      return false;
+  }
+
+  // Eliminate freeze if all operands are guaranteed non-poison
+  if (!MaybePoisonOperand) {
+    MatchInfo = [=](MachineIRBuilder &B) { MRI.replaceRegWith(DstOp, OrigOp); };
+    return true;
+  }
+
+  if (!MaybePoisonOperand->isReg())
+    return false;
+
+  Register MaybePoisonOperandReg = MaybePoisonOperand->getReg();
+  LLT MaybePoisonOperandRegTy = MRI.getType(MaybePoisonOperandReg);
+
+  MatchInfo = [=](MachineIRBuilder &B) mutable {
+    auto Reg = MRI.createGenericVirtualRegister(MaybePoisonOperandRegTy);
+    B.setInsertPt(*OrigDef->getParent(), OrigDef->getIterator());
+    B.buildFreeze(Reg, MaybePoisonOperandReg);
+    replaceRegOpWith(
+        MRI, *OrigDef->findRegisterUseOperand(MaybePoisonOperandReg, TRI), Reg);
+    replaceRegWith(MRI, DstOp, OrigOp);
+  };
+  return true;
+}
+
 bool CombinerHelper::matchCombineConcatVectors(MachineInstr &MI,
                                                SmallVector<Register> &Ops) {
   assert(MI.getOpcode() == TargetOpcode::G_CONCAT_VECTORS &&
@@ -3060,6 +3118,7 @@ bool CombinerHelper::matchHoistLogicOpWithSameOpcodeHands(
   MachineInstr *RightHandInst = getDefIgnoringCopies(RHSReg, MRI);
   if (!LeftHandInst || !RightHandInst)
     return false;
+
   unsigned HandOpcode = LeftHandInst->getOpcode();
   if (HandOpcode != RightHandInst->getOpcode())
     return false;
diff --git a/llvm/lib/CodeGen/GlobalISel/Utils.cpp b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
index 4e3781cb4e9d59..3f9497fcb6855a 100644
--- a/llvm/lib/CodeGen/GlobalISel/Utils.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
@@ -1737,6 +1737,8 @@ static bool canCreateUndefOrPoison(Register Reg, const MachineRegisterInfo &MRI,
   case TargetOpcode::G_FREEZE:
     return false;
   default:
+    if (isa<GCastOp>(RegDef) || isa<GBinOp>(RegDef))
+      return false;
     return true;
   }
 }
@@ -1750,13 +1752,26 @@ static bool isGuaranteedNotToBeUndefOrPoison(Register Reg,
 
   MachineInstr *RegDef = MRI.getVRegDef(Reg);
 
+  auto OpCheck = [&](MachineOperand &Operand) {
+    if (!Operand.isReg())
+      return true;
+
+    return isGuaranteedNotToBeUndefOrPoison(Operand.getReg(), MRI, Depth + 1,
+                                            Kind);
+  };
+
   switch (RegDef->getOpcode()) {
   case TargetOpcode::G_FREEZE:
     return true;
   case TargetOpcode::G_IMPLICIT_DEF:
     return !includesUndef(Kind);
   default:
-    return false;
+    GenericMachineInstr *Opr = dyn_cast<GBinOp>(RegDef);
+    if (!Opr)
+      Opr = dyn_cast<GCastOp>(RegDef);
+
+    return Opr && !::llvm::canCreateUndefOrPoison(Reg, MRI) &&
+           all_of(Opr->operands(), OpCheck);
   }
 }
 
diff --git a/llvm/lib/Target/AArch64/AArch64Combine.td b/llvm/lib/Target/AArch64/AArch64Combine.td
index 10cad6d1924407..7229d4fe52261f 100644
--- a/llvm/lib/Target/AArch64/AArch64Combine.td
+++ b/llvm/lib/Target/AArch64/AArch64Combine.td
@@ -295,5 +295,6 @@ def AArch64PostLegalizerCombiner
                         ptr_add_immed_chain, overlapping_and,
                         split_store_zero_128, undef_combines,
                         select_to_minmax, or_to_bsp, combine_concat_vector,
-                        commute_constant_to_rhs]> {
+                        commute_constant_to_rhs,
+                        push_freeze_to_prevent_poison_propagation]> {
 }
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/combine-select.mir b/llvm/test/CodeGen/AArch64/GlobalISel/combine-select.mir
index 2bf7e84a379ba0..90d0896d168930 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/combine-select.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/combine-select.mir
@@ -117,9 +117,9 @@ body:             |
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x0
     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY $x2
     ; CHECK-NEXT: %c:_(s1) = G_TRUNC [[COPY]](s64)
-    ; CHECK-NEXT: %f:_(s1) = G_TRUNC [[COPY1]](s64)
-    ; CHECK-NEXT: [[FREEZE:%[0-9]+]]:_(s1) = G_FREEZE %f
-    ; CHECK-NEXT: %sel:_(s1) = G_OR %c, [[FREEZE]]
+    ; CHECK-NEXT: [[FREEZE:%[0-9]+]]:_(s64) = G_FREEZE [[COPY1]]
+    ; CHECK-NEXT: %f:_(s1) = G_TRUNC [[FREEZE]](s64)
+    ; CHECK-NEXT: %sel:_(s1) = G_OR %c, %f
     ; CHECK-NEXT: %ext:_(s32) = G_ANYEXT %sel(s1)
     ; CHECK-NEXT: $w0 = COPY %ext(s32)
     %0:_(s64) = COPY $x0
@@ -144,9 +144,9 @@ body:             |
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x0
     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY $x2
     ; CHECK-NEXT: %c:_(s1) = G_TRUNC [[COPY]](s64)
-    ; CHECK-NEXT: %f:_(s1) = G_TRUNC [[COPY1]](s64)
-    ; CHECK-NEXT: [[FREEZE:%[0-9]+]]:_(s1) = G_FREEZE %f
-    ; CHECK-NEXT: %sel:_(s1) = G_OR %c, [[FREEZE]]
+    ; CHECK-NEXT: [[FREEZE:%[0-9]+]]:_(s64) = G_FREEZE [[COPY1]]
+    ; CHECK-NEXT: %f:_(s1) = G_TRUNC [[FREEZE]](s64)
+    ; CHECK-NEXT: %sel:_(s1) = G_OR %c, %f
     ; CHECK-NEXT: %ext:_(s32) = G_ANYEXT %sel(s1)
     ; CHECK-NEXT: $w0 = COPY %ext(s32)
     %0:_(s64) = COPY $x0
@@ -172,9 +172,9 @@ body:             |
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $d0
     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(<2 x s32>) = COPY $d2
     ; CHECK-NEXT: %c:_(<2 x s1>) = G_TRUNC [[COPY]](<2 x s32>)
-    ; CHECK-NEXT: %f:_(<2 x s1>) = G_TRUNC [[COPY1]](<2 x s32>)
-    ; CHECK-NEXT: [[FREEZE:%[0-9]+]]:_(<2 x s1>) = G_FREEZE %f
-    ; CHECK-NEXT: %sel:_(<2 x s1>) = G_OR %c, [[FREEZE]]
+    ; CHECK-NEXT: [[FREEZE:%[0-9]+]]:_(<2 x s32>) = G_FREEZE [[COPY1]]
+    ; CHECK-NEXT: %f:_(<2 x s1>) = G_TRUNC [[FREEZE]](<2 x s32>)
+    ; CHECK-NEXT: %sel:_(<2 x s1>) = G_OR %c, %f
     ; CHECK-NEXT: %ext:_(<2 x s32>) = G_ANYEXT %sel(<2 x s1>)
     ; CHECK-NEXT: $d0 = COPY %ext(<2 x s32>)
     %0:_(<2 x s32>) = COPY $d0
@@ -201,9 +201,9 @@ body:             |
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x0
     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
     ; CHECK-NEXT: %c:_(s1) = G_TRUNC [[COPY]](s64)
-    ; CHECK-NEXT: %t:_(s1) = G_TRUNC [[COPY1]](s64)
-    ; CHECK-NEXT: [[FREEZE:%[0-9]+]]:_(s1) = G_FREEZE %t
-    ; CHECK-NEXT: %sel:_(s1) = G_AND %c, [[FREEZE]]
+    ; CHECK-NEXT: [[FREEZE:%[0-9]+]]:_(s64) = G_FREEZE [[COPY1]]
+    ; CHECK-NEXT: %t:_(s1) = G_TRUNC [[FREEZE]](s64)
+    ; CHECK-NEXT: %sel:_(s1) = G_AND %c, %t
     ; CHECK-NEXT: %ext:_(s32) = G_ANYEXT %sel(s1)
     ; CHECK-NEXT: $w0 = COPY %ext(s32)
     %0:_(s64) = COPY $x0
@@ -229,9 +229,9 @@ body:             |
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x0
     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
     ; CHECK-NEXT: %c:_(s1) = G_TRUNC [[COPY]](s64)
-    ; CHECK-NEXT: %t:_(s1) = G_TRUNC [[COPY1]](s64)
-    ; CHECK-NEXT: [[FREEZE:%[0-9]+]]:_(s1) = G_FREEZE %t
-    ; CHECK-NEXT: %sel:_(s1) = G_AND %c, [[FREEZE]]
+    ; CHECK-NEXT: [[FREEZE:%[0-9]+]]:_(s64) = G_FREEZE [[COPY1]]
+    ; CHECK-NEXT: %t:_(s1) = G_TRUNC [[FREEZE]](s64)
+    ; CHECK-NEXT: %sel:_(s1) = G_AND %c, %t
     ; CHECK-NEXT: %ext:_(s32) = G_ANYEXT %sel(s1)
     ; CHECK-NEXT: $w0 = COPY %ext(s32)
     %0:_(s64) = COPY $x0
@@ -257,11 +257,11 @@ body:             |
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x0
     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
     ; CHECK-NEXT: %c:_(s1) = G_TRUNC [[COPY]](s64)
-    ; CHECK-NEXT: %t:_(s1) = G_TRUNC [[COPY1]](s64)
+    ; CHECK-NEXT: [[FREEZE:%[0-9]+]]:_(s64) = G_FREEZE [[COPY1]]
+    ; CHECK-NEXT: %t:_(s1) = G_TRUNC [[FREEZE]](s64)
     ; CHECK-NEXT: %one:_(s1) = G_CONSTANT i1 true
     ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(s1) = G_XOR %c, %one
-    ; CHECK-NEXT: [[FREEZE:%[0-9]+]]:_(s1) = G_FREEZE %t
-    ; CHECK-NEXT: %sel:_(s1) = G_OR [[XOR]], [[FREEZE]]
+    ; CHECK-NEXT: %sel:_(s1) = G_OR [[XOR]], %t
     ; CHECK-NEXT: %ext:_(s32) = G_ANYEXT %sel(s1)
     ; CHECK-NEXT: $w0 = COPY %ext(s32)
     %0:_(s64) = COPY $x0
@@ -287,11 +287,11 @@ body:             |
     ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s64) = COPY $x0
     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY $x2
     ; CHECK-NEXT: %c:_(s1) = G_TRUNC [[COPY]](s64)
-    ; CHECK-NEXT: %f:_(s1) = G_TRUNC [[COPY1]](s64)
+    ; CHECK-NEXT: [[FREEZE:%[0-9]+]]:_(s64) = G_FREEZE [[COPY1]]
+    ; CHECK-NEXT: %f:_(s1) = G_TRUNC [[FREEZE]](s64)
     ; CHECK-NEXT: [[C:%[0-9]+]]:_(s1) = G_CONSTANT i1 true
     ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(s1) = G_XOR %c, [[C]]
-    ; CHECK-NEXT: [[FREEZE:%[0-9]+]]:_(s1) = G_FREEZE %f
-    ; CHECK-NEXT: %sel:_(s1) = G_AND [[XOR]], [[FREEZE]]
+    ; CHECK-NEXT: %sel:_(s1) = G_AND [[XOR]], %f
     ; CHECK-NEXT: %ext:_(s32) = G_ANYEXT %sel(s1)
     ; CHECK-NEXT: $w0 = COPY %ext(s32)
     %0:_(s64) = COPY $x0
diff --git a/llvm/test/CodeGen/AArch64/pr58431.ll b/llvm/test/CodeGen/AArch64/pr58431.ll
index dcd97597ae4093..e87d8f7874d62e 100644
--- a/llvm/test/CodeGen/AArch64/pr58431.ll
+++ b/llvm/test/CodeGen/AArch64/pr58431.ll
@@ -4,8 +4,8 @@
 define i32 @f(i64 %0) {
 ; CHECK-LABEL: f:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    mov w8, #10
-; CHECK-NEXT:    mov w9, w0
+; CHECK-NEXT:    mov w8, #10 // =0xa
+; CHECK-NEXT:    and x9, x0, #0xffffffff
 ; CHECK-NEXT:    udiv x10, x9, x8
 ; CHECK-NEXT:    msub x0, x10, x8, x9
 ; CHECK-NEXT:    // kill: def $w0 killed $w0 killed $x0

Copy link

github-actions bot commented Apr 30, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.


MatchInfo = [=](MachineIRBuilder &B) mutable {
auto Reg = MRI.createGenericVirtualRegister(MaybePoisonOperandRegTy);
B.setInsertPt(*OrigDef->getParent(), OrigDef->getIterator());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be necessary anymore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new freeze has to be inserted before the use which will come before the old freeze that is being removed, so the insert point has to be moved back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OrigDef is the root freeze instruction. The insert point for the apply is supposed to be set to before the root instruction, so this is just setting it to what it already should be

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, OrigDef is the instruction that is being frozen. This function gets called on freeze(op(op0, ops...)), where OrigDef is op and the insert point is before the freeze but after op. We have to transform this to op(freeze(op0), ops...), so it necessarily requires moving the insert point before op.

Removing the setInsertPt call causes the following tests to fail:

Failed Tests (2):
  LLVM :: CodeGen/AArch64/GlobalISel/combine-select.mir
  LLVM :: CodeGen/AMDGPU/div_v2i128.ll

Where the first test has freezes inserted in the wrong place, for example:

        683:   
        684:  %0:_(s64) = COPY $x0 
        685:  %2:_(s64) = COPY $x2 
        686:  %c:_(s1) = G_TRUNC %0(s64) 
        687:  %f:_(s1) = G_TRUNC %11(s64) 
        688:  %11:_(s64) = G_FREEZE %2 
next:147      !~~~~~~~~~~~~~~~~~~~~~~~  error: match on wrong line
        689:  %sel:_(s1) = G_OR %c, %f 
        690:  %ext:_(s32) = G_ANYEXT %sel(s1) 
        691:  $w0 = COPY %ext(s32) 
        692:  
        693: ... 

and the second test fails in the machine verifier:

*** Bad machine code: Virtual register defs don't dominate all uses. ***
- function:    v_sdiv_v2i128_vv
- v. register: %964

std::optional<MachineOperand> MaybePoisonOperand = std::nullopt;
for (MachineOperand &Operand : OrigDef->uses()) {
// Avoid working on non-register operands or physical registers.
if (!Operand.isReg() || Operand.getReg().isPhysical())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should never encounter physical registers on a G_* operation

if (canCreateUndefOrPoison(OrigOp, MRI))
return false;

std::optional<MachineOperand> MaybePoisonOperand = std::nullopt;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need the initializer

return true;
}

if (!MaybePoisonOperand->isReg())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't have relevant non-register operands

// when Op0 is not guaranteed non-poison
def push_freeze_to_prevent_poison_propagation : GICombineRule<
(defs root:$root, build_fn_matchinfo:$matchinfo),
(match (wip_match_opcode G_FREEZE):$root,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wip_match_opcode is forbidden.

def freeze_of_poison : GICombineRule<
   (defs root:$root),
   (match (G_FREEZE $root, $src),
          [{ return !isGuaranteedNotToBePoison(${src}.getReg(), MRI) && Helper.matchFreezeOf ....]),
   (apply (Helper.apply...)>;

return false;
GenericMachineInstr *Opr = dyn_cast<GBinOp>(RegDef);
if (!Opr)
Opr = dyn_cast<GCastOp>(RegDef);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

casts and binops can be poison

@@ -1737,6 +1737,8 @@ static bool canCreateUndefOrPoison(Register Reg, const MachineRegisterInfo &MRI,
case TargetOpcode::G_FREEZE:
return false;
default:
if (isa<GCastOp>(RegDef) || isa<GBinOp>(RegDef))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

casts and binops can create poison

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed with my latest revision.

@tschuett
Copy link
Member

I would prefer to not touch Util.cpp. We can do that in separate PRs.

isa<GFreeze>(OrigDef))
return false;

if (canCreateUndefOrPoison(OrigOp, MRI))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (canCreateUndefOrPoison(cast<Operator>(OrigOp),

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are missing the flags parameter.

@tschuett
Copy link
Member

OrigOpInst->dropPoisonGeneratingAnnotations();

is missing.

@dc03-work
Copy link
Contributor Author

OrigOpInst->dropPoisonGeneratingAnnotations();

is missing.

This does not appear to be supported in GlobalISel.

@dc03-work
Copy link
Contributor Author

dc03-work commented May 1, 2024

I would prefer to not touch Util.cpp. We can do that in separate PRs.

TBH I would prefer to do a stacked review but support on GitHub is suboptimal.

return getFlag(FmNoNans) || getFlag(FmNoInfs);

default:
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return the conservative default of true (or maybe just test if any flags are set at all). I'm also not sure we need to consider the opcodes. From construction, we won't have other flags on the MachineInstr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

}

void dropPoisonGeneratingFlags() {
switch (getOpcode()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just drop all the flags that could emit poison at once. All the flags are in the same bitfield in the MachineInstr unlike in the IR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

@dc03-work dc03-work requested a review from arsenm May 6, 2024 04:37
Comment on lines 39 to 41
return getFlag(NoUWrap) || getFlag(NoSWrap) || getFlag(IsExact) ||
getFlag(Disjoint) || getFlag(NonNeg) || getFlag(FmNoNans) ||
getFlag(FmNoInfs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can also just do this in one bit test on getFlags

Register DstOp = MI.getOperand(0).getReg();
Register OrigOp = MI.getOperand(1).getReg();

if (OrigOp.isPhysical() || !MRI.hasOneNonDBGUse(OrigOp))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OrigOp cannot be physical

MachineInstr *OrigDef = MRI.getUniqueVRegDef(OrigOp);
// Avoid trying to fold G_PHI, G_UNMERGE_VALUES, G_FREEZE (the latter is
// handled by idempotent_prop).
if (!OrigDef || OrigDef->isPHI() || isa<GUnmerge>(OrigDef) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to assume OrigDef is non-null

// Avoid trying to fold G_PHI, G_UNMERGE_VALUES, G_FREEZE (the latter is
// handled by idempotent_prop).
if (!OrigDef || OrigDef->isPHI() || isa<GUnmerge>(OrigDef) ||
isa<GFreeze>(OrigDef))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to check if it's freeze(freeze)? It's not wrong to do the replacement, and freeze of freeze should be a separate idempotent fold

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Freeze(freeze) was causing issues when I was initially making this patch, however the issues seem to have gone away now. Fixed.

return false;

if (canCreateUndefOrPoison(OrigOp, MRI,
/*ConsiderFlagsAndMetadata*/ false))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/*ConsiderFlagsAndMetadata*/ false))
/*ConsiderFlagsAndMetadata=*/false))

MatchInfo = [=](MachineIRBuilder &B) mutable {
auto Reg = MRI.createGenericVirtualRegister(MaybePoisonOperandRegTy);
B.setInsertPt(*OrigDef->getParent(), OrigDef->getIterator());
B.buildFreeze(Reg, MaybePoisonOperandReg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto Freeze = B.buildFreeze(MaybePoisonOperandRegTy, MaybePoisonOperandReg);

B.setInsertPt(*OrigDef->getParent(), OrigDef->getIterator());
B.buildFreeze(Reg, MaybePoisonOperandReg);
replaceRegOpWith(
MRI, *OrigDef->findRegisterUseOperand(MaybePoisonOperandReg, TRI), Reg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MRI, *OrigDef->findRegisterUseOperand(MaybePoisonOperandReg, TRI), Reg);
MRI, *OrigDef->findRegisterUseOperand(MaybePoisonOperandReg, TRI), Freeze.getReg(0));

LLT MaybePoisonOperandRegTy = MRI.getType(MaybePoisonOperandReg);

MatchInfo = [=](MachineIRBuilder &B) mutable {
auto Reg = MRI.createGenericVirtualRegister(MaybePoisonOperandRegTy);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto Reg = MRI.createGenericVirtualRegister(MaybePoisonOperandRegTy);

GMI->hasPoisonGeneratingFlags())
return true;
}
// Conservatively return true.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird comment placement, move with the actual case with braces

switch (RegDef->getOpcode()) {
case TargetOpcode::G_FREEZE:
return false;
default:
if (isa<GCastOp>(RegDef) || isa<GBinOp>(RegDef))
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return bool expression

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done to all the above.

@dc03-work dc03-work force-pushed the fold-push-freeze-to-prevent-poison-propagation branch from 8a5473c to b2d6339 Compare May 14, 2024 05:19
@dc03-work dc03-work requested a review from arsenm May 14, 2024 05:21
Comment on lines 44 to 50
clearFlag(NoUWrap);
clearFlag(NoSWrap);
clearFlag(IsExact);
clearFlag(Disjoint);
clearFlag(NonNeg);
clearFlag(FmNoNans);
clearFlag(FmNoInfs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to do this at once as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a good way to, clearFlag takes a MIFlag and checks its value, so this would require doing a bitwise-and with getFlags() within a setFlags() call which may still not work and will likely be much uglier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add a clearFlags to MachineInstr to bulk clear flags

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -220,6 +220,13 @@ def idempotent_prop : GICombineRule<
(match (idempotent_prop_frags $dst, $src)),
(apply (GIReplaceReg $dst, $src))>;

// Convert freeze(Op(Op0, NonPoisonOps...)) to Op(freeze(Op0), NonPoisonOps...)
// when Op0 is not guaranteed non-poison
def push_freeze_to_prevent_poison_propagation : GICombineRule<
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe change to more closely match InstCombine's pushFreezeToPreventPoisonFromPropagating? i.e. push_freeze_to_prevent_poison_from_propagating

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that name was too long, so I had made it push_freeze_to_prevent_poison_propagation. But okay, done.

return false;

MachineInstr *OrigDef = MRI.getUniqueVRegDef(OrigOp);
// Avoid trying to fold G_PHI and G_UNMERGE_VALUES.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't state what this is doing, state why it is skipping these

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@dc03-work
Copy link
Contributor Author

@davemgreen Could you please take a look at this as well? Thanks.

Opr = dyn_cast<GCastOp>(RegDef);

return Opr && !::llvm::canCreateUndefOrPoison(Reg, MRI) &&
all_of(Opr->operands(), OpCheck);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RegDef

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where? We need to check if the dyn_cast returns null, which won't be true if just using RegDef. canCreateUndefOrPoison accepts a Register, not a MachineInstr *.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opr->operands() -> RegDef->uses()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  default: {
    auto MOCheck = [&](const MachineOperand &MO) {
      if (!MO.isReg())
        return true;
      return ::isGuaranteedNotToBeUndefOrPoison(MO.getReg(), MRI, Depth + 1,
                                                Kind);
    };
    return !::canCreateUndefOrPoison(Reg, MRI,
                                     /*ConsiderFlagsAndMetadata=*/true, Kind) &&
           all_of(RegDef->uses(), MOCheck);
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, done.

@dc03-work dc03-work force-pushed the fold-push-freeze-to-prevent-poison-propagation branch from 318b707 to 6fcc9f6 Compare May 20, 2024 11:30
@dc03-work
Copy link
Contributor Author

Pre-commit CI is failing on a broken clang test.

@dc03-work dc03-work force-pushed the fold-push-freeze-to-prevent-poison-propagation branch from 4291abd to 38e2d68 Compare May 23, 2024 03:51
@dc03-work
Copy link
Contributor Author

Rebased on main.

operands

This combine matches the existing fold in InstCombine, i.e.
InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating.

It tries to push freeze through an operand if the operand has only one
maybe-poison operand and all other operands are guaranteed non-poison.

This is beneficial because it can potentially enable other optimizations
to occur that would otherwise be blocked because of the freeze.
@dc03-work dc03-work force-pushed the fold-push-freeze-to-prevent-poison-propagation branch from 38e2d68 to 929c0f2 Compare May 23, 2024 06:10
@dc03-work
Copy link
Contributor Author

dc03-work commented May 23, 2024

@arsenm Rebased again to get a buildkite run on this PR, however it doesn't seem to be added as an action to be run?

@dc03-work dc03-work merged commit 4c48b3c into llvm:main May 23, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants