[Reassociate] Drop weight reduction to fix issue 91417 (llvm#91469)

See the following case: https://alive2.llvm.org/ce/z/A-fBki ``` define i3 @src(i3 %0) { %2 = mul i3 %0, %0 %3 = mul i3 %2, %0 %4 = mul i3 %3, %0 %5 = mul nsw i3 %4, %0 ret i3 %5 } define i3 @tgt(i3 %0) { %2 = mul i3 %0, %0 %5 = mul nsw i3 %2, %0 ret i3 %5 } ``` llvm@d7aeefe introduced weight reduction during weights combination of the same operand. As the weight of `%0` changes from 5 to 3, the nsw flag in `%5` should be dropped. However, the nsw flag isn't cleared by `RewriteExprTree` since `%5 = mul nsw i3 %0, %4` is not included in the range of `[ExpressionChangedStart, ExpressionChangedEnd)`. ``` Calculated Rank[] = 3 Combine negations for: %2 = mul i3 %0, %0 Calculated Rank[] = 4 Combine negations for: %3 = mul i3 %0, %2 Calculated Rank[] = 5 Combine negations for: %4 = mul i3 %0, %3 Calculated Rank[] = 6 Combine negations for: %5 = mul nsw i3 %0, %4 LINEARIZE: %5 = mul nsw i3 %0, %4 OPERAND: i3 %0 (1) ADD USES LEAF: i3 %0 (1) OPERAND: %4 = mul i3 %0, %3 (1) DIRECT ADD: %4 = mul i3 %0, %3 (1) OPERAND: i3 %0 (1) OPERAND: %3 = mul i3 %0, %2 (1) DIRECT ADD: %3 = mul i3 %0, %2 (1) OPERAND: i3 %0 (1) OPERAND: %2 = mul i3 %0, %0 (1) DIRECT ADD: %2 = mul i3 %0, %0 (1) OPERAND: i3 %0 (1) OPERAND: i3 %0 (1) RAIn: mul i3 [ %0, llvm#3] [ %0, llvm#3] [ %0, llvm#3] RAOut: mul i3 [ %0, llvm#3] [ %0, llvm#3] [ %0, llvm#3] RAOut after CSE reorder: mul i3 [ %0, llvm#3] [ %0, llvm#3] [ %0, llvm#3] RA: %5 = mul nsw i3 %0, %4 TO: %5 = mul nsw i3 %4, %0 RA: %4 = mul i3 %0, %3 TO: %4 = mul i3 %0, %0 ``` The best way to fix this is to inform `RewriteExprTree` to clear flags of the whole expr tree when weight reduction happens. But I find that weight reduction based on Carmichael number never happens in practice. See the coverage result https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/home/dtcxzyw/llvm-project/llvm/lib/Transforms/Scalar/Reassociate.cpp.html#L323 I think it would be better to drop `IncorporateWeight`. Fixes llvm#91417
vg0204 · May 29, 2024 · 0da0e21 · 0da0e21
1 parent 9749d4f
commit 0da0e21
Show file tree

Hide file tree

Showing 2 changed files with 126 additions and 173 deletions.
diff --git a/llvm/lib/Transforms/Scalar/Reassociate.cpp b/llvm/lib/Transforms/Scalar/Reassociate.cpp
@@ -302,97 +302,6 @@ static BinaryOperator *LowerNegateToMultiply(Instruction *Neg) {
   return Res;
 }
 
-/// Returns k such that lambda(2^Bitwidth) = 2^k, where lambda is the Carmichael
-/// function. This means that x^(2^k) === 1 mod 2^Bitwidth for
-/// every odd x, i.e. x^(2^k) = 1 for every odd x in Bitwidth-bit arithmetic.
-/// Note that 0 <= k < Bitwidth, and if Bitwidth > 3 then x^(2^k) = 0 for every
-/// even x in Bitwidth-bit arithmetic.
-static unsigned CarmichaelShift(unsigned Bitwidth) {
-  if (Bitwidth < 3)
-    return Bitwidth - 1;
-  return Bitwidth - 2;
-}
-
-/// Add the extra weight 'RHS' to the existing weight 'LHS',
-/// reducing the combined weight using any special properties of the operation.
-/// The existing weight LHS represents the computation X op X op ... op X where
-/// X occurs LHS times.  The combined weight represents  X op X op ... op X with
-/// X occurring LHS + RHS times.  If op is "Xor" for example then the combined
-/// operation is equivalent to X if LHS + RHS is odd, or 0 if LHS + RHS is even;
-/// the routine returns 1 in LHS in the first case, and 0 in LHS in the second.
-static void IncorporateWeight(APInt &LHS, const APInt &RHS, unsigned Opcode) {
-  // If we were working with infinite precision arithmetic then the combined
-  // weight would be LHS + RHS.  But we are using finite precision arithmetic,
-  // and the APInt sum LHS + RHS may not be correct if it wraps (it is correct
-  // for nilpotent operations and addition, but not for idempotent operations
-  // and multiplication), so it is important to correctly reduce the combined
-  // weight back into range if wrapping would be wrong.
-
-  // If RHS is zero then the weight didn't change.
-  if (RHS.isMinValue())
-    return;
-  // If LHS is zero then the combined weight is RHS.
-  if (LHS.isMinValue()) {
-    LHS = RHS;
-    return;
-  }
-  // From this point on we know that neither LHS nor RHS is zero.
-
-  if (Instruction::isIdempotent(Opcode)) {
-    // Idempotent means X op X === X, so any non-zero weight is equivalent to a
-    // weight of 1.  Keeping weights at zero or one also means that wrapping is
-    // not a problem.
-    assert(LHS == 1 && RHS == 1 && "Weights not reduced!");
-    return; // Return a weight of 1.
-  }
-  if (Instruction::isNilpotent(Opcode)) {
-    // Nilpotent means X op X === 0, so reduce weights modulo 2.
-    assert(LHS == 1 && RHS == 1 && "Weights not reduced!");
-    LHS = 0; // 1 + 1 === 0 modulo 2.
-    return;
-  }
-  if (Opcode == Instruction::Add || Opcode == Instruction::FAdd) {
-    // TODO: Reduce the weight by exploiting nsw/nuw?
-    LHS += RHS;
-    return;
-  }
-
-  assert((Opcode == Instruction::Mul || Opcode == Instruction::FMul) &&
-         "Unknown associative operation!");
-  unsigned Bitwidth = LHS.getBitWidth();
-  // If CM is the Carmichael number then a weight W satisfying W >= CM+Bitwidth
-  // can be replaced with W-CM.  That's because x^W=x^(W-CM) for every Bitwidth
-  // bit number x, since either x is odd in which case x^CM = 1, or x is even in
-  // which case both x^W and x^(W - CM) are zero.  By subtracting off multiples
-  // of CM like this weights can always be reduced to the range [0, CM+Bitwidth)
-  // which by a happy accident means that they can always be represented using
-  // Bitwidth bits.
-  // TODO: Reduce the weight by exploiting nsw/nuw?  (Could do much better than
-  // the Carmichael number).
-  if (Bitwidth > 3) {
-    /// CM - The value of Carmichael's lambda function.
-    APInt CM = APInt::getOneBitSet(Bitwidth, CarmichaelShift(Bitwidth));
-    // Any weight W >= Threshold can be replaced with W - CM.
-    APInt Threshold = CM + Bitwidth;
-    assert(LHS.ult(Threshold) && RHS.ult(Threshold) && "Weights not reduced!");
-    // For Bitwidth 4 or more the following sum does not overflow.
-    LHS += RHS;
-    while (LHS.uge(Threshold))
-      LHS -= CM;
-  } else {
-    // To avoid problems with overflow do everything the same as above but using
-    // a larger type.
-    unsigned CM = 1U << CarmichaelShift(Bitwidth);
-    unsigned Threshold = CM + Bitwidth;
-    assert(LHS.getZExtValue() < Threshold && RHS.getZExtValue() < Threshold &&
-           "Weights not reduced!");
-    unsigned Total = LHS.getZExtValue() + RHS.getZExtValue();
-    while (Total >= Threshold)
-      Total -= CM;
-    LHS = Total;
-  }
-}
-
 using RepeatedValue = std::pair<Value*, APInt>;
 
 /// Given an associative binary expression, return the leaf
@@ -562,26 +471,7 @@ static bool LinearizeExprTree(Instruction *I,
                "In leaf map but not visited!");
 
         // Update the number of paths to the leaf.
-        IncorporateWeight(It->second, Weight, Opcode);
-
-#if 0   // TODO: Re-enable once PR13021 is fixed.
-        // The leaf already has one use from inside the expression.  As we want
-        // exactly one such use, drop this new use of the leaf.
-        assert(!Op->hasOneUse() && "Only one use, but we got here twice!");
-        I->setOperand(OpIdx, UndefValue::get(I->getType()));
-        Changed = true;
-
-        // If the leaf is a binary operation of the right kind and we now see
-        // that its multiple original uses were in fact all by nodes belonging
-        // to the expression, then no longer consider it to be a leaf and add
-        // its operands to the expression.
-        if (BinaryOperator *BO = isReassociableOp(Op, Opcode)) {
-          LLVM_DEBUG(dbgs() << "UNLEAF: " << *Op << " (" << It->second << ")\n");
-          Worklist.push_back(std::make_pair(BO, It->second));
-          Leaves.erase(It);
-          continue;
-        }
-#endif
+        It->second += Weight;
 
         // If we still have uses that are not accounted for by the expression
         // then it is not safe to modify the value.