[IR] Add zext nneg flag #67982

nikic · 2023-10-02T13:00:25Z

Add an nneg flag to the zext instruction, which specifies that the argument is non-negative. Otherwise, the result is a poison value.

The primary use-case for the flag is to preserve information when sext gets replaced with zext due to range-based canonicalization. The nneg flag allows us to convert the zext back into an sext later. This is useful for some optimizations (e.g. a signed icmp can fold with sext but not zext), as well as some targets (e.g. RISCV prefers sext over zext).

This patch is based on https://reviews.llvm.org/D156444 by @Panagiotis156, with some implementation simplifications and additional tests.

llvmbot · 2023-10-02T13:01:39Z

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-llvm-transforms

Changes

Add an nneg flag to the zext instruction, which specifies that the argument is non-negative. Otherwise, the result is a poison value.

The primary use-case for the flag is to preserve information when sext gets replaced with zext due to range-based canonicalization. The nneg flag allows us to convert the zext back into an sext later. This is useful for some optimizations (e.g. a signed icmp can fold with sext but not zext), as well as some targets (e.g. RISCV prefers sext over zext).

This patch is based on https://reviews.llvm.org/D156444, with some implementation simplifications and additional tests.

Full diff: https://github.com/llvm/llvm-project/pull/67982.diff

16 Files Affected:

(modified) llvm/docs/LangRef.rst (+6)
(modified) llvm/include/llvm/AsmParser/LLToken.h (+1)
(modified) llvm/include/llvm/Bitcode/LLVMBitCodes.h (+3)
(modified) llvm/include/llvm/IR/InstrTypes.h (+14)
(modified) llvm/include/llvm/IR/Instruction.h (+7)
(modified) llvm/lib/AsmParser/LLLexer.cpp (+1)
(modified) llvm/lib/AsmParser/LLParser.cpp (+10-1)
(modified) llvm/lib/Bitcode/Reader/BitcodeReader.cpp (+8-3)
(modified) llvm/lib/Bitcode/Writer/BitcodeWriter.cpp (+22)
(modified) llvm/lib/IR/AsmWriter.cpp (+3)
(modified) llvm/lib/IR/Instruction.cpp (+24)
(modified) llvm/lib/IR/Operator.cpp (+4)
(modified) llvm/test/Assembler/flags.ll (+6)
(modified) llvm/test/Bitcode/flags.ll (+4)
(modified) llvm/test/Transforms/InstCombine/freeze.ll (+11)
(modified) llvm/test/Transforms/SimplifyCFG/HoistCode.ll (+30)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index c8f74c19bd6b3cf..4c4f3cbbaf0d541 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -11156,6 +11156,9 @@ until it reaches the size of the destination type, ``ty2``.
 
 When zero extending from i1, the result will always be either 0 or 1.
 
+If the ``nneg`` flag is set, and the ``zext`` argument is negative, the result
+is a poison value.
+
 Example:
 """"""""
 
@@ -11165,6 +11168,9 @@ Example:
       %Y = zext i1 true to i32              ; yields i32:1
       %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
 
+      %a = zext nneg i8 127 to i16 ; yields i16 127
+      %b = zext nneg i8 -1 to i16  ; yields i16 poison
+
 .. _i_sext:
 
 '``sext .. to``' Instruction
diff --git a/llvm/include/llvm/AsmParser/LLToken.h b/llvm/include/llvm/AsmParser/LLToken.h
index 673dc58ce6451e3..ae566672b97986b 100644
--- a/llvm/include/llvm/AsmParser/LLToken.h
+++ b/llvm/include/llvm/AsmParser/LLToken.h
@@ -110,6 +110,7 @@ enum Kind {
   kw_nsw,
   kw_exact,
   kw_inbounds,
+  kw_nneg,
   kw_inrange,
   kw_addrspace,
   kw_section,
diff --git a/llvm/include/llvm/Bitcode/LLVMBitCodes.h b/llvm/include/llvm/Bitcode/LLVMBitCodes.h
index 52e76356a892e45..6f11bb0b30c379a 100644
--- a/llvm/include/llvm/Bitcode/LLVMBitCodes.h
+++ b/llvm/include/llvm/Bitcode/LLVMBitCodes.h
@@ -505,6 +505,9 @@ enum FastMathMap {
   AllowReassoc    = (1 << 7)
 };
 
+/// Flags for serializing NonNegInstruction's SubclassOptionalData contents.
+enum NonNegInstructionOptionalFlags { NNI_NON_NEG = 0 };
+
 /// PossiblyExactOperatorOptionalFlags - Flags for serializing
 /// PossiblyExactOperator's SubclassOptionalData contents.
 enum PossiblyExactOperatorOptionalFlags { PEO_EXACT = 0 };
diff --git a/llvm/include/llvm/IR/InstrTypes.h b/llvm/include/llvm/IR/InstrTypes.h
index 6095b0a1be69cb3..b6da2bf836cb56f 100644
--- a/llvm/include/llvm/IR/InstrTypes.h
+++ b/llvm/include/llvm/IR/InstrTypes.h
@@ -692,6 +692,20 @@ class CastInst : public UnaryInstruction {
   }
 };
 
+/// Instruction that can have a nneg flag (only zext).
+class NonNegInstruction : public CastInst {
+public:
+  enum { NonNeg = (1 << 0) };
+
+  static bool classof(const Instruction *I) {
+    return I->getOpcode() == Instruction::ZExt;
+  }
+
+  static bool classof(const Value *V) {
+    return isa<Instruction>(V) && classof(cast<Instruction>(V));
+  }
+};
+
 //===----------------------------------------------------------------------===//
 //                               CmpInst Class
 //===----------------------------------------------------------------------===//
diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h
index 69c3af5b76103f6..781a69e586a9e6b 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -407,12 +407,19 @@ class Instruction : public User,
   /// which supports this flag. See LangRef.html for the meaning of this flag.
   void setIsExact(bool b = true);
 
+  /// Set or clear the nneg flag on this instruction, which must be a zext
+  /// instruction.
+  void setNonNeg(bool b = true);
+
   /// Determine whether the no unsigned wrap flag is set.
   bool hasNoUnsignedWrap() const LLVM_READONLY;
 
   /// Determine whether the no signed wrap flag is set.
   bool hasNoSignedWrap() const LLVM_READONLY;
 
+  /// Determine whether the the nneg flag is set.
+  bool hasNonNeg() const LLVM_READONLY;
+
   /// Return true if this operator has flags which may cause this instruction
   /// to evaluate to poison despite having non-poison inputs.
   bool hasPoisonGeneratingFlags() const LLVM_READONLY;
diff --git a/llvm/lib/AsmParser/LLLexer.cpp b/llvm/lib/AsmParser/LLLexer.cpp
index 466bdebc001f589..02e1a1dce3c01b5 100644
--- a/llvm/lib/AsmParser/LLLexer.cpp
+++ b/llvm/lib/AsmParser/LLLexer.cpp
@@ -565,6 +565,7 @@ lltok::Kind LLLexer::LexIdentifier() {
   KEYWORD(nsw);
   KEYWORD(exact);
   KEYWORD(inbounds);
+  KEYWORD(nneg);
   KEYWORD(inrange);
   KEYWORD(addrspace);
   KEYWORD(section);
diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index 04eabc94cfc6abe..2416814c53c0156 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -6381,8 +6381,17 @@ int LLParser::parseInstruction(Instruction *&Inst, BasicBlock *BB,
   }
 
   // Casts.
+  case lltok::kw_zext: {
+    bool NonNeg = EatIfPresent(lltok::kw_nneg);
+    bool Res = parseCast(Inst, PFS, KeywordVal);
+    if (NonNeg)
+      Inst->setNonNeg();
+    if (Res != 0) {
+      return Res;
+    }
+    return 0;
+  }
   case lltok::kw_trunc:
-  case lltok::kw_zext:
   case lltok::kw_sext:
   case lltok::kw_fptrunc:
   case lltok::kw_fpext:
diff --git a/llvm/lib/Bitcode/Reader/BitcodeReader.cpp b/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
index e56291859022eec..28ba79e48e7697d 100644
--- a/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+++ b/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
@@ -4875,12 +4875,13 @@ Error BitcodeReader::parseFunctionBody(Function *F) {
       Value *Op;
       unsigned OpTypeID;
       if (getValueTypePair(Record, OpNum, NextValueNo, Op, OpTypeID, CurBB) ||
-          OpNum+2 != Record.size())
+          OpNum + 1 > Record.size())
         return error("Invalid record");
 
-      ResTypeID = Record[OpNum];
+      ResTypeID = Record[OpNum++];
       Type *ResTy = getTypeByID(ResTypeID);
-      int Opc = getDecodedCastOpcode(Record[OpNum + 1]);
+      int Opc = getDecodedCastOpcode(Record[OpNum++]);
+
       if (Opc == -1 || !ResTy)
         return error("Invalid record");
       Instruction *Temp = nullptr;
@@ -4892,10 +4893,14 @@ Error BitcodeReader::parseFunctionBody(Function *F) {
         }
       } else {
         auto CastOp = (Instruction::CastOps)Opc;
+
         if (!CastInst::castIsValid(CastOp, Op, ResTy))
           return error("Invalid cast");
         I = CastInst::Create(CastOp, Op, ResTy);
       }
+      if (OpNum < Record.size() && isa<NonNegInstruction>(I) &&
+          (Record[OpNum] & (1 << bitc::NNI_NON_NEG)))
+        I->setNonNeg(true);
       InstructionList.push_back(I);
       break;
     }
diff --git a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
index e991d055f33474b..75ddc058c5bcd79 100644
--- a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -123,6 +123,7 @@ enum {
   FUNCTION_INST_BINOP_ABBREV,
   FUNCTION_INST_BINOP_FLAGS_ABBREV,
   FUNCTION_INST_CAST_ABBREV,
+  FUNCTION_INST_CAST_FLAGS_ABBREV,
   FUNCTION_INST_RET_VOID_ABBREV,
   FUNCTION_INST_RET_VAL_ABBREV,
   FUNCTION_INST_UNREACHABLE_ABBREV,
@@ -1549,6 +1550,9 @@ static uint64_t getOptimizationFlags(const Value *V) {
       Flags |= bitc::AllowContract;
     if (FPMO->hasApproxFunc())
       Flags |= bitc::ApproxFunc;
+  } else if (const auto *NNI = dyn_cast<NonNegInstruction>(V)) {
+    if (NNI->hasNonNeg())
+      Flags |= 1 << bitc::NNI_NON_NEG;
   }
 
   return Flags;
@@ -2825,6 +2829,12 @@ void ModuleBitcodeWriter::writeInstruction(const Instruction &I,
         AbbrevToUse = FUNCTION_INST_CAST_ABBREV;
       Vals.push_back(VE.getTypeID(I.getType()));
       Vals.push_back(getEncodedCastOpcode(I.getOpcode()));
+      uint64_t Flags = getOptimizationFlags(&I);
+      if (Flags != 0) {
+        if (AbbrevToUse == FUNCTION_INST_CAST_ABBREV)
+          AbbrevToUse = FUNCTION_INST_CAST_FLAGS_ABBREV;
+        Vals.push_back(Flags);
+      }
     } else {
       assert(isa<BinaryOperator>(I) && "Unknown instruction!");
       Code = bitc::FUNC_CODE_INST_BINOP;
@@ -3646,6 +3656,18 @@ void ModuleBitcodeWriter::writeBlockInfo() {
         FUNCTION_INST_CAST_ABBREV)
       llvm_unreachable("Unexpected abbrev ordering!");
   }
+  { // INST_CAST_FLAGS abbrev for FUNCTION_BLOCK.
+    auto Abbv = std::make_shared<BitCodeAbbrev>();
+    Abbv->Add(BitCodeAbbrevOp(bitc::FUNC_CODE_INST_CAST));
+    Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // OpVal
+    Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,    // dest ty
+                              VE.computeBitsRequiredForTypeIndicies()));
+    Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 4)); // opc
+    Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 8)); // flags
+    if (Stream.EmitBlockInfoAbbrev(bitc::FUNCTION_BLOCK_ID, Abbv) !=
+        FUNCTION_INST_CAST_FLAGS_ABBREV)
+      llvm_unreachable("Unexpected abbrev ordering!");
+  }
 
   { // INST_RET abbrev for FUNCTION_BLOCK.
     auto Abbv = std::make_shared<BitCodeAbbrev>();
diff --git a/llvm/lib/IR/AsmWriter.cpp b/llvm/lib/IR/AsmWriter.cpp
index e190d82127908db..abcee3f599da633 100644
--- a/llvm/lib/IR/AsmWriter.cpp
+++ b/llvm/lib/IR/AsmWriter.cpp
@@ -1347,6 +1347,9 @@ static void WriteOptimizationInfo(raw_ostream &Out, const User *U) {
   } else if (const GEPOperator *GEP = dyn_cast<GEPOperator>(U)) {
     if (GEP->isInBounds())
       Out << " inbounds";
+  } else if (const auto *NNI = dyn_cast<NonNegInstruction>(U)) {
+    if (NNI->hasNonNeg())
+      Out << " nneg";
   }
 }
 
diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp
index b497951a598cc50..e7c8d1090d95ac1 100644
--- a/llvm/lib/IR/Instruction.cpp
+++ b/llvm/lib/IR/Instruction.cpp
@@ -172,6 +172,12 @@ void Instruction::setIsExact(bool b) {
   cast<PossiblyExactOperator>(this)->setIsExact(b);
 }
 
+void Instruction::setNonNeg(bool b) {
+  assert(isa<NonNegInstruction>(this) && "Must be zext");
+  SubclassOptionalData = (SubclassOptionalData & ~NonNegInstruction::NonNeg) |
+                         (b * NonNegInstruction::NonNeg);
+}
+
 bool Instruction::hasNoUnsignedWrap() const {
   return cast<OverflowingBinaryOperator>(this)->hasNoUnsignedWrap();
 }
@@ -180,6 +186,11 @@ bool Instruction::hasNoSignedWrap() const {
   return cast<OverflowingBinaryOperator>(this)->hasNoSignedWrap();
 }
 
+bool Instruction::hasNonNeg() const {
+  assert(isa<NonNegInstruction>(this) && "Must be zext");
+  return (SubclassOptionalData & NonNegInstruction::NonNeg) != 0;
+}
+
 bool Instruction::hasPoisonGeneratingFlags() const {
   return cast<Operator>(this)->hasPoisonGeneratingFlags();
 }
@@ -204,7 +215,12 @@ void Instruction::dropPoisonGeneratingFlags() {
   case Instruction::GetElementPtr:
     cast<GetElementPtrInst>(this)->setIsInBounds(false);
     break;
+
+  case Instruction::ZExt:
+    setNonNeg(false);
+    break;
   }
+
   if (isa<FPMathOperator>(this)) {
     setHasNoNaNs(false);
     setHasNoInfs(false);
@@ -379,6 +395,10 @@ void Instruction::copyIRFlags(const Value *V, bool IncludeWrapFlags) {
   if (auto *SrcGEP = dyn_cast<GetElementPtrInst>(V))
     if (auto *DestGEP = dyn_cast<GetElementPtrInst>(this))
       DestGEP->setIsInBounds(SrcGEP->isInBounds() || DestGEP->isInBounds());
+
+  if (auto *NNI = dyn_cast<NonNegInstruction>(V))
+    if (isa<NonNegInstruction>(this))
+      setNonNeg(NNI->hasNonNeg());
 }
 
 void Instruction::andIRFlags(const Value *V) {
@@ -404,6 +424,10 @@ void Instruction::andIRFlags(const Value *V) {
   if (auto *SrcGEP = dyn_cast<GetElementPtrInst>(V))
     if (auto *DestGEP = dyn_cast<GetElementPtrInst>(this))
       DestGEP->setIsInBounds(SrcGEP->isInBounds() && DestGEP->isInBounds());
+
+  if (auto *NNI = dyn_cast<NonNegInstruction>(V))
+    if (isa<NonNegInstruction>(this))
+      setNonNeg(hasNonNeg() && NNI->hasNonNeg());
 }
 
 const char *Instruction::getOpcodeName(unsigned OpCode) {
diff --git a/llvm/lib/IR/Operator.cpp b/llvm/lib/IR/Operator.cpp
index d2a1f2eb49dafed..a8be4233991d362 100644
--- a/llvm/lib/IR/Operator.cpp
+++ b/llvm/lib/IR/Operator.cpp
@@ -37,6 +37,10 @@ bool Operator::hasPoisonGeneratingFlags() const {
     // Note: inrange exists on constexpr only
     return GEP->isInBounds() || GEP->getInRangeIndex() != std::nullopt;
   }
+  case Instruction::ZExt:
+    if (auto *NNI = dyn_cast<NonNegInstruction>(this))
+      return NNI->hasNonNeg();
+    return false;
   default:
     if (const auto *FP = dyn_cast<FPMathOperator>(this))
       return FP->hasNoNaNs() || FP->hasNoInfs();
diff --git a/llvm/test/Assembler/flags.ll b/llvm/test/Assembler/flags.ll
index 3b54b06b81d4e2b..8331edf52a1699d 100644
--- a/llvm/test/Assembler/flags.ll
+++ b/llvm/test/Assembler/flags.ll
@@ -260,3 +260,9 @@ define i64 @mul_unsigned_ce() {
 	ret i64 mul nuw (i64 ptrtoint (ptr @addr to i64), i64 91)
 }
 
+define i64 @test_zext(i32 %a) {
+; CHECK: %res = zext nneg i32 %a to i64
+  %res = zext nneg i32 %a to i64
+  ret i64 %res
+}
+
diff --git a/llvm/test/Bitcode/flags.ll b/llvm/test/Bitcode/flags.ll
index 6febaa6b40df863..a6e368b7e76327f 100644
--- a/llvm/test/Bitcode/flags.ll
+++ b/llvm/test/Bitcode/flags.ll
@@ -16,6 +16,8 @@ second:                                           ; preds = %first
   %s = add nsw i32 %a, 0                          ; <i32> [#uses=0]
   %us = add nuw nsw i32 %a, 0                     ; <i32> [#uses=0]
   %z = add i32 %a, 0                              ; <i32> [#uses=0]
+  %hh = zext nneg i32 %a to i64
+  %ll = zext i32 %s to i64
   unreachable
 
 first:                                            ; preds = %entry
@@ -24,5 +26,7 @@ first:                                            ; preds = %entry
   %ss = add nsw i32 %a, 0                         ; <i32> [#uses=0]
   %uuss = add nuw nsw i32 %a, 0                   ; <i32> [#uses=0]
   %zz = add i32 %a, 0                             ; <i32> [#uses=0]
+  %kk = zext nneg i32 %a to i64
+  %rr = zext i32 %ss to i64
   br label %second
 }
diff --git a/llvm/test/Transforms/InstCombine/freeze.ll b/llvm/test/Transforms/InstCombine/freeze.ll
index f8e9a757e2bc938..3fde49d08481278 100644
--- a/llvm/test/Transforms/InstCombine/freeze.ll
+++ b/llvm/test/Transforms/InstCombine/freeze.ll
@@ -1116,6 +1116,17 @@ define i32 @freeze_ctpop(i32 %x) {
   ret i32 %fr
 }
 
+define i32 @freeze_zext_nneg(i8 %x) {
+; CHECK-LABEL: @freeze_zext_nneg(
+; CHECK-NEXT:    [[X_FR:%.*]] = freeze i8 [[X:%.*]]
+; CHECK-NEXT:    [[ZEXT:%.*]] = zext i8 [[X_FR]] to i32
+; CHECK-NEXT:    ret i32 [[ZEXT]]
+;
+  %zext = zext nneg i8 %x to i32
+  %fr = freeze i32 %zext
+  ret i32 %fr
+}
+
 !0 = !{}
 !1 = !{i64 4}
 !2 = !{i32 0, i32 100}
diff --git a/llvm/test/Transforms/SimplifyCFG/HoistCode.ll b/llvm/test/Transforms/SimplifyCFG/HoistCode.ll
index 4088ecfc818982f..08cf6cd5be80cf7 100644
--- a/llvm/test/Transforms/SimplifyCFG/HoistCode.ll
+++ b/llvm/test/Transforms/SimplifyCFG/HoistCode.ll
@@ -94,3 +94,33 @@ end:
   %cond = phi fast float [ 0.0, %bb0 ], [ %x, %bb1 ], [ %x, %bb2 ]
   ret float %cond
 }
+
+define i32 @hoist_zext_flags_preserve(i1 %C, i8 %x) {
+; CHECK-LABEL: @hoist_zext_flags_preserve(
+; CHECK-NEXT:  common.ret:
+; CHECK-NEXT:    [[Z1:%.*]] = zext nneg i8 [[X:%.*]] to i32
+; CHECK-NEXT:    ret i32 [[Z1]]
+;
+  br i1 %C, label %T, label %F
+T:
+  %z1 = zext nneg i8 %x to i32
+  ret i32 %z1
+F:
+  %z2 = zext nneg i8 %x to i32
+  ret i32 %z2
+}
+
+define i32 @hoist_zext_flags_drop(i1 %C, i8 %x) {
+; CHECK-LABEL: @hoist_zext_flags_drop(
+; CHECK-NEXT:  common.ret:
+; CHECK-NEXT:    [[Z1:%.*]] = zext i8 [[X:%.*]] to i32
+; CHECK-NEXT:    ret i32 [[Z1]]
+;
+  br i1 %C, label %T, label %F
+T:
+  %z1 = zext nneg i8 %x to i32
+  ret i32 %z1
+F:
+  %z2 = zext i8 %x to i32
+  ret i32 %z2
+}

nikic · 2023-10-02T13:09:17Z

Wonder whether the flag should also be supported on sext. I don't think it would be terribly useful (as we canonicalize sext -> zext, not the other way around), but it might make sense in the interest of consistency.

(The counter-argument is that it increases the places whether flags needs to be preserved by a factor of two, for a case where it's not worthwhile.)

nikic · 2023-10-02T13:28:31Z

Just saw https://discourse.llvm.org/t/aggressive-conversion-of-sext-to-zext-blocks-indvarsimplify/61561/9?u=nikic, which proposed to call this zext nsw instead (and the equivalent sext would be sext nuw). I can definitely see the appeal in reusing existing nowrap flags, though it seems somewhat awkward in this instance, especially with one of the flags always being implied.

leo-ard · 2023-10-02T22:34:43Z

Thanks for doing this PR, I am the author of #67594 and this is very useful. I just had some ideas from your though/comments and wanted to tag along

Wonder whether the flag should also be supported on sext.

I don't think this would be useful, even for consistency, as zext nneg and sext nneg are exactly the same instruction (pad zeros at the front of the value, and we know that the argument is positive). I think a better question is : should we add the nneg attribute to zext or sext ? or even : Is there an underlying instruction, such as ext that could represent what we want ? We could have ext sign ... ext zero ... and ext nneg .... All of the tree scenarios (signed, zero and zero but we know its positive) seems to be mutually exclusive. But at the end, probably that zext nneg is the easiest to implement (compared to renaming sext -> ext sign and zext -> ext zero) and makes the most sens as we are zero-expanding

nikic · 2023-10-03T20:54:53Z

I think a better question is : should we add the nneg attribute to zext or sext ? or even : Is there an underlying instruction, such as ext that could represent what we want ? We could have ext sign ... ext zero ... and ext nneg .... All of the tree scenarios (signed, zero and zero but we know its positive) seems to be mutually exclusive. But at the end, probably that zext nneg is the easiest to implement (compared to renaming sext -> ext sign and zext -> ext zero) and makes the most sens as we are zero-expanding

Yeah, I think this basically comes down the how to hard it is to implement. Adding a new flag to zext is based on the well-understood existing mechanism of poison-generating flags and degrades gracefully: All existing zext folds continue working. If a fold doesn't consider the nneg flag (as will be the case for all folds initially), the worst that can happen is that the flag is dropped or a fold that could make use of it doesn't, but we'll never end up doing something worse than we do now. This allows gradual introduction without breaking anything.

The main disadvantage to zext nneg I see is that zext continues to be "privileged" as our choice of canonicalization. Optimizations expecting a zext can just look for a zext. Optimizations expecting an sext may have to look for a "sext or zext nneg".

For the "common ext instruction" approach, the way I would frame this is probably that there are ext zero, ext sign and ext zero sign, where the latter is both a zero and a sign extend, returning poison if those two options conflict. The "m_ZExt" would match any ext with zero flag (either ext zero or ext zero sign) and m_SExt would do the same with the sign flag. Two exts would be considered "identical if defined" if they have at least one common flag. Flag intersection would then be that common flag. This means that ext zero sign could be combined with both ext zero and ext sign by dropping the appropriate flag. The "drop poison-generating flags" operation would still have to make an arbitrary choice, such that freeze(ext zero sign) would presumably become ext zero (freeze).

The nice thing about this approach is that zext is no longer privileged, the model forces handling zext and sext on the same footing. On the other hand, this would be a major IR change, and it doesn't quite fit into the usual paradigm of poison-generating flags (because these "flags" aren't the kind that can be simply dropped).

(A further extension of this would be to also have plain ext which is extension without forcing the high bits to be zero or sign. In other words, this represents the anyext operation. But I don't think we actually want something like this at the IR level, because this would have to be based on undef bits, which is a concept we'd rather get rid off.)

nikic · 2023-10-06T15:13:12Z

I've submitted an RFC for this change: https://discourse.llvm.org/t/rfc-add-zext-nneg-flag/73914

asb · 2023-10-09T19:09:19Z

llvm/docs/LangRef.rst

@@ -11156,6 +11156,9 @@ until it reaches the size of the destination type, ``ty2``.

 When zero extending from i1, the result will always be either 0 or 1.

+If the ``nneg`` flag is set, and the ``zext`` argument is negative, the result


The description of nneg is very minimal, vs the more detailed description in the RFC and PR. Maybe add to "Overview" something like "The nneg (non-negative) flag, if present, specifies that the operand is non-negative. This property may be used by optimization passes to later convert the zext to a semantically equivalent sext."

Done! (I dropped the "semantically equivalent" bit because it's a refinement, not an equivalence.)

nikic · 2023-10-16T07:51:06Z

ping

goldsteinn · 2023-10-16T17:17:30Z

The main disadvantage to zext nneg I see is that zext continues to be "privileged" as our choice of canonicalization. Optimizations expecting a zext can just look for a zext. Optimizations expecting an sext may have to look for a "sext or zext nneg".

This seems easy enough to fix with a new pattern m_SExtLike. Few transforms actually check that opcode is sext and those could almost certainly be refactored relatively easily.

goldsteinn · 2023-10-16T17:19:51Z

llvm/include/llvm/IR/InstrTypes.h

@@ -692,6 +692,20 @@ class CastInst : public UnaryInstruction {
  }
 };

+/// Instruction that can have a nneg flag (only zext).
+class NonNegInstruction : public CastInst {


Personally not a fan of NonNegInstruction as the name. To me atleast, that seems to imply a lot more coverage than just zext.
Maybe a bit more specific NonNegCastInstruction?

Maybe it would be best to just drop it entirely? Given that we have just the one instruction supporting the flag, it might be better to stick with isa<ZExtInst> over isa<NonNegInstruction>. There is some risk that this may require future refactoring if we add the flag on additional instructions -- but that might be better than doing an unnecessary over-generalization now.

(We might want to add nneg on GEPs, but in that case the semantics of the flag would be sufficiently different that they could not share any common handling anyway.)

Is this meant to mirror how we use OverflowingBinaryOperator now?
But I'd say if there is even a change we might expand this, keeping the class makes sense but would leave it's name as NonNegCastInstruction for now. The refactor to rename NonNegCastInstruction -> NonNegInstruction will probably be cheaper than if we skip it entirely.
I guess when I made the comment I wasn't thinking the flag may be applied to other instructions, which makes me less opposed to the current name.

Is this meant to mirror how we use OverflowingBinaryOperator now?

Yes. I guess the name is confusing in that it sounds like "an instruction with the nneg flag", while in reality this is "an instruction that may or may not have the nneg flag". A more accurate name, taking inspiration from PossiblyExactOperator, would be PossiblyNonNegInstruction. Or I guess PossiblyNonNegCastInst taking your suggestion into account.

Agreed, I would go with PossiblyNonNegInstruction. Think with that name (since its so explicitly a query) the CastInst isn't so important.

llvm/lib/AsmParser/LLParser.cpp

llvm/lib/Bitcode/Reader/BitcodeReader.cpp

goldsteinn · 2023-10-16T17:24:19Z

llvm/lib/Bitcode/Reader/BitcodeReader.cpp

        if (!CastInst::castIsValid(CastOp, Op, ResTy))
          return error("Invalid cast");
        I = CastInst::Create(CastOp, Op, ResTy);
      }
+      if (OpNum < Record.size() && isa<NonNegInstruction>(I) &&
+          (Record[OpNum] & (1 << bitc::NNI_NON_NEG)))


nit: think bitc:NNI_NON_NEG can just be 1 << bitc::NNI_NON_NEG (as opposed to the log). All the uses I see are as the mask and that seems to fit the pattern of other enums better as well.

If there is a reason to keep as log then this should be ResTypeID(1) << ... and below should be uint64_t(1) << ...

This is just following the same convention used for the other flags (OBO and PEO):

llvm-project/llvm/lib/Bitcode/Reader/BitcodeReader.cpp

Lines 4849 to 4863 in e93bddb

if (OpNum < Record.size()) {

if (Opc == Instruction::Add ||

Opc == Instruction::Sub ||

Opc == Instruction::Mul ||

Opc == Instruction::Shl) {

if (Record[OpNum] & (1 << bitc::OBO_NO_SIGNED_WRAP))

cast<BinaryOperator>(I)->setHasNoSignedWrap(true);

if (Record[OpNum] & (1 << bitc::OBO_NO_UNSIGNED_WRAP))

cast<BinaryOperator>(I)->setHasNoUnsignedWrap(true);

} else if (Opc == Instruction::SDiv ||

Opc == Instruction::UDiv ||

Opc == Instruction::LShr ||

Opc == Instruction::AShr) {

if (Record[OpNum] & (1 << bitc::PEO_EXACT))

cast<BinaryOperator>(I)->setIsExact(true);

Okay to keep as log then (although I'd say preferably all would be changed to be the mask), but then can you cast 1 to the proper shift width. Otherwise just feels like bug waiting to happen.

Is the "bug waiting to happen" that we are going to add 30 more flags to this instructions? That seems like a pretty contrived concern, especially when instructions can only have up to 7 flags.

Fair enough

Add an nneg flag to the zext instruction, which specifies that the argument is non-negative. Otherwise, the result is a poison value. The primary use-case for the flag is to preserve information when sext gets replaced with zext due to range-based canonicalization. The nneg flag allows us to convert the zext back into an sext later. This is useful for some optimizations (e.g. a signed icmp can fold with sext but not zext), as well as some targets (e.g. RISCV prefers sext over zext). Co-authored-by: Nikita Popov <npopov@redhat.com>

nikic · 2023-10-26T18:23:12Z

ping

goldsteinn · 2023-10-27T15:49:03Z

LGTM, but if another could also check it this is not an area I'm fully qualified to review.

topperc · 2023-10-27T17:23:01Z

llvm/lib/IR/Operator.cpp

@@ -37,6 +37,10 @@ bool Operator::hasPoisonGeneratingFlags() const {
    // Note: inrange exists on constexpr only
    return GEP->isInBounds() || GEP->getInRangeIndex() != std::nullopt;
  }
+  case Instruction::ZExt:
+    if (auto *NNI = dyn_cast<PossiblyNonNegInst>(this))


Why do we need a dyn_cast instead of a cast like the other cases?

This is in case it's a constant expression. I haven't completely finished the zext expr removal yet.

topperc

LGTM

zext nneg was recently added to the IR in #67982. Teaching SCEVExpander to emit nneg when possible is valuable since SCEV may have proved non-trivial facts about loop bounds which would otherwise be lost when materializing the value.

This fixes a miscompile introduced in the recent llvm#67982, and likely exposed in changes since to infer and leverage the same. No active bug reports as of yet. This was noticed in llvm#70858 (comment).

This fixes a miscompile introduced in the recent #67982, and likely exposed in changes since to infer and leverage the same. No active bug reports as of yet. This was noticed in #70858 (comment).

zext nneg was recently added to the IR in llvm#67982. This patch teaches demanded bits and known bits about the semantics of the instruction, and adds a couple of test cases to illustrate basic functionality.

zext nneg was recently added to the IR in #67982. This patch teaches SimplifyIndVars to prefer zext nneg over *both* sext and plain zext, when a local SCEV query indicates the source is non-negative. The choice to prefer zext nneg over sext looks slightly aggressive here, but probably isn't so much in practice. For cases where we'd "remember" the range fact, instcombine would convert the sext into a zext nneg anyways. The only cases where this produces a different result overall are when SCEV knows a non-local fact, and it doesn't get materialized into the IR. Those are exactly the cases where using zext nneg are most useful. We do run the risk of e.g. a missing combine - since we haven't updated most of them yet - but that seems like a manageable risk. Note that there are much deeper algorithmic changes we could make to this code to exploit zext nneg, but this seemed like a reasonable and low risk starting point.

zext nneg was recently added to the IR in #67982. This patch teaches demanded bits and known bits about the semantics of the instruction, and adds a couple of test cases to illustrate basic functionality.

@rotateright

This PR fixes #55013 : the max intrinsics is not generated for this simple loop case : https://godbolt.org/z/hxz1xhMPh. This is caused by a ICMP not being folded into a select, thus not generating the max intrinsics. For the story : Since LLVM 14, SCCP pass got smarter by folding sext into zext for positive ranges : https://reviews.llvm.org/D81756. After this change, InstCombine was sometimes unable to fold ICMP correctly as both of the arguments pointed to mismatched zext/sext. To fix this, @rotateright implemented this fix : https://reviews.llvm.org/D124419 that tries to resolve the mismatch by knowing if the argument of a zext is positive (in which case, it is like a sext) by using ValueTracking, however ValueTracking is not smart enough to infer that the value is positive in some cases. Recently, @nikic implemented #67982 which keeps the information that a zext is non-negative. This PR simply uses this information to do the folding accordingly. TLDR : This PR uses the recent nneg tag on zext to fold the icmp accordingly in instcombine. This PR also contains test cases for sext/zext folding with InstCombine as well as a x86 regression tests for the max/min case.

@rotateright

This PR fixes llvm#55013 : the max intrinsics is not generated for this simple loop case : https://godbolt.org/z/hxz1xhMPh. This is caused by a ICMP not being folded into a select, thus not generating the max intrinsics. For the story : Since LLVM 14, SCCP pass got smarter by folding sext into zext for positive ranges : https://reviews.llvm.org/D81756. After this change, InstCombine was sometimes unable to fold ICMP correctly as both of the arguments pointed to mismatched zext/sext. To fix this, @rotateright implemented this fix : https://reviews.llvm.org/D124419 that tries to resolve the mismatch by knowing if the argument of a zext is positive (in which case, it is like a sext) by using ValueTracking, however ValueTracking is not smart enough to infer that the value is positive in some cases. Recently, @nikic implemented llvm#67982 which keeps the information that a zext is non-negative. This PR simply uses this information to do the folding accordingly. TLDR : This PR uses the recent nneg tag on zext to fold the icmp accordingly in instcombine. This PR also contains test cases for sext/zext folding with InstCombine as well as a x86 regression tests for the max/min case.

…tions (#73592) This flag was added in #67982, but was not yet accessible via the C API. This commit adds a getter/setter for this flag, and a test for it.

nikic requested a review from topperc October 2, 2023 13:00

llvmbot added llvm:ir llvm:transforms labels Oct 2, 2023

nikic requested review from fhahn, preames and efriedma-quic October 2, 2023 13:03

asb reviewed Oct 9, 2023

View reviewed changes

dtcxzyw mentioned this pull request Oct 12, 2023

sext(square nsw) produces unnecessary zext on riscv64/mips64/loongarch64 #62587

Open

goldsteinn reviewed Oct 16, 2023

View reviewed changes

llvm/lib/AsmParser/LLParser.cpp Show resolved Hide resolved

goldsteinn reviewed Oct 16, 2023

View reviewed changes

llvm/lib/Bitcode/Reader/BitcodeReader.cpp Outdated Show resolved Hide resolved

goldsteinn reviewed Oct 16, 2023

View reviewed changes

karouzakisp and others added 3 commits October 18, 2023 09:35

Add additional description to overview

55d512b

Rename NonNegInstruction -> PossiblyNonNegInst

32a20d7

nikic force-pushed the zext-nneg branch from a545bac to 32a20d7 Compare October 18, 2023 07:47

topperc reviewed Oct 27, 2023

View reviewed changes

topperc approved these changes Oct 27, 2023

View reviewed changes

nikic merged commit ed3f06b into llvm:main Oct 30, 2023
4 checks passed

preames mentioned this pull request Oct 30, 2023

[InstCombine] Infer nneg on zext when forming from non-negative sext #70706

Merged

preames mentioned this pull request Oct 31, 2023

[IndVars] Generate zext nneg when locally obvious #70823

Closed

leo-ard mentioned this pull request Oct 31, 2023

[InstCombine] Use zext's nneg flag for icmp folding #70845

Merged

preames mentioned this pull request Oct 31, 2023

Add known and demanded bits support for zext nneg #70858

Merged

preames mentioned this pull request Nov 2, 2023

[instcombine] Drop zext nneg flag when simplify operand #71088

Merged

Benjins mentioned this pull request Nov 28, 2023

[C API] Add support for setting/getting new nneg flag on zext instructions #73592

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IR] Add zext nneg flag #67982

[IR] Add zext nneg flag #67982

nikic commented Oct 2, 2023 •

edited

llvmbot commented Oct 2, 2023 •

edited

nikic commented Oct 2, 2023 •

edited

nikic commented Oct 2, 2023

leo-ard commented Oct 2, 2023 •

edited

nikic commented Oct 3, 2023

nikic commented Oct 6, 2023

asb Oct 9, 2023

nikic Oct 10, 2023

nikic commented Oct 16, 2023

goldsteinn commented Oct 16, 2023

goldsteinn Oct 16, 2023

nikic Oct 17, 2023

goldsteinn Oct 17, 2023

nikic Oct 17, 2023

goldsteinn Oct 17, 2023

goldsteinn Oct 16, 2023

goldsteinn Oct 16, 2023

nikic Oct 18, 2023

goldsteinn Oct 18, 2023

nikic Oct 18, 2023

goldsteinn Oct 19, 2023

nikic commented Oct 26, 2023

goldsteinn commented Oct 27, 2023

topperc Oct 27, 2023 •

edited

nikic Oct 27, 2023

topperc left a comment

		@@ -11156,6 +11156,9 @@ until it reaches the size of the destination type, ``ty2``.

		When zero extending from i1, the result will always be either 0 or 1.

		If the ``nneg`` flag is set, and the ``zext`` argument is negative, the result

	if (OpNum < Record.size()) {
	if (Opc == Instruction::Add \|\|
	Opc == Instruction::Sub \|\|
	Opc == Instruction::Mul \|\|
	Opc == Instruction::Shl) {
	if (Record[OpNum] & (1 << bitc::OBO_NO_SIGNED_WRAP))
	cast<BinaryOperator>(I)->setHasNoSignedWrap(true);
	if (Record[OpNum] & (1 << bitc::OBO_NO_UNSIGNED_WRAP))
	cast<BinaryOperator>(I)->setHasNoUnsignedWrap(true);
	} else if (Opc == Instruction::SDiv \|\|
	Opc == Instruction::UDiv \|\|
	Opc == Instruction::LShr \|\|
	Opc == Instruction::AShr) {
	if (Record[OpNum] & (1 << bitc::PEO_EXACT))
	cast<BinaryOperator>(I)->setIsExact(true);

[IR] Add zext nneg flag #67982

[IR] Add zext nneg flag #67982

Conversation

nikic commented Oct 2, 2023 • edited

llvmbot commented Oct 2, 2023 • edited

nikic commented Oct 2, 2023 • edited

nikic commented Oct 2, 2023

leo-ard commented Oct 2, 2023 • edited

nikic commented Oct 3, 2023

nikic commented Oct 6, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikic commented Oct 16, 2023

goldsteinn commented Oct 16, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikic commented Oct 26, 2023

goldsteinn commented Oct 27, 2023

topperc Oct 27, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

topperc left a comment

Choose a reason for hiding this comment

nikic commented Oct 2, 2023 •

edited

llvmbot commented Oct 2, 2023 •

edited

nikic commented Oct 2, 2023 •

edited

leo-ard commented Oct 2, 2023 •

edited

topperc Oct 27, 2023 •

edited