Skip to content

[RISCV] Fix a correctness issue in optimizeCondBranch. Prevent optimizing compare with x0. NFC #145440

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 24, 2025

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Jun 23, 2025

We were incorrectly changing -1 to 0 for unsigned compares in case 1.
The comment incorrectly said UINT64_MAX is bigger than INT64_MAX, but
we were doing a signed compare and UINT64_MAX is smaller than INT64_MAX
in signed space.

Prevent changing 0 constants since we can use x0. The test cases
for these are contrived to use addi rd, x0, 0. We're more likely
to have a COPY from x0 which we already don't optimize for other
reasons.

Check if registers are virtual before calling hasOneUse. The use count
for physical registers is meaningless.

topperc added 2 commits June 23, 2025 16:11
…zing compare with x0. NFC

We were incorrectly changing -1 to 0 for unsigned compares in case 1.
The comment incorrectly said UINT64_MAX is bigger than INT64_MAX, but
we were doing a signed compare and UINT64_MAX is smaller than INT64_MAX
in signed space.

Prevent changing 0 constants since we can use x0. The test cases
for these are contrived to use addi rd, x0, 0. We're more likely
to have a COPY from x0 which we already don't optimize for other
reasons.

Check if registers are virtual before calling hasOneUse.
@llvmbot
Copy link
Member

llvmbot commented Jun 23, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Craig Topper (topperc)

Changes

We were incorrectly changing -1 to 0 for unsigned compares in case 1.
The comment incorrectly said UINT64_MAX is bigger than INT64_MAX, but
we were doing a signed compare and UINT64_MAX is smaller than INT64_MAX
in signed space.

Prevent changing 0 constants since we can use x0. The test cases
for these are contrived to use addi rd, x0, 0. We're more likely
to have a COPY from x0 which we already don't optimize for other
reasons.

Check if registers are virtual before calling hasOneUse. The use count
for physical registers is meaningless.


Full diff: https://github.com/llvm/llvm-project/pull/145440.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+15-8)
  • (modified) llvm/test/CodeGen/RISCV/branch-opt.mir (+308)
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index d9ef911b9a32e..074c74883e172 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -1449,11 +1449,14 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
     return Register();
   };
 
-  if (isFromLoadImm(MRI, LHS, C0) && MRI.hasOneUse(LHS.getReg())) {
+  if (isFromLoadImm(MRI, LHS, C0) && LHS.getReg().isVirtual() &&
+      MRI.hasOneUse(LHS.getReg())) {
+    assert(isInt<12>(C0) && "Unexpected immediate");
     // Might be case 1.
-    // Signed integer overflow is UB. (UINT64_MAX is bigger so we don't need
-    // to worry about unsigned overflow here)
-    if (C0 < INT64_MAX)
+    // Don't change 0 to -1 since we can use x0.
+    // For unsigned cases changing -1U to 0 would be incorrect.
+    if (C0 &&
+        ((CC == RISCVCC::COND_GE || CC == RISCVCC::COND_LT) || C0 != -1)) {
       if (Register RegZ = searchConst(C0 + 1)) {
         reverseBranchCondition(Cond);
         Cond[1] = MachineOperand::CreateReg(RHS.getReg(), /*isDef=*/false);
@@ -1464,11 +1467,14 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
         modifyBranch();
         return true;
       }
-  } else if (isFromLoadImm(MRI, RHS, C0) && MRI.hasOneUse(RHS.getReg())) {
+    }
+  } else if (isFromLoadImm(MRI, RHS, C0) && RHS.getReg().isVirtual() &&
+             MRI.hasOneUse(RHS.getReg())) {
+    assert(isInt<12>(C0) && "Unexpected immediate");
     // Might be case 2.
-    // For unsigned cases, we don't want C1 to wrap back to UINT64_MAX
-    // when C0 is zero.
-    if ((CC == RISCVCC::COND_GE || CC == RISCVCC::COND_LT) || C0)
+    // For signed cases we don't want to change 0 since we can use x0.
+    // For unsigned cases changing 0 to -1U would be incorrect.
+    if (C0) {
       if (Register RegZ = searchConst(C0 - 1)) {
         reverseBranchCondition(Cond);
         Cond[1] = MachineOperand::CreateReg(RegZ, /*isDef=*/false);
@@ -1479,6 +1485,7 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) const {
         modifyBranch();
         return true;
       }
+    }
   }
 
   return false;
diff --git a/llvm/test/CodeGen/RISCV/branch-opt.mir b/llvm/test/CodeGen/RISCV/branch-opt.mir
index ba3a20f2fbfcd..76ade18ce5161 100644
--- a/llvm/test/CodeGen/RISCV/branch-opt.mir
+++ b/llvm/test/CodeGen/RISCV/branch-opt.mir
@@ -18,6 +18,74 @@
     ret void
   }
 
+  define void @ule_negone(ptr %a, i32 signext %b, ptr %c, ptr %d) {
+    store i32 0, ptr %a
+    %p = icmp ule i32 %b, -1
+    br i1 %p, label %block1, label %block2
+
+  block1:                                           ; preds = %0
+    store i32 %b, ptr %c
+    br label %end_block
+
+  block2:                                           ; preds = %0
+    store i32 87, ptr %d
+    br label %end_block
+
+  end_block:                                        ; preds = %block2, %block1
+    ret void
+  }
+
+  define void @ult_zero(ptr %a, i32 signext %b, ptr %c, ptr %d) {
+    store i32 -1, ptr %a
+    %p = icmp ult i32 %b, 0
+    br i1 %p, label %block1, label %block2
+
+  block1:                                           ; preds = %0
+    store i32 %b, ptr %c
+    br label %end_block
+
+  block2:                                           ; preds = %0
+    store i32 87, ptr %d
+    br label %end_block
+
+  end_block:                                        ; preds = %block2, %block1
+    ret void
+  }
+
+  define void @sle_zero(ptr %a, i32 signext %b, ptr %c, ptr %d) {
+    store i32 1, ptr %a
+    %p = icmp sle i32 %b, 0
+    br i1 %p, label %block1, label %block2
+
+  block1:                                           ; preds = %0
+    store i32 %b, ptr %c
+    br label %end_block
+
+  block2:                                           ; preds = %0
+    store i32 87, ptr %d
+    br label %end_block
+
+  end_block:                                        ; preds = %block2, %block1
+    ret void
+  }
+
+  define void @slt_zero(ptr %a, i32 signext %b, ptr %c, ptr %d) {
+    store i32 -1, ptr %a
+    %p = icmp slt i32 %b, 0
+    br i1 %p, label %block1, label %block2
+
+  block1:                                           ; preds = %0
+    store i32 %b, ptr %c
+    br label %end_block
+
+  block2:                                           ; preds = %0
+    store i32 87, ptr %d
+    br label %end_block
+
+  end_block:                                        ; preds = %block2, %block1
+    ret void
+  }
+
   declare void @bar(...)
 
 ...
@@ -66,3 +134,243 @@ body:             |
     PseudoRET
 
 ...
+---
+name:            ule_negone
+tracksRegLiveness: true
+body:             |
+  ; CHECK-LABEL: name: ule_negone
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT:   liveins: $x10, $x11, $x12, $x13
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gpr = COPY $x13
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gpr = COPY $x12
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gpr = COPY $x11
+  ; CHECK-NEXT:   [[COPY3:%[0-9]+]]:gpr = COPY $x10
+  ; CHECK-NEXT:   [[ADDI:%[0-9]+]]:gpr = ADDI $x0, 0
+  ; CHECK-NEXT:   SW killed [[ADDI]], [[COPY3]], 0 :: (store (s32))
+  ; CHECK-NEXT:   [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, -1
+  ; CHECK-NEXT:   BLTU killed [[ADDI1]], [[COPY2]], %bb.2
+  ; CHECK-NEXT:   PseudoBR %bb.1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY4:%[0-9]+]]:gpr = COPY [[COPY2]]
+  ; CHECK-NEXT:   SW [[COPY4]], [[COPY1]], 0 :: (store (s32))
+  ; CHECK-NEXT:   PseudoBR %bb.3
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[ADDI2:%[0-9]+]]:gpr = ADDI $x0, 87
+  ; CHECK-NEXT:   SW killed [[ADDI2]], [[COPY]], 0 :: (store (s32))
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.3:
+  ; CHECK-NEXT:   PseudoRET
+  bb.0:
+    successors: %bb.1, %bb.2
+    liveins: $x10, $x11, $x12, $x13
+
+    %3:gpr = COPY $x13
+    %2:gpr = COPY $x12
+    %1:gpr = COPY $x11
+    %0:gpr = COPY $x10
+    %5:gpr = ADDI $x0, 0
+    SW killed %5, %0, 0 :: (store (s32))
+    %6:gpr = ADDI $x0, -1
+    BLTU killed %6, %1, %bb.2
+    PseudoBR %bb.1
+
+  bb.1:
+    %4:gpr = COPY %1
+    SW %4, %2, 0 :: (store (s32))
+    PseudoBR %bb.3
+
+  bb.2:
+    %7:gpr = ADDI $x0, 87
+    SW killed %7, %3, 0 :: (store (s32))
+
+  bb.3:
+    PseudoRET
+...
+---
+name:            ult_zero
+tracksRegLiveness: true
+body:             |
+  ; CHECK-LABEL: name: ult_zero
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT:   liveins: $x10, $x11, $x12, $x13
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gpr = COPY $x13
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gpr = COPY $x12
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gpr = COPY $x11
+  ; CHECK-NEXT:   [[COPY3:%[0-9]+]]:gpr = COPY $x10
+  ; CHECK-NEXT:   [[ADDI:%[0-9]+]]:gpr = ADDI $x0, -1
+  ; CHECK-NEXT:   SW killed [[ADDI]], [[COPY3]], 0 :: (store (s32))
+  ; CHECK-NEXT:   [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
+  ; CHECK-NEXT:   BLTU [[COPY2]], killed [[ADDI1]], %bb.2
+  ; CHECK-NEXT:   PseudoBR %bb.1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY4:%[0-9]+]]:gpr = COPY [[COPY2]]
+  ; CHECK-NEXT:   SW [[COPY4]], [[COPY1]], 0 :: (store (s32))
+  ; CHECK-NEXT:   PseudoBR %bb.3
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[ADDI2:%[0-9]+]]:gpr = ADDI $x0, 87
+  ; CHECK-NEXT:   SW killed [[ADDI2]], [[COPY]], 0 :: (store (s32))
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.3:
+  ; CHECK-NEXT:   PseudoRET
+  bb.0:
+    successors: %bb.1, %bb.2
+    liveins: $x10, $x11, $x12, $x13
+
+    %3:gpr = COPY $x13
+    %2:gpr = COPY $x12
+    %1:gpr = COPY $x11
+    %0:gpr = COPY $x10
+    %5:gpr = ADDI $x0, -1
+    SW killed %5, %0, 0 :: (store (s32))
+    %6:gpr = ADDI $x0, 0
+    BLTU %1, killed %6, %bb.2
+    PseudoBR %bb.1
+
+  bb.1:
+    %4:gpr = COPY %1
+    SW %4, %2, 0 :: (store (s32))
+    PseudoBR %bb.3
+
+  bb.2:
+    %7:gpr = ADDI $x0, 87
+    SW killed %7, %3, 0 :: (store (s32))
+
+  bb.3:
+    PseudoRET
+...
+---
+name:            sle_zero
+tracksRegLiveness: true
+body:             |
+  ; CHECK-LABEL: name: sle_zero
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT:   liveins: $x10, $x11, $x12, $x13
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gpr = COPY $x13
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gpr = COPY $x12
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gpr = COPY $x11
+  ; CHECK-NEXT:   [[COPY3:%[0-9]+]]:gpr = COPY $x10
+  ; CHECK-NEXT:   [[ADDI:%[0-9]+]]:gpr = ADDI $x0, 1
+  ; CHECK-NEXT:   SW killed [[ADDI]], [[COPY3]], 0 :: (store (s32))
+  ; CHECK-NEXT:   [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
+  ; CHECK-NEXT:   BLT killed [[ADDI1]], [[COPY2]], %bb.2
+  ; CHECK-NEXT:   PseudoBR %bb.1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY4:%[0-9]+]]:gpr = COPY [[COPY2]]
+  ; CHECK-NEXT:   SW [[COPY4]], [[COPY1]], 0 :: (store (s32))
+  ; CHECK-NEXT:   PseudoBR %bb.3
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[ADDI2:%[0-9]+]]:gpr = ADDI $x0, 87
+  ; CHECK-NEXT:   SW killed [[ADDI2]], [[COPY]], 0 :: (store (s32))
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.3:
+  ; CHECK-NEXT:   PseudoRET
+  bb.0:
+    successors: %bb.1, %bb.2
+    liveins: $x10, $x11, $x12, $x13
+
+    %3:gpr = COPY $x13
+    %2:gpr = COPY $x12
+    %1:gpr = COPY $x11
+    %0:gpr = COPY $x10
+    %5:gpr = ADDI $x0, 1
+    SW killed %5, %0, 0 :: (store (s32))
+    %6:gpr = ADDI $x0, 0
+    BLT killed %6, %1, %bb.2
+    PseudoBR %bb.1
+
+  bb.1:
+    %4:gpr = COPY %1
+    SW %4, %2, 0 :: (store (s32))
+    PseudoBR %bb.3
+
+  bb.2:
+    %7:gpr = ADDI $x0, 87
+    SW killed %7, %3, 0 :: (store (s32))
+
+  bb.3:
+    PseudoRET
+...
+---
+name:            slt_zero
+tracksRegLiveness: true
+body:             |
+  ; CHECK-LABEL: name: slt_zero
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT:   liveins: $x10, $x11, $x12, $x13
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gpr = COPY $x13
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gpr = COPY $x12
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gpr = COPY $x11
+  ; CHECK-NEXT:   [[COPY3:%[0-9]+]]:gpr = COPY $x10
+  ; CHECK-NEXT:   [[ADDI:%[0-9]+]]:gpr = ADDI $x0, -1
+  ; CHECK-NEXT:   SW killed [[ADDI]], [[COPY3]], 0 :: (store (s32))
+  ; CHECK-NEXT:   [[ADDI1:%[0-9]+]]:gpr = ADDI $x0, 0
+  ; CHECK-NEXT:   BLT [[COPY2]], killed [[ADDI1]], %bb.2
+  ; CHECK-NEXT:   PseudoBR %bb.1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY4:%[0-9]+]]:gpr = COPY [[COPY2]]
+  ; CHECK-NEXT:   SW [[COPY4]], [[COPY1]], 0 :: (store (s32))
+  ; CHECK-NEXT:   PseudoBR %bb.3
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[ADDI2:%[0-9]+]]:gpr = ADDI $x0, 87
+  ; CHECK-NEXT:   SW killed [[ADDI2]], [[COPY]], 0 :: (store (s32))
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.3:
+  ; CHECK-NEXT:   PseudoRET
+  bb.0:
+    successors: %bb.1, %bb.2
+    liveins: $x10, $x11, $x12, $x13
+
+    %3:gpr = COPY $x13
+    %2:gpr = COPY $x12
+    %1:gpr = COPY $x11
+    %0:gpr = COPY $x10
+    %5:gpr = ADDI $x0, -1
+    SW killed %5, %0, 0 :: (store (s32))
+    %6:gpr = ADDI $x0, 0
+    BLT %1, killed %6, %bb.2
+    PseudoBR %bb.1
+
+  bb.1:
+    %4:gpr = COPY %1
+    SW %4, %2, 0 :: (store (s32))
+    PseudoBR %bb.3
+
+  bb.2:
+    %7:gpr = ADDI $x0, 87
+    SW killed %7, %3, 0 :: (store (s32))
+
+  bb.3:
+    PseudoRET
+...

// Don't change 0 to -1 since we can use x0.
// For unsigned cases changing -1U to 0 would be incorrect.
if (C0 &&
((CC == RISCVCC::COND_GE || CC == RISCVCC::COND_LT) || C0 != -1)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(A || B) || C is just A || B || C isn't it?

// Signed integer overflow is UB. (UINT64_MAX is bigger so we don't need
// to worry about unsigned overflow here)
if (C0 < INT64_MAX)
// Don't change 0 to -1 since we can use x0.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems off? We'd convert it to +1 wouldn't we?

if (C0 < INT64_MAX)
// Don't change 0 to -1 since we can use x0.
// For unsigned cases changing -1U to 0 would be incorrect.
if (C0 &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style wise, folding the C0 is non-zero check into the prior if-clause might make the code easier to read in both cases.

if ((CC == RISCVCC::COND_GE || CC == RISCVCC::COND_LT) || C0)
// For signed cases we don't want to change 0 since we can use x0.
// For unsigned cases changing 0 to -1U would be incorrect.
if (C0) {
if (Register RegZ = searchConst(C0 - 1)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, staring at this, I'm questioning my sanity. Isn't this just completely unsound?

Given X <=u 22, this produces 21 >u X doesn't it?

Isn't that just complete wrong for X = 22?

Copy link
Collaborator Author

@topperc topperc Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can only be x < 22 or x >= 22 here. LT/GE are the only valid condition codes. Which will become 21 >= x or 21 < x.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is that precondition established?

Copy link
Collaborator Author

@topperc topperc Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RISC-V ISA. There are 6 branch instructions. EQ, NE, BLT, BGE, BLTU, BGEU.

EQ/NE are excluded earlier. This code handles the remaining 4.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, revised example.

BLT X, INT_MIN ==> BGE INT_MAX, X

The former is statically false. The later is statically true.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Protected by the assert(isInt<12>(C0)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! That was the bit I was missing. We can't have INT_MIN here until we handle LUI/ADDI sequences.

// Signed integer overflow is UB. (UINT64_MAX is bigger so we don't need
// to worry about unsigned overflow here)
if (C0 < INT64_MAX)
// Don't change 0 to -1 since we can use x0.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Don't change 0 to -1 since we can use x0.
// Don't change 0 to 1 since we can use x0.

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@mshockwave mshockwave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Good catch

if (isFromLoadImm(MRI, LHS, C0) && C0 != 0 && LHS.getReg().isVirtual() &&
MRI.hasOneUse(LHS.getReg()) &&
(CC == RISCVCC::COND_GE || CC == RISCVCC::COND_LT || C0 != -1)) {
assert(isInt<12>(C0) && "Unexpected immediate");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was having the same question with Philip as where did we prevent sign overflow from happening, so perhaps we can leave a comment here?

@topperc topperc merged commit 48a21e6 into llvm:main Jun 24, 2025
4 of 6 checks passed
@topperc topperc deleted the pr/optcondbranch branch June 24, 2025 22:45
anthonyhatran pushed a commit to anthonyhatran/llvm-project that referenced this pull request Jun 26, 2025
…zing compare with x0. NFC (llvm#145440)

We were incorrectly changing -1 to 0 for unsigned compares in case 1.
The comment incorrectly said UINT64_MAX is bigger than INT64_MAX, but
we were doing a signed compare and UINT64_MAX is smaller than INT64_MAX
in signed space.

Prevent changing 0 constants since we can use x0. The test cases
for these are contrived to use addi rd, x0, 0. We're more likely
to have a COPY from x0 which we already don't optimize for other
reasons.

Check if registers are virtual before calling hasOneUse. The use count
for physical registers is meaningless.
rlavaee pushed a commit to rlavaee/llvm-project that referenced this pull request Jul 1, 2025
…zing compare with x0. NFC (llvm#145440)

We were incorrectly changing -1 to 0 for unsigned compares in case 1.
The comment incorrectly said UINT64_MAX is bigger than INT64_MAX, but
we were doing a signed compare and UINT64_MAX is smaller than INT64_MAX
in signed space.

Prevent changing 0 constants since we can use x0. The test cases
for these are contrived to use addi rd, x0, 0. We're more likely
to have a COPY from x0 which we already don't optimize for other
reasons.

Check if registers are virtual before calling hasOneUse. The use count
for physical registers is meaningless.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants