Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions llvm/lib/CodeGen/RegAllocEvictionAdvisor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,7 @@ bool DefaultEvictionAdvisor::canEvictHintInterference(
const LiveInterval &VirtReg, MCRegister PhysReg,
const SmallVirtRegSet &FixedRegisters) const {
EvictionCost MaxCost;
MaxCost.setBrokenHints(1);
MaxCost.setBrokenHints(MRI->getRegClass(VirtReg.reg())->getCopyCost());
return canEvictInterferenceBasedOnCost(VirtReg, PhysReg, true, MaxCost,
FixedRegisters);
}
Expand Down Expand Up @@ -300,12 +300,14 @@ bool DefaultEvictionAdvisor::canEvictInterferenceBasedOnCost(
return false;
// We permit breaking cascades for urgent evictions. It should be the
// last resort, though, so make it really expensive.
Cost.BrokenHints += 10;
Cost.BrokenHints += 10 * MRI->getRegClass(Intf->reg())->getCopyCost();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arsenm , is this change considering that getCopyCost may return -1 to indicate "extremely high cost".
IIUC that actually would result in subtracting 10 instead of adding a huge penalty here.

(Context is that I'm debugging some downstream problems, as we've been seeing a number of "ran out of registers" failures after merging this PR. Haven't gotten that far in the debugging yet. But the question above popped up.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a hack when I changed all three getCopyCost() calls added here to basically do this (pseudo code),
getCopyCost() < 0 ? 100 : getCopyCost()
and then at least the "ran out of registers" fault I was looking at disappeared (and I got the same result as before this patch).

Our target has some quad-registers that we define with CopyCost=-1. So I figure that is what caused the problem here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe -1 was supposed to mean impossible cost, not very high

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this should probably be an unsigned value

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we should explicitly use saturating arithmetic here. Either way, probably worth a revert and reapply with a fix.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so what do we do here? CopyCost=-1 is used in several places.

@preames suggested a revert and reapply with fix. Will you do that @arsenm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning towards no and just rejecting using -1 as a CopyCost. I haven't found anywhere that does anything with that as an impossible value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the in-tree cases using -1 copy costs appear to be cargo-cult application to non-allocatable physical registers. The only place making use of the negative cost check is in InstrEmitter, which is probably better served by a non-allocatable check. I actually see a number of test improvements when I swap this out for the direct allocatability check

}
// Would this break a satisfied hint?
bool BreaksHint = VRM->hasPreferredPhys(Intf->reg());
// Update eviction cost.
Cost.BrokenHints += BreaksHint;
if (BreaksHint)
Cost.BrokenHints += MRI->getRegClass(Intf->reg())->getCopyCost();

Cost.MaxWeight = std::max(Cost.MaxWeight, Intf->weight());
// Abort if this would be too expensive.
if (Cost >= MaxCost)
Expand Down
19 changes: 8 additions & 11 deletions llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.dim.gfx90a.ll
Original file line number Diff line number Diff line change
Expand Up @@ -18,22 +18,19 @@ define amdgpu_ps <4 x float> @load_1d_lwe(<8 x i32> inreg %rsrc, ptr addrspace(1
; GCN-LABEL: load_1d_lwe:
; GCN: ; %bb.0: ; %main_body
; GCN-NEXT: v_mov_b32_e32 v8, 0
; GCN-NEXT: v_mov_b32_e32 v6, v0
; GCN-NEXT: v_mov_b32_e32 v9, v8
; GCN-NEXT: v_mov_b32_e32 v10, v8
; GCN-NEXT: v_mov_b32_e32 v11, v8
; GCN-NEXT: v_mov_b32_e32 v12, v8
; GCN-NEXT: v_mov_b32_e32 v2, v8
; GCN-NEXT: v_mov_b32_e32 v3, v9
; GCN-NEXT: v_mov_b32_e32 v4, v10
; GCN-NEXT: v_mov_b32_e32 v5, v11
; GCN-NEXT: v_mov_b32_e32 v6, v12
; GCN-NEXT: image_load v[2:6], v0, s[0:7] dmask:0xf unorm lwe
; GCN-NEXT: v_mov_b32_e32 v0, v8
; GCN-NEXT: v_mov_b32_e32 v1, v9
; GCN-NEXT: v_mov_b32_e32 v2, v10
; GCN-NEXT: v_mov_b32_e32 v3, v11
; GCN-NEXT: v_mov_b32_e32 v4, v12
; GCN-NEXT: image_load v[0:4], v6, s[0:7] dmask:0xf unorm lwe
; GCN-NEXT: s_waitcnt vmcnt(0)
; GCN-NEXT: v_mov_b32_e32 v0, v2
; GCN-NEXT: v_mov_b32_e32 v1, v3
; GCN-NEXT: v_mov_b32_e32 v2, v4
; GCN-NEXT: v_mov_b32_e32 v3, v5
; GCN-NEXT: global_store_dword v8, v6, s[8:9]
; GCN-NEXT: global_store_dword v8, v4, s[8:9]
; GCN-NEXT: s_waitcnt vmcnt(0)
; GCN-NEXT: ; return to shader part epilog
main_body:
Expand Down
3,112 changes: 1,556 additions & 1,556 deletions llvm/test/CodeGen/RISCV/rvv/vloxseg-rv32.ll

Large diffs are not rendered by default.

3,784 changes: 1,892 additions & 1,892 deletions llvm/test/CodeGen/RISCV/rvv/vloxseg-rv64.ll

Large diffs are not rendered by default.

3,112 changes: 1,556 additions & 1,556 deletions llvm/test/CodeGen/RISCV/rvv/vluxseg-rv32.ll

Large diffs are not rendered by default.

3,812 changes: 1,906 additions & 1,906 deletions llvm/test/CodeGen/RISCV/rvv/vluxseg-rv64.ll

Large diffs are not rendered by default.

7 changes: 3 additions & 4 deletions llvm/test/CodeGen/RISCV/zilsd.ll
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,9 @@
define i64 @load(ptr %a) nounwind {
; CHECK-LABEL: load:
; CHECK: # %bb.0:
; CHECK-NEXT: ld a2, 80(a0)
; CHECK-NEXT: ld zero, 0(a0)
; CHECK-NEXT: mv a0, a2
; CHECK-NEXT: mv a1, a3
; CHECK-NEXT: mv a2, a0
; CHECK-NEXT: ld a0, 80(a0)
; CHECK-NEXT: ld zero, 0(a2)
; CHECK-NEXT: ret
%1 = getelementptr i64, ptr %a, i32 10
%2 = load i64, ptr %1
Expand Down