Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CodeGen] Renumber slot indexes before register allocation #66334

Closed
wants to merge 1 commit into from
Closed

[CodeGen] Renumber slot indexes before register allocation #66334

wants to merge 1 commit into from

Conversation

jayfoad
Copy link
Contributor

@jayfoad jayfoad commented Sep 14, 2023

RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate
the length of a live range for its heuristics. Renumbering all slot
indexes with the default instruction distance ensures that this estimate
will be as accurate as possible, and will not depend on the history of
how instructions have been added to and removed from SlotIndexes's maps.

This also means that enabling -early-live-intervals, which runs the
SlotIndexes analysis earlier, will not cause large amounts of churn due
to different register allocator decisions.

@jayfoad
Copy link
Contributor Author

jayfoad commented Sep 14, 2023

https://llvm-compile-time-tracker.com/ shows a geomean 0.00% change.

TODO: I still need to update ~15 tests with manual checks.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 15, 2023

@llvm/pr-subscribers-llvm-transforms

Changes RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate the length of a live range for its heuristics. Renumbering all slot indexes with the default instruction distance ensures that this estimate will be as accurate as possible, and will not depend on the history of how instructions have been added to and removed from SlotIndexes's maps.

This also means that enabling -early-live-intervals, which runs the
SlotIndexes analysis earlier, will not cause large amounts of churn due
to different register allocator decisions.

--

Patch is 31.92 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/66334.diff

330 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/SlotIndexes.h (+3)
  • (modified) llvm/lib/CodeGen/RegAllocGreedy.cpp (+3)
  • (modified) llvm/lib/CodeGen/SlotIndexes.cpp (+5)
  • (modified) llvm/test/CodeGen/AArch64/active_lane_mask.ll (+45-45)
  • (modified) llvm/test/CodeGen/AArch64/arm64-addr-type-promotion.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/arm64-cse.ll (+5-5)
  • (modified) llvm/test/CodeGen/AArch64/arm64-shrink-wrapping.ll (+22-22)
  • (modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-uniform-cases.ll (+11-11)
  • (modified) llvm/test/CodeGen/AArch64/extbinopload.ll (+73-73)
  • (modified) llvm/test/CodeGen/AArch64/faddp-half.ll (+5-5)
  • (modified) llvm/test/CodeGen/AArch64/fcvt_combine.ll (+42-42)
  • (modified) llvm/test/CodeGen/AArch64/fdiv.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/fpow.ll (+93-93)
  • (modified) llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll (+252-252)
  • (modified) llvm/test/CodeGen/AArch64/fptoui-sat-vector.ll (+88-90)
  • (modified) llvm/test/CodeGen/AArch64/frem.ll (+93-93)
  • (modified) llvm/test/CodeGen/AArch64/llvm.exp10.ll (+5-6)
  • (modified) llvm/test/CodeGen/AArch64/neon-dotreduce.ll (+515-515)
  • (modified) llvm/test/CodeGen/AArch64/neon-extadd.ll (+26-26)
  • (modified) llvm/test/CodeGen/AArch64/pow.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/ragreedy-local-interval-cost.ll (+109-116)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-length-masked-scatter.ll (+18-18)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-length-shuffles.ll (+26-26)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-to-int.ll (+50-50)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-div.ll (+40-40)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll (+74-72)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-permute-zip-uzp-trn.ll (+95-95)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll (+11-12)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll (+204-204)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/sdiv.i64.ll (+1439-1434)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll (+119-119)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/srem.i64.ll (+1066-1062)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll (+12-12)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/udiv.i64.ll (+312-312)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll (+297-296)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll (+268-268)
  • (modified) llvm/test/CodeGen/AMDGPU/agpr-copy-no-free-registers.ll (+18-18)
  • (modified) llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll (+79-79)
  • (modified) llvm/test/CodeGen/AMDGPU/branch-folding-implicit-def-subreg.ll (+100-100)
  • (modified) llvm/test/CodeGen/AMDGPU/bswap.ll (+44-44)
  • (modified) llvm/test/CodeGen/AMDGPU/bug-sdag-emitcopyfromreg.ll (+7-7)
  • (modified) llvm/test/CodeGen/AMDGPU/bypass-div.ll (+103-103)
  • (modified) llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll (+19-19)
  • (modified) llvm/test/CodeGen/AMDGPU/ctpop16.ll (+8-8)
  • (modified) llvm/test/CodeGen/AMDGPU/cttz_zero_undef.ll (+10-10)
  • (modified) llvm/test/CodeGen/AMDGPU/extract-subvector-16bit.ll (+41-41)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-scratch-init.ll (+4)
  • (modified) llvm/test/CodeGen/AMDGPU/fp_to_sint.ll (+8-8)
  • (modified) llvm/test/CodeGen/AMDGPU/fsqrt.f32.ll (+24-24)
  • (modified) llvm/test/CodeGen/AMDGPU/gfx-callable-return-types.ll (+243-242)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i32_system.ll (+24-24)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i64_system.ll (+24-24)
  • (modified) llvm/test/CodeGen/AMDGPU/half.ll (+90-90)
  • (modified) llvm/test/CodeGen/AMDGPU/idot8s.ll (+64-64)
  • (modified) llvm/test/CodeGen/AMDGPU/indirect-call.ll (+6-6)
  • (modified) llvm/test/CodeGen/AMDGPU/insert-delay-alu-bug.ll (+9-9)
  • (modified) llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll (+95-95)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll (+170-170)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.exp2.ll (+13-13)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.log2.ll (+13-13)
  • (modified) llvm/test/CodeGen/AMDGPU/load-constant-i1.ll (+1323-1318)
  • (modified) llvm/test/CodeGen/AMDGPU/load-constant-i16.ll (+333-333)
  • (modified) llvm/test/CodeGen/AMDGPU/load-constant-i32.ll (+56-54)
  • (modified) llvm/test/CodeGen/AMDGPU/load-constant-i8.ll (+554-550)
  • (modified) llvm/test/CodeGen/AMDGPU/load-global-i16.ll (+1059-1053)
  • (modified) llvm/test/CodeGen/AMDGPU/mul.ll (+41-41)
  • (modified) llvm/test/CodeGen/AMDGPU/pr51516.mir (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll (+10-10)
  • (modified) llvm/test/CodeGen/AMDGPU/rsq.f32.ll (+60-60)
  • (modified) llvm/test/CodeGen/AMDGPU/scc-clobbered-sgpr-to-vmem-spill.ll (+125-125)
  • (modified) llvm/test/CodeGen/AMDGPU/sdiv.ll (+168-168)
  • (modified) llvm/test/CodeGen/AMDGPU/sdiv64.ll (+19-19)
  • (modified) llvm/test/CodeGen/AMDGPU/select.f16.ll (+15-15)
  • (modified) llvm/test/CodeGen/AMDGPU/si-spill-sgpr-stack.ll (+4-1)
  • (modified) llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll (+27-30)
  • (modified) llvm/test/CodeGen/AMDGPU/spill-vgpr.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/srem64.ll (+51-51)
  • (modified) llvm/test/CodeGen/AMDGPU/swdev373493.ll (+9-9)
  • (modified) llvm/test/CodeGen/AMDGPU/swdev380865.ll (+2-4)
  • (modified) llvm/test/CodeGen/AMDGPU/udiv.ll (+48-48)
  • (modified) llvm/test/CodeGen/AMDGPU/urem64.ll (+15-15)
  • (modified) llvm/test/CodeGen/AMDGPU/vni8-across-blocks.ll (+18-18)
  • (modified) llvm/test/CodeGen/AMDGPU/wave32.ll (+29-29)
  • (modified) llvm/test/CodeGen/Hexagon/atomicrmw-uinc-udec-wrap.ll (+6-6)
  • (modified) llvm/test/CodeGen/Hexagon/autohvx/fp-to-int.ll (+62-60)
  • (modified) llvm/test/CodeGen/Hexagon/autohvx/int-to-fp.ll (+417-423)
  • (modified) llvm/test/CodeGen/Hexagon/autohvx/isel-truncate.ll (+4-4)
  • (modified) llvm/test/CodeGen/Hexagon/autohvx/vmpy-parts.ll (+17-17)
  • (modified) llvm/test/CodeGen/Hexagon/ifcvt-diamond-bug-2016-08-26.ll (-1)
  • (modified) llvm/test/CodeGen/Hexagon/ntstbit.ll (+3-3)
  • (modified) llvm/test/CodeGen/Hexagon/signext-inreg.ll (+23-23)
  • (modified) llvm/test/CodeGen/Hexagon/swp-conv3x3-nested.ll (+2-1)
  • (modified) llvm/test/CodeGen/Hexagon/swp-stages4.ll (+1)
  • (modified) llvm/test/CodeGen/PowerPC/all-atomics.ll (+29-29)
  • (modified) llvm/test/CodeGen/PowerPC/atomics.ll (+21-21)
  • (modified) llvm/test/CodeGen/PowerPC/inc-of-add.ll (+71-69)
  • (modified) llvm/test/CodeGen/PowerPC/ldst-16-byte.mir (+112-107)
  • (modified) llvm/test/CodeGen/PowerPC/loop-instr-form-prepare.ll (+12-12)
  • (modified) llvm/test/CodeGen/PowerPC/more-dq-form-prepare.ll (+119-119)
  • (modified) llvm/test/CodeGen/PowerPC/ppc64-P9-vabsd.ll (+159-159)
  • (modified) llvm/test/CodeGen/PowerPC/sat-add.ll (+15-15)
  • (modified) llvm/test/CodeGen/PowerPC/srem-vector-lkk.ll (+23-23)
  • (modified) llvm/test/CodeGen/PowerPC/sub-of-not.ll (+71-69)
  • (modified) llvm/test/CodeGen/PowerPC/tocSaveInPrologue.ll (+9-9)
  • (modified) llvm/test/CodeGen/PowerPC/umulo-128-legalisation-lowering.ll (+22-22)
  • (modified) llvm/test/CodeGen/PowerPC/urem-vector-lkk.ll (+10-10)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp32_to_i16_elts.ll (+186-186)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp32_to_i8_elts.ll (+102-102)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp64_to_i16_elts.ll (+118-118)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp64_to_i8_elts.ll (+122-122)
  • (modified) llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll (+48-48)
  • (modified) llvm/test/CodeGen/RISCV/branch-relaxation.ll (+211-207)
  • (modified) llvm/test/CodeGen/RISCV/callee-saved-gprs.ll (+96-96)
  • (modified) llvm/test/CodeGen/RISCV/early-clobber-tied-def-subreg-liveness.ll (+7-7)
  • (modified) llvm/test/CodeGen/RISCV/fpclamptosat.ll (+32-32)
  • (modified) llvm/test/CodeGen/RISCV/mul.ll (+28-28)
  • (modified) llvm/test/CodeGen/RISCV/overflow-intrinsics.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/push-pop-popret.ll (+144-144)
  • (modified) llvm/test/CodeGen/RISCV/rv32zbb-zbkb.ll (+17-17)
  • (modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-sdnode.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/bswap-sdnode.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll (+27-19)
  • (modified) llvm/test/CodeGen/RISCV/rvv/ctpop-vp.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ceil-vp.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctlz-vp.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctpop-vp.ll (+31-82)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-cttz-vp.ll (+90-90)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-floor-vp.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-explodevector.ll (+43-43)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll (+87-100)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll (+26-26)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-store-int.ll (+24-11)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-nearbyint-vp.ll (+29-13)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-rint-vp.ll (+37-26)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-round-vp.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundeven-vp.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundtozero-vp.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-setcc-int-vp.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vscale-range.ll (+38-26)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect-vp.ll (+22-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/floor-vp.ll (+27-19)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fpclamptosat_vec.ll (+44-44)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fshr-fshl-vp.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/nearbyint-vp.ll (+51-16)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rint-vp.ll (+32-21)
  • (modified) llvm/test/CodeGen/RISCV/rvv/round-vp.ll (+27-19)
  • (modified) llvm/test/CodeGen/RISCV/rvv/roundeven-vp.ll (+27-19)
  • (modified) llvm/test/CodeGen/RISCV/rvv/roundtozero-vp.ll (+27-19)
  • (modified) llvm/test/CodeGen/RISCV/rvv/setcc-fp-vp.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/shuffle-reverse.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/sink-splat-operands.ll (+126-126)
  • (modified) llvm/test/CodeGen/RISCV/rvv/strided-vpstore.ll (+18-18)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-load.ll (+16-32)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-deinterleave.ll (+43-73)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll (+11-10)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfmadd-sdnode.ll (+19-19)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfmuladd-vp.ll (+11-10)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll (+9-9)
  • (modified) llvm/test/CodeGen/RISCV/shifts.ll (+72-72)
  • (modified) llvm/test/CodeGen/RISCV/srem-vector-lkk.ll (+50-50)
  • (modified) llvm/test/CodeGen/RISCV/stack-store-check.ll (+41-41)
  • (modified) llvm/test/CodeGen/RISCV/umulo-128-legalisation-lowering.ll (+18-18)
  • (modified) llvm/test/CodeGen/RISCV/urem-vector-lkk.ll (+50-50)
  • (modified) llvm/test/CodeGen/RISCV/wide-scalar-shift-by-byte-multiple-legalization.ll (+224-224)
  • (modified) llvm/test/CodeGen/RISCV/wide-scalar-shift-legalization.ll (+590-590)
  • (modified) llvm/test/CodeGen/SPARC/smulo-128-legalisation-lowering.ll (+35-35)
  • (modified) llvm/test/CodeGen/SPARC/umulo-128-legalisation-lowering.ll (+66-66)
  • (modified) llvm/test/CodeGen/SystemZ/int-conv-01.ll (+1-1)
  • (modified) llvm/test/CodeGen/SystemZ/int-conv-02.ll (+1-1)
  • (modified) llvm/test/CodeGen/SystemZ/int-conv-03.ll (+1-1)
  • (modified) llvm/test/CodeGen/SystemZ/int-conv-04.ll (+1-1)
  • (modified) llvm/test/CodeGen/SystemZ/int-conv-06.ll (+1-1)
  • (modified) llvm/test/CodeGen/SystemZ/int-conv-07.ll (+1-1)
  • (modified) llvm/test/CodeGen/SystemZ/int-conv-08.ll (+1-1)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/spillingmove.ll (+10-10)
  • (modified) llvm/test/CodeGen/X86/AMX/amx-greedy-ra-spill-shape.ll (+87-83)
  • (modified) llvm/test/CodeGen/X86/abs.ll (+61-62)
  • (modified) llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast_from_memory.ll (+102-102)
  • (modified) llvm/test/CodeGen/X86/avg.ll (+268-269)
  • (modified) llvm/test/CodeGen/X86/avx512-calling-conv.ll (+324-334)
  • (modified) llvm/test/CodeGen/X86/avx512-regcall-NoMask.ll (+79-78)
  • (modified) llvm/test/CodeGen/X86/avx512bw-intrinsics-upgrade.ll (+32-28)
  • (modified) llvm/test/CodeGen/X86/bitreverse.ll (+30-31)
  • (modified) llvm/test/CodeGen/X86/combine-rotates.ll (+8-8)
  • (modified) llvm/test/CodeGen/X86/dagcombine-cse.ll (+8-8)
  • (modified) llvm/test/CodeGen/X86/div-rem-pair-recomposition-signed.ll (+280-279)
  • (modified) llvm/test/CodeGen/X86/div-rem-pair-recomposition-unsigned.ll (+239-237)
  • (modified) llvm/test/CodeGen/X86/fma.ll (+10-10)
  • (modified) llvm/test/CodeGen/X86/fold-tied-op.ll (+40-41)
  • (modified) llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll (+15-13)
  • (modified) llvm/test/CodeGen/X86/fshl.ll (+29-29)
  • (modified) llvm/test/CodeGen/X86/fshr.ll (+32-32)
  • (modified) llvm/test/CodeGen/X86/funnel-shift.ll (+6-6)
  • (modified) llvm/test/CodeGen/X86/haddsub-2.ll (+14-14)
  • (modified) llvm/test/CodeGen/X86/hoist-and-by-const-from-shl-in-eqcmp-zero.ll (+7-7)
  • (modified) llvm/test/CodeGen/X86/horizontal-sum.ll (+13-14)
  • (modified) llvm/test/CodeGen/X86/i128-mul.ll (+35-35)
  • (modified) llvm/test/CodeGen/X86/legalize-shl-vec.ll (+40-38)
  • (modified) llvm/test/CodeGen/X86/machine-cp.ll (+5-5)
  • (modified) llvm/test/CodeGen/X86/masked_store_trunc_ssat.ll (+14-14)
  • (modified) llvm/test/CodeGen/X86/matrix-multiply.ll (+908-896)
  • (modified) llvm/test/CodeGen/X86/midpoint-int-vec-128.ll (+49-49)
  • (modified) llvm/test/CodeGen/X86/mmx-arith.ll (+15-15)
  • (modified) llvm/test/CodeGen/X86/mul-i1024.ll (+3483-3464)
  • (modified) llvm/test/CodeGen/X86/mul-i256.ll (+169-165)
  • (modified) llvm/test/CodeGen/X86/mul-i512.ll (+864-862)
  • (modified) llvm/test/CodeGen/X86/mul128.ll (+37-38)
  • (modified) llvm/test/CodeGen/X86/muloti.ll (+15-15)
  • (modified) llvm/test/CodeGen/X86/musttail-varargs.ll (+24-24)
  • (modified) llvm/test/CodeGen/X86/oddshuffles.ll (+55-55)
  • (modified) llvm/test/CodeGen/X86/oddsubvector.ll (+10-10)
  • (modified) llvm/test/CodeGen/X86/optimize-max-0.ll (+89-92)
  • (modified) llvm/test/CodeGen/X86/overflow.ll (+10-10)
  • (modified) llvm/test/CodeGen/X86/pr43820.ll (+80-83)
  • (modified) llvm/test/CodeGen/X86/pr46527.ll (+5-5)
  • (modified) llvm/test/CodeGen/X86/pr46877.ll (+122-121)
  • (modified) llvm/test/CodeGen/X86/pr57340.ll (+29-29)
  • (modified) llvm/test/CodeGen/X86/pr59258.ll (+6-6)
  • (modified) llvm/test/CodeGen/X86/psubus.ll (+30-30)
  • (modified) llvm/test/CodeGen/X86/sdiv_fix.ll (+48-48)
  • (modified) llvm/test/CodeGen/X86/sdiv_fix_sat.ll (+87-87)
  • (modified) llvm/test/CodeGen/X86/select.ll (+31-31)
  • (modified) llvm/test/CodeGen/X86/sext-vsetcc.ll (+33-33)
  • (modified) llvm/test/CodeGen/X86/shift-and.ll (+8-10)
  • (modified) llvm/test/CodeGen/X86/shift-i128.ll (+53-54)
  • (modified) llvm/test/CodeGen/X86/shift-i256.ll (+200-15)
  • (modified) llvm/test/CodeGen/X86/shrink_vmul.ll (+37-37)
  • (modified) llvm/test/CodeGen/X86/smax.ll (+21-21)
  • (modified) llvm/test/CodeGen/X86/smin.ll (+21-21)
  • (modified) llvm/test/CodeGen/X86/smul-with-overflow.ll (+442-451)
  • (modified) llvm/test/CodeGen/X86/smul_fix.ll (+10-10)
  • (modified) llvm/test/CodeGen/X86/smul_fix_sat.ll (+82-81)
  • (modified) llvm/test/CodeGen/X86/smulo-128-legalisation-lowering.ll (+717-739)
  • (modified) llvm/test/CodeGen/X86/srem-vector-lkk.ll (+30-30)
  • (modified) llvm/test/CodeGen/X86/sse-regcall.ll (+101-100)
  • (modified) llvm/test/CodeGen/X86/sse-regcall4.ll (+95-94)
  • (modified) llvm/test/CodeGen/X86/sshl_sat_vec.ll (+66-67)
  • (modified) llvm/test/CodeGen/X86/statepoint-live-in.ll (+11-11)
  • (modified) llvm/test/CodeGen/X86/statepoint-regs.ll (+11-11)
  • (modified) llvm/test/CodeGen/X86/sttni.ll (+12-12)
  • (modified) llvm/test/CodeGen/X86/subvectorwise-store-of-vector-splat.ll (+146-146)
  • (modified) llvm/test/CodeGen/X86/swifterror.ll (+12-12)
  • (modified) llvm/test/CodeGen/X86/umax.ll (+46-46)
  • (modified) llvm/test/CodeGen/X86/umin.ll (+21-21)
  • (modified) llvm/test/CodeGen/X86/umul-with-overflow.ll (+275-284)
  • (modified) llvm/test/CodeGen/X86/umul_fix.ll (+12-12)
  • (modified) llvm/test/CodeGen/X86/umul_fix_sat.ll (+20-20)
  • (modified) llvm/test/CodeGen/X86/umulo-128-legalisation-lowering.ll (+57-55)
  • (modified) llvm/test/CodeGen/X86/unfold-masked-merge-vector-variablemask.ll (+372-364)
  • (modified) llvm/test/CodeGen/X86/ushl_sat_vec.ll (+53-50)
  • (modified) llvm/test/CodeGen/X86/var-permute-128.ll (+48-48)
  • (modified) llvm/test/CodeGen/X86/var-permute-512.ll (+40-40)
  • (modified) llvm/test/CodeGen/X86/vec-strict-fptoint-512.ll (+32-32)
  • (modified) llvm/test/CodeGen/X86/vec_smulo.ll (+462-460)
  • (modified) llvm/test/CodeGen/X86/vec_uaddo.ll (+28-28)
  • (modified) llvm/test/CodeGen/X86/vec_umulo.ll (+253-253)
  • (modified) llvm/test/CodeGen/X86/vec_usubo.ll (+28-28)
  • (modified) llvm/test/CodeGen/X86/vector-bo-select.ll (+68-68)
  • (modified) llvm/test/CodeGen/X86/vector-compare-results.ll (+15-15)
  • (modified) llvm/test/CodeGen/X86/vector-half-conversions.ll (+20-20)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-2.ll (+25-26)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-3.ll (+1055-1058)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-4.ll (+2114-2113)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-5.ll (+4269-4265)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-6.ll (+6594-6589)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-7.ll (+10528-10522)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-8.ll (+5099-5118)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-2.ll (+203-203)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-3.ll (+1517-1508)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-4.ll (+1432-1455)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-5.ll (+2177-2181)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-6.ll (+5319-5296)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-7.ll (+6096-6104)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-8.ll (+6864-10487)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-2.ll (+536-536)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-3.ll (+980-1000)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-4.ll (+1383-1374)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-5.ll (+3055-3082)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-6.ll (+3263-3240)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-7.ll (+8001-7964)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-8.ll (+4915-4893)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-3.ll (+721-717)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-4.ll (+546-542)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-5.ll (+2028-2035)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-6.ll (+2753-2726)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-7.ll (+8102-7984)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-8.ll (+3809-3791)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-3.ll (+680-679)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-4.ll (+75-75)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride...

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 15, 2023

@llvm/pr-subscribers-llvm-transforms

Changes RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate the length of a live range for its heuristics. Renumbering all slot indexes with the default instruction distance ensures that this estimate will be as accurate as possible, and will not depend on the history of how instructions have been added to and removed from SlotIndexes's maps.

This also means that enabling -early-live-intervals, which runs the
SlotIndexes analysis earlier, will not cause large amounts of churn due
to different register allocator decisions.

--

Patch is 31.92 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/66334.diff

330 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/SlotIndexes.h (+3)
  • (modified) llvm/lib/CodeGen/RegAllocGreedy.cpp (+3)
  • (modified) llvm/lib/CodeGen/SlotIndexes.cpp (+5)
  • (modified) llvm/test/CodeGen/AArch64/active_lane_mask.ll (+45-45)
  • (modified) llvm/test/CodeGen/AArch64/arm64-addr-type-promotion.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/arm64-cse.ll (+5-5)
  • (modified) llvm/test/CodeGen/AArch64/arm64-shrink-wrapping.ll (+22-22)
  • (modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-uniform-cases.ll (+11-11)
  • (modified) llvm/test/CodeGen/AArch64/extbinopload.ll (+73-73)
  • (modified) llvm/test/CodeGen/AArch64/faddp-half.ll (+5-5)
  • (modified) llvm/test/CodeGen/AArch64/fcvt_combine.ll (+42-42)
  • (modified) llvm/test/CodeGen/AArch64/fdiv.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/fpow.ll (+93-93)
  • (modified) llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll (+252-252)
  • (modified) llvm/test/CodeGen/AArch64/fptoui-sat-vector.ll (+88-90)
  • (modified) llvm/test/CodeGen/AArch64/frem.ll (+93-93)
  • (modified) llvm/test/CodeGen/AArch64/llvm.exp10.ll (+5-6)
  • (modified) llvm/test/CodeGen/AArch64/neon-dotreduce.ll (+515-515)
  • (modified) llvm/test/CodeGen/AArch64/neon-extadd.ll (+26-26)
  • (modified) llvm/test/CodeGen/AArch64/pow.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/ragreedy-local-interval-cost.ll (+109-116)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-length-masked-scatter.ll (+18-18)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-length-shuffles.ll (+26-26)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-to-int.ll (+50-50)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-div.ll (+40-40)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll (+74-72)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-permute-zip-uzp-trn.ll (+95-95)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll (+11-12)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll (+204-204)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/sdiv.i64.ll (+1439-1434)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll (+119-119)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/srem.i64.ll (+1066-1062)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll (+12-12)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/udiv.i64.ll (+312-312)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll (+297-296)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll (+268-268)
  • (modified) llvm/test/CodeGen/AMDGPU/agpr-copy-no-free-registers.ll (+18-18)
  • (modified) llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll (+79-79)
  • (modified) llvm/test/CodeGen/AMDGPU/branch-folding-implicit-def-subreg.ll (+100-100)
  • (modified) llvm/test/CodeGen/AMDGPU/bswap.ll (+44-44)
  • (modified) llvm/test/CodeGen/AMDGPU/bug-sdag-emitcopyfromreg.ll (+7-7)
  • (modified) llvm/test/CodeGen/AMDGPU/bypass-div.ll (+103-103)
  • (modified) llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll (+19-19)
  • (modified) llvm/test/CodeGen/AMDGPU/ctpop16.ll (+8-8)
  • (modified) llvm/test/CodeGen/AMDGPU/cttz_zero_undef.ll (+10-10)
  • (modified) llvm/test/CodeGen/AMDGPU/extract-subvector-16bit.ll (+41-41)
  • (modified) llvm/test/CodeGen/AMDGPU/flat-scratch-init.ll (+4)
  • (modified) llvm/test/CodeGen/AMDGPU/fp_to_sint.ll (+8-8)
  • (modified) llvm/test/CodeGen/AMDGPU/fsqrt.f32.ll (+24-24)
  • (modified) llvm/test/CodeGen/AMDGPU/gfx-callable-return-types.ll (+243-242)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i32_system.ll (+24-24)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i64_system.ll (+24-24)
  • (modified) llvm/test/CodeGen/AMDGPU/half.ll (+90-90)
  • (modified) llvm/test/CodeGen/AMDGPU/idot8s.ll (+64-64)
  • (modified) llvm/test/CodeGen/AMDGPU/indirect-call.ll (+6-6)
  • (modified) llvm/test/CodeGen/AMDGPU/insert-delay-alu-bug.ll (+9-9)
  • (modified) llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll (+95-95)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll (+170-170)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.exp2.ll (+13-13)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.log2.ll (+13-13)
  • (modified) llvm/test/CodeGen/AMDGPU/load-constant-i1.ll (+1323-1318)
  • (modified) llvm/test/CodeGen/AMDGPU/load-constant-i16.ll (+333-333)
  • (modified) llvm/test/CodeGen/AMDGPU/load-constant-i32.ll (+56-54)
  • (modified) llvm/test/CodeGen/AMDGPU/load-constant-i8.ll (+554-550)
  • (modified) llvm/test/CodeGen/AMDGPU/load-global-i16.ll (+1059-1053)
  • (modified) llvm/test/CodeGen/AMDGPU/mul.ll (+41-41)
  • (modified) llvm/test/CodeGen/AMDGPU/pr51516.mir (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll (+10-10)
  • (modified) llvm/test/CodeGen/AMDGPU/rsq.f32.ll (+60-60)
  • (modified) llvm/test/CodeGen/AMDGPU/scc-clobbered-sgpr-to-vmem-spill.ll (+125-125)
  • (modified) llvm/test/CodeGen/AMDGPU/sdiv.ll (+168-168)
  • (modified) llvm/test/CodeGen/AMDGPU/sdiv64.ll (+19-19)
  • (modified) llvm/test/CodeGen/AMDGPU/select.f16.ll (+15-15)
  • (modified) llvm/test/CodeGen/AMDGPU/si-spill-sgpr-stack.ll (+4-1)
  • (modified) llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll (+27-30)
  • (modified) llvm/test/CodeGen/AMDGPU/spill-vgpr.ll (+4-4)
  • (modified) llvm/test/CodeGen/AMDGPU/srem64.ll (+51-51)
  • (modified) llvm/test/CodeGen/AMDGPU/swdev373493.ll (+9-9)
  • (modified) llvm/test/CodeGen/AMDGPU/swdev380865.ll (+2-4)
  • (modified) llvm/test/CodeGen/AMDGPU/udiv.ll (+48-48)
  • (modified) llvm/test/CodeGen/AMDGPU/urem64.ll (+15-15)
  • (modified) llvm/test/CodeGen/AMDGPU/vni8-across-blocks.ll (+18-18)
  • (modified) llvm/test/CodeGen/AMDGPU/wave32.ll (+29-29)
  • (modified) llvm/test/CodeGen/Hexagon/atomicrmw-uinc-udec-wrap.ll (+6-6)
  • (modified) llvm/test/CodeGen/Hexagon/autohvx/fp-to-int.ll (+62-60)
  • (modified) llvm/test/CodeGen/Hexagon/autohvx/int-to-fp.ll (+417-423)
  • (modified) llvm/test/CodeGen/Hexagon/autohvx/isel-truncate.ll (+4-4)
  • (modified) llvm/test/CodeGen/Hexagon/autohvx/vmpy-parts.ll (+17-17)
  • (modified) llvm/test/CodeGen/Hexagon/ifcvt-diamond-bug-2016-08-26.ll (-1)
  • (modified) llvm/test/CodeGen/Hexagon/ntstbit.ll (+3-3)
  • (modified) llvm/test/CodeGen/Hexagon/signext-inreg.ll (+23-23)
  • (modified) llvm/test/CodeGen/Hexagon/swp-conv3x3-nested.ll (+2-1)
  • (modified) llvm/test/CodeGen/Hexagon/swp-stages4.ll (+1)
  • (modified) llvm/test/CodeGen/PowerPC/all-atomics.ll (+29-29)
  • (modified) llvm/test/CodeGen/PowerPC/atomics.ll (+21-21)
  • (modified) llvm/test/CodeGen/PowerPC/inc-of-add.ll (+71-69)
  • (modified) llvm/test/CodeGen/PowerPC/ldst-16-byte.mir (+112-107)
  • (modified) llvm/test/CodeGen/PowerPC/loop-instr-form-prepare.ll (+12-12)
  • (modified) llvm/test/CodeGen/PowerPC/more-dq-form-prepare.ll (+119-119)
  • (modified) llvm/test/CodeGen/PowerPC/ppc64-P9-vabsd.ll (+159-159)
  • (modified) llvm/test/CodeGen/PowerPC/sat-add.ll (+15-15)
  • (modified) llvm/test/CodeGen/PowerPC/srem-vector-lkk.ll (+23-23)
  • (modified) llvm/test/CodeGen/PowerPC/sub-of-not.ll (+71-69)
  • (modified) llvm/test/CodeGen/PowerPC/tocSaveInPrologue.ll (+9-9)
  • (modified) llvm/test/CodeGen/PowerPC/umulo-128-legalisation-lowering.ll (+22-22)
  • (modified) llvm/test/CodeGen/PowerPC/urem-vector-lkk.ll (+10-10)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp32_to_i16_elts.ll (+186-186)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp32_to_i8_elts.ll (+102-102)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp64_to_i16_elts.ll (+118-118)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp64_to_i8_elts.ll (+122-122)
  • (modified) llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll (+48-48)
  • (modified) llvm/test/CodeGen/RISCV/branch-relaxation.ll (+211-207)
  • (modified) llvm/test/CodeGen/RISCV/callee-saved-gprs.ll (+96-96)
  • (modified) llvm/test/CodeGen/RISCV/early-clobber-tied-def-subreg-liveness.ll (+7-7)
  • (modified) llvm/test/CodeGen/RISCV/fpclamptosat.ll (+32-32)
  • (modified) llvm/test/CodeGen/RISCV/mul.ll (+28-28)
  • (modified) llvm/test/CodeGen/RISCV/overflow-intrinsics.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/push-pop-popret.ll (+144-144)
  • (modified) llvm/test/CodeGen/RISCV/rv32zbb-zbkb.ll (+17-17)
  • (modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-sdnode.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/bswap-sdnode.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll (+27-19)
  • (modified) llvm/test/CodeGen/RISCV/rvv/ctpop-vp.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ceil-vp.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctlz-vp.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctpop-vp.ll (+31-82)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-cttz-vp.ll (+90-90)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-floor-vp.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-explodevector.ll (+43-43)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll (+87-100)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll (+26-26)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-store-int.ll (+24-11)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-nearbyint-vp.ll (+29-13)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-rint-vp.ll (+37-26)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-round-vp.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundeven-vp.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundtozero-vp.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-setcc-int-vp.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vscale-range.ll (+38-26)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect-vp.ll (+22-12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/floor-vp.ll (+27-19)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fpclamptosat_vec.ll (+44-44)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fshr-fshl-vp.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/nearbyint-vp.ll (+51-16)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rint-vp.ll (+32-21)
  • (modified) llvm/test/CodeGen/RISCV/rvv/round-vp.ll (+27-19)
  • (modified) llvm/test/CodeGen/RISCV/rvv/roundeven-vp.ll (+27-19)
  • (modified) llvm/test/CodeGen/RISCV/rvv/roundtozero-vp.ll (+27-19)
  • (modified) llvm/test/CodeGen/RISCV/rvv/setcc-fp-vp.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/shuffle-reverse.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/sink-splat-operands.ll (+126-126)
  • (modified) llvm/test/CodeGen/RISCV/rvv/strided-vpstore.ll (+18-18)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-load.ll (+16-32)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-deinterleave.ll (+43-73)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll (+11-10)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfmadd-sdnode.ll (+19-19)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfmuladd-vp.ll (+11-10)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll (+9-9)
  • (modified) llvm/test/CodeGen/RISCV/shifts.ll (+72-72)
  • (modified) llvm/test/CodeGen/RISCV/srem-vector-lkk.ll (+50-50)
  • (modified) llvm/test/CodeGen/RISCV/stack-store-check.ll (+41-41)
  • (modified) llvm/test/CodeGen/RISCV/umulo-128-legalisation-lowering.ll (+18-18)
  • (modified) llvm/test/CodeGen/RISCV/urem-vector-lkk.ll (+50-50)
  • (modified) llvm/test/CodeGen/RISCV/wide-scalar-shift-by-byte-multiple-legalization.ll (+224-224)
  • (modified) llvm/test/CodeGen/RISCV/wide-scalar-shift-legalization.ll (+590-590)
  • (modified) llvm/test/CodeGen/SPARC/smulo-128-legalisation-lowering.ll (+35-35)
  • (modified) llvm/test/CodeGen/SPARC/umulo-128-legalisation-lowering.ll (+66-66)
  • (modified) llvm/test/CodeGen/SystemZ/int-conv-01.ll (+1-1)
  • (modified) llvm/test/CodeGen/SystemZ/int-conv-02.ll (+1-1)
  • (modified) llvm/test/CodeGen/SystemZ/int-conv-03.ll (+1-1)
  • (modified) llvm/test/CodeGen/SystemZ/int-conv-04.ll (+1-1)
  • (modified) llvm/test/CodeGen/SystemZ/int-conv-06.ll (+1-1)
  • (modified) llvm/test/CodeGen/SystemZ/int-conv-07.ll (+1-1)
  • (modified) llvm/test/CodeGen/SystemZ/int-conv-08.ll (+1-1)
  • (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/spillingmove.ll (+10-10)
  • (modified) llvm/test/CodeGen/X86/AMX/amx-greedy-ra-spill-shape.ll (+87-83)
  • (modified) llvm/test/CodeGen/X86/abs.ll (+61-62)
  • (modified) llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast_from_memory.ll (+102-102)
  • (modified) llvm/test/CodeGen/X86/avg.ll (+268-269)
  • (modified) llvm/test/CodeGen/X86/avx512-calling-conv.ll (+324-334)
  • (modified) llvm/test/CodeGen/X86/avx512-regcall-NoMask.ll (+79-78)
  • (modified) llvm/test/CodeGen/X86/avx512bw-intrinsics-upgrade.ll (+32-28)
  • (modified) llvm/test/CodeGen/X86/bitreverse.ll (+30-31)
  • (modified) llvm/test/CodeGen/X86/combine-rotates.ll (+8-8)
  • (modified) llvm/test/CodeGen/X86/dagcombine-cse.ll (+8-8)
  • (modified) llvm/test/CodeGen/X86/div-rem-pair-recomposition-signed.ll (+280-279)
  • (modified) llvm/test/CodeGen/X86/div-rem-pair-recomposition-unsigned.ll (+239-237)
  • (modified) llvm/test/CodeGen/X86/fma.ll (+10-10)
  • (modified) llvm/test/CodeGen/X86/fold-tied-op.ll (+40-41)
  • (modified) llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll (+15-13)
  • (modified) llvm/test/CodeGen/X86/fshl.ll (+29-29)
  • (modified) llvm/test/CodeGen/X86/fshr.ll (+32-32)
  • (modified) llvm/test/CodeGen/X86/funnel-shift.ll (+6-6)
  • (modified) llvm/test/CodeGen/X86/haddsub-2.ll (+14-14)
  • (modified) llvm/test/CodeGen/X86/hoist-and-by-const-from-shl-in-eqcmp-zero.ll (+7-7)
  • (modified) llvm/test/CodeGen/X86/horizontal-sum.ll (+13-14)
  • (modified) llvm/test/CodeGen/X86/i128-mul.ll (+35-35)
  • (modified) llvm/test/CodeGen/X86/legalize-shl-vec.ll (+40-38)
  • (modified) llvm/test/CodeGen/X86/machine-cp.ll (+5-5)
  • (modified) llvm/test/CodeGen/X86/masked_store_trunc_ssat.ll (+14-14)
  • (modified) llvm/test/CodeGen/X86/matrix-multiply.ll (+908-896)
  • (modified) llvm/test/CodeGen/X86/midpoint-int-vec-128.ll (+49-49)
  • (modified) llvm/test/CodeGen/X86/mmx-arith.ll (+15-15)
  • (modified) llvm/test/CodeGen/X86/mul-i1024.ll (+3483-3464)
  • (modified) llvm/test/CodeGen/X86/mul-i256.ll (+169-165)
  • (modified) llvm/test/CodeGen/X86/mul-i512.ll (+864-862)
  • (modified) llvm/test/CodeGen/X86/mul128.ll (+37-38)
  • (modified) llvm/test/CodeGen/X86/muloti.ll (+15-15)
  • (modified) llvm/test/CodeGen/X86/musttail-varargs.ll (+24-24)
  • (modified) llvm/test/CodeGen/X86/oddshuffles.ll (+55-55)
  • (modified) llvm/test/CodeGen/X86/oddsubvector.ll (+10-10)
  • (modified) llvm/test/CodeGen/X86/optimize-max-0.ll (+89-92)
  • (modified) llvm/test/CodeGen/X86/overflow.ll (+10-10)
  • (modified) llvm/test/CodeGen/X86/pr43820.ll (+80-83)
  • (modified) llvm/test/CodeGen/X86/pr46527.ll (+5-5)
  • (modified) llvm/test/CodeGen/X86/pr46877.ll (+122-121)
  • (modified) llvm/test/CodeGen/X86/pr57340.ll (+29-29)
  • (modified) llvm/test/CodeGen/X86/pr59258.ll (+6-6)
  • (modified) llvm/test/CodeGen/X86/psubus.ll (+30-30)
  • (modified) llvm/test/CodeGen/X86/sdiv_fix.ll (+48-48)
  • (modified) llvm/test/CodeGen/X86/sdiv_fix_sat.ll (+87-87)
  • (modified) llvm/test/CodeGen/X86/select.ll (+31-31)
  • (modified) llvm/test/CodeGen/X86/sext-vsetcc.ll (+33-33)
  • (modified) llvm/test/CodeGen/X86/shift-and.ll (+8-10)
  • (modified) llvm/test/CodeGen/X86/shift-i128.ll (+53-54)
  • (modified) llvm/test/CodeGen/X86/shift-i256.ll (+200-15)
  • (modified) llvm/test/CodeGen/X86/shrink_vmul.ll (+37-37)
  • (modified) llvm/test/CodeGen/X86/smax.ll (+21-21)
  • (modified) llvm/test/CodeGen/X86/smin.ll (+21-21)
  • (modified) llvm/test/CodeGen/X86/smul-with-overflow.ll (+442-451)
  • (modified) llvm/test/CodeGen/X86/smul_fix.ll (+10-10)
  • (modified) llvm/test/CodeGen/X86/smul_fix_sat.ll (+82-81)
  • (modified) llvm/test/CodeGen/X86/smulo-128-legalisation-lowering.ll (+717-739)
  • (modified) llvm/test/CodeGen/X86/srem-vector-lkk.ll (+30-30)
  • (modified) llvm/test/CodeGen/X86/sse-regcall.ll (+101-100)
  • (modified) llvm/test/CodeGen/X86/sse-regcall4.ll (+95-94)
  • (modified) llvm/test/CodeGen/X86/sshl_sat_vec.ll (+66-67)
  • (modified) llvm/test/CodeGen/X86/statepoint-live-in.ll (+11-11)
  • (modified) llvm/test/CodeGen/X86/statepoint-regs.ll (+11-11)
  • (modified) llvm/test/CodeGen/X86/sttni.ll (+12-12)
  • (modified) llvm/test/CodeGen/X86/subvectorwise-store-of-vector-splat.ll (+146-146)
  • (modified) llvm/test/CodeGen/X86/swifterror.ll (+12-12)
  • (modified) llvm/test/CodeGen/X86/umax.ll (+46-46)
  • (modified) llvm/test/CodeGen/X86/umin.ll (+21-21)
  • (modified) llvm/test/CodeGen/X86/umul-with-overflow.ll (+275-284)
  • (modified) llvm/test/CodeGen/X86/umul_fix.ll (+12-12)
  • (modified) llvm/test/CodeGen/X86/umul_fix_sat.ll (+20-20)
  • (modified) llvm/test/CodeGen/X86/umulo-128-legalisation-lowering.ll (+57-55)
  • (modified) llvm/test/CodeGen/X86/unfold-masked-merge-vector-variablemask.ll (+372-364)
  • (modified) llvm/test/CodeGen/X86/ushl_sat_vec.ll (+53-50)
  • (modified) llvm/test/CodeGen/X86/var-permute-128.ll (+48-48)
  • (modified) llvm/test/CodeGen/X86/var-permute-512.ll (+40-40)
  • (modified) llvm/test/CodeGen/X86/vec-strict-fptoint-512.ll (+32-32)
  • (modified) llvm/test/CodeGen/X86/vec_smulo.ll (+462-460)
  • (modified) llvm/test/CodeGen/X86/vec_uaddo.ll (+28-28)
  • (modified) llvm/test/CodeGen/X86/vec_umulo.ll (+253-253)
  • (modified) llvm/test/CodeGen/X86/vec_usubo.ll (+28-28)
  • (modified) llvm/test/CodeGen/X86/vector-bo-select.ll (+68-68)
  • (modified) llvm/test/CodeGen/X86/vector-compare-results.ll (+15-15)
  • (modified) llvm/test/CodeGen/X86/vector-half-conversions.ll (+20-20)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-2.ll (+25-26)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-3.ll (+1055-1058)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-4.ll (+2114-2113)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-5.ll (+4269-4265)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-6.ll (+6594-6589)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-7.ll (+10528-10522)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-8.ll (+5099-5118)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-2.ll (+203-203)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-3.ll (+1517-1508)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-4.ll (+1432-1455)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-5.ll (+2177-2181)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-6.ll (+5319-5296)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-7.ll (+6096-6104)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-8.ll (+6864-10487)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-2.ll (+536-536)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-3.ll (+980-1000)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-4.ll (+1383-1374)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-5.ll (+3055-3082)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-6.ll (+3263-3240)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-7.ll (+8001-7964)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-8.ll (+4915-4893)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-3.ll (+721-717)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-4.ll (+546-542)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-5.ll (+2028-2035)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-6.ll (+2753-2726)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-7.ll (+8102-7984)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-8.ll (+3809-3791)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-3.ll (+680-679)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-4.ll (+75-75)
  • (modified) llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride...

Error: Command failed due to missing milestone.

@jayfoad
Copy link
Contributor Author

jayfoad commented Sep 15, 2023

TODO: I still need to update ~15 tests with manual checks.

All tests updated now.

@jayfoad jayfoad requested review from a team September 15, 2023 14:13
@arsenm
Copy link
Contributor

arsenm commented Sep 15, 2023

I've thought about doing this but it only makes the problem harder to observe. It still happens when renumbering occurs during allocation

@jayfoad
Copy link
Contributor Author

jayfoad commented Sep 15, 2023

I've thought about doing this but it only makes the problem harder to observe.

I don't think that's a bad thing. I think this makes things better, even if they're still not perfect.

@perlfu
Copy link
Contributor

perlfu commented Sep 18, 2023

This seems reasonable to me.

For AMDGPU graphics, this seems like it yields a very minor reduction in mean VGPR usage, and slightly greater reduction in mean SGPR usage. There are some a handful of edges case where VGPR usage increased significantly (20 VGPRs). I guess this is unsurprising given it tweaks a heuristic input.

@jayfoad
Copy link
Contributor Author

jayfoad commented Sep 18, 2023

Rebased on #66627 which allowed me to update test/CodeGen/X86/AMX/amx-greedy-ra-spill-shape.ll automatically with no need for manual tweaks to the checks.

@qcolombet
Copy link
Collaborator

I'm on the fence with that change.

My concerns is that it adds compile time and may trigger other renumbering down the line.
The compile time impact is probably in the noise (@jayfoad have you double check?) and initially at least getApproxInstrDistance makes more sense.

Could you rename renumberAllIndexes into densifyIndices or something that conveys that we're packing the indices in a contiguous way. Here I'd like to avoid having a name that collide with renumberIndexes.

@jayfoad
Copy link
Contributor Author

jayfoad commented Sep 18, 2023

My concerns is that it adds compile time and may trigger other renumbering down the line. The compile time impact is probably in the noise (@jayfoad have you double check?)

Yes, see earlier comment :)

https://llvm-compile-time-tracker.com/ shows a geomean 0.00% change.

This was based on the "renumber-all-slotindexes" branch here: https://llvm-compile-time-tracker.com/?config=NewPM-O3&stat=instructions%3Au&remote=jayfoad

@jayfoad
Copy link
Contributor Author

jayfoad commented Sep 18, 2023

Could you rename renumberAllIndexes into densifyIndices or something that conveys that we're packing the indices in a contiguous way. Here I'd like to avoid having a name that collide with renumberIndexes.

How about packIndexes or compactIndexes? (Using "compact" as a verb not a noun, in case that's not obvious.)

@qcolombet
Copy link
Collaborator

Could you rename renumberAllIndexes into densifyIndices or something that conveys that we're packing the indices in a contiguous way. Here I'd like to avoid having a name that collide with renumberIndexes.

How about packIndexes or compactIndexes? (Using "compact" as a verb not a noun, in case that's not obvious.)

Both sounds good to me. Slight preference for packIndexes.

@qcolombet
Copy link
Collaborator

My concerns is that it adds compile time and may trigger other renumbering down the line. The compile time impact is probably in the noise (@jayfoad have you double check?)

Yes, see earlier comment :)

https://llvm-compile-time-tracker.com/ shows a geomean 0.00% change.

This was based on the "renumber-all-slotindexes" branch here: https://llvm-compile-time-tracker.com/?config=NewPM-O3&stat=instructions%3Au&remote=jayfoad

Sorry, seen it then forgot about it :P.

@jayfoad
Copy link
Contributor Author

jayfoad commented Sep 18, 2023

Renamed renumberAllIndexes -> packIndexes.

Copy link
Contributor

@perlfu perlfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - but probably wait a little in case @qcolombet has further input

RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate
the length of a live range for its heuristics. Renumbering all slot
indexes with the default instruction distance ensures that this estimate
will be as accurate as possible, and will not depend on the history of
how instructions have been added to and removed from SlotIndexes's maps.

This also means that enabling -early-live-intervals, which runs the
SlotIndexes analysis earlier, will not cause large amounts of churn due
to different register allocator decisions.
jayfoad added a commit that referenced this pull request Sep 19, 2023
RegAllocGreedy uses SlotIndexes::getApproxInstrDistance to approximate
the length of a live range for its heuristics. Renumbering all slot
indexes with the default instruction distance ensures that this estimate
will be as accurate as possible, and will not depend on the history of
how instructions have been added to and removed from SlotIndexes's maps.

This also means that enabling -early-live-intervals, which runs the
SlotIndexes analysis earlier, will not cause large amounts of churn due
to different register allocator decisions.
@jayfoad jayfoad closed this Sep 19, 2023
@jayfoad
Copy link
Contributor Author

jayfoad commented Sep 19, 2023

Merged manually in e0919b1

@jayfoad jayfoad deleted the renumber-all-slotindexes branch September 19, 2023 10:27
mtrofin added a commit that referenced this pull request Sep 19, 2023
blackgeorge-boom added a commit to blackgeorge-boom/llvm-project that referenced this pull request Sep 29, 2023
Applies:
llvm#66334
llvm#67038

Packing the slot indexes before register allocation is useful for us
because it evens the gaps between slots after all the optimization
passes that happen before `greedy` and may have removed a different number
of instructions between AArch64 and X86. This leads to different slot gaps
and, hence, slightly different regalloc in some cases.

We backport the above patches for our LLVM, with the main difference
being the absence of some convenient data structure iterators, which we
had to convert to be compatible with our ADT infrastructure.

We add the `-pack-indexes` flag to activate this.

Addressses: systems-nuts/unifico#291
jayfoad added a commit that referenced this pull request Oct 9, 2023
…7038)

PR #66334 tried to renumber slot indexes before register allocation, but
the numbering was still affected by list entries for instructions which
had been erased. Fix this to make the register allocator's live range
length heuristics even less dependent on the history of how instructions
have been added to and removed from SlotIndexes's maps.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants