Skip to content

Conversation

@wangpc-pp
Copy link
Contributor

@wangpc-pp wangpc-pp commented Jan 9, 2026

  • [RISCV] Remove RISCVVMV0Elimination pass
    This removes the use of RISCVVMV0Elimination pass.

  • [RISCV] Add an artificial between copies of v0 and vmv0 users
    This fixes the only test with ran out of registers error.
    This avoids the overlap of v0 and fixes some correctness issues.

  • [RISCV] Change the limit of register pressure VMV0 to 2
    Try to fix regressions.

  • [RISCV] Change the limit of register pressure VMV0 to 1
    Try to fix regressions.

@wangpc-pp wangpc-pp requested review from lukel97, mshockwave, preames and topperc and removed request for lukel97 January 9, 2026 10:16
@wangpc-pp wangpc-pp force-pushed the main-riscv-move-v0-elimination branch from 4417d3e to 6cdec0b Compare January 9, 2026 10:18
@wangpc-pp
Copy link
Contributor Author

I am doing some attempts to fix some scheduling issues caused by v0 register and remove redundant vmv.v v0s , and I found the root cause is we still can't make vmv0 a safe register class to pass RA and scheduler. So I am trying to remove RISCVVMV0Elimination pass and finally let RA allocate it to v0.

We can pass the unit tests after this PR and we see some reductions of spill/reload/moves as well. I will try to make it pass the llvm-testsuite next week.

@lukel97
Copy link
Contributor

lukel97 commented Jan 10, 2026

I'm not sure if anythings changed in the RegisterCoalescer in the meantime, but do we still run into the issue where it will coalesce registers so instructions will have multiple operands in the vmv0 class? e.g. %x:vrnov0 = PseudoVADD_VV_M1_MASK %0:vrnov0, %1:vr, %2:vmv0, %3:vmv0, ...

@wangpc-pp wangpc-pp force-pushed the main-riscv-move-v0-elimination branch from 6cdec0b to 10cc824 Compare January 13, 2026 03:11
@wangpc-pp
Copy link
Contributor Author

wangpc-pp commented Jan 13, 2026

I'm not sure if anythings changed in the RegisterCoalescer in the meantime, but do we still run into the issue where it will coalesce registers so instructions will have multiple operands in the vmv0 class? e.g. %x:vrnov0 = PseudoVADD_VV_M1_MASK %0:vrnov0, %1:vr, %2:vmv0, %3:vmv0, ...

I didn't change RegisterCoalescer, instead, I changed the operand register class used in VPseudoBinaryM and added a new scheduling mutation to avoid such cases. Now we can pass the llvm-testsuite.

@wangpc-pp wangpc-pp force-pushed the main-riscv-move-v0-elimination branch from 10cc824 to 5f8019b Compare January 21, 2026 11:30
@wangpc-pp
Copy link
Contributor Author

Status updates:

  1. We can pass llvm-testsuite after this PR so the correctness is not a noticeable concern now.
  2. For the in-tree unit tests, we saw some improvements of removing vmvs but somehow there are several cases where we generate more spills/reloads. Overall the spilling issue is more significant than the gain.
  3. We measured the dynamic instruction count of SPEC CPU 2017 and we didn't see big differences.

; FOLDING-NEXT: vmv.v.i v8, 0
; FOLDING-NEXT: vmerge.vim v9, v8, -1, v0
; FOLDING-NEXT: vlm.v v0, (a1)
; FOLDING-NEXT: vmv1r.v v0, v10
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regression.

; ZVBB-NEXT: vmv1r.v v0, v13
; ZVBB-NEXT: vmerge.vim v10, v22, 1, v0
; ZVBB-NEXT: vmerge.vim v16, v6, 1, v0
; ZVBB-NEXT: vmv2r.v v28, v26
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regression.

; RV64-NEXT: csrr a3, vlenb
; RV64-NEXT: slli a3, a3, 3
; RV64-NEXT: sub sp, sp, a3
; RV64-NEXT: .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x08, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 8 * vlenb
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants