Skip to content

Conversation

lukel97
Copy link
Contributor

@lukel97 lukel97 commented Sep 2, 2025

In order to fold a vmerge into a pseudo, the pseudo's passthru needs to be the same as vmerge's false operand.

If they don't match we can try and commute the instruction if possible, e.g. here we can commute v9 and v8 to fold the vmerge:

vsetvli zero, a0, e32, m1, ta, ma
vfmadd.vv v9, v10, v8
vsetvli zero, zero, e32, m1, tu, ma
vmerge.vvm v8, v8, v9, v0

vsetvli zero, a0, e32, m1, tu, mu
vfmacc.vv v8, v9, v10, v0.t

Previously this wasn't possible because we did the peephole in SelectionDAG, but now that it's been migrated to MachineInstr in #144076 we can reuse the commuting infrastructure in TargetInstrInfo.

This fixes the extra vmv.v.v in the "mul" example here: #123069 (comment)

It should also allow us to remove the isel patterns described in #141885 later.

In order to fold a vmerge into a pseudo, the pseudo's passthru needs to be the same as vmerge's false operand.

If they don't match we can try and commute the instruction if possible, e.g. here we can commute v9 and v8 to fold the vmerge:

    vsetvli zero, a0, e32, m1, ta, ma
    vfmadd.vv v9, v10, v8
    vsetvli zero, zero, e32, m1, tu, ma
    vmerge.vvm v8, v8, v9, v0

    vsetvli zero, a0, e32, m1, tu, mu
    vfmacc.vv v8, v9, v10, v0.t

Previously this wasn't possible because we did the peephole in SelectionDAG, but now that it's been migrated to MachineInstr in llvm#144076 we can reuse the commuting infrastructure in TargetInstrInfo.

This fixes the extra vmv.v.v in the "mul" example here: llvm#123069 (comment)

It should also allow us to remove the isel patterns described in llvm#141885 later.
Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lukel97 lukel97 enabled auto-merge (squash) September 3, 2025 00:53
@lukel97 lukel97 merged commit 410764c into llvm:main Sep 3, 2025
9 checks passed
lukel97 added a commit to lukel97/llvm-project that referenced this pull request Sep 3, 2025
Now that RISCVVectorPeephole can commute operands to fold vmerge into a pseudo to make it masked in llvm#156499, we can remove the remaining VPatMultiplyAccVL_VV_VX/VPatFPMulAccVL_VV_VF_RM patterns.

It also looks like we can remove the vmerge_vl patterns for _TIED psuedos too.

Tested on SPEC CPU 2017 and llvm-test-suite to confirm there's no codegen change.

Fixes llvm#141885
lukel97 added a commit that referenced this pull request Sep 3, 2025
Now that RISCVVectorPeephole can commute operands to fold vmerge into a
pseudo to make it masked in #156499, we can remove the remaining
VPatMultiplyAccVL_VV_VX/VPatFPMulAccVL_VV_VF_RM patterns.

It also looks like we can remove the vmerge_vl patterns for _TIED
psuedos too. I suspect they're handled by convertAllOnesVMergeToVMv and
foldVMV_V_V

Tested on SPEC CPU 2017 and llvm-test-suite to confirm there's no
codegen change.

Fixes #141885
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants