[AMDGPU][WaveTransform] Migrate VcndmaskVcmpExecMask fold into SIFoldOperand pass #369

vg0204 · 2025-10-24T07:16:24Z

As in case of lateAMDGPUWaveTransform pipeline, the SIOptimizeExecMaskingPreRA should be moved just before SGPR allocation when per-lane VGPR allocation has been handled. So, we need to ensure that any kind of optimization dealing with EXEC mask around VGPRs has to be handlded way before, just after Instruction Selection appropriately.

Thus, we migrate optimizeVcndVcmpPair from SIOptimizeExecMaksingPreRA into SIFoldOperands pass invoked during MachineSSAOptimization pipeline.

SIFoldOperand pass As in case of lateAMDGPUWaveTransform pipeline, the SIOptimizeExecMaskingPreRA should be moved just before SGPR allocation when per-lane VGPR allocation has been handled. So, we need to ensure that any kind of optimization dealing with EXEC mask around VGPRs has to be handlded way before, just after Instruction Selection appropriately. Thus, we migrate optimizeVcndVcmpPair from SIOptimizeExecMaksingPreRA into SIFoldOperands pass invoked dring MachineSSAOptimization pipeline.

z1-cciauto · 2025-10-24T07:17:12Z

PSDB Build Link: http://mlse-bdc-20dd129:8065/#/builders/10/builds/7

llvm/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp

llvm/lib/Target/AMDGPU/SIFoldOperands.cpp

llvm/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp

cdevadas · 2025-10-24T11:58:06Z

llvm/test/CodeGen/AMDGPU/optimize-exec-mask-pre-ra-def-after-use.mir

-  ; GCN-NEXT:   [[V_CNDMASK_B32_e64_:%[0-9]+]]:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, [[V_CMP_NEQ_F16_e64_]], implicit $exec
  ; GCN-NEXT:   [[S_CSELECT_B32_:%[0-9]+]]:sreg_32_xm0_xexec = S_CSELECT_B32 -1, 0, implicit undef $scc
-  ; GCN-NEXT:   [[S_AND_B32_1:%[0-9]+]]:sreg_32 = S_AND_B32 $exec_lo, [[S_CSELECT_B32_]], implicit-def dead $scc
+  ; GCN-NEXT:   [[COPY:%[0-9]+]]:sreg_32 = COPY $exec_lo, implicit-def $exec_lo


Is this new code sequence correct?

Yes, it is correct as the below optimization which happened in SIOptimizeExecMaskingPreRA additionally now does not happen in SIFoldOperands pass :

// If the only user of a logical operation is move to exec, fold it now // to prevent forming of saveexec. I.e.: // // %0:sreg_64 = COPY $exec // %1:sreg_64 = S_AND_B64 %0:sreg_64, %2:sreg_64 // => // %1 = S_AND_B64 $exec, %2:sreg_64

Take a look at the input test MIR to understand it clearly

z1-cciauto · 2025-10-27T08:08:38Z

PSDB Build Link: http://mlse-bdc-20dd129:8065/#/builders/10/builds/10

…w pipeline. (#412) This patch introduces SIOptimizeExecMaskingPreRA after AMDGPUWaveTransform pass, but just before SGPR allocation to reduce register pressure for the new pipeline. While at the same time, it still acts as pre-RA pass optimizing EXEC-mask related instructions for legacy pipeline. It is a follow-up which depended on the #369.

vg0204 requested review from cdevadas and jmmartinez October 24, 2025 07:16

vg0204 self-assigned this Oct 24, 2025

jmmartinez reviewed Oct 24, 2025

View reviewed changes

llvm/lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp Show resolved Hide resolved

cdevadas reviewed Oct 24, 2025

View reviewed changes

Update llvm/lib/Target/AMDGPU/SIFoldOperands.cpp

0c45cde

cdevadas approved these changes Oct 27, 2025

View reviewed changes

vg0204 merged commit 0bba171 into amd-feature/wave-transform Oct 27, 2025
5 checks passed

vg0204 deleted the amd/dev/vikashgu/refactor-Vcndmask-execMask-fold-migrate-siFoldOperands branch October 27, 2025 11:06

vg0204 mentioned this pull request Oct 28, 2025

[AMDGPU][WaveTransform] Enable SIOptimizeExecMaskingPreRA pass for new pipeline. #412

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU][WaveTransform] Migrate VcndmaskVcmpExecMask fold into SIFoldOperand pass #369

[AMDGPU][WaveTransform] Migrate VcndmaskVcmpExecMask fold into SIFoldOperand pass #369

Uh oh!

vg0204 commented Oct 24, 2025 •

edited

Loading

Uh oh!

z1-cciauto commented Oct 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cdevadas Oct 24, 2025

Uh oh!

vg0204 Oct 24, 2025 •

edited

Loading

Uh oh!

z1-cciauto commented Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[AMDGPU][WaveTransform] Migrate VcndmaskVcmpExecMask fold into SIFoldOperand pass #369

[AMDGPU][WaveTransform] Migrate VcndmaskVcmpExecMask fold into SIFoldOperand pass #369

Uh oh!

Conversation

vg0204 commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

z1-cciauto commented Oct 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cdevadas Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

vg0204 Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

z1-cciauto commented Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

vg0204 commented Oct 24, 2025 •

edited

Loading

vg0204 Oct 24, 2025 •

edited

Loading