[PowerPC] Replace vspltisw+vadduwm instructions with xxleqv+vsubuwm for adding the vector {1, 1, 1, 1} #160882

Himadhith · 2025-09-26T13:34:23Z

This patch leverages generation of vector of -1s to be cheaper than vector of 1s to optimize the current implementation for A + vector {1, 1, 1, 1}.

In this optimized version we replace vspltisw (4 cycles) with xxleqv (2 cycles) using the following identity:
A - (-1) = A + 1.

llvmbot · 2025-09-26T13:35:03Z

@llvm/pr-subscribers-backend-powerpc

Author: None (Himadhith)

Changes

This patch leverages generation of vector of -1s to be cheaper than vector of 1s to optimize the current implementation for A + vector {1, 1, 1, 1}.

In this optimized version we replace vspltisw (4 cycles) with xxleqv (2 cycles) using the following identity:
A - (-1) = A + 1.

Full diff: https://github.com/llvm/llvm-project/pull/160882.diff

1 Files Affected:

(modified) llvm/lib/Target/PowerPC/PPCInstrVSX.td (+4)

diff --git a/llvm/lib/Target/PowerPC/PPCInstrVSX.td b/llvm/lib/Target/PowerPC/PPCInstrVSX.td
index 4e5165bfcda55..dc850d2470cfd 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrVSX.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrVSX.td
@@ -3627,6 +3627,10 @@ def : Pat<(v4i32 (build_vector immSExt5NonZero:$A, immSExt5NonZero:$A,
                                immSExt5NonZero:$A, immSExt5NonZero:$A)),
           (v4i32 (VSPLTISW imm:$A))>;
 
+// Optimise for vector of 1s addition operation
+def : Pat<(add v4i32:$A, (build_vector (i32 1), (i32 1), (i32 1), (i32 1))),
+          (VSUBUWM $A, (v4i32 (COPY_TO_REGCLASS (XXLEQVOnes), VSRC)))>;
+
 // Splat loads.
 def : Pat<(v8i16 (PPCldsplat ForceXForm:$A)),
           (v8i16 (VSPLTHs 3, (MTVSRWZ (LHZX ForceXForm:$A))))>;

lei137 · 2025-09-26T17:56:55Z

I'm guessing this is not ready to be reviewed as it need https://github.com/llvm/llvm-project/pull/160476/files to be in first enable to show the difference.

Himadhith · 2025-09-26T18:19:42Z

I'm guessing this is not ready to be reviewed as it need https://github.com/llvm/llvm-project/pull/160476/files to be in first enable to show the difference.

Yes as soon as the NFC patch gets merged I will rebase and the file should reflect the changes. Should I keep this as a draft till then?

…ector of -1s is cheaper than vector of 1s

tonykuttai · 2025-10-06T05:27:24Z

llvm/lib/Target/PowerPC/PPCInstrVSX.td

          (v4i32 (VSPLTISW imm:$A))>;

+// Optimize for vector of 1s addition operation
+def : Pat<(add v4i32:$A, (build_vector (i32 1), (i32 1), (i32 1), (i32 1))),


Does this work only for v4i32 vector types? Why not v2i64, v8i16 and v16i8 types?

tonykuttai · 2025-10-06T05:27:58Z

llvm/test/CodeGen/PowerPC/vector-all-ones.ll

-; This pattern is expected to be optimized in a future patch by using `xxleqv` to generate vector of -1s
-; followed by subtraction operation.
+; Optimized version of vector addition with {1,1,1,1} by replacing `vspltisw + vadduwm` with 'xxleqv + vsubuwm'
 define dso_local noundef <4 x i32> @test1(<4 x i32> %a) {


Same as above comment. Support v2i64, v8i16 and v16i8 types as well ?

Himadhith requested review from AditiRM, RolandF77, amy-kwan, kamaub, lei137 and tonykuttai September 26, 2025 13:34

llvmbot added the backend:PowerPC label Sep 26, 2025

Himadhith force-pushed the himadhith/xxleqv_vec branch from 6018f73 to 40edcce Compare September 26, 2025 13:35

Himadhith mentioned this pull request Sep 26, 2025

[NFC] Lockdown instructions of vspltisw for addition of vector of 1s #160476

Merged

Himadhith force-pushed the himadhith/xxleqv_vec branch 4 times, most recently from 8079d5c to da6de91 Compare September 26, 2025 16:32

Himadhith force-pushed the himadhith/xxleqv_vec branch from da6de91 to 5de66e2 Compare September 27, 2025 04:01

himadhith and others added 3 commits September 27, 2025 04:02

[PowerPC] Replace vspltisw instruction with xxleqv as generation of v…

5de66e2

…ector of -1s is cheaper than vector of 1s

Merge branch 'main' into himadhith/xxleqv_vec

e8a7227

Updating testfile

9d18c9f

tonykuttai requested changes Oct 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[PowerPC] Replace vspltisw+vadduwm instructions with xxleqv+vsubuwm for adding the vector {1, 1, 1, 1} #160882

[PowerPC] Replace vspltisw+vadduwm instructions with xxleqv+vsubuwm for adding the vector {1, 1, 1, 1} #160882

Uh oh!

Himadhith commented Sep 26, 2025

Uh oh!

llvmbot commented Sep 26, 2025

Uh oh!

lei137 commented Sep 26, 2025

Uh oh!

Himadhith commented Sep 26, 2025 •

edited

Loading

Uh oh!

tonykuttai Oct 6, 2025

Uh oh!

tonykuttai Oct 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[PowerPC] Replace vspltisw+vadduwm instructions with xxleqv+vsubuwm for adding the vector {1, 1, 1, 1} #160882

Are you sure you want to change the base?

[PowerPC] Replace vspltisw+vadduwm instructions with xxleqv+vsubuwm for adding the vector {1, 1, 1, 1} #160882

Uh oh!

Conversation

Himadhith commented Sep 26, 2025

Uh oh!

llvmbot commented Sep 26, 2025

Uh oh!

lei137 commented Sep 26, 2025

Uh oh!

Himadhith commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tonykuttai Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

tonykuttai Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Himadhith commented Sep 26, 2025 •

edited

Loading