[VectorCombine] Relax vector type constraint on bitop(bitcast, constant) #157246

XChy · 2025-09-06T09:19:15Z

Fixes #157131.
This patch allows bitop(bitcast, constant) -> bitcast(bitop) for scalar integer types.

llvmbot · 2025-09-06T09:19:48Z

@llvm/pr-subscribers-llvm-transforms

Author: Hongyu Chen (XChy)

Changes

Fixes #157131.
This patch allows bitop(bitcast, constant) -> bitcast(bitop) for scalar integer types.

Full diff: https://github.com/llvm/llvm-project/pull/157246.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/VectorCombine.cpp (+13-14)
(modified) llvm/test/Transforms/VectorCombine/X86/bitop-of-castops.ll (+60)

diff --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index 6e46547b15b2b..cea818350103a 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -1013,19 +1013,20 @@ bool VectorCombine::foldBitOpOfCastConstant(Instruction &I) {
 
   Value *LHSSrc = LHSCast->getOperand(0);
 
-  // Only handle vector types with integer elements
-  auto *SrcVecTy = dyn_cast<FixedVectorType>(LHSSrc->getType());
-  auto *DstVecTy = dyn_cast<FixedVectorType>(I.getType());
-  if (!SrcVecTy || !DstVecTy)
+  // Only handle vector types with integer elements if the cast is not bitcast
+  auto *SrcTy = LHSSrc->getType();
+  auto *DstTy = I.getType();
+  if (CastOpcode != Instruction::BitCast &&
+      (!isa<FixedVectorType>(SrcTy) || !!isa<FixedVectorType>(DstTy)))
     return false;
 
-  if (!SrcVecTy->getScalarType()->isIntegerTy() ||
-      !DstVecTy->getScalarType()->isIntegerTy())
+  if (!SrcTy->getScalarType()->isIntegerTy() ||
+      !DstTy->getScalarType()->isIntegerTy())
     return false;
 
   // Find the constant InvC, such that castop(InvC) equals to C.
   PreservedCastFlags RHSFlags;
-  Constant *InvC = getLosslessInvCast(C, SrcVecTy, CastOpcode, *DL, RHSFlags);
+  Constant *InvC = getLosslessInvCast(C, SrcTy, CastOpcode, *DL, RHSFlags);
   if (!InvC)
     return false;
 
@@ -1034,20 +1035,18 @@ bool VectorCombine::foldBitOpOfCastConstant(Instruction &I) {
   // NewCost = bitlogic + cast
 
   // Calculate specific costs for each cast with instruction context
-  InstructionCost LHSCastCost =
-      TTI.getCastInstrCost(CastOpcode, DstVecTy, SrcVecTy,
-                           TTI::CastContextHint::None, CostKind, LHSCast);
+  InstructionCost LHSCastCost = TTI.getCastInstrCost(
+      CastOpcode, DstTy, SrcTy, TTI::CastContextHint::None, CostKind, LHSCast);
 
   InstructionCost OldCost =
-      TTI.getArithmeticInstrCost(I.getOpcode(), DstVecTy, CostKind) +
-      LHSCastCost;
+      TTI.getArithmeticInstrCost(I.getOpcode(), DstTy, CostKind) + LHSCastCost;
 
   // For new cost, we can't provide an instruction (it doesn't exist yet)
   InstructionCost GenericCastCost = TTI.getCastInstrCost(
-      CastOpcode, DstVecTy, SrcVecTy, TTI::CastContextHint::None, CostKind);
+      CastOpcode, DstTy, SrcTy, TTI::CastContextHint::None, CostKind);
 
   InstructionCost NewCost =
-      TTI.getArithmeticInstrCost(I.getOpcode(), SrcVecTy, CostKind) +
+      TTI.getArithmeticInstrCost(I.getOpcode(), SrcTy, CostKind) +
       GenericCastCost;
 
   // Account for multi-use casts using specific costs
diff --git a/llvm/test/Transforms/VectorCombine/X86/bitop-of-castops.ll b/llvm/test/Transforms/VectorCombine/X86/bitop-of-castops.ll
index ca707ca08f169..c6253a7b858ad 100644
--- a/llvm/test/Transforms/VectorCombine/X86/bitop-of-castops.ll
+++ b/llvm/test/Transforms/VectorCombine/X86/bitop-of-castops.ll
@@ -420,3 +420,63 @@ define <4 x i32> @or_zext_nneg_multiconstant(<4 x i8> %a) {
   %or = or <4 x i32> %z1, <i32 240, i32 1, i32 242, i32 3>
   ret <4 x i32> %or
 }
+
+; Negative test: bitcast from scalar float to vector int (optimization should not apply)
+define <2 x i16> @and_bitcast_f32_to_v2i16_constant(float %a) {
+; CHECK-LABEL: @and_bitcast_f32_to_v2i16_constant(
+; CHECK-NEXT:    [[BC2:%.*]] = bitcast float [[B:%.*]] to <2 x i16>
+; CHECK-NEXT:    [[AND:%.*]] = and <2 x i16> <i16 0, i16 1>, [[BC2]]
+; CHECK-NEXT:    ret <2 x i16> [[AND]]
+;
+  %bc = bitcast float %a to <2 x i16>
+  %and = and <2 x i16> <i16 0, i16 1>, %bc
+  ret <2 x i16> %and
+}
+
+; Negative test: bitcast from vector float to scalar int (optimization should not apply)
+define i64 @and_bitcast_v2f32_to_i64_constant(<2 x float> %a) {
+; CHECK-LABEL: @and_bitcast_v2f32_to_i64_constant(
+; CHECK-NEXT:    [[BC2:%.*]] = bitcast <2 x float> [[B:%.*]] to i64
+; CHECK-NEXT:    [[AND:%.*]] = and i64 123, [[BC2]]
+; CHECK-NEXT:    ret i64 [[AND]]
+;
+  %bc = bitcast <2 x float> %a to i64
+  %and = and i64 123, %bc
+  ret i64 %and
+}
+
+; Test no-op bitcast
+define i16 @xor_bitcast_i16_to_i16_constant(i16 %a) {
+; CHECK-LABEL: @xor_bitcast_i16_to_i16_constant(
+; CHECK-NEXT:    [[BC2:%.*]] = bitcast i16 [[B:%.*]] to i16
+; CHECK-NEXT:    [[OR:%.*]] = xor i16 123, [[BC2]]
+; CHECK-NEXT:    ret i16 [[OR]]
+;
+  %bc = bitcast i16 %a to i16
+  %or = xor i16 123, %bc
+  ret i16 %or
+}
+
+; Test bitwise operations with integer vector to integer bitcast
+define <16 x i1> @xor_bitcast_i16_to_v16i1_constant(i16 %a) {
+; CHECK-LABEL: @xor_bitcast_i16_to_v16i1_constant(
+; CHECK-NEXT:    [[B:%.*]] = xor i16 [[A:%.*]], -1
+; CHECK-NEXT:    [[BC2:%.*]] = bitcast i16 [[B]] to <16 x i1>
+; CHECK-NEXT:    ret <16 x i1> [[BC2]]
+;
+  %bc = bitcast i16 %a to <16 x i1>
+  %or = xor <16 x i1> %bc, splat (i1 true)
+  ret <16 x i1> %or
+}
+
+; Test bitwise operations with integer vector to integer bitcast
+define i16 @or_bitcast_v16i1_to_i16_constant(<16 x i1> %a) {
+; CHECK-LABEL: @or_bitcast_v16i1_to_i16_constant(
+; CHECK-NEXT:    [[BC2:%.*]] = bitcast <16 x i1> [[B:%.*]] to i16
+; CHECK-NEXT:    [[OR:%.*]] = or i16 [[BC2]], 3
+; CHECK-NEXT:    ret i16 [[OR]]
+;
+  %bc = bitcast <16 x i1> %a to i16
+  %or = or i16 %bc, 3
+  ret i16 %or
+}

llvmbot · 2025-09-06T09:19:49Z

@llvm/pr-subscribers-vectorizers

Author: Hongyu Chen (XChy)

Changes

Fixes #157131.
This patch allows bitop(bitcast, constant) -> bitcast(bitop) for scalar integer types.

Full diff: https://github.com/llvm/llvm-project/pull/157246.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/VectorCombine.cpp (+13-14)
(modified) llvm/test/Transforms/VectorCombine/X86/bitop-of-castops.ll (+60)

diff --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index 6e46547b15b2b..cea818350103a 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -1013,19 +1013,20 @@ bool VectorCombine::foldBitOpOfCastConstant(Instruction &I) {
 
   Value *LHSSrc = LHSCast->getOperand(0);
 
-  // Only handle vector types with integer elements
-  auto *SrcVecTy = dyn_cast<FixedVectorType>(LHSSrc->getType());
-  auto *DstVecTy = dyn_cast<FixedVectorType>(I.getType());
-  if (!SrcVecTy || !DstVecTy)
+  // Only handle vector types with integer elements if the cast is not bitcast
+  auto *SrcTy = LHSSrc->getType();
+  auto *DstTy = I.getType();
+  if (CastOpcode != Instruction::BitCast &&
+      (!isa<FixedVectorType>(SrcTy) || !!isa<FixedVectorType>(DstTy)))
     return false;
 
-  if (!SrcVecTy->getScalarType()->isIntegerTy() ||
-      !DstVecTy->getScalarType()->isIntegerTy())
+  if (!SrcTy->getScalarType()->isIntegerTy() ||
+      !DstTy->getScalarType()->isIntegerTy())
     return false;
 
   // Find the constant InvC, such that castop(InvC) equals to C.
   PreservedCastFlags RHSFlags;
-  Constant *InvC = getLosslessInvCast(C, SrcVecTy, CastOpcode, *DL, RHSFlags);
+  Constant *InvC = getLosslessInvCast(C, SrcTy, CastOpcode, *DL, RHSFlags);
   if (!InvC)
     return false;
 
@@ -1034,20 +1035,18 @@ bool VectorCombine::foldBitOpOfCastConstant(Instruction &I) {
   // NewCost = bitlogic + cast
 
   // Calculate specific costs for each cast with instruction context
-  InstructionCost LHSCastCost =
-      TTI.getCastInstrCost(CastOpcode, DstVecTy, SrcVecTy,
-                           TTI::CastContextHint::None, CostKind, LHSCast);
+  InstructionCost LHSCastCost = TTI.getCastInstrCost(
+      CastOpcode, DstTy, SrcTy, TTI::CastContextHint::None, CostKind, LHSCast);
 
   InstructionCost OldCost =
-      TTI.getArithmeticInstrCost(I.getOpcode(), DstVecTy, CostKind) +
-      LHSCastCost;
+      TTI.getArithmeticInstrCost(I.getOpcode(), DstTy, CostKind) + LHSCastCost;
 
   // For new cost, we can't provide an instruction (it doesn't exist yet)
   InstructionCost GenericCastCost = TTI.getCastInstrCost(
-      CastOpcode, DstVecTy, SrcVecTy, TTI::CastContextHint::None, CostKind);
+      CastOpcode, DstTy, SrcTy, TTI::CastContextHint::None, CostKind);
 
   InstructionCost NewCost =
-      TTI.getArithmeticInstrCost(I.getOpcode(), SrcVecTy, CostKind) +
+      TTI.getArithmeticInstrCost(I.getOpcode(), SrcTy, CostKind) +
       GenericCastCost;
 
   // Account for multi-use casts using specific costs
diff --git a/llvm/test/Transforms/VectorCombine/X86/bitop-of-castops.ll b/llvm/test/Transforms/VectorCombine/X86/bitop-of-castops.ll
index ca707ca08f169..c6253a7b858ad 100644
--- a/llvm/test/Transforms/VectorCombine/X86/bitop-of-castops.ll
+++ b/llvm/test/Transforms/VectorCombine/X86/bitop-of-castops.ll
@@ -420,3 +420,63 @@ define <4 x i32> @or_zext_nneg_multiconstant(<4 x i8> %a) {
   %or = or <4 x i32> %z1, <i32 240, i32 1, i32 242, i32 3>
   ret <4 x i32> %or
 }
+
+; Negative test: bitcast from scalar float to vector int (optimization should not apply)
+define <2 x i16> @and_bitcast_f32_to_v2i16_constant(float %a) {
+; CHECK-LABEL: @and_bitcast_f32_to_v2i16_constant(
+; CHECK-NEXT:    [[BC2:%.*]] = bitcast float [[B:%.*]] to <2 x i16>
+; CHECK-NEXT:    [[AND:%.*]] = and <2 x i16> <i16 0, i16 1>, [[BC2]]
+; CHECK-NEXT:    ret <2 x i16> [[AND]]
+;
+  %bc = bitcast float %a to <2 x i16>
+  %and = and <2 x i16> <i16 0, i16 1>, %bc
+  ret <2 x i16> %and
+}
+
+; Negative test: bitcast from vector float to scalar int (optimization should not apply)
+define i64 @and_bitcast_v2f32_to_i64_constant(<2 x float> %a) {
+; CHECK-LABEL: @and_bitcast_v2f32_to_i64_constant(
+; CHECK-NEXT:    [[BC2:%.*]] = bitcast <2 x float> [[B:%.*]] to i64
+; CHECK-NEXT:    [[AND:%.*]] = and i64 123, [[BC2]]
+; CHECK-NEXT:    ret i64 [[AND]]
+;
+  %bc = bitcast <2 x float> %a to i64
+  %and = and i64 123, %bc
+  ret i64 %and
+}
+
+; Test no-op bitcast
+define i16 @xor_bitcast_i16_to_i16_constant(i16 %a) {
+; CHECK-LABEL: @xor_bitcast_i16_to_i16_constant(
+; CHECK-NEXT:    [[BC2:%.*]] = bitcast i16 [[B:%.*]] to i16
+; CHECK-NEXT:    [[OR:%.*]] = xor i16 123, [[BC2]]
+; CHECK-NEXT:    ret i16 [[OR]]
+;
+  %bc = bitcast i16 %a to i16
+  %or = xor i16 123, %bc
+  ret i16 %or
+}
+
+; Test bitwise operations with integer vector to integer bitcast
+define <16 x i1> @xor_bitcast_i16_to_v16i1_constant(i16 %a) {
+; CHECK-LABEL: @xor_bitcast_i16_to_v16i1_constant(
+; CHECK-NEXT:    [[B:%.*]] = xor i16 [[A:%.*]], -1
+; CHECK-NEXT:    [[BC2:%.*]] = bitcast i16 [[B]] to <16 x i1>
+; CHECK-NEXT:    ret <16 x i1> [[BC2]]
+;
+  %bc = bitcast i16 %a to <16 x i1>
+  %or = xor <16 x i1> %bc, splat (i1 true)
+  ret <16 x i1> %or
+}
+
+; Test bitwise operations with integer vector to integer bitcast
+define i16 @or_bitcast_v16i1_to_i16_constant(<16 x i1> %a) {
+; CHECK-LABEL: @or_bitcast_v16i1_to_i16_constant(
+; CHECK-NEXT:    [[BC2:%.*]] = bitcast <16 x i1> [[B:%.*]] to i16
+; CHECK-NEXT:    [[OR:%.*]] = or i16 [[BC2]], 3
+; CHECK-NEXT:    ret i16 [[OR]]
+;
+  %bc = bitcast <16 x i1> %a to i16
+  %or = or i16 %bc, 3
+  ret i16 %or
+}

llvm/lib/Transforms/Vectorize/VectorCombine.cpp

github-actions · 2025-09-07T13:28:15Z

✅ With the latest revision this PR passed the C/C++ code formatter.

RKSimon

LGTM

XChy added 2 commits September 6, 2025 17:06

Precommit tests

280d9f7

[VectorCombine] Relax vector type constraint in foldBitOpOfCastConstant

3caa1e1

XChy requested a review from RKSimon September 6, 2025 09:19

llvmbot added vectorizers llvm:transforms llvm:vectorcombine labels Sep 6, 2025

RKSimon reviewed Sep 7, 2025

View reviewed changes

llvm/lib/Transforms/Vectorize/VectorCombine.cpp Outdated Show resolved Hide resolved

update comment

67e3093

format

11f3332

RKSimon approved these changes Sep 7, 2025

View reviewed changes

XChy merged commit db2fc84 into llvm:main Sep 8, 2025
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[VectorCombine] Relax vector type constraint on bitop(bitcast, constant) #157246

[VectorCombine] Relax vector type constraint on bitop(bitcast, constant) #157246

Uh oh!

XChy commented Sep 6, 2025

Uh oh!

llvmbot commented Sep 6, 2025

Uh oh!

llvmbot commented Sep 6, 2025

Uh oh!

Uh oh!

github-actions bot commented Sep 7, 2025 •

edited

Loading

Uh oh!

RKSimon left a comment

Uh oh!

Uh oh!

Uh oh!

[VectorCombine] Relax vector type constraint on bitop(bitcast, constant) #157246

[VectorCombine] Relax vector type constraint on bitop(bitcast, constant) #157246

Uh oh!

Conversation

XChy commented Sep 6, 2025

Uh oh!

llvmbot commented Sep 6, 2025

Uh oh!

llvmbot commented Sep 6, 2025

Uh oh!

Uh oh!

github-actions bot commented Sep 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RKSimon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Sep 7, 2025 •

edited

Loading