Skip to content

Conversation

azwolski
Copy link
Contributor

@azwolski azwolski commented Sep 23, 2025

Missing VCMPPHZrrik, VCMPPHZ128rrik, and VCMPPHZ256rrik opcodes in commuteInstructionImpl and findCommutedOpIndices led to improper handling of compare instruction during optimization. Operands were commuted, but swapping of the immediate was not called.

Fixes: #159723

@azwolski azwolski marked this pull request as ready for review September 23, 2025 14:16
Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@llvmbot
Copy link
Member

llvmbot commented Sep 23, 2025

@llvm/pr-subscribers-backend-x86

Author: None (azwolski)

Changes

Missing VCMPPHZrrik, VCMPPHZ128rrik, and VCMPPHZ256rrik opcodes in commuteInstructionImpl and findCommutedOpIndices led to improper handling of compare instruction during optimization. Operands were commuted, but swapping of the immediate was not called.

Fixes: #159723


Full diff: https://github.com/llvm/llvm-project/pull/160304.diff

2 Files Affected:

  • (modified) llvm/lib/Target/X86/X86InstrInfo.cpp (+7-1)
  • (modified) llvm/test/CodeGen/X86/pr159723.ll (+3-3)
diff --git a/llvm/lib/Target/X86/X86InstrInfo.cpp b/llvm/lib/Target/X86/X86InstrInfo.cpp
index 58d526269ff3c..497441d8b6f93 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.cpp
+++ b/llvm/lib/Target/X86/X86InstrInfo.cpp
@@ -2573,10 +2573,13 @@ MachineInstr *X86InstrInfo::commuteInstructionImpl(MachineInstr &MI, bool NewMI,
   case X86::VCMPPSZ256rri:
   case X86::VCMPPDZrrik:
   case X86::VCMPPSZrrik:
+  case X86::VCMPPHZrrik:
   case X86::VCMPPDZ128rrik:
   case X86::VCMPPSZ128rrik:
+  case X86::VCMPPHZ128rrik:
   case X86::VCMPPDZ256rrik:
   case X86::VCMPPSZ256rrik:
+  case X86::VCMPPHZ256rrik:
     WorkingMI = CloneIfNew(MI);
     WorkingMI->getOperand(MI.getNumExplicitOperands() - 1)
         .setImm(X86::getSwappedVCMPImm(
@@ -2830,10 +2833,13 @@ bool X86InstrInfo::findCommutedOpIndices(const MachineInstr &MI,
   case X86::VCMPPSZ256rri:
   case X86::VCMPPDZrrik:
   case X86::VCMPPSZrrik:
+  case X86::VCMPPHZrrik:
   case X86::VCMPPDZ128rrik:
   case X86::VCMPPSZ128rrik:
+  case X86::VCMPPHZ128rrik:
   case X86::VCMPPDZ256rrik:
-  case X86::VCMPPSZ256rrik: {
+  case X86::VCMPPSZ256rrik:
+  case X86::VCMPPHZ256rrik: {
     unsigned OpOffset = X86II::isKMasked(Desc.TSFlags) ? 1 : 0;
 
     // Float comparison can be safely commuted for
diff --git a/llvm/test/CodeGen/X86/pr159723.ll b/llvm/test/CodeGen/X86/pr159723.ll
index cab4abb043639..c66b101fff990 100644
--- a/llvm/test/CodeGen/X86/pr159723.ll
+++ b/llvm/test/CodeGen/X86/pr159723.ll
@@ -17,7 +17,7 @@ define <8 x i1> @test_cmp_v8half_ogt(<8 x half> %rhs, <8 x i1> %mask) nounwind {
 ; CHECK-NEXT:    kmovw %k1, {{[-0-9]+}}(%r{{[sb]}}p) # 2-byte Spill
 ; CHECK-NEXT:    callq test_call_8@PLT
 ; CHECK-NEXT:    kmovw {{[-0-9]+}}(%r{{[sb]}}p), %k1 # 2-byte Reload
-; CHECK-NEXT:    vcmpltph {{[-0-9]+}}(%r{{[sb]}}p), %xmm0, %k0 {%k1} # 16-byte Folded Reload
+; CHECK-NEXT:    vcmpgtph {{[-0-9]+}}(%r{{[sb]}}p), %xmm0, %k0 {%k1} # 16-byte Folded Reload
 ; CHECK-NEXT:    vpmovm2w %k0, %xmm0
 ; CHECK-NEXT:    addq $40, %rsp
 ; CHECK-NEXT:    retq
@@ -79,7 +79,7 @@ define <16 x i1> @test_cmp_v16half_olt_commute(<16 x half> %rhs, <16 x i1> %mask
 ; CHECK-NEXT:    kmovw %k1, {{[-0-9]+}}(%r{{[sb]}}p) # 2-byte Spill
 ; CHECK-NEXT:    callq test_call_16@PLT
 ; CHECK-NEXT:    kmovw {{[-0-9]+}}(%r{{[sb]}}p), %k1 # 2-byte Reload
-; CHECK-NEXT:    vcmpltph {{[-0-9]+}}(%r{{[sb]}}p), %ymm0, %k0 {%k1} # 32-byte Folded Reload
+; CHECK-NEXT:    vcmpgtph {{[-0-9]+}}(%r{{[sb]}}p), %ymm0, %k0 {%k1} # 32-byte Folded Reload
 ; CHECK-NEXT:    vpmovm2b %k0, %xmm0
 ; CHECK-NEXT:    addq $56, %rsp
 ; CHECK-NEXT:    vzeroupper
@@ -100,7 +100,7 @@ define <32 x i1> @test_cmp_v32half_oge(<32 x half> %rhs, <32 x i1> %mask) nounwi
 ; CHECK-NEXT:    kmovd %k1, {{[-0-9]+}}(%r{{[sb]}}p) # 4-byte Spill
 ; CHECK-NEXT:    callq test_call_32@PLT
 ; CHECK-NEXT:    kmovd {{[-0-9]+}}(%r{{[sb]}}p), %k1 # 4-byte Reload
-; CHECK-NEXT:    vcmpleph {{[-0-9]+}}(%r{{[sb]}}p), %zmm0, %k0 {%k1} # 64-byte Folded Reload
+; CHECK-NEXT:    vcmpgeph {{[-0-9]+}}(%r{{[sb]}}p), %zmm0, %k0 {%k1} # 64-byte Folded Reload
 ; CHECK-NEXT:    vpmovm2b %k0, %ymm0
 ; CHECK-NEXT:    addq $88, %rsp
 ; CHECK-NEXT:    retq

Copy link
Contributor

@e-kud e-kud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - cheers

@e-kud e-kud merged commit 6d4bb20 into llvm:main Sep 23, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[X86] Invalid operand order for fp16 vector comparison
5 participants