-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[ConstantTime] Native ct.select support for X86 and i386 #166704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: users/wizardengineer/ct-select-clang
Are you sure you want to change the base?
[ConstantTime] Native ct.select support for X86 and i386 #166704
Conversation
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Add native X86 implementation with CMOV instructions and comprehensive tests: - X86 ISelLowering with CMOV for x86_64 and i386 - Fallback bitwise operations for i386 targets without CMOV - Post-RA expansion for pseudo-instructions - Comprehensive test coverage: - Edge cases (zero conditions, large integers) - i386-specific tests (FP, MMX, non-CMOV fallback) - Vector operations - Optimization patterns The basic test demonstrating fallback is in the core infrastructure PR.
1e1460d to
d0a67b0
Compare
cbb5490 to
6ac8221
Compare
Add special handling for x86_fp80 types in CTSELECT lowering by splitting them into three 32-bit chunks, performing constant-time selection on each chunk, and reassembling the result. This fixes crashes when compiling tests with f80 types. Also updated ctselect.ll to match current generic fallback implementation.
d0a67b0 to
4f62062
Compare
|
@llvm/pr-subscribers-backend-x86 Author: Julius Alexandre (wizardengineer) Changes[LLVM][X86] Add native ct.select support for X86 and i386 Add native X86 implementation with CMOV instructions and comprehensive tests:
The basic test demonstrating fallback is in the core infrastructure PR. [LLVM][X86] Add f80 support for ct.select Add special handling for x86_fp80 types in CTSELECT lowering by splitting Also updated ctselect.ll to match current generic fallback implementation. Patch is 881.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/166704.diff 17 Files Affected:
diff --git a/llvm/lib/Target/X86/X86.td b/llvm/lib/Target/X86/X86.td
index 9e291a6ae431f..21826d8289bb9 100644
--- a/llvm/lib/Target/X86/X86.td
+++ b/llvm/lib/Target/X86/X86.td
@@ -825,9 +825,10 @@ include "X86SchedSapphireRapids.td"
def ProcessorFeatures {
// x86-64 micro-architecture levels: x86-64 and x86-64-v[234]
- list<SubtargetFeature> X86_64V1Features = [
- FeatureX87, FeatureCX8, FeatureCMOV, FeatureMMX, FeatureSSE2,
- FeatureFXSR, FeatureNOPL, FeatureX86_64,
+ list<SubtargetFeature> X86_64V1Features = [FeatureX87, FeatureCX8,
+ FeatureCMOV, FeatureMMX,
+ FeatureSSE2, FeatureFXSR,
+ FeatureNOPL, FeatureX86_64,
];
list<SubtargetFeature> X86_64V1Tuning = [
TuningMacroFusion,
@@ -1161,6 +1162,7 @@ def ProcessorFeatures {
FeatureAVXNECONVERT,
FeatureAVXVNNIINT8,
FeatureAVXVNNIINT16,
+ FeatureUSERMSR,
FeatureSHA512,
FeatureSM3,
FeatureEGPR,
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 6edf0185df813..7c5de8a834d79 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -12,6 +12,7 @@
//===----------------------------------------------------------------------===//
#include "X86ISelLowering.h"
+#include "MCTargetDesc/X86MCTargetDesc.h"
#include "MCTargetDesc/X86ShuffleDecode.h"
#include "X86.h"
#include "X86FrameLowering.h"
@@ -28,6 +29,8 @@
#include "llvm/Analysis/BlockFrequencyInfo.h"
#include "llvm/Analysis/ProfileSummaryInfo.h"
#include "llvm/Analysis/VectorUtils.h"
+#include "llvm/CodeGen/ISDOpcodes.h"
+#include "llvm/CodeGen/IntrinsicLowering.h"
#include "llvm/CodeGen/LivePhysRegs.h"
#include "llvm/CodeGen/MachineFrameInfo.h"
#include "llvm/CodeGen/MachineFunction.h"
@@ -48,6 +51,7 @@
#include "llvm/IR/GlobalAlias.h"
#include "llvm/IR/GlobalVariable.h"
#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/InlineAsm.h"
#include "llvm/IR/Instructions.h"
#include "llvm/IR/Intrinsics.h"
#include "llvm/IR/PatternMatch.h"
@@ -189,10 +193,10 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
// We don't accept any truncstore of integer registers.
setTruncStoreAction(MVT::i64, MVT::i32, Expand);
setTruncStoreAction(MVT::i64, MVT::i16, Expand);
- setTruncStoreAction(MVT::i64, MVT::i8 , Expand);
+ setTruncStoreAction(MVT::i64, MVT::i8, Expand);
setTruncStoreAction(MVT::i32, MVT::i16, Expand);
- setTruncStoreAction(MVT::i32, MVT::i8 , Expand);
- setTruncStoreAction(MVT::i16, MVT::i8, Expand);
+ setTruncStoreAction(MVT::i32, MVT::i8, Expand);
+ setTruncStoreAction(MVT::i16, MVT::i8, Expand);
setTruncStoreAction(MVT::f64, MVT::f32, Expand);
@@ -204,106 +208,106 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
// Integer absolute.
if (Subtarget.canUseCMOV()) {
- setOperationAction(ISD::ABS , MVT::i16 , Custom);
- setOperationAction(ISD::ABS , MVT::i32 , Custom);
+ setOperationAction(ISD::ABS, MVT::i16, Custom);
+ setOperationAction(ISD::ABS, MVT::i32, Custom);
if (Subtarget.is64Bit())
- setOperationAction(ISD::ABS , MVT::i64 , Custom);
+ setOperationAction(ISD::ABS, MVT::i64, Custom);
}
// Absolute difference.
for (auto Op : {ISD::ABDS, ISD::ABDU}) {
- setOperationAction(Op , MVT::i8 , Custom);
- setOperationAction(Op , MVT::i16 , Custom);
- setOperationAction(Op , MVT::i32 , Custom);
+ setOperationAction(Op, MVT::i8, Custom);
+ setOperationAction(Op, MVT::i16, Custom);
+ setOperationAction(Op, MVT::i32, Custom);
if (Subtarget.is64Bit())
- setOperationAction(Op , MVT::i64 , Custom);
+ setOperationAction(Op, MVT::i64, Custom);
}
// Signed saturation subtraction.
- setOperationAction(ISD::SSUBSAT , MVT::i8 , Custom);
- setOperationAction(ISD::SSUBSAT , MVT::i16 , Custom);
- setOperationAction(ISD::SSUBSAT , MVT::i32 , Custom);
+ setOperationAction(ISD::SSUBSAT, MVT::i8, Custom);
+ setOperationAction(ISD::SSUBSAT, MVT::i16, Custom);
+ setOperationAction(ISD::SSUBSAT, MVT::i32, Custom);
if (Subtarget.is64Bit())
- setOperationAction(ISD::SSUBSAT , MVT::i64 , Custom);
+ setOperationAction(ISD::SSUBSAT, MVT::i64, Custom);
// Funnel shifts.
for (auto ShiftOp : {ISD::FSHL, ISD::FSHR}) {
// For slow shld targets we only lower for code size.
LegalizeAction ShiftDoubleAction = Subtarget.isSHLDSlow() ? Custom : Legal;
- setOperationAction(ShiftOp , MVT::i8 , Custom);
- setOperationAction(ShiftOp , MVT::i16 , Custom);
- setOperationAction(ShiftOp , MVT::i32 , ShiftDoubleAction);
+ setOperationAction(ShiftOp, MVT::i8, Custom);
+ setOperationAction(ShiftOp, MVT::i16, Custom);
+ setOperationAction(ShiftOp, MVT::i32, ShiftDoubleAction);
if (Subtarget.is64Bit())
- setOperationAction(ShiftOp , MVT::i64 , ShiftDoubleAction);
+ setOperationAction(ShiftOp, MVT::i64, ShiftDoubleAction);
}
if (!Subtarget.useSoftFloat()) {
// Promote all UINT_TO_FP to larger SINT_TO_FP's, as X86 doesn't have this
// operation.
- setOperationAction(ISD::UINT_TO_FP, MVT::i8, Promote);
+ setOperationAction(ISD::UINT_TO_FP, MVT::i8, Promote);
setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i8, Promote);
- setOperationAction(ISD::UINT_TO_FP, MVT::i16, Promote);
+ setOperationAction(ISD::UINT_TO_FP, MVT::i16, Promote);
setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i16, Promote);
// We have an algorithm for SSE2, and we turn this into a 64-bit
// FILD or VCVTUSI2SS/SD for other targets.
- setOperationAction(ISD::UINT_TO_FP, MVT::i32, Custom);
+ setOperationAction(ISD::UINT_TO_FP, MVT::i32, Custom);
setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i32, Custom);
// We have an algorithm for SSE2->double, and we turn this into a
// 64-bit FILD followed by conditional FADD for other targets.
- setOperationAction(ISD::UINT_TO_FP, MVT::i64, Custom);
+ setOperationAction(ISD::UINT_TO_FP, MVT::i64, Custom);
setOperationAction(ISD::STRICT_UINT_TO_FP, MVT::i64, Custom);
// Promote i8 SINT_TO_FP to larger SINT_TO_FP's, as X86 doesn't have
// this operation.
- setOperationAction(ISD::SINT_TO_FP, MVT::i8, Promote);
+ setOperationAction(ISD::SINT_TO_FP, MVT::i8, Promote);
setOperationAction(ISD::STRICT_SINT_TO_FP, MVT::i8, Promote);
// SSE has no i16 to fp conversion, only i32. We promote in the handler
// to allow f80 to use i16 and f64 to use i16 with sse1 only
- setOperationAction(ISD::SINT_TO_FP, MVT::i16, Custom);
+ setOperationAction(ISD::SINT_TO_FP, MVT::i16, Custom);
setOperationAction(ISD::STRICT_SINT_TO_FP, MVT::i16, Custom);
// f32 and f64 cases are Legal with SSE1/SSE2, f80 case is not
- setOperationAction(ISD::SINT_TO_FP, MVT::i32, Custom);
+ setOperationAction(ISD::SINT_TO_FP, MVT::i32, Custom);
setOperationAction(ISD::STRICT_SINT_TO_FP, MVT::i32, Custom);
// In 32-bit mode these are custom lowered. In 64-bit mode F32 and F64
// are Legal, f80 is custom lowered.
- setOperationAction(ISD::SINT_TO_FP, MVT::i64, Custom);
+ setOperationAction(ISD::SINT_TO_FP, MVT::i64, Custom);
setOperationAction(ISD::STRICT_SINT_TO_FP, MVT::i64, Custom);
// Promote i8 FP_TO_SINT to larger FP_TO_SINTS's, as X86 doesn't have
// this operation.
- setOperationAction(ISD::FP_TO_SINT, MVT::i8, Promote);
+ setOperationAction(ISD::FP_TO_SINT, MVT::i8, Promote);
// FIXME: This doesn't generate invalid exception when it should. PR44019.
- setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i8, Promote);
- setOperationAction(ISD::FP_TO_SINT, MVT::i16, Custom);
+ setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i8, Promote);
+ setOperationAction(ISD::FP_TO_SINT, MVT::i16, Custom);
setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i16, Custom);
- setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
+ setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i32, Custom);
// In 32-bit mode these are custom lowered. In 64-bit mode F32 and F64
// are Legal, f80 is custom lowered.
- setOperationAction(ISD::FP_TO_SINT, MVT::i64, Custom);
+ setOperationAction(ISD::FP_TO_SINT, MVT::i64, Custom);
setOperationAction(ISD::STRICT_FP_TO_SINT, MVT::i64, Custom);
// Handle FP_TO_UINT by promoting the destination to a larger signed
// conversion.
- setOperationAction(ISD::FP_TO_UINT, MVT::i8, Promote);
+ setOperationAction(ISD::FP_TO_UINT, MVT::i8, Promote);
// FIXME: This doesn't generate invalid exception when it should. PR44019.
- setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i8, Promote);
- setOperationAction(ISD::FP_TO_UINT, MVT::i16, Promote);
+ setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i8, Promote);
+ setOperationAction(ISD::FP_TO_UINT, MVT::i16, Promote);
// FIXME: This doesn't generate invalid exception when it should. PR44019.
setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i16, Promote);
- setOperationAction(ISD::FP_TO_UINT, MVT::i32, Custom);
+ setOperationAction(ISD::FP_TO_UINT, MVT::i32, Custom);
setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i32, Custom);
- setOperationAction(ISD::FP_TO_UINT, MVT::i64, Custom);
+ setOperationAction(ISD::FP_TO_UINT, MVT::i64, Custom);
setOperationAction(ISD::STRICT_FP_TO_UINT, MVT::i64, Custom);
- setOperationAction(ISD::LRINT, MVT::f32, Custom);
- setOperationAction(ISD::LRINT, MVT::f64, Custom);
- setOperationAction(ISD::LLRINT, MVT::f32, Custom);
- setOperationAction(ISD::LLRINT, MVT::f64, Custom);
+ setOperationAction(ISD::LRINT, MVT::f32, Custom);
+ setOperationAction(ISD::LRINT, MVT::f64, Custom);
+ setOperationAction(ISD::LLRINT, MVT::f32, Custom);
+ setOperationAction(ISD::LLRINT, MVT::f64, Custom);
if (!Subtarget.is64Bit()) {
- setOperationAction(ISD::LRINT, MVT::i64, Custom);
+ setOperationAction(ISD::LRINT, MVT::i64, Custom);
setOperationAction(ISD::LLRINT, MVT::i64, Custom);
}
}
@@ -311,7 +315,7 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
if (Subtarget.hasSSE2()) {
// Custom lowering for saturating float to int conversions.
// We handle promotion to larger result types manually.
- for (MVT VT : { MVT::i8, MVT::i16, MVT::i32 }) {
+ for (MVT VT : {MVT::i8, MVT::i16, MVT::i32}) {
setOperationAction(ISD::FP_TO_UINT_SAT, VT, Custom);
setOperationAction(ISD::FP_TO_SINT_SAT, VT, Custom);
}
@@ -344,17 +348,17 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
// TODO: when we have SSE, these could be more efficient, by using movd/movq.
if (!Subtarget.hasSSE2()) {
- setOperationAction(ISD::BITCAST , MVT::f32 , Expand);
- setOperationAction(ISD::BITCAST , MVT::i32 , Expand);
+ setOperationAction(ISD::BITCAST, MVT::f32, Expand);
+ setOperationAction(ISD::BITCAST, MVT::i32, Expand);
setOperationAction(ISD::FCANONICALIZE, MVT::f32, Custom);
setOperationAction(ISD::FCANONICALIZE, MVT::f64, Custom);
if (Subtarget.is64Bit()) {
- setOperationAction(ISD::BITCAST , MVT::f64 , Expand);
+ setOperationAction(ISD::BITCAST, MVT::f64, Expand);
// Without SSE, i64->f64 goes through memory.
- setOperationAction(ISD::BITCAST , MVT::i64 , Expand);
+ setOperationAction(ISD::BITCAST, MVT::i64, Expand);
}
} else if (!Subtarget.is64Bit())
- setOperationAction(ISD::BITCAST , MVT::i64 , Custom);
+ setOperationAction(ISD::BITCAST, MVT::i64, Custom);
// Scalar integer divide and remainder are lowered to use operations that
// produce two results, to match the available instructions. This exposes
@@ -366,7 +370,7 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
// (low) operations are left as Legal, as there are single-result
// instructions for this in x86. Using the two-result multiply instructions
// when both high and low results are needed must be arranged by dagcombine.
- for (auto VT : { MVT::i8, MVT::i16, MVT::i32, MVT::i64 }) {
+ for (auto VT : {MVT::i8, MVT::i16, MVT::i32, MVT::i64}) {
setOperationAction(ISD::MULHS, VT, Expand);
setOperationAction(ISD::MULHU, VT, Expand);
setOperationAction(ISD::SDIV, VT, Expand);
@@ -375,47 +379,47 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
setOperationAction(ISD::UREM, VT, Expand);
}
- setOperationAction(ISD::BR_JT , MVT::Other, Expand);
- setOperationAction(ISD::BRCOND , MVT::Other, Custom);
- for (auto VT : { MVT::f32, MVT::f64, MVT::f80, MVT::f128,
- MVT::i8, MVT::i16, MVT::i32, MVT::i64 }) {
- setOperationAction(ISD::BR_CC, VT, Expand);
+ setOperationAction(ISD::BR_JT, MVT::Other, Expand);
+ setOperationAction(ISD::BRCOND, MVT::Other, Custom);
+ for (auto VT : {MVT::f32, MVT::f64, MVT::f80, MVT::f128, MVT::i8, MVT::i16,
+ MVT::i32, MVT::i64}) {
+ setOperationAction(ISD::BR_CC, VT, Expand);
setOperationAction(ISD::SELECT_CC, VT, Expand);
}
if (Subtarget.is64Bit())
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i32, Legal);
- setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16 , Legal);
- setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8 , Legal);
- setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i1 , Expand);
+ setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Legal);
+ setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Legal);
+ setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i1, Expand);
- setOperationAction(ISD::FREM , MVT::f32 , Expand);
- setOperationAction(ISD::FREM , MVT::f64 , Expand);
- setOperationAction(ISD::FREM , MVT::f80 , Expand);
- setOperationAction(ISD::FREM , MVT::f128 , Expand);
+ setOperationAction(ISD::FREM, MVT::f32, Expand);
+ setOperationAction(ISD::FREM, MVT::f64, Expand);
+ setOperationAction(ISD::FREM, MVT::f80, Expand);
+ setOperationAction(ISD::FREM, MVT::f128, Expand);
if (!Subtarget.useSoftFloat() && Subtarget.hasX87()) {
- setOperationAction(ISD::GET_ROUNDING , MVT::i32 , Custom);
- setOperationAction(ISD::SET_ROUNDING , MVT::Other, Custom);
- setOperationAction(ISD::GET_FPENV_MEM , MVT::Other, Custom);
- setOperationAction(ISD::SET_FPENV_MEM , MVT::Other, Custom);
- setOperationAction(ISD::RESET_FPENV , MVT::Other, Custom);
+ setOperationAction(ISD::GET_ROUNDING, MVT::i32, Custom);
+ setOperationAction(ISD::SET_ROUNDING, MVT::Other, Custom);
+ setOperationAction(ISD::GET_FPENV_MEM, MVT::Other, Custom);
+ setOperationAction(ISD::SET_FPENV_MEM, MVT::Other, Custom);
+ setOperationAction(ISD::RESET_FPENV, MVT::Other, Custom);
}
// Promote the i8 variants and force them on up to i32 which has a shorter
// encoding.
- setOperationPromotedToType(ISD::CTTZ , MVT::i8 , MVT::i32);
- setOperationPromotedToType(ISD::CTTZ_ZERO_UNDEF, MVT::i8 , MVT::i32);
+ setOperationPromotedToType(ISD::CTTZ, MVT::i8, MVT::i32);
+ setOperationPromotedToType(ISD::CTTZ_ZERO_UNDEF, MVT::i8, MVT::i32);
// Promoted i16. tzcntw has a false dependency on Intel CPUs. For BSF, we emit
// a REP prefix to encode it as TZCNT for modern CPUs so it makes sense to
// promote that too.
- setOperationPromotedToType(ISD::CTTZ , MVT::i16 , MVT::i32);
- setOperationPromotedToType(ISD::CTTZ_ZERO_UNDEF, MVT::i16 , MVT::i32);
+ setOperationPromotedToType(ISD::CTTZ, MVT::i16, MVT::i32);
+ setOperationPromotedToType(ISD::CTTZ_ZERO_UNDEF, MVT::i16, MVT::i32);
if (!Subtarget.hasBMI()) {
- setOperationAction(ISD::CTTZ , MVT::i32 , Custom);
- setOperationAction(ISD::CTTZ_ZERO_UNDEF, MVT::i32 , Legal);
+ setOperationAction(ISD::CTTZ, MVT::i32, Custom);
+ setOperationAction(ISD::CTTZ_ZERO_UNDEF, MVT::i32, Legal);
if (Subtarget.is64Bit()) {
- setOperationAction(ISD::CTTZ , MVT::i64 , Custom);
+ setOperationAction(ISD::CTTZ, MVT::i64, Custom);
setOperationAction(ISD::CTTZ_ZERO_UNDEF, MVT::i64, Legal);
}
}
@@ -423,13 +427,13 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
if (Subtarget.hasLZCNT()) {
// When promoting the i8 variants, force them to i32 for a shorter
// encoding.
- setOperationPromotedToType(ISD::CTLZ , MVT::i8 , MVT::i32);
- setOperationPromotedToType(ISD::CTLZ_ZERO_UNDEF, MVT::i8 , MVT::i32);
+ setOperationPromotedToType(ISD::CTLZ, MVT::i8, MVT::i32);
+ setOperationPromotedToType(ISD::CTLZ_ZERO_UNDEF, MVT::i8, MVT::i32);
} else {
for (auto VT : {MVT::i8, MVT::i16, MVT::i32, MVT::i64}) {
if (VT == MVT::i64 && !Subtarget.is64Bit())
continue;
- setOperationAction(ISD::CTLZ , VT, Custom);
+ setOperationAction(ISD::CTLZ, VT, Custom);
setOperationAction(ISD::CTLZ_ZERO_UNDEF, VT, Custom);
}
}
@@ -474,36 +478,39 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
// on the dest that popcntl hasn't had since Cannon Lake.
setOperationPromotedToType(ISD::CTPOP, MVT::i16, MVT::i32);
} else {
- setOperationAction(ISD::CTPOP , MVT::i8 , Custom);
- setOperationAction(ISD::CTPOP , MVT::i16 , Custom);
- setOperationAction(ISD::CTPOP , MVT::i32 , Custom);
- setOperationAction(ISD::CTPOP , MVT::i64 , Custom);
+ setOperationAction(ISD::CTPOP, MVT::i8, Custom);
+ setOperationAction(ISD::CTPOP, MVT::i16, Custom);
+ setOperationAction(ISD::CTPOP, MVT::i32, Custom);
+ setOperationAction(ISD::CTPOP, MVT::i64, Custom);
}
- setOperationAction(ISD::READCYCLECOUNTER , MVT::i64 , Custom);
+ setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, Custom);
if (!Subtarget.hasMOVBE())
- setOperationAction(ISD::BSWAP , MVT::i16 , Expand);
+ setOperationAction(ISD::BSWAP, MVT::i16, Expand);
// X86 wants to expand cmov itself.
- for (auto VT : { MVT::f32, MVT::f64, MVT::f80, MVT::f128 }) {
+ for (auto VT : {MVT::f32, MVT::f64, MVT::f80, MVT::f128}) {
setOperationAction(ISD::SELECT, VT, Custom);
+ setOperationAction(ISD::CTSELECT, VT, Custom);
setOperationAction(ISD::SETCC, VT, Custom);
setOperationAction(ISD::STRICT_FSETCC, VT, Custom);
setOperationAction(ISD::STRICT_FSETCCS, VT, Custom);
}
- for (auto VT : { MVT::i8, MVT::i16, MVT::i32, MVT::i64 }) {
+ for (auto VT : {MVT::i8, MVT::i16, MVT::i32, MVT::i64}) {
if (VT == MVT::i64 && !Subtarget.is64Bit())
continue;
setOperationAction(ISD::SELECT, VT, Custom);
- setOperationAction(ISD::SETCC, VT, Custom);
+ setOperationAction(ISD::CTSELECT, VT, Custom);
+ setOperationAction(ISD::SETCC, VT, Custom);
}
// Custom action for SELECT MMX and expand action for SELECT_CC MMX
setOperationAction(ISD::SELECT, MVT::x86mmx, Custom);
+ setOperationAction(ISD::CTSELECT, MVT::x86mmx, Custom);
setOperationAction(ISD::SELECT_CC, MVT::x86mmx, Expand);
- setOperationAction(ISD::EH_RETURN , MVT::Other, Custom);
+ setOperationAction(ISD::EH_RETURN, MVT::Other, Custom);
// NOTE: EH_SJLJ_SETJMP/_LONGJMP are not recommended, since
// LLVM/Clang supports zero-cost DWARF and SEH exception handling.
setOperationAction(ISD::EH_SJLJ_SETJMP, MVT::i32, Custom);
@@ -511,19 +518,19 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
setOperationAction(ISD::EH_SJLJ_SETUP_DISPATCH, MVT::Other, Custom);
// Darwin ABI issu...
[truncated]
|

[LLVM][X86] Add native ct.select support for X86 and i386
Add native X86 implementation with CMOV instructions and comprehensive tests:
The basic test demonstrating fallback is in the core infrastructure PR.
[LLVM][X86] Add f80 support for ct.select
Add special handling for x86_fp80 types in CTSELECT lowering by splitting
them into three 32-bit chunks, performing constant-time selection on each
chunk, and reassembling the result. This fixes crashes when compiling
tests with f80 types.
Also updated ctselect.ll to match current generic fallback implementation.