Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalar/Packed conversions for floating point to integer #97529

Merged
merged 82 commits into from
Apr 5, 2024
Merged
Show file tree
Hide file tree
Changes from 79 commits
Commits
Show all changes
82 commits
Select commit Hold shift + click to select a range
447a45e
merging with main
khushal1996 Oct 24, 2023
42332de
Basic working version of double -> ulong saturation
khushal1996 Oct 25, 2023
77876cd
Moving the code in a do-while with proper checks to amke sure we are …
khushal1996 Oct 27, 2023
e04bd5d
adjusting comments
khushal1996 Nov 3, 2023
fabed24
Merging with main
khushal1996 Nov 29, 2023
274018e
removing conflicts from gentree.h flags
khushal1996 Dec 6, 2023
2fc6d75
float to uint conversion verified. removing commented code
khushal1996 Dec 6, 2023
7a38ba9
merging with main. Making changes to simdashwintrinsic.cpp and
khushal1996 Dec 7, 2023
4464448
progress on double to long morphing
khushal1996 Dec 13, 2023
6854e53
another attempt at double to long conversion
khushal1996 Dec 14, 2023
915bb3d
Merge with main
khushal1996 Dec 20, 2023
e63313f
adding handling for scalar conversion cases for SSE2. Remaining float…
khushal1996 Jan 3, 2024
cfb66ed
partial changes for float to int conversion using double to int for a…
khushal1996 Jan 5, 2024
2417cd2
adding float to int working scalar conversion case. Working on vectro…
khushal1996 Jan 8, 2024
4bb2f01
partial work on float to int packed conversion
khushal1996 Jan 11, 2024
0ad99c9
partial version of float to int conversion
khushal1996 Jan 17, 2024
8f9e225
working version of float to int scalar/packed for avx512
khushal1996 Jan 24, 2024
a92c010
complete conversions code for floating point to integral conversions …
khushal1996 Jan 24, 2024
1ce320a
Merging with main.
khushal1996 Jan 25, 2024
37460f7
fixing debug checks hitting asserts for TYP_ULONG and TYP_UINT at IR …
khushal1996 Jan 30, 2024
665e79b
adding JIT_Dbl2Int for target_x86 and other architectures.
khushal1996 Jan 31, 2024
921b833
Supporting x86 for saturating conversions as well
khushal1996 Feb 5, 2024
e3eaa88
fixing errors in packed conversion
khushal1996 Feb 6, 2024
bdd0fc3
accomodate unsigned in IR
khushal1996 Feb 8, 2024
9510547
adding evex support for cvttss2si
khushal1996 Feb 13, 2024
46e2c88
Mergw with main
khushal1996 Feb 13, 2024
3204bbc
Catch divide by zero exception
khushal1996 Feb 14, 2024
813c72e
Handle overflow cases
khushal1996 Feb 15, 2024
7795754
Fix tests to check saturating behavior
khushal1996 Feb 15, 2024
8392b31
Correct mapping of instructions
khushal1996 Feb 15, 2024
ea6acb6
Convert float -> ulong / long as float -> double -> ulong / long
khushal1996 Feb 22, 2024
c89c4b9
Merging with main
khushal1996 Oct 24, 2023
1c23e73
Merging with main
khushal1996 Nov 3, 2023
0aaac78
removing conflicts from gentree.h flags
khushal1996 Dec 6, 2023
c564b37
merging with main. Making changes to simdashwintrinsic.cpp and
khushal1996 Dec 7, 2023
172c967
adding a new helper function ofr float to uint scalar conversion for …
khushal1996 Dec 20, 2023
31b899a
Merging with main
khushal1996 Jan 3, 2024
7730d46
partial changes for float to int conversion using double to int for a…
khushal1996 Jan 5, 2024
facb8b4
partial version of float to int conversion
khushal1996 Jan 17, 2024
970db62
working version of float to int scalar/packed for avx512
khushal1996 Jan 24, 2024
cfc52bf
Merging with main.
khushal1996 Jan 25, 2024
9c4edd5
Changing the way helper functions are handled in morph
khushal1996 Jan 30, 2024
e9ac9a0
adding JIT_Dbl2Int for target_x86 and other architectures.
khushal1996 Jan 31, 2024
6c7be45
Supporting x86 for saturating conversions as well
khushal1996 Feb 5, 2024
8ebe57d
fixing errors in packed conversion
khushal1996 Feb 6, 2024
5f8bbbc
Correct mapping of instructions
khushal1996 Feb 15, 2024
6553069
delete extra files
khushal1996 Feb 22, 2024
597f6f3
Merging main
khushal1996 Feb 24, 2024
4de1da4
Merge with main and adding new helpers in nativeaot
khushal1996 Feb 24, 2024
9670355
changing type of cast node as signed when making cast nodes
khushal1996 Feb 26, 2024
f2c6487
Avoiding removing extra element from the stack
khushal1996 Feb 27, 2024
6197b20
Fix formatting, Change comp->IsaSupportedDebugOnly to IsBaselineVecto…
khushal1996 Feb 27, 2024
feb4be0
Reverting some changes to maintain uniformity in code
khushal1996 Feb 27, 2024
aa9e127
Handling cases where AVX512 is not supported in simdashwintrinsic.cpp
khushal1996 Feb 28, 2024
34341cd
fixing exit conditions for ConvertVectorT_ToDouble
khushal1996 Feb 28, 2024
5ff9d1a
Check for AVX512 support for TARGET_XARCH
khushal1996 Feb 28, 2024
d93dc5b
Avoid avx512 path for x86
khushal1996 Feb 29, 2024
2a1b6f8
Enable AVX512F codepath for conversions in x86 arch. Move x86 to usin…
khushal1996 Mar 12, 2024
48e0acf
Add SSE41 path for scalar conversions and 128 bit float to int packed…
khushal1996 Mar 13, 2024
8506ece
Adding SSE41 path for floating point to UINT scalar conversions
khushal1996 Mar 14, 2024
408c716
Add AVX path for ConvertToInt32
khushal1996 Mar 14, 2024
b1f4f67
Adding comments and cleaning the code
khushal1996 Mar 18, 2024
ab7dfb7
Fix errors in double to ulong
khushal1996 Mar 19, 2024
f3e4bf5
Addressing review comments
khushal1996 Mar 21, 2024
b620c2f
Fix tests
khushal1996 Mar 22, 2024
487c9e2
Reverse val < 0 check in dbltoUint and dbltoUlng helpers
khushal1996 Mar 22, 2024
f145e1a
Add overflow conversions for 86/x64, remove FastDbl2Lng and inline it
khushal1996 Mar 22, 2024
4cb90fb
Apply suggestions from code review
khushal1996 Mar 23, 2024
98c23de
Correct Dbl2UlngOvf
khushal1996 Mar 23, 2024
782c8d4
Apply suggestions from code review
jkotas Mar 23, 2024
ab7b4de
Apply suggestions from code review
jkotas Mar 23, 2024
b4b8411
Update src/coreclr/vm/jithelpers.cpp
jkotas Mar 23, 2024
e474ed1
Disable failing mono tests
khushal1996 Mar 29, 2024
9e6ddd0
Merge branch 'main' into kcm-scalar-convert-rebased
khushal1996 Mar 29, 2024
d27c4f7
Merge branch 'main' into kcm-scalar-convert-rebased
khushal1996 Apr 1, 2024
70f2170
Working version of saturating logic moved to lowering for x86/x64
khushal1996 Apr 2, 2024
0f1dc05
Making changes for pre SSE41
khushal1996 Apr 2, 2024
36e0655
Apply suggestions from code review
khushal1996 Apr 2, 2024
c4f28c7
Merge branch 'main' into kcm-scalar-convert-rebased
khushal1996 Apr 3, 2024
9aac0f4
Removing dead code
khushal1996 Apr 4, 2024
080ec88
Fix formatting
khushal1996 Apr 4, 2024
523f1cc
Address review comments, add proper docstrings
khushal1996 Apr 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
6 changes: 3 additions & 3 deletions src/coreclr/inc/jithelpers.h
Original file line number Diff line number Diff line change
Expand Up @@ -55,11 +55,11 @@
JITHELPER(CORINFO_HELP_ULMOD, JIT_ULMod, CORINFO_HELP_SIG_16_STACK)
JITHELPER(CORINFO_HELP_LNG2DBL, JIT_Lng2Dbl, CORINFO_HELP_SIG_8_STACK)
JITHELPER(CORINFO_HELP_ULNG2DBL, JIT_ULng2Dbl, CORINFO_HELP_SIG_8_STACK)
DYNAMICJITHELPER(CORINFO_HELP_DBL2INT, JIT_Dbl2Lng, CORINFO_HELP_SIG_8_STACK)
JITHELPER(CORINFO_HELP_DBL2INT, JIT_Dbl2Int, CORINFO_HELP_SIG_8_STACK)
JITHELPER(CORINFO_HELP_DBL2INT_OVF, JIT_Dbl2IntOvf, CORINFO_HELP_SIG_8_STACK)
DYNAMICJITHELPER(CORINFO_HELP_DBL2LNG, JIT_Dbl2Lng, CORINFO_HELP_SIG_8_STACK)
JITHELPER(CORINFO_HELP_DBL2LNG, JIT_Dbl2Lng, CORINFO_HELP_SIG_8_STACK)
JITHELPER(CORINFO_HELP_DBL2LNG_OVF, JIT_Dbl2LngOvf, CORINFO_HELP_SIG_8_STACK)
DYNAMICJITHELPER(CORINFO_HELP_DBL2UINT, JIT_Dbl2Lng, CORINFO_HELP_SIG_8_STACK)
JITHELPER(CORINFO_HELP_DBL2UINT, JIT_Dbl2UInt, CORINFO_HELP_SIG_8_STACK)
JITHELPER(CORINFO_HELP_DBL2UINT_OVF, JIT_Dbl2UIntOvf, CORINFO_HELP_SIG_8_STACK)
JITHELPER(CORINFO_HELP_DBL2ULNG, JIT_Dbl2ULng, CORINFO_HELP_SIG_8_STACK)
JITHELPER(CORINFO_HELP_DBL2ULNG_OVF, JIT_Dbl2ULngOvf, CORINFO_HELP_SIG_8_STACK)
Expand Down
1 change: 1 addition & 0 deletions src/coreclr/jit/assertionprop.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ bool IntegralRange::Contains(int64_t value) const
{
case TYP_UBYTE:
case TYP_USHORT:
case TYP_UINT:
khushal1996 marked this conversation as resolved.
Show resolved Hide resolved
tannergooding marked this conversation as resolved.
Show resolved Hide resolved
return SymbolicIntegerValue::Zero;
case TYP_BYTE:
return SymbolicIntegerValue::ByteMin;
Expand Down
11 changes: 7 additions & 4 deletions src/coreclr/jit/codegenxarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7604,21 +7604,24 @@ void CodeGen::genFloatToIntCast(GenTree* treeNode)
noway_assert((dstSize == EA_ATTR(genTypeSize(TYP_INT))) || (dstSize == EA_ATTR(genTypeSize(TYP_LONG))));

// We shouldn't be seeing uint64 here as it should have been converted
// into a helper call by either front-end or lowering phase.
assert(!varTypeIsUnsigned(dstType) || (dstSize != EA_ATTR(genTypeSize(TYP_LONG))));
// into a helper call by either front-end or lowering phase, unless we have AVX512F
// accelerated conversions.
assert(!varTypeIsUnsigned(dstType) || (dstSize != EA_ATTR(genTypeSize(TYP_LONG))) ||
compiler->compIsaSupportedDebugOnly(InstructionSet_AVX512F));

// If the dstType is TYP_UINT, we have 32-bits to encode the
// float number. Any of 33rd or above bits can be the sign bit.
// To achieve it we pretend as if we are converting it to a long.
if (varTypeIsUnsigned(dstType) && (dstSize == EA_ATTR(genTypeSize(TYP_INT))))
if (varTypeIsUnsigned(dstType) && (dstSize == EA_ATTR(genTypeSize(TYP_INT))) &&
!compiler->compOpportunisticallyDependsOn(InstructionSet_AVX512F))
{
dstType = TYP_LONG;
}

// Note that we need to specify dstType here so that it will determine
// the size of destination integer register and also the rex.w prefix.
genConsumeOperands(treeNode->AsOp());
instruction ins = ins_FloatConv(TYP_INT, srcType, emitTypeSize(srcType));
instruction ins = ins_FloatConv(dstType, srcType, emitTypeSize(srcType));
GetEmitter()->emitInsBinary(ins, emitTypeSize(dstType), treeNode, op1);
genProduceReg(treeNode);
}
Expand Down
8 changes: 8 additions & 0 deletions src/coreclr/jit/compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -3200,6 +3200,14 @@ class Compiler
CorInfoType simdBaseJitType,
unsigned simdSize);

#if defined(TARGET_XARCH)
GenTree* gtNewSimdCvtNode(var_types type,
GenTree* op1,
CorInfoType simdTargetBaseJitType,
CorInfoType simdSourceBaseJitType,
unsigned simdSize);
#endif //TARGET_XARCH

GenTree* gtNewSimdCreateBroadcastNode(
var_types type, GenTree* op1, CorInfoType simdBaseJitType, unsigned simdSize);

Expand Down
6 changes: 4 additions & 2 deletions src/coreclr/jit/emit.h
Original file line number Diff line number Diff line change
Expand Up @@ -4000,7 +4000,8 @@ emitAttr emitter::emitGetBaseMemOpSize(instrDesc* id) const
case INS_comiss:
case INS_cvtss2sd:
case INS_cvtss2si:
case INS_cvttss2si:
case INS_cvttss2si32:
case INS_cvttss2si64:
case INS_divss:
case INS_extractps:
case INS_insertps:
Expand Down Expand Up @@ -4043,7 +4044,8 @@ emitAttr emitter::emitGetBaseMemOpSize(instrDesc* id) const
case INS_comisd:
case INS_cvtsd2si:
case INS_cvtsd2ss:
case INS_cvttsd2si:
case INS_cvttsd2si32:
case INS_cvttsd2si64:
case INS_divsd:
case INS_maxsd:
case INS_minsd:
Expand Down
41 changes: 23 additions & 18 deletions src/coreclr/jit/emitxarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1522,9 +1522,11 @@ bool emitter::TakesRexWPrefix(const instrDesc* id) const
switch (ins)
{
case INS_cvtss2si:
case INS_cvttss2si:
case INS_cvttss2si32:
case INS_cvttss2si64:
case INS_cvtsd2si:
case INS_cvttsd2si:
case INS_cvttsd2si32:
case INS_cvttsd2si64:
case INS_movd:
case INS_movnti:
case INS_andn:
Expand All @@ -1544,7 +1546,6 @@ bool emitter::TakesRexWPrefix(const instrDesc* id) const
#endif // TARGET_AMD64
case INS_vcvtsd2usi:
case INS_vcvtss2usi:
case INS_vcvttsd2usi:
{
if (attr == EA_8BYTE)
{
Expand Down Expand Up @@ -2723,8 +2724,10 @@ bool emitter::emitInsCanOnlyWriteSSE2OrAVXReg(instrDesc* id)
case INS_blsmsk:
case INS_blsr:
case INS_bzhi:
case INS_cvttsd2si:
case INS_cvttss2si:
case INS_cvttsd2si32:
case INS_cvttsd2si64:
case INS_cvttss2si32:
case INS_cvttss2si64:
case INS_cvtsd2si:
case INS_cvtss2si:
case INS_extractps:
Expand All @@ -2748,7 +2751,8 @@ bool emitter::emitInsCanOnlyWriteSSE2OrAVXReg(instrDesc* id)
#endif
case INS_vcvtsd2usi:
case INS_vcvtss2usi:
case INS_vcvttsd2usi:
case INS_vcvttsd2usi32:
case INS_vcvttsd2usi64:
case INS_vcvttss2usi32:
case INS_vcvttss2usi64:
{
Expand Down Expand Up @@ -11605,22 +11609,20 @@ void emitter::emitDispIns(
break;
}

case INS_cvttsd2si:
case INS_cvttsd2si32:
case INS_cvttsd2si64:
case INS_cvtss2si:
case INS_cvtsd2si:
case INS_cvttss2si:
case INS_cvttss2si32:
case INS_cvttss2si64:
case INS_vcvtsd2usi:
case INS_vcvtss2usi:
case INS_vcvttsd2usi:
{
printf(" %s, %s", emitRegName(id->idReg1(), attr), emitRegName(id->idReg2(), EA_16BYTE));
break;
}

case INS_vcvttsd2usi32:
case INS_vcvttsd2usi64:
case INS_vcvttss2usi32:
case INS_vcvttss2usi64:
{
printf(" %s, %s", emitRegName(id->idReg1(), attr), emitRegName(id->idReg2(), EA_4BYTE));
printf(" %s, %s", emitRegName(id->idReg1(), attr), emitRegName(id->idReg2(), EA_16BYTE));
break;
}

Expand Down Expand Up @@ -19050,7 +19052,8 @@ emitter::insExecutionCharacteristics emitter::getInsExecutionCharacteristics(ins
break;
}

case INS_cvttsd2si:
case INS_cvttsd2si32:
case INS_cvttsd2si64:
case INS_cvtsd2si:
case INS_cvtsi2sd32:
case INS_cvtsi2ss32:
Expand All @@ -19059,7 +19062,8 @@ emitter::insExecutionCharacteristics emitter::getInsExecutionCharacteristics(ins
case INS_vcvtsd2usi:
case INS_vcvtusi2ss32:
case INS_vcvtusi2ss64:
case INS_vcvttsd2usi:
case INS_vcvttsd2usi32:
case INS_vcvttsd2usi64:
case INS_vcvttss2usi32:
result.insThroughput = PERFSCORE_THROUGHPUT_1C;
result.insLatency += PERFSCORE_LATENCY_7C;
Expand All @@ -19071,7 +19075,8 @@ emitter::insExecutionCharacteristics emitter::getInsExecutionCharacteristics(ins
result.insLatency += PERFSCORE_LATENCY_5C;
break;

case INS_cvttss2si:
case INS_cvttss2si32:
case INS_cvttss2si64:
case INS_cvtss2si:
case INS_vcvtss2usi:
result.insThroughput = PERFSCORE_THROUGHPUT_1C;
Expand Down