New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WebAssembly SIMD] Support bitwise operations on Intel #7365
[WebAssembly SIMD] Support bitwise operations on Intel #7365
Conversation
EWS run on previous version of this PR (hash e35e9d0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r=me
|
||
void vandpd_rrr(XMMRegisterID a, XMMRegisterID b, XMMRegisterID dest) | ||
{ | ||
m_formatter.vexNdsLigWigCommutativeTwoByteOp(PRE_OPERAND_SIZE, OP2_ANDPD_VpdWpd, (RegisterID)dest, (RegisterID)a, (RegisterID)b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PRE_OPERAND_SIZE
should be PRE_SSE_66
. And apply this change to the followings.
I also fixed the existing ones in https://github.com/WebKit/WebKit/pull/7370/files
OP3_PCMPGTQ_VdqWdq = 0x37 | ||
OP3_PCMPGTQ_VdqWdq = 0x37, | ||
OP3_PMAXSB_VdqWdq = 0x3C, | ||
OP3_PMAXSD_VdqWdq = 0x3D, | ||
OP3_PMAXUW_VdqWdq = 0x3E, | ||
OP3_PMAXUD_VdqWdq = 0x3F, | ||
OP3_PMINSB_VdqWdq = 0x38, | ||
OP3_PMINSD_VdqWdq = 0x39, | ||
OP3_PMINUW_VdqWdq = 0x3A, | ||
OP3_PMINUD_VdqWdq = 0x3B, | ||
OP3_PTEST_VdqWdq = 0x17 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's sort based on the numbers.
OP2_PMINUB_VdqWdq = 0xDA, | ||
OP2_PSLLW_VdqWdq = 0xF1, | ||
OP2_PSLLD_VdqWdq = 0xF2, | ||
OP2_PSLLQ_VdqWdq = 0xF3, | ||
OP2_PSRLW_VdqWdq = 0xD1, | ||
OP2_PSRLD_VdqWdq = 0xD2, | ||
OP2_PSRLQ_VdqWdq = 0xD3, | ||
OP2_PSRAW_VdqWdq = 0xE1, | ||
OP2_PSRAD_VdqWdq = 0xE2, | ||
OP2_PSRAQ_VdqWdq = 0xE3, | ||
OP2_PMOVMSKB_EqWdq = 0xD7, | ||
OP2_MOVMSKPS_EqWps = 0x50, | ||
OP2_MOVMSKPD_EqWpd = 0x50 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto, let's sort based on numbers.
m_assembler.vpcmpeqb_rr(vec, scratch, scratch); | ||
break; | ||
case SIMDLane::i16x8: | ||
m_assembler.vpcmpeqw_rr(vec, scratch, scratch); | ||
break; | ||
case SIMDLane::i32x4: | ||
m_assembler.vpcmpeqd_rr(vec, scratch, scratch); | ||
break; | ||
case SIMDLane::i64x2: | ||
m_assembler.vpcmpeqq_rr(vec, scratch, scratch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are rrr
.
{ | ||
RELEASE_ASSERT(supportsAVXForSIMD()); | ||
|
||
m_assembler.vpxor_rr(scratch, scratch, scratch); // Zero scratch register. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
m_assembler.vpxor_rr(tmp, tmp, tmp); | ||
m_assembler.vpacksswb_rr(vec, tmp, tmp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
m_formatter.vexNdsLigWigTwoByteOp(PRE_OPERAND_SIZE, OP2_PSRAW_VdqWdq, (RegisterID)dest, (RegisterID)input, (RegisterID)shift); | ||
} | ||
|
||
void vpsrad_rrr(XMMRegisterID input, XMMRegisterID shift, XMMRegisterID dest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these ordering needs to be changed to align to ATT style. Currently, it is not aligned.
e35e9d0
to
ab02add
Compare
https://bugs.webkit.org/show_bug.cgi?id=248639 rdar://103089683 Reviewed by Yusuke Suzuki. Adds support for SIMD bitwise operations (AND, OR, XOR, ANDN, NOT, shifts, any/all_true, bitmask/select) to the Intel macro assembler. Currently omits shifts and popcnt for i8x16 vectors, since they aren't supported natively on Intel and are pretty involved to emulate. * Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h: (JSC::MacroAssemblerX86_64::compareIntegerVectorWithZero): (JSC::MacroAssemblerX86_64::vectorAnd): (JSC::MacroAssemblerX86_64::vectorAndnot): (JSC::MacroAssemblerX86_64::vectorOr): (JSC::MacroAssemblerX86_64::vectorXor): (JSC::MacroAssemblerX86_64::vectorUshl): (JSC::MacroAssemblerX86_64::vectorUshr): (JSC::MacroAssemblerX86_64::vectorSshr): (JSC::MacroAssemblerX86_64::vectorAnyTrue): (JSC::MacroAssemblerX86_64::vectorAllTrue): (JSC::MacroAssemblerX86_64::vectorNot): Deleted. * Source/JavaScriptCore/assembler/X86Assembler.h: (JSC::X86Assembler::vandps_rrr): (JSC::X86Assembler::vandpd_rrr): (JSC::X86Assembler::vorps_rrr): (JSC::X86Assembler::vorpd_rrr): (JSC::X86Assembler::vxorps_rrr): (JSC::X86Assembler::vxorpd_rrr): (JSC::X86Assembler::vandnps_rrr): (JSC::X86Assembler::vandnpd_rrr): (JSC::X86Assembler::vpmovmskb_rr): (JSC::X86Assembler::vmovmskps_rr): (JSC::X86Assembler::vmovmskpd_rr): (JSC::X86Assembler::vptest_rr): (JSC::X86Assembler::vpsllw_rrr): (JSC::X86Assembler::vpslld_rrr): (JSC::X86Assembler::vpsllq_rrr): (JSC::X86Assembler::vpsrlw_rrr): (JSC::X86Assembler::vpsrld_rrr): (JSC::X86Assembler::vpsrlq_rrr): (JSC::X86Assembler::vpsraw_rrr): (JSC::X86Assembler::vpsrad_rrr): (JSC::X86Assembler::vpsraq_rrr): (JSC::X86Assembler::andps_rr): (JSC::X86Assembler::orps_rr): * Source/JavaScriptCore/b3/air/AirLowerMacros.cpp: (JSC::B3::Air::lowerMacros): * Source/JavaScriptCore/b3/air/AirOpcode.opcodes: * Source/JavaScriptCore/wasm/WasmAirIRGenerator.cpp: (JSC::Wasm::AirIRGenerator::addSIMDI_V): (JSC::Wasm::AirIRGenerator::addSIMDV_V): (JSC::Wasm::AirIRGenerator::addConstant): (JSC::Wasm::AirIRGenerator::addSIMDShift): Canonical link: https://commits.webkit.org/257640@main
ab02add
to
4dea5d5
Compare
Committed 257640@main (4dea5d5): https://commits.webkit.org/257640@main Reviewed commits have been landed. Closing PR #7365 and removing active labels. |
4dea5d5
ab02add
π π§ͺ winπ§ͺ ios-wk2π§ͺ api-macπ§ͺ gtk-wk2π§ͺ api-iosπ§ͺ mac-wk1π§ͺ api-gtkπ tvπ§ͺ mac-wk2π§ͺ mac-AS-debug-wk2π watch