Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WebAssembly SIMD] Support bitwise operations on Intel #7365

Conversation

ddegazio
Copy link
Contributor

@ddegazio ddegazio commented Dec 9, 2022

4dea5d5

[WebAssembly SIMD] Support bitwise operations on Intel
https://bugs.webkit.org/show_bug.cgi?id=248639
rdar://103089683

Reviewed by Yusuke Suzuki.

Adds support for SIMD bitwise operations (AND, OR, XOR, ANDN, NOT, shifts, any/all_true,
bitmask/select) to the Intel macro assembler. Currently omits shifts and popcnt for i8x16
vectors, since they aren't supported natively on Intel and are pretty involved to emulate.

* Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h:
(JSC::MacroAssemblerX86_64::compareIntegerVectorWithZero):
(JSC::MacroAssemblerX86_64::vectorAnd):
(JSC::MacroAssemblerX86_64::vectorAndnot):
(JSC::MacroAssemblerX86_64::vectorOr):
(JSC::MacroAssemblerX86_64::vectorXor):
(JSC::MacroAssemblerX86_64::vectorUshl):
(JSC::MacroAssemblerX86_64::vectorUshr):
(JSC::MacroAssemblerX86_64::vectorSshr):
(JSC::MacroAssemblerX86_64::vectorAnyTrue):
(JSC::MacroAssemblerX86_64::vectorAllTrue):
(JSC::MacroAssemblerX86_64::vectorNot): Deleted.
* Source/JavaScriptCore/assembler/X86Assembler.h:
(JSC::X86Assembler::vandps_rrr):
(JSC::X86Assembler::vandpd_rrr):
(JSC::X86Assembler::vorps_rrr):
(JSC::X86Assembler::vorpd_rrr):
(JSC::X86Assembler::vxorps_rrr):
(JSC::X86Assembler::vxorpd_rrr):
(JSC::X86Assembler::vandnps_rrr):
(JSC::X86Assembler::vandnpd_rrr):
(JSC::X86Assembler::vpmovmskb_rr):
(JSC::X86Assembler::vmovmskps_rr):
(JSC::X86Assembler::vmovmskpd_rr):
(JSC::X86Assembler::vptest_rr):
(JSC::X86Assembler::vpsllw_rrr):
(JSC::X86Assembler::vpslld_rrr):
(JSC::X86Assembler::vpsllq_rrr):
(JSC::X86Assembler::vpsrlw_rrr):
(JSC::X86Assembler::vpsrld_rrr):
(JSC::X86Assembler::vpsrlq_rrr):
(JSC::X86Assembler::vpsraw_rrr):
(JSC::X86Assembler::vpsrad_rrr):
(JSC::X86Assembler::vpsraq_rrr):
(JSC::X86Assembler::andps_rr):
(JSC::X86Assembler::orps_rr):
* Source/JavaScriptCore/b3/air/AirLowerMacros.cpp:
(JSC::B3::Air::lowerMacros):
* Source/JavaScriptCore/b3/air/AirOpcode.opcodes:
* Source/JavaScriptCore/wasm/WasmAirIRGenerator.cpp:
(JSC::Wasm::AirIRGenerator::addSIMDI_V):
(JSC::Wasm::AirIRGenerator::addSIMDV_V):
(JSC::Wasm::AirIRGenerator::addConstant):
(JSC::Wasm::AirIRGenerator::addSIMDShift):

Canonical link: https://commits.webkit.org/257640@main

ab02add

Misc iOS, tvOS & watchOS macOS Linux Windows
βœ… πŸ§ͺ style βœ… πŸ›  ios βœ… πŸ›  mac βœ… πŸ›  wpe   πŸ›  πŸ§ͺ win
βœ… πŸ›  ios-sim βœ… πŸ›  mac-AS-debug βœ… πŸ›  gtk βœ… πŸ›  wincairo
βœ… πŸ§ͺ webkitperl   πŸ§ͺ ios-wk2   πŸ§ͺ api-mac   πŸ§ͺ gtk-wk2
  πŸ§ͺ api-ios   πŸ§ͺ mac-wk1   πŸ§ͺ api-gtk
βœ… πŸ›  πŸ§ͺ jsc   πŸ›  tv   πŸ§ͺ mac-wk2 βœ… πŸ›  jsc-armv7
⏳ πŸ›  πŸ§ͺ jsc-arm64 βœ… πŸ›  tv-sim   πŸ§ͺ mac-AS-debug-wk2 βœ… πŸ§ͺ jsc-armv7-tests
  πŸ›  watch βœ… πŸ§ͺ mac-wk2-stress βœ… πŸ›  jsc-mips
βœ… πŸ›  πŸ§ͺ merge βœ… πŸ›  watch-sim βœ… πŸ§ͺ jsc-mips-tests

@ddegazio ddegazio requested a review from a team as a code owner December 9, 2022 01:06
@ddegazio ddegazio self-assigned this Dec 9, 2022
@ddegazio ddegazio added the WebAssembly For bugs in JavaScript WebAssembly label Dec 9, 2022
Copy link
Member

@Constellation Constellation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me


void vandpd_rrr(XMMRegisterID a, XMMRegisterID b, XMMRegisterID dest)
{
m_formatter.vexNdsLigWigCommutativeTwoByteOp(PRE_OPERAND_SIZE, OP2_ANDPD_VpdWpd, (RegisterID)dest, (RegisterID)a, (RegisterID)b);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PRE_OPERAND_SIZE should be PRE_SSE_66. And apply this change to the followings.
I also fixed the existing ones in https://github.com/WebKit/WebKit/pull/7370/files

Comment on lines 422 to 439
OP3_PCMPGTQ_VdqWdq = 0x37
OP3_PCMPGTQ_VdqWdq = 0x37,
OP3_PMAXSB_VdqWdq = 0x3C,
OP3_PMAXSD_VdqWdq = 0x3D,
OP3_PMAXUW_VdqWdq = 0x3E,
OP3_PMAXUD_VdqWdq = 0x3F,
OP3_PMINSB_VdqWdq = 0x38,
OP3_PMINSD_VdqWdq = 0x39,
OP3_PMINUW_VdqWdq = 0x3A,
OP3_PMINUD_VdqWdq = 0x3B,
OP3_PTEST_VdqWdq = 0x17
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's sort based on the numbers.

Comment on lines 384 to 396
OP2_PMINUB_VdqWdq = 0xDA,
OP2_PSLLW_VdqWdq = 0xF1,
OP2_PSLLD_VdqWdq = 0xF2,
OP2_PSLLQ_VdqWdq = 0xF3,
OP2_PSRLW_VdqWdq = 0xD1,
OP2_PSRLD_VdqWdq = 0xD2,
OP2_PSRLQ_VdqWdq = 0xD3,
OP2_PSRAW_VdqWdq = 0xE1,
OP2_PSRAD_VdqWdq = 0xE2,
OP2_PSRAQ_VdqWdq = 0xE3,
OP2_PMOVMSKB_EqWdq = 0xD7,
OP2_MOVMSKPS_EqWps = 0x50,
OP2_MOVMSKPD_EqWpd = 0x50
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, let's sort based on numbers.

Comment on lines 3304 to 3313
m_assembler.vpcmpeqb_rr(vec, scratch, scratch);
break;
case SIMDLane::i16x8:
m_assembler.vpcmpeqw_rr(vec, scratch, scratch);
break;
case SIMDLane::i32x4:
m_assembler.vpcmpeqd_rr(vec, scratch, scratch);
break;
case SIMDLane::i64x2:
m_assembler.vpcmpeqq_rr(vec, scratch, scratch);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are rrr.

{
RELEASE_ASSERT(supportsAVXForSIMD());

m_assembler.vpxor_rr(scratch, scratch, scratch); // Zero scratch register.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Comment on lines 3332 to 3333
m_assembler.vpxor_rr(tmp, tmp, tmp);
m_assembler.vpacksswb_rr(vec, tmp, tmp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

m_formatter.vexNdsLigWigTwoByteOp(PRE_OPERAND_SIZE, OP2_PSRAW_VdqWdq, (RegisterID)dest, (RegisterID)input, (RegisterID)shift);
}

void vpsrad_rrr(XMMRegisterID input, XMMRegisterID shift, XMMRegisterID dest)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these ordering needs to be changed to align to ATT style. Currently, it is not aligned.

@ddegazio ddegazio force-pushed the eng/WebAssembly-SIMD-Support-bitwise-operations-on-Intel branch from e35e9d0 to ab02add Compare December 9, 2022 18:25
@ddegazio ddegazio added the merge-queue Applied to send a pull request to merge-queue label Dec 9, 2022
https://bugs.webkit.org/show_bug.cgi?id=248639
rdar://103089683

Reviewed by Yusuke Suzuki.

Adds support for SIMD bitwise operations (AND, OR, XOR, ANDN, NOT, shifts, any/all_true,
bitmask/select) to the Intel macro assembler. Currently omits shifts and popcnt for i8x16
vectors, since they aren't supported natively on Intel and are pretty involved to emulate.

* Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h:
(JSC::MacroAssemblerX86_64::compareIntegerVectorWithZero):
(JSC::MacroAssemblerX86_64::vectorAnd):
(JSC::MacroAssemblerX86_64::vectorAndnot):
(JSC::MacroAssemblerX86_64::vectorOr):
(JSC::MacroAssemblerX86_64::vectorXor):
(JSC::MacroAssemblerX86_64::vectorUshl):
(JSC::MacroAssemblerX86_64::vectorUshr):
(JSC::MacroAssemblerX86_64::vectorSshr):
(JSC::MacroAssemblerX86_64::vectorAnyTrue):
(JSC::MacroAssemblerX86_64::vectorAllTrue):
(JSC::MacroAssemblerX86_64::vectorNot): Deleted.
* Source/JavaScriptCore/assembler/X86Assembler.h:
(JSC::X86Assembler::vandps_rrr):
(JSC::X86Assembler::vandpd_rrr):
(JSC::X86Assembler::vorps_rrr):
(JSC::X86Assembler::vorpd_rrr):
(JSC::X86Assembler::vxorps_rrr):
(JSC::X86Assembler::vxorpd_rrr):
(JSC::X86Assembler::vandnps_rrr):
(JSC::X86Assembler::vandnpd_rrr):
(JSC::X86Assembler::vpmovmskb_rr):
(JSC::X86Assembler::vmovmskps_rr):
(JSC::X86Assembler::vmovmskpd_rr):
(JSC::X86Assembler::vptest_rr):
(JSC::X86Assembler::vpsllw_rrr):
(JSC::X86Assembler::vpslld_rrr):
(JSC::X86Assembler::vpsllq_rrr):
(JSC::X86Assembler::vpsrlw_rrr):
(JSC::X86Assembler::vpsrld_rrr):
(JSC::X86Assembler::vpsrlq_rrr):
(JSC::X86Assembler::vpsraw_rrr):
(JSC::X86Assembler::vpsrad_rrr):
(JSC::X86Assembler::vpsraq_rrr):
(JSC::X86Assembler::andps_rr):
(JSC::X86Assembler::orps_rr):
* Source/JavaScriptCore/b3/air/AirLowerMacros.cpp:
(JSC::B3::Air::lowerMacros):
* Source/JavaScriptCore/b3/air/AirOpcode.opcodes:
* Source/JavaScriptCore/wasm/WasmAirIRGenerator.cpp:
(JSC::Wasm::AirIRGenerator::addSIMDI_V):
(JSC::Wasm::AirIRGenerator::addSIMDV_V):
(JSC::Wasm::AirIRGenerator::addConstant):
(JSC::Wasm::AirIRGenerator::addSIMDShift):

Canonical link: https://commits.webkit.org/257640@main
@webkit-commit-queue webkit-commit-queue force-pushed the eng/WebAssembly-SIMD-Support-bitwise-operations-on-Intel branch from ab02add to 4dea5d5 Compare December 9, 2022 19:49
@webkit-commit-queue webkit-commit-queue merged commit 4dea5d5 into WebKit:main Dec 9, 2022
@webkit-commit-queue
Copy link
Collaborator

Committed 257640@main (4dea5d5): https://commits.webkit.org/257640@main

Reviewed commits have been landed. Closing PR #7365 and removing active labels.

@webkit-commit-queue webkit-commit-queue removed the merge-queue Applied to send a pull request to merge-queue label Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WebAssembly For bugs in JavaScript WebAssembly
Projects
None yet
4 participants