[WebAssembly SIMD] Support bitwise operations on Intel #7365

ddegazio · 2022-12-09T01:06:29Z

`4dea5d5`

[WebAssembly SIMD] Support bitwise operations on Intel
https://bugs.webkit.org/show_bug.cgi?id=248639
rdar://103089683

Reviewed by Yusuke Suzuki.

Adds support for SIMD bitwise operations (AND, OR, XOR, ANDN, NOT, shifts, any/all_true,
bitmask/select) to the Intel macro assembler. Currently omits shifts and popcnt for i8x16
vectors, since they aren't supported natively on Intel and are pretty involved to emulate.

* Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h:
(JSC::MacroAssemblerX86_64::compareIntegerVectorWithZero):
(JSC::MacroAssemblerX86_64::vectorAnd):
(JSC::MacroAssemblerX86_64::vectorAndnot):
(JSC::MacroAssemblerX86_64::vectorOr):
(JSC::MacroAssemblerX86_64::vectorXor):
(JSC::MacroAssemblerX86_64::vectorUshl):
(JSC::MacroAssemblerX86_64::vectorUshr):
(JSC::MacroAssemblerX86_64::vectorSshr):
(JSC::MacroAssemblerX86_64::vectorAnyTrue):
(JSC::MacroAssemblerX86_64::vectorAllTrue):
(JSC::MacroAssemblerX86_64::vectorNot): Deleted.
* Source/JavaScriptCore/assembler/X86Assembler.h:
(JSC::X86Assembler::vandps_rrr):
(JSC::X86Assembler::vandpd_rrr):
(JSC::X86Assembler::vorps_rrr):
(JSC::X86Assembler::vorpd_rrr):
(JSC::X86Assembler::vxorps_rrr):
(JSC::X86Assembler::vxorpd_rrr):
(JSC::X86Assembler::vandnps_rrr):
(JSC::X86Assembler::vandnpd_rrr):
(JSC::X86Assembler::vpmovmskb_rr):
(JSC::X86Assembler::vmovmskps_rr):
(JSC::X86Assembler::vmovmskpd_rr):
(JSC::X86Assembler::vptest_rr):
(JSC::X86Assembler::vpsllw_rrr):
(JSC::X86Assembler::vpslld_rrr):
(JSC::X86Assembler::vpsllq_rrr):
(JSC::X86Assembler::vpsrlw_rrr):
(JSC::X86Assembler::vpsrld_rrr):
(JSC::X86Assembler::vpsrlq_rrr):
(JSC::X86Assembler::vpsraw_rrr):
(JSC::X86Assembler::vpsrad_rrr):
(JSC::X86Assembler::vpsraq_rrr):
(JSC::X86Assembler::andps_rr):
(JSC::X86Assembler::orps_rr):
* Source/JavaScriptCore/b3/air/AirLowerMacros.cpp:
(JSC::B3::Air::lowerMacros):
* Source/JavaScriptCore/b3/air/AirOpcode.opcodes:
* Source/JavaScriptCore/wasm/WasmAirIRGenerator.cpp:
(JSC::Wasm::AirIRGenerator::addSIMDI_V):
(JSC::Wasm::AirIRGenerator::addSIMDV_V):
(JSC::Wasm::AirIRGenerator::addConstant):
(JSC::Wasm::AirIRGenerator::addSIMDShift):

Canonical link: https://commits.webkit.org/257640@main

ab02add

Misc	iOS, tvOS & watchOS	macOS	Linux	Windows
✅ 🧪 style	✅ 🛠 ios	✅ 🛠 mac	✅ 🛠 wpe	~~🛠 🧪 win~~
	✅ 🛠 ios-sim	✅ 🛠 mac-AS-debug	✅ 🛠 gtk	✅ 🛠 wincairo
✅ 🧪 webkitperl	~~🧪 ios-wk2~~	~~🧪 api-mac~~	~~🧪 gtk-wk2~~
	~~🧪 api-ios~~	~~🧪 mac-wk1~~	~~🧪 api-gtk~~
✅ 🛠 🧪 jsc	~~🛠 tv~~	~~🧪 mac-wk2~~	✅ 🛠 jsc-armv7
⏳ 🛠 🧪 jsc-arm64	✅ 🛠 tv-sim	~~🧪 mac-AS-debug-wk2~~	✅ 🧪 jsc-armv7-tests
	~~🛠 watch~~	✅ 🧪 mac-wk2-stress	✅ 🛠 jsc-mips
✅ 🛠 🧪 merge	✅ 🛠 watch-sim		✅ 🧪 jsc-mips-tests

webkit-early-warning-system · 2022-12-09T01:08:03Z

EWS run on previous version of this PR (hash e35e9d0)

Misc	iOS, tvOS & watchOS	macOS	Linux	Windows
✅ 🧪 style	✅ 🛠 ios	✅ 🛠 mac	✅ 🛠 wpe	✅ 🛠 🧪 win
	✅ 🛠 ios-sim	✅ 🛠 mac-AS-debug	✅ 🛠 gtk	✅ 🛠 wincairo
✅ 🧪 webkitperl	✅ 🧪 ios-wk2	✅ 🧪 api-mac	✅ 🧪 gtk-wk2
	✅ 🧪 api-ios	✅ 🧪 mac-wk1	~~🧪 api-gtk~~
✅ 🛠 🧪 jsc	✅ 🛠 tv	✅ 🧪 mac-wk2	✅ 🛠 jsc-armv7
	✅ 🛠 tv-sim	✅ 🧪 mac-AS-debug-wk2	✅ 🧪 jsc-armv7-tests
	✅ 🛠 watch	✅ 🧪 mac-wk2-stress	✅ 🛠 jsc-mips
	✅ 🛠 watch-sim		✅ 🧪 jsc-mips-tests

Constellation

r=me

Constellation · 2022-12-09T01:56:41Z

Source/JavaScriptCore/assembler/X86Assembler.h

+
+    void vandpd_rrr(XMMRegisterID a, XMMRegisterID b, XMMRegisterID dest)
+    {
+        m_formatter.vexNdsLigWigCommutativeTwoByteOp(PRE_OPERAND_SIZE, OP2_ANDPD_VpdWpd, (RegisterID)dest, (RegisterID)a, (RegisterID)b);


PRE_OPERAND_SIZE should be PRE_SSE_66. And apply this change to the followings.
I also fixed the existing ones in https://github.com/WebKit/WebKit/pull/7370/files

Constellation · 2022-12-09T01:59:37Z

Source/JavaScriptCore/assembler/X86Assembler.h

-        OP3_PCMPGTQ_VdqWdq      = 0x37
+        OP3_PCMPGTQ_VdqWdq      = 0x37,
+        OP3_PMAXSB_VdqWdq       = 0x3C,
+        OP3_PMAXSD_VdqWdq       = 0x3D,
+        OP3_PMAXUW_VdqWdq       = 0x3E,
+        OP3_PMAXUD_VdqWdq       = 0x3F,
+        OP3_PMINSB_VdqWdq       = 0x38,
+        OP3_PMINSD_VdqWdq       = 0x39,
+        OP3_PMINUW_VdqWdq       = 0x3A,
+        OP3_PMINUD_VdqWdq       = 0x3B,
+        OP3_PTEST_VdqWdq        = 0x17


Let's sort based on the numbers.

Constellation · 2022-12-09T01:59:51Z

Source/JavaScriptCore/assembler/X86Assembler.h

+        OP2_PMINUB_VdqWdq               = 0xDA,
+        OP2_PSLLW_VdqWdq                = 0xF1,
+        OP2_PSLLD_VdqWdq                = 0xF2,
+        OP2_PSLLQ_VdqWdq                = 0xF3,
+        OP2_PSRLW_VdqWdq                = 0xD1,
+        OP2_PSRLD_VdqWdq                = 0xD2,
+        OP2_PSRLQ_VdqWdq                = 0xD3,
+        OP2_PSRAW_VdqWdq                = 0xE1,
+        OP2_PSRAD_VdqWdq                = 0xE2,
+        OP2_PSRAQ_VdqWdq                = 0xE3,
+        OP2_PMOVMSKB_EqWdq              = 0xD7,
+        OP2_MOVMSKPS_EqWps              = 0x50,
+        OP2_MOVMSKPD_EqWpd              = 0x50


Ditto, let's sort based on numbers.

Constellation · 2022-12-09T02:51:51Z

Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h

+            m_assembler.vpcmpeqb_rr(vec, scratch, scratch);
+            break;
+        case SIMDLane::i16x8:
+            m_assembler.vpcmpeqw_rr(vec, scratch, scratch);
+            break;
+        case SIMDLane::i32x4:
+            m_assembler.vpcmpeqd_rr(vec, scratch, scratch);
+            break;
+        case SIMDLane::i64x2:
+            m_assembler.vpcmpeqq_rr(vec, scratch, scratch);


They are rrr.

Constellation · 2022-12-09T02:51:55Z

Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h

+    {
+        RELEASE_ASSERT(supportsAVXForSIMD());
+
+        m_assembler.vpxor_rr(scratch, scratch, scratch); // Zero scratch register.


Constellation · 2022-12-09T02:52:05Z

Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h

+            m_assembler.vpxor_rr(tmp, tmp, tmp);
+            m_assembler.vpacksswb_rr(vec, tmp, tmp);


Constellation · 2022-12-09T05:38:45Z

Source/JavaScriptCore/assembler/X86Assembler.h

+        m_formatter.vexNdsLigWigTwoByteOp(PRE_OPERAND_SIZE, OP2_PSRAW_VdqWdq, (RegisterID)dest, (RegisterID)input, (RegisterID)shift);
+    }
+
+    void vpsrad_rrr(XMMRegisterID input, XMMRegisterID shift, XMMRegisterID dest)


I think these ordering needs to be changed to align to ATT style. Currently, it is not aligned.

webkit-early-warning-system · 2022-12-09T18:25:57Z

EWS run on current version of this PR (hash ab02add)

Misc	iOS, tvOS & watchOS	macOS	Linux	Windows
✅ 🧪 style	✅ 🛠 ios	✅ 🛠 mac	✅ 🛠 wpe	~~🛠 🧪 win~~
	✅ 🛠 ios-sim	✅ 🛠 mac-AS-debug	✅ 🛠 gtk	✅ 🛠 wincairo
✅ 🧪 webkitperl	~~🧪 ios-wk2~~	~~🧪 api-mac~~	~~🧪 gtk-wk2~~
	~~🧪 api-ios~~	~~🧪 mac-wk1~~	~~🧪 api-gtk~~
✅ 🛠 🧪 jsc	~~🛠 tv~~	~~🧪 mac-wk2~~	✅ 🛠 jsc-armv7
⏳ 🛠 🧪 jsc-arm64	✅ 🛠 tv-sim	~~🧪 mac-AS-debug-wk2~~	✅ 🧪 jsc-armv7-tests
	~~🛠 watch~~	✅ 🧪 mac-wk2-stress	✅ 🛠 jsc-mips
✅ 🛠 🧪 merge	✅ 🛠 watch-sim		✅ 🧪 jsc-mips-tests

https://bugs.webkit.org/show_bug.cgi?id=248639 rdar://103089683 Reviewed by Yusuke Suzuki. Adds support for SIMD bitwise operations (AND, OR, XOR, ANDN, NOT, shifts, any/all_true, bitmask/select) to the Intel macro assembler. Currently omits shifts and popcnt for i8x16 vectors, since they aren't supported natively on Intel and are pretty involved to emulate. * Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h: (JSC::MacroAssemblerX86_64::compareIntegerVectorWithZero): (JSC::MacroAssemblerX86_64::vectorAnd): (JSC::MacroAssemblerX86_64::vectorAndnot): (JSC::MacroAssemblerX86_64::vectorOr): (JSC::MacroAssemblerX86_64::vectorXor): (JSC::MacroAssemblerX86_64::vectorUshl): (JSC::MacroAssemblerX86_64::vectorUshr): (JSC::MacroAssemblerX86_64::vectorSshr): (JSC::MacroAssemblerX86_64::vectorAnyTrue): (JSC::MacroAssemblerX86_64::vectorAllTrue): (JSC::MacroAssemblerX86_64::vectorNot): Deleted. * Source/JavaScriptCore/assembler/X86Assembler.h: (JSC::X86Assembler::vandps_rrr): (JSC::X86Assembler::vandpd_rrr): (JSC::X86Assembler::vorps_rrr): (JSC::X86Assembler::vorpd_rrr): (JSC::X86Assembler::vxorps_rrr): (JSC::X86Assembler::vxorpd_rrr): (JSC::X86Assembler::vandnps_rrr): (JSC::X86Assembler::vandnpd_rrr): (JSC::X86Assembler::vpmovmskb_rr): (JSC::X86Assembler::vmovmskps_rr): (JSC::X86Assembler::vmovmskpd_rr): (JSC::X86Assembler::vptest_rr): (JSC::X86Assembler::vpsllw_rrr): (JSC::X86Assembler::vpslld_rrr): (JSC::X86Assembler::vpsllq_rrr): (JSC::X86Assembler::vpsrlw_rrr): (JSC::X86Assembler::vpsrld_rrr): (JSC::X86Assembler::vpsrlq_rrr): (JSC::X86Assembler::vpsraw_rrr): (JSC::X86Assembler::vpsrad_rrr): (JSC::X86Assembler::vpsraq_rrr): (JSC::X86Assembler::andps_rr): (JSC::X86Assembler::orps_rr): * Source/JavaScriptCore/b3/air/AirLowerMacros.cpp: (JSC::B3::Air::lowerMacros): * Source/JavaScriptCore/b3/air/AirOpcode.opcodes: * Source/JavaScriptCore/wasm/WasmAirIRGenerator.cpp: (JSC::Wasm::AirIRGenerator::addSIMDI_V): (JSC::Wasm::AirIRGenerator::addSIMDV_V): (JSC::Wasm::AirIRGenerator::addConstant): (JSC::Wasm::AirIRGenerator::addSIMDShift): Canonical link: https://commits.webkit.org/257640@main

webkit-commit-queue · 2022-12-09T19:50:08Z

Committed 257640@main (4dea5d5): https://commits.webkit.org/257640@main

Reviewed commits have been landed. Closing PR #7365 and removing active labels.

ddegazio requested a review from a team as a code owner December 9, 2022 01:06

ddegazio self-assigned this Dec 9, 2022

ddegazio added the WebAssembly For bugs in JavaScript WebAssembly label Dec 9, 2022

Constellation approved these changes Dec 9, 2022

View reviewed changes

Constellation reviewed Dec 9, 2022

View reviewed changes

ddegazio force-pushed the eng/WebAssembly-SIMD-Support-bitwise-operations-on-Intel branch from e35e9d0 to ab02add Compare December 9, 2022 18:25

ddegazio added the merge-queue Applied to send a pull request to merge-queue label Dec 9, 2022

webkit-commit-queue force-pushed the eng/WebAssembly-SIMD-Support-bitwise-operations-on-Intel branch from ab02add to 4dea5d5 Compare December 9, 2022 19:49

webkit-commit-queue merged commit 4dea5d5 into WebKit:main Dec 9, 2022

webkit-commit-queue removed the merge-queue Applied to send a pull request to merge-queue label Dec 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WebAssembly SIMD] Support bitwise operations on Intel #7365

[WebAssembly SIMD] Support bitwise operations on Intel #7365

ddegazio commented Dec 9, 2022 •

edited by webkit-early-warning-system

webkit-early-warning-system commented Dec 9, 2022 •

edited

Constellation left a comment

Constellation Dec 9, 2022

Constellation Dec 9, 2022

Constellation Dec 9, 2022

Constellation Dec 9, 2022

Constellation Dec 9, 2022

Constellation Dec 9, 2022

Constellation Dec 9, 2022

webkit-early-warning-system commented Dec 9, 2022 •

edited

webkit-commit-queue commented Dec 9, 2022

		m_assembler.vpxor_rr(tmp, tmp, tmp);
		m_assembler.vpacksswb_rr(vec, tmp, tmp);

[WebAssembly SIMD] Support bitwise operations on Intel #7365

[WebAssembly SIMD] Support bitwise operations on Intel #7365

Conversation

ddegazio commented Dec 9, 2022 • edited by webkit-early-warning-system

webkit-early-warning-system commented Dec 9, 2022 • edited

Constellation left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

webkit-early-warning-system commented Dec 9, 2022 • edited

webkit-commit-queue commented Dec 9, 2022

ddegazio commented Dec 9, 2022 •

edited by webkit-early-warning-system

webkit-early-warning-system commented Dec 9, 2022 •

edited

webkit-early-warning-system commented Dec 9, 2022 •

edited