Skip to content
Permalink
Browse files
[JSC] Add JIT optimizations for ResizableArrayBuffers
https://bugs.webkit.org/show_bug.cgi?id=248206
rdar://problem/102597308

Reviewed by Ross Kirsling.

This patch adds JIT optimizations for resizable ArrayBuffer. Right now, our generated code is not so tightly optimized (in terms of code size in particular),
but still it offers large improvement already, so this is great step as a first implementation.

1. We add JIT getter optimizations for TypedArray intrinsic getters. They are implemented in IntrinsicEmitter.
2. We add JIT AccesssCase optimizations for resizable TypedArrays. IC can detect resizable TypedArrays, and generate IndexedResizableTypedArray* ICes.
   We do not extend existing TypedArray IC to handle resizable TypedArrays since we would like to keep existing ICes super tightly optimized.
   We should generate this IC handling resizable TypedArrays gracefully only when we found resizable TypedArrays.
3. We annotate ArrayProfile based on profiling and DFG OSR exit so that we can know resizable TypedArrays in DFG / FTL. Based on that, we optimize DFG / FTL
   nodes handling TypedArrays. When we didn't observe resizable TypedArrays, we make resizable TypedArrays OSR exit to make node super tightly optimized and
   avoid saying pessimized clobbering information.
4. We implement DFG / FTL nodes handling resizable TypedArrays. We use (1) and (2)'s JIT code generation to implement them. Ideally, we can do more optimized thing
   in FTL by generating B3 nodes for this instead of using patchpoint. But currently B3 lacks AtomicLoad nodes, so we first just use patchpoint to implement FTL
   optimization.

This patch improved emscripten-cube2hash-resizable benchmark by 2x.

                                            ToT                     Patched

emscripten-cube2hash-resizable       19.1501+-0.0248     ^      9.1659+-0.0471        ^ definitely 2.0893x faster

* JSTests/microbenchmarks/emscripten-cube2hash-resizable.js: Added.
(key.in.Module.Module.hasOwnProperty):
(ENVIRONMENT_IS_NODE.Module.string_appeared_here):
(else.Module.string_appeared_here):
(else.else.Module.string_appeared_here):
(else):
(else.else):
(globalEval):
(Module.string_appeared_here.string_appeared_here.Module.string_appeared_here.Module.string_appeared_here):
(Module.string_appeared_here.Module.string_appeared_here):
(key.in.moduleOverrides.moduleOverrides.hasOwnProperty):
(Runtime.stackSave):
(Runtime.stackRestore):
(Runtime.forceAlign):
(Runtime.isNumberType):
(Runtime.isPointerType):
(Runtime.isStructType):
* JSTests/stress/resizable-bytelength.js: Added.
(shouldBe):
(test):
* JSTests/stress/resizable-byteoffset.js: Added.
(shouldBe):
(test):
* JSTests/stress/resizable-length.js: Added.
(shouldBe):
(test):
* Source/JavaScriptCore/assembler/MacroAssemblerARM64.h:
(JSC::MacroAssemblerARM64::lshift32):
(JSC::MacroAssemblerARM64::lshift64):
(JSC::MacroAssemblerARM64::loadAcq32):
(JSC::MacroAssemblerARM64::loadAcq64):
(JSC::MacroAssemblerARM64::atomicLoad32):
(JSC::MacroAssemblerARM64::atomicLoad64):
* Source/JavaScriptCore/assembler/MacroAssemblerRISCV64.h:
(JSC::MacroAssemblerRISCV64::lshift32):
(JSC::MacroAssemblerRISCV64::lshift64):
(JSC::MacroAssemblerRISCV64::atomicLoad32):
(JSC::MacroAssemblerRISCV64::atomicLoad64):
* Source/JavaScriptCore/assembler/MacroAssemblerX86Common.h:
(JSC::MacroAssemblerX86Common::lshift32):
(JSC::MacroAssemblerX86Common::atomicLoad32):
* Source/JavaScriptCore/assembler/MacroAssemblerX86_64.h:
(JSC::MacroAssemblerX86_64::lshift64):
(JSC::MacroAssemblerX86_64::atomicLoad64):
* Source/JavaScriptCore/bytecode/AccessCase.cpp:
(JSC::AccessCase::create):
(JSC::AccessCase::guardedByStructureCheckSkippingConstantIdentifierCheck const):
(JSC::AccessCase::requiresIdentifierNameMatch const):
(JSC::AccessCase::requiresInt32PropertyCheck const):
(JSC::AccessCase::needsScratchFPR const):
(JSC::AccessCase::forEachDependentCell const):
(JSC::AccessCase::doesCalls const):
(JSC::AccessCase::canReplace const):
(JSC::AccessCase::generateWithGuard):
(JSC::AccessCase::generateImpl):
(JSC::AccessCase::toTypedArrayType):
(JSC::AccessCase::forResizableTypedArray):
(JSC::AccessCase::runWithDowncast):
(JSC::AccessCase::canBeShared):
* Source/JavaScriptCore/bytecode/AccessCase.h:
* Source/JavaScriptCore/bytecode/IntrinsicGetterAccessCase.h:
* Source/JavaScriptCore/bytecode/PolymorphicAccess.cpp:
(WTF::printInternal):
* Source/JavaScriptCore/bytecode/Repatch.cpp:
(JSC::tryCacheArrayGetByVal):
(JSC::tryCacheArrayPutByVal):
* Source/JavaScriptCore/dfg/DFGArrayMode.cpp:
(JSC::DFG::ArrayMode::refine const):
* Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:
(JSC::DFG::ByteCodeParser::handleIntrinsicCall):
(JSC::DFG::ByteCodeParser::handleIntrinsicGetter):
* Source/JavaScriptCore/dfg/DFGClobberize.h:
(JSC::DFG::clobberize):
* Source/JavaScriptCore/dfg/DFGNode.h:
* Source/JavaScriptCore/dfg/DFGOSRExit.cpp:
(JSC::DFG::OSRExit::compileExit):
* Source/JavaScriptCore/dfg/DFGOperations.cpp:
(JSC::DFG::JSC_DEFINE_JIT_OPERATION):
* Source/JavaScriptCore/dfg/DFGOperations.h:
* Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:
(JSC::DFG::SpeculativeJIT::jumpForTypedArrayOutOfBounds):
(JSC::DFG::SpeculativeJIT::emitTypedArrayBoundsCheck):
(JSC::DFG::SpeculativeJIT::jumpForTypedArrayIsDetachedIfOutOfBounds):
(JSC::DFG::SpeculativeJIT::compileGetByValOnIntTypedArray):
(JSC::DFG::SpeculativeJIT::compilePutByValForIntTypedArray):
* Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h:
* Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:
(JSC::DFG::SpeculativeJIT::compileGetTypedArrayLengthAsInt52):
(JSC::DFG::SpeculativeJIT::compileGetTypedArrayByteOffsetAsInt52):
(JSC::DFG::SpeculativeJIT::compile):
* Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:
(JSC::FTL::DFG::LowerDFGToB3::emitGetTypedArrayByteOffsetExceptSettingResult):
(JSC::FTL::DFG::LowerDFGToB3::typedArrayLength):
(JSC::FTL::DFG::LowerDFGToB3::compileGetArrayLength):
(JSC::FTL::DFG::LowerDFGToB3::compileGetTypedArrayLengthAsInt52):
(JSC::FTL::DFG::LowerDFGToB3::compileCompareStrictEq):
* Source/JavaScriptCore/jit/AssemblyHelpers.cpp:
(JSC::AssemblyHelpers::branchIfResizableOrGrowableSharedTypedArrayIsOutOfBounds):
(JSC::AssemblyHelpers::loadTypedArrayByteLengthImpl):
(JSC::AssemblyHelpers::loadTypedArrayByteLength):
(JSC::AssemblyHelpers::loadTypedArrayLength):
* Source/JavaScriptCore/jit/AssemblyHelpers.h:
* Source/JavaScriptCore/jit/IntrinsicEmitter.cpp:
(JSC::IntrinsicGetterAccessCase::canEmitIntrinsicGetter):
(JSC::IntrinsicGetterAccessCase::doesCalls const):
(JSC::IntrinsicGetterAccessCase::emitIntrinsicGetter):
* Source/JavaScriptCore/jit/JITOperations.h:
* Source/JavaScriptCore/runtime/ArrayBuffer.h:
* Source/JavaScriptCore/runtime/JSDataView.h:
(JSC::JSDataView::offsetOfBuffer):
* Source/JavaScriptCore/runtime/TypedArrayType.cpp:
* Source/JavaScriptCore/runtime/TypedArrayType.h:

Canonical link: https://commits.webkit.org/257001@main
  • Loading branch information
Constellation committed Nov 25, 2022
1 parent ada505e commit 1a5636acd02ea65e4795ca8d19f1111ae088e413
Show file tree
Hide file tree
Showing 29 changed files with 9,915 additions and 357 deletions.

Large diffs are not rendered by default.

@@ -0,0 +1,18 @@
//@ requireOptions("--useResizableArrayBuffer=1")

function shouldBe(actual, expected) {
if (actual !== expected)
throw new Error('bad value: ' + actual);
}

function test() {
var buffer = new ArrayBuffer(0, { maxByteLength: 1024 });
var array = new Uint8Array(buffer);
for (var i = 0; i < 1024; ++i) {
buffer.resize(i);
shouldBe(array.byteLength, i);
}
}

for (var i = 0; i < 1e4; ++i)
test();
@@ -0,0 +1,21 @@
//@ requireOptions("--useResizableArrayBuffer=1")

function shouldBe(actual, expected) {
if (actual !== expected)
throw new Error('bad value: ' + actual);
}

function test() {
var buffer = new ArrayBuffer(128, { maxByteLength: 1024 });
var array = new Uint8Array(buffer, 64);
for (var i = 0; i < 1024; ++i) {
buffer.resize(i);
if (i < 64)
shouldBe(array.byteOffset, 0);
else
shouldBe(array.byteOffset, 64);
}
}

for (var i = 0; i < 1e4; ++i)
test();
@@ -0,0 +1,18 @@
//@ requireOptions("--useResizableArrayBuffer=1")

function shouldBe(actual, expected) {
if (actual !== expected)
throw new Error('bad value: ' + actual);
}

function test() {
var buffer = new ArrayBuffer(0, { maxByteLength: 1024 });
var array = new Uint8Array(buffer);
for (var i = 0; i < 1024; ++i) {
buffer.resize(i);
shouldBe(array.length, i);
}
}

for (var i = 0; i < 1e4; ++i)
test();
@@ -852,6 +852,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
lshift32(dest, imm, dest);
}

void lshift32(Address src, RegisterID shiftAmount, RegisterID dest)
{
load32(src, getCachedDataTempRegisterIDAndInvalidate());
m_assembler.lsl<32>(dest, dataTempRegister, shiftAmount);
}

void lshift64(RegisterID src, RegisterID shiftAmount, RegisterID dest)
{
m_assembler.lsl<64>(dest, src, shiftAmount);
@@ -872,6 +878,12 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
lshift64(dest, imm, dest);
}

void lshift64(Address src, RegisterID shiftAmount, RegisterID dest)
{
load64(src, getCachedDataTempRegisterIDAndInvalidate());
m_assembler.lsl<64>(dest, dataTempRegister, shiftAmount);
}

void mul32(RegisterID left, RegisterID right, RegisterID dest)
{
m_assembler.mul<32>(dest, left, right);
@@ -4453,7 +4465,17 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {
{
m_assembler.ldar<64>(dest, extractSimpleAddress(address));
}


void loadAcq32(BaseIndex address, RegisterID dest)
{
m_assembler.ldar<32>(dest, extractSimpleAddress(address));
}

void loadAcq64(BaseIndex address, RegisterID dest)
{
m_assembler.ldar<64>(dest, extractSimpleAddress(address));
}

void storeRel32(RegisterID dest, Address address)
{
m_assembler.stlr<32>(dest, extractSimpleAddress(address));
@@ -5992,7 +6014,27 @@ class MacroAssemblerARM64 : public AbstractMacroAssembler<Assembler> {

RELEASE_ASSERT_NOT_REACHED();
}


void atomicLoad32(Address address, RegisterID dest)
{
loadAcq32(address, dest);
}

void atomicLoad64(Address address, RegisterID dest)
{
loadAcq64(address, dest);
}

void atomicLoad32(BaseIndex address, RegisterID dest)
{
loadAcq32(address, dest);
}

void atomicLoad64(BaseIndex address, RegisterID dest)
{
loadAcq64(address, dest);
}

RegisterID extractSimpleAddress(Address address)
{
if (!address.offset)
@@ -653,6 +653,13 @@ class MacroAssemblerRISCV64 : public AbstractMacroAssembler<Assembler> {
m_assembler.maskRegister<32>(dest);
}

void lshift32(Address src, RegisterID shiftAmount, RegisterID dest)
{
auto temp = temps<Data>();
load32(src, temp.data());
lshift32(temp.data(), shiftAmount, dest);
}

void lshift64(RegisterID shiftAmount, RegisterID dest)
{
lshift64(dest, shiftAmount, dest);
@@ -673,6 +680,13 @@ class MacroAssemblerRISCV64 : public AbstractMacroAssembler<Assembler> {
m_assembler.slliInsn(dest, src, uint32_t(imm.m_value & ((1 << 6) - 1)));
}

void lshift64(Address src, RegisterID shiftAmount, RegisterID dest)
{
auto temp = temps<Data>();
load64(src, temp.data());
lshift64(temp.data(), shiftAmount, dest);
}

void rshift32(RegisterID shiftAmount, RegisterID dest)
{
rshift32(dest, shiftAmount, dest);
@@ -3530,6 +3544,38 @@ class MacroAssemblerRISCV64 : public AbstractMacroAssembler<Assembler> {
return failure;
}

void atomicLoad32(Address address, RegisterID dest)
{
auto resolution = resolveAddress(address, lazyTemp<Memory>());
memoryFence();
m_assembler.lwInsn(dest, resolution.base, Imm::I(resolution.offset));
loadFence();
}

void atomicLoad32(BaseIndex address, RegisterID dest)
{
auto resolution = resolveAddress(address, lazyTemp<Memory>());
memoryFence();
m_assembler.lwInsn(dest, resolution.base, Imm::I(resolution.offset));
loadFence();
}

void atomicLoad64(Address address, RegisterID dest)
{
auto resolution = resolveAddress(address, lazyTemp<Memory>());
memoryFence();
m_assembler.ldInsn(dest, resolution.base, Imm::I(resolution.offset));
loadFence();
}

void atomicLoad64(BaseIndex address, RegisterID dest)
{
auto resolution = resolveAddress(address, lazyTemp<Memory>());
memoryFence();
m_assembler.ldInsn(dest, resolution.base, Imm::I(resolution.offset));
loadFence();
}

void moveConditionally32(RelationalCondition cond, RegisterID lhs, RegisterID rhs, RegisterID src, RegisterID dest)
{
auto temp = temps<Data, Memory>();
@@ -466,6 +466,18 @@ class MacroAssemblerX86Common : public AbstractMacroAssembler<Assembler> {
move32IfNeeded(src, dest);
lshift32(imm, dest);
}

void lshift32(Address src, RegisterID shiftAmount, RegisterID dest)
{
if (shiftAmount == dest) {
move(shiftAmount, scratchRegister());
load32(src, dest);
lshift32(scratchRegister(), dest);
} else {
load32(src, dest);
lshift32(shiftAmount, dest);
}
}

void mul32(RegisterID src, RegisterID dest)
{
@@ -3892,6 +3904,16 @@ class MacroAssemblerX86Common : public AbstractMacroAssembler<Assembler> {
{
m_assembler.xchgl_rm(reg, address.offset, address.base, address.index, address.scale);
}

void atomicLoad32(Address address, RegisterID dest)
{
load32(address, dest);
}

void atomicLoad32(BaseIndex address, RegisterID dest)
{
load32(address, dest);
}

// We take this to mean that it prevents motion of normal stores. So, it's a no-op on x86.
void storeFence()
@@ -573,7 +573,18 @@ class MacroAssemblerX86_64 : public MacroAssemblerX86Common {
}
}


void lshift64(Address src, RegisterID shiftAmount, RegisterID dest)
{
if (shiftAmount == dest) {
move(shiftAmount, scratchRegister());
load64(src, dest);
lshift64(scratchRegister(), dest);
} else {
load64(src, dest);
lshift64(shiftAmount, dest);
}
}

void rshift64(TrustedImm32 imm, RegisterID dest)
{
m_assembler.sarq_i8r(imm.m_value, dest);
@@ -618,6 +629,18 @@ class MacroAssemblerX86_64 : public MacroAssemblerX86Common {
}
}

void urshift64(RegisterID src, RegisterID shiftAmount, RegisterID dest)
{
if (shiftAmount == dest) {
move(shiftAmount, scratchRegister());
move(src, dest);
urshift64(scratchRegister(), dest);
} else {
move(src, dest);
urshift64(shiftAmount, dest);
}
}

void rotateRight64(TrustedImm32 imm, RegisterID dest)
{
m_assembler.rorq_i8r(imm.m_value, dest);
@@ -1947,6 +1970,16 @@ class MacroAssemblerX86_64 : public MacroAssemblerX86Common {
m_assembler.lock();
m_assembler.xchgq_rm(reg, address.offset, address.base, address.index, address.scale);
}

void atomicLoad64(Address address, RegisterID dest)
{
load64(address, dest);
}

void atomicLoad64(BaseIndex address, RegisterID dest)
{
load64(address, dest);
}

#if ENABLE(FAST_TLS_JIT)
void loadFromTLS64(uint32_t offset, RegisterID dst)

0 comments on commit 1a5636a

Please sign in to comment.