Skip to content
Permalink
Browse files
WTF should make it super easy to do ARM concurrency tricks
https://bugs.webkit.org/show_bug.cgi?id=169300

Reviewed by Mark Lam.

Source/JavaScriptCore:

This changes a bunch of GC hot paths to use new concurrency APIs that lead to optimal
code on both x86 (fully leverage TSO, transactions become CAS loops) and ARM (use
dependency chains for fencing, transactions become LL/SC loops). While inspecting the
machine code, I found other opportunities for improvement, like inlining the "am I
marked" part of the marking functions.

* heap/Heap.cpp:
(JSC::Heap::setGCDidJIT):
* heap/HeapInlines.h:
(JSC::Heap::testAndSetMarked):
* heap/LargeAllocation.h:
(JSC::LargeAllocation::isMarked):
(JSC::LargeAllocation::isMarkedConcurrently):
(JSC::LargeAllocation::aboutToMark):
(JSC::LargeAllocation::testAndSetMarked):
* heap/MarkedBlock.h:
(JSC::MarkedBlock::areMarksStaleWithDependency):
(JSC::MarkedBlock::aboutToMark):
(JSC::MarkedBlock::isMarkedConcurrently):
(JSC::MarkedBlock::isMarked):
(JSC::MarkedBlock::testAndSetMarked):
* heap/SlotVisitor.cpp:
(JSC::SlotVisitor::appendSlow):
(JSC::SlotVisitor::appendHiddenSlow):
(JSC::SlotVisitor::appendHiddenSlowImpl):
(JSC::SlotVisitor::setMarkedAndAppendToMarkStack):
(JSC::SlotVisitor::appendUnbarriered): Deleted.
(JSC::SlotVisitor::appendHidden): Deleted.
* heap/SlotVisitor.h:
* heap/SlotVisitorInlines.h:
(JSC::SlotVisitor::appendUnbarriered):
(JSC::SlotVisitor::appendHidden):
(JSC::SlotVisitor::append):
(JSC::SlotVisitor::appendValues):
(JSC::SlotVisitor::appendValuesHidden):
* runtime/CustomGetterSetter.cpp:
* runtime/JSObject.cpp:
(JSC::JSObject::visitButterflyImpl):
* runtime/JSObject.h:

Source/WTF:

This adds Atomic<>::loadLink and Atomic<>::storeCond, available only when HAVE(LL_SC).

It abstracts loadLink/storeCond behind prepare/attempt. You can write prepare/attempt
loops whenever your loop fits into the least common denominator of LL/SC and CAS.

This modifies Atomic<>::transaction to use prepare/attempt. So, if you write your loop
using Atomic<>::transaction, then you get LL/SC for free.

Depending on the kind of transaction you are doing, you may not want to perform an LL
until you have a chance to just load the current value. Atomic<>::transaction() assumes
that you do not care to have any ordering guarantees in that case. If you think that
the transaction has a good chance of aborting this way, you want
Atomic<>::transaction() to first do a plain load. But if you don't think that such an
abort is likely, then you want to go straight to the LL. The API supports this concept
via TransactionAbortLikelihood.

Additionally, this redoes the depend/consume API to be dead simple. Dependency is
unsigned. You get a dependency on a loaded value by just saying
dependency(loadedValue). You consume the dependency by using it as a bonus index to
some pointer dereference. This is made easy with the consume<T*>(ptr, dependency)
helper. In those cases where you want to pass around both a computed value and a
dependency, there's DependencyWith<T>. But you won't need it in most cases. The loaded
value or any value computed from the loaded value is a fine input to dependency()!

This change updates a bunch of hot paths to use the new APIs. Using transaction() gives
us optimal LL/SC loops for object marking and lock acquisition.

This change also updates a bunch of hot paths to use dependency()/consume().

This is a significant Octane/splay speed-up on ARM.

* wtf/Atomics.h:
(WTF::hasFence):
(WTF::Atomic::prepare):
(WTF::Atomic::attempt):
(WTF::Atomic::transaction):
(WTF::Atomic::transactionRelaxed):
(WTF::nullDependency):
(WTF::dependency):
(WTF::DependencyWith::DependencyWith):
(WTF::dependencyWith):
(WTF::consume):
(WTF::Atomic::tryTransactionRelaxed): Deleted.
(WTF::Atomic::tryTransaction): Deleted.
(WTF::zeroWithConsumeDependency): Deleted.
(WTF::consumeLoad): Deleted.
* wtf/Bitmap.h:
(WTF::WordType>::get):
(WTF::WordType>::concurrentTestAndSet):
(WTF::WordType>::concurrentTestAndClear):
* wtf/LockAlgorithm.h:
(WTF::LockAlgorithm::lockFast):
(WTF::LockAlgorithm::unlockFast):
(WTF::LockAlgorithm::unlockSlow):
* wtf/Platform.h:

Tools:

This vastly simplifies the consume API. The new API is thoroughly tested by being used
in the GC's guts. I think that unit tests are a pain to maintain, so we shouldn't have
them unless we are legitimately worried about coverage. We're not in this case.

* TestWebKitAPI/CMakeLists.txt:
* TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
* TestWebKitAPI/Tests/WTF/Consume.cpp: Removed.



Canonical link: https://commits.webkit.org/186402@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@213645 268f45cc-cd09-0410-ab3c-d52691b4dbfc
  • Loading branch information
pizlonator committed Mar 9, 2017
1 parent acb93cf commit d3e0604377a68927eb842d22da120a60d269fdf5
Showing 20 changed files with 487 additions and 339 deletions.
@@ -1,3 +1,50 @@
2017-03-07 Filip Pizlo <fpizlo@apple.com>

WTF should make it super easy to do ARM concurrency tricks
https://bugs.webkit.org/show_bug.cgi?id=169300

Reviewed by Mark Lam.

This changes a bunch of GC hot paths to use new concurrency APIs that lead to optimal
code on both x86 (fully leverage TSO, transactions become CAS loops) and ARM (use
dependency chains for fencing, transactions become LL/SC loops). While inspecting the
machine code, I found other opportunities for improvement, like inlining the "am I
marked" part of the marking functions.

* heap/Heap.cpp:
(JSC::Heap::setGCDidJIT):
* heap/HeapInlines.h:
(JSC::Heap::testAndSetMarked):
* heap/LargeAllocation.h:
(JSC::LargeAllocation::isMarked):
(JSC::LargeAllocation::isMarkedConcurrently):
(JSC::LargeAllocation::aboutToMark):
(JSC::LargeAllocation::testAndSetMarked):
* heap/MarkedBlock.h:
(JSC::MarkedBlock::areMarksStaleWithDependency):
(JSC::MarkedBlock::aboutToMark):
(JSC::MarkedBlock::isMarkedConcurrently):
(JSC::MarkedBlock::isMarked):
(JSC::MarkedBlock::testAndSetMarked):
* heap/SlotVisitor.cpp:
(JSC::SlotVisitor::appendSlow):
(JSC::SlotVisitor::appendHiddenSlow):
(JSC::SlotVisitor::appendHiddenSlowImpl):
(JSC::SlotVisitor::setMarkedAndAppendToMarkStack):
(JSC::SlotVisitor::appendUnbarriered): Deleted.
(JSC::SlotVisitor::appendHidden): Deleted.
* heap/SlotVisitor.h:
* heap/SlotVisitorInlines.h:
(JSC::SlotVisitor::appendUnbarriered):
(JSC::SlotVisitor::appendHidden):
(JSC::SlotVisitor::append):
(JSC::SlotVisitor::appendValues):
(JSC::SlotVisitor::appendValuesHidden):
* runtime/CustomGetterSetter.cpp:
* runtime/JSObject.cpp:
(JSC::JSObject::visitButterflyImpl):
* runtime/JSObject.h:

2017-03-08 Yusuke Suzuki <utatane.tea@gmail.com>

[GTK] JSC test stress/arity-check-ftl-throw.js.ftl-no-cjit-validate-sampling-profiler crashing on GTK bot
@@ -1853,9 +1853,10 @@ void Heap::handleNeedFinalize()
void Heap::setGCDidJIT()
{
m_worldState.transaction(
[&] (unsigned& state) {
[&] (unsigned& state) -> bool {
RELEASE_ASSERT(state & stoppedBit);
state |= gcDidJITBit;
return true;
});
}

@@ -93,8 +93,8 @@ ALWAYS_INLINE bool Heap::testAndSetMarked(HeapVersion markingVersion, const void
if (cell->isLargeAllocation())
return cell->largeAllocation().testAndSetMarked();
MarkedBlock& block = cell->markedBlock();
block.aboutToMark(markingVersion);
return block.testAndSetMarked(cell);
Dependency dependency = block.aboutToMark(markingVersion);
return block.testAndSetMarked(cell, dependency);
}

ALWAYS_INLINE size_t Heap::cellSize(const void* rawCell)
@@ -74,8 +74,9 @@ class LargeAllocation : public BasicRawSentinelNode<LargeAllocation> {

bool isNewlyAllocated() const { return m_isNewlyAllocated; }
ALWAYS_INLINE bool isMarked() { return m_isMarked.load(std::memory_order_relaxed); }
ALWAYS_INLINE bool isMarked(HeapCell*) { return m_isMarked.load(std::memory_order_relaxed); }
ALWAYS_INLINE bool isMarkedConcurrently(HeapVersion, HeapCell*) { return m_isMarked.load(std::memory_order_relaxed); }
ALWAYS_INLINE bool isMarked(HeapCell*) { return isMarked(); }
ALWAYS_INLINE bool isMarked(HeapCell*, Dependency) { return isMarked(); }
ALWAYS_INLINE bool isMarkedConcurrently(HeapVersion, HeapCell*) { return isMarked(); }
bool isLive() { return isMarked() || isNewlyAllocated(); }

bool hasValidCell() const { return m_hasValidCell; }
@@ -109,7 +110,7 @@ class LargeAllocation : public BasicRawSentinelNode<LargeAllocation> {

const AllocatorAttributes& attributes() const { return m_attributes; }

void aboutToMark(HeapVersion) { }
Dependency aboutToMark(HeapVersion) { return nullDependency(); }

ALWAYS_INLINE bool testAndSetMarked()
{
@@ -120,7 +121,7 @@ class LargeAllocation : public BasicRawSentinelNode<LargeAllocation> {
return true;
return m_isMarked.compareExchangeStrong(false, true);
}
ALWAYS_INLINE bool testAndSetMarked(HeapCell*) { return testAndSetMarked(); }
ALWAYS_INLINE bool testAndSetMarked(HeapCell*, Dependency, TransactionAbortLikelihood = TransactionAbortLikelihood::Likely) { return testAndSetMarked(); }
void clearMarked() { m_isMarked.store(false); }

void noteMarked() { }
@@ -258,7 +258,8 @@ class MarkedBlock {
bool isMarked(const void*);
bool isMarked(HeapVersion markingVersion, const void*);
bool isMarkedConcurrently(HeapVersion markingVersion, const void*);
bool testAndSetMarked(const void*);
bool isMarked(const void*, Dependency);
bool testAndSetMarked(const void*, Dependency, TransactionAbortLikelihood = TransactionAbortLikelihood::Likely);

bool isAtom(const void*);
void clearMarked(const void*);
@@ -278,15 +279,15 @@ class MarkedBlock {

JS_EXPORT_PRIVATE bool areMarksStale();
bool areMarksStale(HeapVersion markingVersion);
struct MarksWithDependency {
bool areStale;
ConsumeDependency dependency;
};
MarksWithDependency areMarksStaleWithDependency(HeapVersion markingVersion);
DependencyWith<bool> areMarksStaleWithDependency(HeapVersion markingVersion);

void aboutToMark(HeapVersion markingVersion);
Dependency aboutToMark(HeapVersion markingVersion);

void assertMarksNotStale();
#if ASSERT_DISABLED
void assertMarksNotStale() { }
#else
JS_EXPORT_PRIVATE void assertMarksNotStale();
#endif

bool needsDestruction() const { return m_needsDestruction; }

@@ -306,7 +307,7 @@ class MarkedBlock {
MarkedBlock(VM&, Handle&);
Atom* atoms();

void aboutToMarkSlow(HeapVersion markingVersion);
JS_EXPORT_PRIVATE void aboutToMarkSlow(HeapVersion markingVersion);
void clearHasAnyMarked();

void noteMarkedSlow();
@@ -491,27 +492,19 @@ inline bool MarkedBlock::areMarksStale(HeapVersion markingVersion)
return markingVersion != m_markingVersion;
}

ALWAYS_INLINE MarkedBlock::MarksWithDependency MarkedBlock::areMarksStaleWithDependency(HeapVersion markingVersion)
ALWAYS_INLINE DependencyWith<bool> MarkedBlock::areMarksStaleWithDependency(HeapVersion markingVersion)
{
auto consumed = consumeLoad(&m_markingVersion);
MarksWithDependency ret;
ret.areStale = consumed.value != markingVersion;
ret.dependency = consumed.dependency;
return ret;
HeapVersion version = m_markingVersion;
return dependencyWith(dependency(version), version != markingVersion);
}

inline void MarkedBlock::aboutToMark(HeapVersion markingVersion)
inline Dependency MarkedBlock::aboutToMark(HeapVersion markingVersion)
{
if (UNLIKELY(areMarksStale(markingVersion)))
auto result = areMarksStaleWithDependency(markingVersion);
if (UNLIKELY(result.value))
aboutToMarkSlow(markingVersion);
WTF::loadLoadFence();
}

#if ASSERT_DISABLED
inline void MarkedBlock::assertMarksNotStale()
{
return result.dependency;
}
#endif // ASSERT_DISABLED

inline void MarkedBlock::Handle::assertMarksNotStale()
{
@@ -530,16 +523,22 @@ inline bool MarkedBlock::isMarked(HeapVersion markingVersion, const void* p)

inline bool MarkedBlock::isMarkedConcurrently(HeapVersion markingVersion, const void* p)
{
auto marksWithDependency = areMarksStaleWithDependency(markingVersion);
if (marksWithDependency.areStale)
auto result = areMarksStaleWithDependency(markingVersion);
if (result.value)
return false;
return m_marks.get(atomNumber(p) + marksWithDependency.dependency);
return m_marks.get(atomNumber(p), result.dependency);
}

inline bool MarkedBlock::isMarked(const void* p, Dependency dependency)
{
assertMarksNotStale();
return m_marks.get(atomNumber(p), dependency);
}

inline bool MarkedBlock::testAndSetMarked(const void* p)
inline bool MarkedBlock::testAndSetMarked(const void* p, Dependency dependency, TransactionAbortLikelihood abortLikelihood)
{
assertMarksNotStale();
return m_marks.concurrentTestAndSet(atomNumber(p));
return m_marks.concurrentTestAndSet(atomNumber(p), dependency, abortLikelihood);
}

inline bool MarkedBlock::Handle::isNewlyAllocated(const void* p)
@@ -222,49 +222,37 @@ void SlotVisitor::appendJSCellOrAuxiliary(HeapCell* heapCell)
} }
}

void SlotVisitor::appendUnbarriered(JSValue value)
void SlotVisitor::appendSlow(JSCell* cell, Dependency dependency)
{
if (!value || !value.isCell())
return;

if (UNLIKELY(m_heapSnapshotBuilder))
m_heapSnapshotBuilder->appendEdge(m_currentCell, value.asCell());

setMarkedAndAppendToMarkStack(value.asCell());
m_heapSnapshotBuilder->appendEdge(m_currentCell, cell);
appendHiddenSlowImpl(cell, dependency);
}

void SlotVisitor::appendHidden(JSValue value)
void SlotVisitor::appendHiddenSlow(JSCell* cell, Dependency dependency)
{
if (!value || !value.isCell())
return;

setMarkedAndAppendToMarkStack(value.asCell());
appendHiddenSlowImpl(cell, dependency);
}

void SlotVisitor::setMarkedAndAppendToMarkStack(JSCell* cell)
ALWAYS_INLINE void SlotVisitor::appendHiddenSlowImpl(JSCell* cell, Dependency dependency)
{
SuperSamplerScope superSamplerScope(false);

ASSERT(!m_isCheckingForDefaultMarkViolation);
if (!cell)
return;

#if ENABLE(GC_VALIDATION)
validate(cell);
#endif

if (cell->isLargeAllocation())
setMarkedAndAppendToMarkStack(cell->largeAllocation(), cell);
setMarkedAndAppendToMarkStack(cell->largeAllocation(), cell, dependency);
else
setMarkedAndAppendToMarkStack(cell->markedBlock(), cell);
setMarkedAndAppendToMarkStack(cell->markedBlock(), cell, dependency);
}

template<typename ContainerType>
ALWAYS_INLINE void SlotVisitor::setMarkedAndAppendToMarkStack(ContainerType& container, JSCell* cell)
ALWAYS_INLINE void SlotVisitor::setMarkedAndAppendToMarkStack(ContainerType& container, JSCell* cell, Dependency dependency)
{
container.aboutToMark(m_markingVersion);

if (container.testAndSetMarked(cell))
if (container.testAndSetMarked(cell, dependency, TransactionAbortLikelihood::Unlikely))
return;

ASSERT(cell->structure());
@@ -175,11 +175,14 @@ class SlotVisitor {

void appendJSCellOrAuxiliary(HeapCell*);
void appendHidden(JSValue);
void appendHidden(JSCell*);

JS_EXPORT_PRIVATE void setMarkedAndAppendToMarkStack(JSCell*);
JS_EXPORT_PRIVATE void appendSlow(JSCell*, Dependency);
JS_EXPORT_PRIVATE void appendHiddenSlow(JSCell*, Dependency);
void appendHiddenSlowImpl(JSCell*, Dependency);

template<typename ContainerType>
void setMarkedAndAppendToMarkStack(ContainerType&, JSCell*);
void setMarkedAndAppendToMarkStack(ContainerType&, JSCell*, Dependency);

void appendToMarkStack(JSCell*);

0 comments on commit d3e0604

Please sign in to comment.