libpas: memory-accounting fixes + embedder thread-suspension hook for Linux TLC decommit#182
Conversation
pas_fast_large_free_heap.c: fast_write_cursor and fast_merge had inverted max-heap invariant checks (shrink checked parent, grow checked children; should be the reverse). When the root shrank or a leaf grew via coalescing, the node stayed in place with parent.y < child.y, hiding free blocks from find_first and forcing redundant aligned_allocator pulls. Affects every pas_large_heap and pas_large_utility_free_heap (TLC allocations). pas_bitfit_directory.c: take_last_empty used pas_versioned_field_read instead of read_to_watch, so a concurrent view_did_become_empty_at_index between the read and the final try_write was a no-op (no version bump), letting try_write zero last_empty_plus_one with empty_bits[K] still set: a permanently stranded page. Segregated directory already does this right. pas_probabilistic_guard_malloc_allocator.c: deallocate dereferenced *entry after pas_ptr_hash_map_remove may have rehashed/freed the table. pas_thread_local_cache.c: TLC realloc copied should_stop_bitvector but not its should_stop_some gate, dropping pending stop requests for one cycle. pas_scavenger.c: try_install_foreign_work_callback returned false without unlocking foreign_work.lock when the descriptor table was full. pas_large_free_heap_helpers.c: large_utility_aligned_allocator's failure path called give_back without the talks_to_large_sharing_pool guard that wraps the matching take_later. pas_deferred_decommit_log.c: decommit_all coalesced adjacent ranges without comparing mmap_capability, decommitting the second range with the first's capability. Tests: two LargeFreeHeapTests cases (root-shrink, leaf-grow) and one BitfitTests race-hook case; all fail before, pass after. New race-test-hook kind compiles to a no-op when PAS_ENABLE_TESTING is off. Build: bump libpas.xcodeproj to gnu++23 so test_pas links against the current libc++ (<atomic> vs <stdatomic.h> interop hard-errors before C++23).
…decommit
On non-Darwin platforms the scavenger could only request that a thread stop
its allocators (honored on the next allocation slow path), so allocators on
threads that go idle without exiting were never stopped and their TLC pages
stayed committed. Darwin uses Mach thread_suspend to force-stop them; this
adds an embedder-provided suspender so the same path works on Linux without
libpas owning a second signal-based mechanism that would deadlock against
the embedder's GC suspension.
pas_thread_suspender: install-once table of {current_thread, begin_suspend,
end_suspend}. Contract: callbacks must not allocate or take libpas locks;
begin_suspend serializes against the embedder's other suspenders.
pas_thread_local_cache: store embedder_thread_handle at TLC creation and
copy it on realloc. can_force_stop_allocators() is compile-time true on
Darwin, runtime (suspender_instance && handle) elsewhere. Non-Darwin
suspend()/resume() route through the installed callbacks; Darwin path is
unchanged. decommit_allocator_range() is now PAS_OS(DARWIN) || PAS_OS(LINUX).
Without an installed suspender behavior is identical to before (request-only
on non-Darwin). The JSC/WTF adapter is a follow-up; this commit is the
libpas hook.
Tests: ThreadSuspenderTests verifies install + handle storage on all
platforms, and balanced begin/end calls on force-stop on non-Darwin.
…resume ThreadSuspendLocker already wraps pas_thread_suspend_lock under BENABLE(LIBPAS), so the libpas scavenger and JSC's GC are already serialized by the same lock. The adapter therefore needs an adopt-lock constructor (the scavenger holds the lock when begin_suspend is called) — Thread::suspend takes the locker by const ref purely as a witness, so no behavior change. current_thread() takes a ref on the WTF::Thread to keep it alive across the TLC's lifetime; release_handle() (new optional callback, fired from TLC destroy after dropping libpas locks) drops it. Linux only: Darwin keeps direct Mach in libpas; Windows decommit isn't enabled. Uses g_wtfConfig.sigThreadSuspendResume — whatever signal Bun already configures (SIGPWR), no second handler. Also: pas_store_store_fence before publishing pas_thread_suspender_instance.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughAdds an embedder-facing thread suspender API for libpas, integrates it into WTF Thread suspend/resume paths, augments ThreadSuspendLocker with adopt-lock semantics, updates bmalloc build/test inputs, and includes multiple libpas bug fixes and new tests exercising the suspender and allocator behaviors. Changes
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@Source/bmalloc/libpas/src/libpas/pas_thread_suspender.h`:
- Around line 53-60: The struct pas_thread_suspender uses raw function pointer
types which reduces readability; define typedefs like
pas_current_thread_callback (for pas_embedder_thread_handle (*)(void)),
pas_begin_suspend_callback (for bool (*)(pas_embedder_thread_handle)),
pas_end_suspend_callback (for void (*)(pas_embedder_thread_handle)), and
pas_release_handle_callback (for void (*)(pas_embedder_thread_handle)); then
update the struct fields current_thread, begin_suspend, end_suspend, and
release_handle to use these typedef names instead of raw pointer signatures so
the code is clearer and the types are reusable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6ad9ad7d-4ce8-41fb-8f7d-05ea1a478917
📒 Files selected for processing (19)
Source/WTF/wtf/Threading.cppSource/WTF/wtf/Threading.hSource/bmalloc/CMakeLists.txtSource/bmalloc/libpas/libpas.xcodeproj/project.pbxprojSource/bmalloc/libpas/src/libpas/pas_bitfit_directory.cSource/bmalloc/libpas/src/libpas/pas_deferred_decommit_log.cSource/bmalloc/libpas/src/libpas/pas_fast_large_free_heap.cSource/bmalloc/libpas/src/libpas/pas_large_free_heap_helpers.cSource/bmalloc/libpas/src/libpas/pas_probabilistic_guard_malloc_allocator.cSource/bmalloc/libpas/src/libpas/pas_race_test_hooks.hSource/bmalloc/libpas/src/libpas/pas_scavenger.cSource/bmalloc/libpas/src/libpas/pas_thread_local_cache.cSource/bmalloc/libpas/src/libpas/pas_thread_local_cache.hSource/bmalloc/libpas/src/libpas/pas_thread_suspender.cSource/bmalloc/libpas/src/libpas/pas_thread_suspender.hSource/bmalloc/libpas/src/test/BitfitTests.cppSource/bmalloc/libpas/src/test/LargeFreeHeapTests.cppSource/bmalloc/libpas/src/test/TestHarness.cppSource/bmalloc/libpas/src/test/ThreadSuspenderTests.cpp
| struct pas_thread_suspender { | ||
| pas_embedder_thread_handle (*current_thread)(void); | ||
| bool (*begin_suspend)(pas_embedder_thread_handle); | ||
| void (*end_suspend)(pas_embedder_thread_handle); | ||
| /* Optional; may be NULL. Called once when the TLC that owns the handle is destroyed, | ||
| from the owning thread, with no libpas locks held. */ | ||
| void (*release_handle)(pas_embedder_thread_handle); | ||
| }; |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Function pointer typedefs would improve readability.
Consider using typedefs for the function pointer types to improve readability and enable reuse:
typedef pas_embedder_thread_handle (*pas_current_thread_callback)(void);
typedef bool (*pas_begin_suspend_callback)(pas_embedder_thread_handle);
// etc.This is a minor style suggestion.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Source/bmalloc/libpas/src/libpas/pas_thread_suspender.h` around lines 53 -
60, The struct pas_thread_suspender uses raw function pointer types which
reduces readability; define typedefs like pas_current_thread_callback (for
pas_embedder_thread_handle (*)(void)), pas_begin_suspend_callback (for bool
(*)(pas_embedder_thread_handle)), pas_end_suspend_callback (for void
(*)(pas_embedder_thread_handle)), and pas_release_handle_callback (for void
(*)(pas_embedder_thread_handle)); then update the struct fields current_thread,
begin_suspend, end_suspend, and release_handle to use these typedef names
instead of raw pointer signatures so the code is clearer and the types are
reusable.
Preview Builds
|
…rtion The libpas scavenger calls Thread::suspend (via pas_thread_suspender) while holding the heap lock. The scavenger is a raw pthread with no WTF::Thread, so currentSingleton() lazy-allocates one -> fastMalloc -> heap lock -> deadlock. currentMayBeNull() reads TLS without allocating; if it returns null the caller can't be suspending itself anyway.
What this fixes
Bun's RSS on Linux servers sits noticeably higher than the equivalent Node.js process and doesn't fully recover after load drops. Two independent causes, both in libpas:
thread_suspend; Linux had no equivalent path.This PR fixes (1) directly and fixes (2) by letting libpas borrow WTF's existing thread-suspension mechanism instead of owning a second one.
Commit 1 — eight libpas accounting fixes
The two that affect production memory:
pas_fast_large_free_heap.c— cartesian-tree max-heap invariant inverted. The tree is a treap keyed on (address, size) used to find a free block ≥ N bytes.fast_write_cursorandfast_mergehad the re-balance check backwards: when a node shrinks they checked the parent (should be children), when it grows they checked children (should be parent). When the root shrank below a child, or a leaf grew past its parent, the tree silently broke,find_firstpruned the subtree, and the allocator pulled fresh pages fromaligned_allocatorwhile a satisfying block sat hidden. This is the path every large allocation and every TLC allocation takes.pas_bitfit_directory.c—take_last_emptyrace strands pages. It readlast_empty_plus_onewithpas_versioned_field_readinstead ofread_to_watch. A concurrentview_did_become_emptybetween the read and the finaltry_writewas a no-op (no version bump), sotry_writezeroed the field withempty_bits[K]still set — a permanently stranded page. The segregated directory already does this correctly.The other six are correctness/hygiene with little production impact (PGM UAF, a missed flag copy on TLC realloc, a missing mutex unlock, two test-config-only accounting symmetries, and bumping the standalone xcodeproj to gnu++23 so
test_pasactually compiles on the current macOS SDK).All have new regression tests in
LargeFreeHeapTests.cpp/BitfitTests.cppthat fail onmainand pass after.Commit 2 —
pas_thread_suspendercallback APIAdds an install-once embedder callback table:
pas_thread_local_cachestores the handle. The scavenger's force-stop path becomes:thread_suspend(kernel-counted, already nests with GC). Disassembly of the production dylib confirms the embedder branch is dead-code-eliminated.begin_suspend/end_suspend.decommit_allocator_rangeis enabled onPAS_OS(DARWIN) || PAS_OS(LINUX).A testing-only
pas_thread_suspender_override_nativeflag lets Darwin route through the embedder so the Linux code path can be exercised locally —ThreadSuspenderTests.cppdoes this with a Mach-backed mock.Commit 3 — WTF adapter
The adapter wraps
WTF::Thread::suspend/resume. Three things make this safe:Same lock, no new mechanism.
ThreadSuspendLockeralready wrapspas_thread_suspend_lockunderBENABLE(LIBPAS), so GC suspension and the libpas scavenger were already serialized by the same lock. The adapter uses a newThreadSuspendLocker(AdoptLock)constructor since the scavenger holds the lock whenbegin_suspendis called.Thread::suspendonly takes the locker as a const-ref witness, so this is purely a token.Same signal, no new handler.
Thread::suspendon Linux usesg_wtfConfig.sigThreadSuspendResume— whatever the embedder configures (Bun uses SIGPWR). libpas installs no handlers and reserves no signals.Handle lifetime.
current_thread()takes a ref on theWTF::Thread;release_handle()(fired from TLC destroy after libpas drops its locks) drops it. Threads created outside WTF return a null handle and stay request-only — graceful degradation.The "what if GC suspends a thread the scavenger also wants" case: on Darwin the kernel counts suspensions; on Linux the shared lock serializes them — the scavenger blocks on
pas_thread_suspend_lockuntil GC'sThreadSuspendLockerdestructs, then proceeds. No nesting, no deadlock.Testing
Standalone libpas (
./build.sh -s macosx -a arm64 -t test_pas -v testing):main— 2 new tests)main— 1 new test)main, identicalAdversarial: built clean
mainin a worktree with only the new tests applied — they crash with the expected assertions; pass after the fixes.testForceStopUsesEmbedder20/20 under repeat. Production dylib:override_nativesymbol absent,_stop_allocatordisassembly shows direct_thread_suspendwith no_pas_thread_suspender_instanceload.