bmalloc: add 1ms backoff and retry cap to SYSCALL/PAS_SYSCALL EAGAIN loops#169
Draft
coleleavitt wants to merge 2 commits intooven-sh:mainfrom
Draft
bmalloc: add 1ms backoff and retry cap to SYSCALL/PAS_SYSCALL EAGAIN loops#169coleleavitt wants to merge 2 commits intooven-sh:mainfrom
coleleavitt wants to merge 2 commits intooven-sh:mainfrom
Conversation
…loops The SYSCALL and PAS_SYSCALL macros retry syscalls on EAGAIN in a zero-delay tight loop. When madvise(MADV_DONTDUMP) returns EAGAIN due to kernel mmap_write_lock contention (VMA split/merge allocation failure under memory pressure), this causes 100% CPU usage across all GC threads — effectively freezing the application. Add usleep(1000) backoff (1ms) and cap retries at 100 (100ms total). madvise failures here are advisory, not fatal, so breaking after max retries is safe. This matches the existing Windows precedent in libpas/pas_page_malloc.c virtual_alloc_with_retry() which uses Sleep(50ms) with 10 max retries. Upstream Apple WebKit has the same zero-delay loop and has not yet addressed this. tcmalloc uses bounded retries (3 attempts) for expensive madvise operations. sched_yield() was considered but is explicitly not recommended for this use case (Red Hat RHEL-RT guide). Related: oven-sh/bun#17723, oven-sh/bun#27371, oven-sh/bun#27196, google/tcmalloc#247, golang/go#61718
…k contention MADV_DONTDUMP is the sole cause of the mmap_write_lock contention that triggers the EAGAIN spin loop fixed in the previous commit. Unlike MADV_DONTNEED which only acquires the kernel's mmap_read_lock (no contention), MADV_DONTDUMP requires mmap_write_lock — a single process-wide exclusive lock. With concurrent GC threads all calling vmDeallocatePhysicalPages(), MADV_DONTDUMP creates a serialization point in the kernel. Under memory pressure, VMA split/merge allocation fails and the kernel returns EAGAIN, which (before the previous fix) caused 100% CPU spin. MADV_DONTDUMP only affects core dump size — it has zero impact on memory reclamation or allocation correctness. MADV_DODUMP (its symmetric counterpart in vmAllocatePhysicalPages/commit_impl) is also removed. This is the root cause elimination (vs the previous commit which is the defensive mitigation). Together they fully resolve the issue. Removed 4 madvise calls: - VMAllocate.h vmDeallocatePhysicalPages: MADV_DONTDUMP - VMAllocate.h vmAllocatePhysicalPages: MADV_DODUMP - pas_page_malloc.c decommit_impl: MADV_DONTDUMP - pas_page_malloc.c commit_impl: MADV_DODUMP
This was referenced Feb 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
SYSCALLandPAS_SYSCALLmacros in bmalloc retry syscalls onEAGAINin a zero-delay tight loop. Whenmadvise(MADV_DONTDUMP)returnsEAGAINdue to kernelmmap_write_lockcontention, this causes 100% CPU usage across all GC threads — effectively freezing the application.This PR adds
usleep(1000)backoff (1ms) and caps retries at 100 (100ms total).Root Cause Analysis
The Smoking Gun
BSyscall.h (before):
pas_utils.h (before):
Zero-delay infinite retry. No backoff, no sleep, no yield, no retry cap.
The Fix
Why This Approach
virtual_alloc_with_retry()inlibpas/pas_page_malloc.calready usesSleep(50ms)with 10 max retriesWhy NOT sched_yield()
Per Red Hat RHEL-RT Tuning Guide: sched_yield can reschedule immediately (busy loop) or after long delay — unpredictable behavior.
usleep(1000)provides deterministic 1ms backoff.Blast Radius
17 callsites affected (all madvise/mprotect/mincore — all benefit from this fix):
bmalloc/VMAllocate.h(madvise calls invmDeallocatePhysicalPagesandvmAllocatePhysicalPages)libpas/pas_page_malloc.c(madvise/mprotect incommit_implanddecommit_impl)libpas/pas_committed_pages_vector.c(mincore inpas_committed_pages_vector_construct)Upstream Status
Apple's upstream WebKit has the identical zero-delay SYSCALL macro and has not addressed this. This fix is novel.
Related Issues
bun -e "console.log('ok')")Complementary Fix (Not in This PR)
MADV_DONTDUMP(the specific call that takesmmap_write_lock) could also be removed or made optional. It only affects core dump size, not allocation correctness. However, that's a behavioral change best evaluated separately.