Support madvise, msync, and mremap on high-VA mmap regions#83
Merged
Conversation
There was a problem hiding this comment.
4 issues found across 7 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="tests/test-rosetta-madvise.sh">
<violation number="1" location="tests/test-rosetta-madvise.sh:66">
P2: Script exits 0 even when tests fail — missing `if [ "$fail" -gt 0 ]; then exit 1; fi` after `report_summary`. This masks test failures in standalone runs and weakens the matrix runner's belt-and-suspenders `|| rc=1` check.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| printf '%s\n' "$madv_out" >&2 | ||
| fi | ||
|
|
||
| report_summary "$total" |
There was a problem hiding this comment.
P2: Script exits 0 even when tests fail — missing if [ "$fail" -gt 0 ]; then exit 1; fi after report_summary. This masks test failures in standalone runs and weakens the matrix runner's belt-and-suspenders || rc=1 check.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At tests/test-rosetta-madvise.sh, line 66:
<comment>Script exits 0 even when tests fail — missing `if [ "$fail" -gt 0 ]; then exit 1; fi` after `report_summary`. This masks test failures in standalone runs and weakens the matrix runner's belt-and-suspenders `|| rc=1` check.</comment>
<file context>
@@ -0,0 +1,66 @@
+ printf '%s\n' "$madv_out" >&2
+fi
+
+report_summary "$total"
</file context>
jserv
requested changes
Jul 2, 2026
jserv
left a comment
Contributor
There was a problem hiding this comment.
Rebase latest main branch and resolve conflicts and refine per review.
6c44a18 to
b348ca7
Compare
jserv
requested changes
Jul 2, 2026
jserv
left a comment
Contributor
There was a problem hiding this comment.
Enforce rules described in https://cbea.ms/git-commit/ carefully.
sys_madvise, sys_msync, and sys_mremap were all primary-window-only: they computed off = addr - ipa_base and reached into host_base + off, which only resolves for identity regions (gpa_base == start). High-VA mmap regions -- Rosetta's slab/JIT and guest mmap(NULL) placements -- back their pages at gpa_base (a named mapping or overflow segment), so those paths rejected the range with -ENOMEM/-EFAULT or wrote to the wrong host address. V8's page allocator decommits guard/code pages with mprotect(PROT_NONE)+madvise(MADV_DONTNEED) and CHECK_EQ(0, ret)s the result, so the spurious ENOMEM aborted x86_64 Node.js the moment its JIT initialized; apt's high-VA MAP_SHARED package-list cache hit the same wall in sys_msync (issue sysprog21#108). Route every admission and data-movement path through the region tracker and host_ptr_for_gpa(gpa_base + ...) so primary and high-VA regions act on their real backing. Identity regions have gpa_base == start, so this collapses to host_base + off and primary behaviour is byte-for-byte unchanged: - madvise MADV_DONTNEED accepts high-VA ranges the tracker records as mapped and zeroes / restores them on their gpa_base backing. - msync's admission drops the guest_size bound (the coverage loop validates both windows) and sync_shared_aliases_range / refresh_shared_region_range resolve the guest bytes through gpa_base. - sys_mmap_high_va accepts file-backed MAP_SHARED as a snapshot-style shared region (pread on map, msync writes dirty bytes back), so the high-VA shared mappings msync operates on can be created (sysprog21#108). - mremap admits high-VA sources and resolves the source read/zero through gpa_base in the shrink, MREMAP_FIXED, and MREMAP_MAYMOVE paths; the destination stays in the primary window (find_free_gap / mremap_extend_range), so an mremap(MAYMOVE) of a high-VA region relocates it there with its contents intact -- no high-VA destination backing or new Stage-2 machinery is needed. Regression tests (vendored x86_64 static ELFs run through elfuse + Rosetta): x86_64-rosetta-madvise (MADV_DONTNEED on writable / PROT_NONE guard / multi-page high-VA ranges -- the V8 decommit pattern), x86_64-rosetta-msync (high-VA MAP_SHARED write-back, single/multi-page/MS_ASYNC), and x86_64-rosetta-mremap (MREMAP_MAYMOVE grow with contents preserved and in-place shrink). Wired as make test-rosetta-{madvise,msync,mremap} (and test-rosetta-all) plus the matching suites in tests/test-matrix.sh, with the rebuild recipe in tests/fixtures/rosetta/README.md. The aarch64 unit tests test-madvise / test-msync / test-mremap / test-mremap-infra (primary-path coverage) pass; the new Rosetta suites pass. An end-to-end MAP_SHARED store scenario (write + msync + cross-mapping refresh + mremap grow + madvise) passes on this branch and fails (ENODEV/ENOMEM) on the pre-fix baseline.
b348ca7 to
115c659
Compare
Contributor
|
Thank @Max042004 for contributing! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Three memory syscalls --
sys_madvise,sys_msync, andsys_mremap-- plusthe file-backed
MAP_SHAREDpath insys_mmap_high_vawere allprimary-window-only. They computed
off = addr - ipa_baseand reached intohost_base + off, which only resolves for identity regions (gpa_base == start). High-VA mmap regions -- Rosetta's own slab/JIT and guestmmap(NULL)placements -- back their pages at
gpa_base(a named mapping or overflowsegment), so these paths rejected the range with
-ENOMEM/-EFAULTor wrote tothe wrong host address.
Two concrete failures motivate this:
x86_64 Node.js / V8. V8's page allocator decommits guard/code pages with
mprotect(PROT_NONE) + madvise(MADV_DONTNEED)andCHECK_EQ(0, ret)s themadvise return, so the spurious
-ENOMEMaborted Node the moment its JITinitialized:
--verbosepins it tomadvise(advice=4 MADV_DONTNEED) -> -12 (ENOMEM)on apage in the high-VA window (
mmap(NULL)under Rosetta lands at e.g.0x7fffff7fd000).apt (issue
msync()returnsENOMEM(errno 12) for high-VA guest mappings created by Rosetta #108). apt memory-maps its package-list cache withMAP_SHAREDand
msyncs it; under an x86_64 guest that mapping is high-VA, sosys_msyncreturned
-ENOMEMand apt failed withE: Unable to synchronize mmap - msync (12: Cannot allocate memory). Themapping could not even be created, because
sys_mmap_high_varefusedfile-backed
MAP_SHAREDwith-ENODEV.Fix
Route every admission and data-movement path through the region tracker and
host_ptr_for_gpa(gpa_base + ...)so primary and high-VA regions act on theirreal backing. Identity regions have
gpa_base == start, so this collapses tohost_base + offand primary behaviour is byte-for-byte unchanged.MADV_DONTNEEDaccepts high-VA ranges the tracker records asmapped and zeroes / restores them on their
gpa_basebacking (drops theprevious zero-fill-only scope-out for high-VA file-backed pages).
guest_sizebound (the existing coverage loopvalidates both windows), and
sync_shared_aliases_range/refresh_shared_region_rangeresolve the guest bytes throughgpa_base.sys_mmap_high_vaaccepts file-backedMAP_SHAREDas a snapshot-styleshared region (pread on map,
msyncwrites dirty bytes back), so the high-VAshared mappings
msyncoperates on can be created.gpa_basein the shrink,MREMAP_FIXED, andMREMAP_MAYMOVEpaths. Thedestination stays in the primary window (
find_free_gap/mremap_extend_range), so anmremap(MAYMOVE)of a high-VA region relocatesit there with its contents intact -- no high-VA destination backing or new
Stage-2 machinery is needed.
Tests
Vendored x86_64 static ELFs run through elfuse + Rosetta (rebuild recipe in
tests/fixtures/rosetta/README.md):x86_64-rosetta-madvise--MADV_DONTNEEDon writable /PROT_NONEguard /multi-page high-VA ranges (the V8 decommit pattern).
x86_64-rosetta-msync-- high-VAMAP_SHAREDwrite-back(single/multi-page/
MS_ASYNC), verified against the backing file.x86_64-rosetta-mremap--MREMAP_MAYMOVEgrow (contents preserved) andin-place shrink.
Wired as
make test-rosetta-{madvise,msync,mremap}(andtest-rosetta-all)plus the matching suites in
tests/test-matrix.sh.The aarch64 unit tests
test-madvise/test-msync/test-mremap/test-mremap-infracover the unchanged primary-window path. An end-to-endMAP_SHAREDstore scenario (write +msync+ cross-mapping refresh +mremapgrow +
madvise) passes with this change and fails (-ENODEV/-ENOMEM) on thepre-fix baseline.
Notes
suites were verified locally on Apple silicon; they self-skip (exit 77) where
the translator or
timeout(1)is unavailable.aarch64 hosts); rebuild them out of tree per the README when the sources
change.
MAP_SHAREDserved through the FUSE sysroot layer,the backing is materialized read-only, so
msyncwrite-back to the originalsysroot file is best-effort; real host-fd backings (e.g.
/dev/shm) writeback exactly. This matches how the primary window already treats such
mappings.
Fixes #108.