Skip to content

feat: detect and diagnose hvisor memory region overlap with root zone#311

Open
Inquisitor-201 wants to merge 2 commits into
devfrom
fix/memory-overlap-detection
Open

feat: detect and diagnose hvisor memory region overlap with root zone#311
Inquisitor-201 wants to merge 2 commits into
devfrom
fix/memory-overlap-detection

Conversation

@Inquisitor-201
Copy link
Copy Markdown
Contributor

Summary

The root zone's ROOT_ZONE_MEMORY_REGIONS often overlaps with hvisor's own physical memory [skernel, __hv_end). This causes Linux to allocate and write to hvisor's page tables and heap, manifesting as seemingly random "unhandled MMIO fault" panics. This problem has appeared at least 10 times and is tracked in #310.

Changes

1. Compile-time overlap check (tools/check_hv_mem_overlap.py, integrated into make all)

  • Reads skernel and __hv_end from the built ELF
  • Parses ROOT_ZONE_MEMORY_REGIONS from board.rs (comment-aware)
  • Fails the build with a clear diagnostic if any MEM_TYPE_RAM region overlaps hvisor's range
  • Skips gracefully when symbols or board config aren't available

2. Runtime diagnostic (check_fault_in_hvisor_mem() in src/arch/aarch64/trap.rs)

  • When handle_dabt encounters a stage-2 fault address within [skernel, __hv_end), prints:
    • The fault address and hvisor's range
    • The likely cause (guest DTB/config includes hvisor's memory)
    • Instructions to fix (exclude the range from memory_regions and DTB)
  • Falls through to the original error handling if the address is outside hvisor's range

Testing

  • Verified on rk3568: overlap check correctly blocks the build (hvisor at 0x60080000, RAM up to 0xf0000000)
  • Verified on qemu-gicv3: passes cleanly when no overlap exists
  • Intentionally misconfigured qemu-gicv3 RAM to overlap -> build correctly fails
  • Self-test script validates symbol parsing and range boundary logic

Closes: #310
Related: #309

Add two mitigations for the recurring "memory stomping" problem where
hvisor's physical memory falls within root zone MEM_TYPE_RAM regions:

1. Compile-time overlap check (tools/check_hv_mem_overlap.py):
   Post-link script that reads skernel/__hv_end from the ELF, parses
   ROOT_ZONE_MEMORY_REGIONS from board.rs, and fails the build if any
   MEM_TYPE_RAM region overlaps hvisor's range.

2. Runtime diagnostic in aarch64 trap handler:
   When handle_dabt gets a stage-2 fault whose address falls within
   hvisor's physical memory range [skernel, __hv_end), it prints the
   actual cause before panicking instead of the generic error.

Closes: #310
Related: #309

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@Inquisitor-201 Inquisitor-201 force-pushed the fix/memory-overlap-detection branch from 7ffb20e to 4a4ec24 Compare May 19, 2026 14:27
@Inquisitor-201 Inquisitor-201 requested review from agicy and dallasxy May 19, 2026 14:28
…8 and riscv64 qemu-plic

Split overlapping MEM_TYPE_RAM regions in board.rs to prevent memory
stomping where Linux in the root zone would write to pages owned by
hvisor (page tables, heap), causing unrecoverable panics.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@Inquisitor-201 Inquisitor-201 force-pushed the fix/memory-overlap-detection branch from c8c8e72 to 5457167 Compare May 19, 2026 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

aarch64 feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

hvisor physical memory overlaps with root zone RAM regions (memory stomping)

2 participants