-
Notifications
You must be signed in to change notification settings - Fork 45
Description
Bug Report
I am currently running the Debian system on the sysoul-x3300 platform (based on rk3588). During memory stress testing using memtester, I observed a critical stability issue.
When the tested memory size exceeds 4 GiB (total memory is 16 GiB, available 15 GiB; testing 1 GiB or 2 GiB works fine), an MMIO fault in zone0 is frequently triggered. This occurs even though the accessed memory address is correctly configured as belonging to zone0 in board.rs.
Logs
TODO: My logs are coming soon.
Configuration (board.rs)
TODO: My configuration is coming soon.Root Cause Analysis
Upon investigation, the root cause is an insufficient reserved memory area for the hypervisor in the device tree, leading to memory corruption by the root-linux kernel.
According to src/consts.rs, the memory layout of hvisor consists of:
- Static binary code (
.text,.data, etc.) - Per-CPU local storage (Stack, etc.)
- Frame Allocator Memory Pool
Source Code Reference (src/consts.rs):
pub use crate::memory::PAGE_SIZE;
use crate::{memory::addr::VirtAddr, platform::BOARD_NCPUS};
/// Size of the hypervisor heap.
pub const HV_HEAP_SIZE: usize = 1024 * 1024; // 1 MiB
pub const HV_MEM_POOL_SIZE: usize = 64 * 1024 * 1024; // 64 MiB
/// Size of the per-CPU data (stack and other CPU-local data).
pub const PER_CPU_SIZE: usize = 512 * 1024; // 512 KiB
/// ... (omitted)
pub fn mem_pool_start() -> VirtAddr {
core_end() + MAX_CPU_NUM * PER_CPU_SIZE
}
pub fn hv_end() -> VirtAddr {
mem_pool_start() + HV_MEM_POOL_SIZE
}Memory Layout Calculation (sysoul-x3300, 8 CPUs):
- Start Address:
0x0050_0000 core_end(Binary end):0x006e_6000mem_pool_start:0x00ae_6000- Calculation:
core_end+ (512 KiB * 8 CPUs) ≈0x006e_6000+ 4 MiB
- Calculation:
hv_end:0x04ae_6000- Calculation:
mem_pool_start+ 64 MiB (Frame Allocator)
- Calculation:
The Discrepancy:
The actual required memory range extends up to 0x04ae_6000 (approx. 70 MiB total). However, most existing device tree configurations only reserve 4 MiB for hvisor.
%%{init: {'theme': 'base', 'themeVariables': { 'fontFamily': 'arial', 'fontSize': '14px'}}}%%
flowchart LR
classDef memBlock fill:#e3f2fd,stroke:#1565c0,stroke-width:1px;
classDef boundaryNode fill:none,stroke:none,color:#555,font-size:12px;
classDef dangerBlock fill:#ffcdd2,stroke:#b71c1c,stroke-width:2px;
subgraph Reserved ["✅ Reserved Memory (Safe: 4 MiB)<br/>Range: 0x0050_0000 ~ 0x0090_0000"]
direction LR
StartAddr["0x0050_0000"]:::boundaryNode
Bin["Static Bin<br/>(~1.9 MiB)<br/>End: 0x006E_6000"]:::memBlock
C0["CPU 0<br/>512 KiB"]:::memBlock
C1["CPU 1<br/>512 KiB"]:::memBlock
C2["CPU 2<br/>512 KiB"]:::memBlock
C3["CPU 3<br/>512 KiB<br/>End: 0x008E_6000"]:::memBlock
StartAddr --- Bin --- C0 --- C1 --- C2 --- C3
end
subgraph Unreserved ["❌ Unreserved Region (Unsafe / MMIO Fault Risk)<br/>Range: 0x0090_0000 ~ 0x04AE_6000"]
direction LR
C4["CPU 4<br/>(Cross Boundary)<br/>Start: 0x008E_6000"]:::dangerBlock
C5["CPU 5<br/>512 KiB"]:::dangerBlock
C6["CPU 6<br/>512 KiB"]:::dangerBlock
C7["CPU 7<br/>512 KiB"]:::dangerBlock
PoolStartAddr["0x00AE_6000"]:::boundaryNode
FrameAlloc["Frame Allocator Pool<br/>Size: 64 MiB<br/>(Target of Corruption)"]:::dangerBlock
EndAddr["0x04AE_6000"]:::boundaryNode
C4 --- C5 --- C6 --- C7 --- PoolStartAddr --- FrameAlloc --- EndAddr
end
C3 --- C4
style Reserved fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,stroke-dasharray: 5 5
style Unreserved fill:#ffebee,stroke:#c62828,stroke-width:2px,stroke-dasharray: 5 5
Failure Mechanism
- The reserved 4 MiB covers the static binary and potentially the per-CPU data for the first few cores, but completely fails to cover the 64 MiB Frame Allocator.
hvisoruses this Frame Allocator to manage memory regions via a BTree structure.- When running
memtesterwith large memory blocks, the root-linux kernel allocates pages that physically overlap withhvisor's unreserved Frame Allocator region. - Linux overwrites the Frame Allocator data, corrupting the BTree metadata used for zone memory region tracking.
- Consequently,
hvisorloses track of valid memory regions, resulting in false MMIO faults when those addresses are accessed.
Why it seemed to work before:
- Luck: The specific physical pages used by the Frame Allocator were not allocated/overwritten by Linux during lighter loads.
- Partial Coverage: The 4 MiB reservation covers the binary and initial CPU stacks. Since root-linux often utilizes fewer cores (e.g., 2 cores) during boot or idle, the per-CPU data for the active cores remained safe within the reserved area.
Action Items
To resolve this issue and prevent future occurrences, the following actions are required:
- Build System Update: Implement a mechanism in the build system to calculate and output the exact required reserved memory range (Entry point to
hv_end) during compilation. - Configuration Fix: Update all existing board configurations and Device Trees (DTS) to reserve sufficient memory (covering the full 64 MiB pool + per-CPU areas).
- CI/CD Enhancement: Integrate
memtesterinto the CI system test workflow. The root-linux should perform memory stress tests immediately after boot to ensure memory integrity before proceeding with other tests. This explains the high failure rate in past CI runs. - Documentation: Update the
hvisor-bookto explicitly document the static and runtime memory layout. Add a guide on how to correctly calculate and configurereserved-memoryin the device tree.