Skip to content

feat: add configurable scratch region base GPA#1271

Open
danbugs wants to merge 5 commits intohyperlight-dev:mainfrom
nanvix:danbugs/configurable-scratch-gpa
Open

feat: add configurable scratch region base GPA#1271
danbugs wants to merge 5 commits intohyperlight-dev:mainfrom
nanvix:danbugs/configurable-scratch-gpa

Conversation

@danbugs
Copy link
Contributor

@danbugs danbugs commented Mar 3, 2026

Summary

Adds a configurable scratch region base GPA to SandboxConfiguration, makes the snapshot memory region writable for non-init-paging guests, and uses compile-time #[cfg(feature = "init-paging")] gates for scratch GVA computation.

Motivation

Nanvix runs as a 32-bit (i386) guest without Hyperlight's init-paging feature. The default scratch region GPA sits at the top of the 36-bit physical address range — unreachable by a non-paging 32-bit kernel. This PR lets the caller place the scratch region at an arbitrary GPA (e.g., just below 4 GB) and ensures the snapshot region is host-writable so non-paging guests can function correctly.

Changes

Commit 1: feat: add configurable scratch region base GPA

  • config.rs: New scratch_base_gpa: u64 field (0 = use default) with getter/setter. Uses u64 with a zero sentinel rather than Option<u64> to maintain #[repr(C)] FFI safety, matching the existing heap_size_override pattern.
  • layout.rs: Uses custom GPA when specified
  • hyperlight_vm.rs: Stores and passes through scratch_base_gpa
  • gdb/mod.rs: Adapts to layout changes
  • snapshot.rs: Takes scratch_base_gpa: u64 directly instead of computing it from size

Commit 2: fix: make snapshot region writable for non-init-paging guests

  • shared_mem.rs: Makes snapshot region writable at host level so non-paging guests can write to it

Commit 3: fix: use init-paging feature gate for scratch GVA computation

  • layout.rs + hyperlight_vm.rs: Uses #[cfg(feature = "init-paging")] for scratch GVA computation. With init-paging, the GVA is the high-canonical page-table-mapped address; without it, GVA = GPA (identity mapped). This is a compile-time decision, not a runtime one.

Commit 4: fix: use scratch base GPA instead of scratch size in page table walker

  • mgr.rs: Updates read_guest_memory_by_gva (behind trace_guest/mem_profile features) to pass the correct scratch_base_gpa value instead of scratch_size to access_gpa and SharedMemoryPageTableBuffer::new, matching their updated signatures from commit 1.

Test plan

  • cargo clippy passes with all feature combinations (kvm,init-paging, kvm,executable_heap, kvm,mshv3,mem_profile, kvm,mshv3,trace_guest, etc.)
  • cargo test -p hyperlight-host --no-default-features -F "kvm,init-paging" --lib passes
  • Non-init-paging build: cargo build -p hyperlight-host --no-default-features -F "kvm,executable_heap" --lib succeeds
  • Existing CI should pass

danbugs added 3 commits March 3, 2026 23:04
Add `set_scratch_base_gpa()` / `get_scratch_base_gpa()` to
`SandboxConfiguration` so that callers can override where the scratch
region is placed in guest physical address space.

By default the scratch region sits at the top of the maximum GPA range
(36-bit on KVM). 32-bit guests cannot address those GPAs, so this
option lets them place the scratch region within their addressable range
(e.g. just below 4 GB).

When `None`, the existing default behaviour is preserved.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
Signed-off-by: danbugs <danilochiarlone@gmail.com>
The GVA=GPA decision should be based on the init-paging feature flag,
not on whether scratch_base_gpa is customized. A user could set a
custom GPA while still using init-paging (page tables map default
high-canonical GVA to the custom GPA).

Signed-off-by: danbugs <danilochiarlone@gmail.com>
@danbugs danbugs force-pushed the danbugs/configurable-scratch-gpa branch from 9275564 to b961eb1 Compare March 3, 2026 23:04
@danbugs danbugs added the kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. label Mar 3, 2026
The access_gpa and SharedMemoryPageTableBuffer::new functions now take
a scratch_base_gpa (u64) instead of scratch_size (usize). Update the
read_guest_memory_by_gva caller to pass the correct value from
layout.get_scratch_base_gpa().

Signed-off-by: danbugs <danilochiarlone@gmail.com>
@danbugs danbugs force-pushed the danbugs/configurable-scratch-gpa branch from ef6485a to 4b579d3 Compare March 3, 2026 23:19
@danbugs danbugs mentioned this pull request Mar 4, 2026
…mpatibility

MSHV and WHP reject vCPU state with zeroed segment registers (ES, SS,
FS, GS, LDT) and uninitialized XSAVE areas.  Properly initialize all
segment registers in standard_real_mode_defaults() and add reset_xsave()
call after set_sregs() to ensure FPU state (FCW, MXCSR) is valid.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
Copy link
Contributor

@dblnz dblnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
However, I am not that well acquainted with this part of the codebase. Maybe @syntactically can take a look

..Default::default()
},
// CR0.ET (bit 4) is hardwired to 1 on modern x86 CPUs.
// MSHV rejects the vCPU state if ET is not set.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How hasn't mshv failed so far? was this scenario not tested before?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe there are many tests for the 16/32-bit guests.

// MSHV (unlike KVM) does not provide sane defaults for the XSAVE
// area on a freshly created vCPU, and will reject the VP state
// on the first run if it is left uninitialized.
vm.reset_xsave().map_err(VmError::Register)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious how this worked before

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like at issue here is the difference between bringing up the vcpu in real vs long mode. Does the xsave reset call need to be there in both, or should it just be in real mode? I know we had tests of the long mode mshv stuff passing before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the real vs long is the issue here. When I did some investigation in mxcsr on kvm awhile back it did work fine in long mode.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that if we don't need this for some targets it would be nice to not waste cycles here

@syntactically
Copy link
Member

I'm not clear on the need for a configurable region base for scratch. You can just set MAX_GPA/MAX_GVA based on target architecture + feature flags? (see src/hyperlight_common/src/arch/i686/layout.rs). Maybe the selection based on target_arch of which layout module to use is broken in the host for the i686 case where it's the guest arch that counts; maybe that needs to change to something like #[cfg_attr(any(target_arch = "x86", all(target_arch = "x86-64", not(feature = "init-paging")), path = "arch/i686/layout.rs")]. (Or, we could add cargo cfg aliases for guest arch, or something?). I would really prefer to avoid re-adding more configuration knobs.

I'm not clear on how the idea of making snapshot pages writable is going to work with the snapshot-first lifecycle (#1268). This seems like it will break a lot of assumptions and make a lot of code more complex/rife with opportunities for vulnerabilities. If you really are unable to run on amd64 natively, would teaching hyperlight about i686 paging structures be a slightly smaller break of the conceptual model?

@jsturtevant
Copy link
Contributor

I'm not clear on how the idea of making snapshot pages writable is going to work with the snapshot-first lifecycle (#1268).

This seems like something we could flush out in that discussion? With CoW changes we've broken some downstream consumers, so it makes sense give an escape hatch (maybe behind a feature flag) while we sort through what we want to do. As far as I understand, this isn't a breaking change and doesn't introduce issues now? or does it? To be clear I think we do need to reconcile this requirement and enabling this wouldn't be a long term solution

Copy link
Contributor

@ludfjig ludfjig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does nanvix not use snapshotting? If we can do this without adding a config option, I think that would be good too.

// MSHV (unlike KVM) does not provide sane defaults for the XSAVE
// area on a freshly created vCPU, and will reject the VP state
// on the first run if it is left uninitialized.
vm.reset_xsave().map_err(VmError::Register)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that if we don't need this for some targets it would be nice to not waste cycles here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement For PRs adding features, improving functionality, docs, tests, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants