Skip to content

[1.14] Advertise wait ack bit in free page hinting#14

Merged
kalyazin merged 4 commits into
e2b-dev:firecracker-v1.14-direct-memfrom
kalyazin:feat/fph-wait-ack-1.14
May 12, 2026
Merged

[1.14] Advertise wait ack bit in free page hinting#14
kalyazin merged 4 commits into
e2b-dev:firecracker-v1.14-direct-memfrom
kalyazin:feat/fph-wait-ack-1.14

Conversation

@kalyazin
Copy link
Copy Markdown

@kalyazin kalyazin commented May 8, 2026

Changes

Advertise VIRTIO_BALLOON_F_HINT_WAIT_ON_ACK unconditionally when free page hinting is enabled.
The feature support in the guest kernel: e2b-dev/fc-kernels#19 .

Reason

This is to address the race condition under memory pressure in the guest. See more in the kernel patch.

kalyazin and others added 3 commits May 8, 2026 16:31
Whenever free-page hinting is enabled, also advertise the new
VIRTIO_BALLOON_F_HINT_WAIT_ON_ACK feature bit (6). When negotiated,
the guest driver waits for the device to signal-used each hint buffer
before pushing the just-hinted page onto vb->free_page_list, closing
a stale-hint data-loss race where the shrinker could recycle a page
back to the buddy allocator before discard_range completed on the host.

Guests without kernel support for bit 6 simply do not negotiate it
(the driver self-clears the bit if VIRTIO_BALLOON_F_FREE_PAGE_HINT is
not also negotiated), so this is forward-compatible with stock guests.
No host-side protocol change is required: process_free_page_hinting_queue
already calls signal_used_queue once per drain, which serves as the
ACK the guest waits on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Nikita Kalyazin <nikita.kalyazin@e2b.dev>
Adds a guest-side check that the negotiated balloon features in
/sys/bus/virtio/devices/virtioN/features include bit 3 (FREE_PAGE_HINT)
and bit 6 (HINT_WAIT_ON_ACK) when free_page_hinting is enabled.

The test is gated on a new dedicated marker, requires_patched_kernel,
which is registered in tests/pytest.ini and added to the default -m
exclusion filter so the test is auto-skipped by every CI run (regular
and nightly). To run it, replace the 6.1 artifact vmlinux with a build
that carries Jack Thomson's wait-on-ACK patch and invoke:

    tools/devtool -y test -- -m requires_patched_kernel \
        tests/integration_tests/functional/test_balloon_wait_on_ack.py

If the kernel is not patched, the bit-6 assertion fails with a clear
"did you replace the kernel?" message.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Nikita Kalyazin <nikita.kalyazin@e2b.dev>
Add a subsection under free_page_hinting describing the behaviour of
VIRTIO_BALLOON_F_HINT_WAIT_ON_ACK: always advertised alongside FPH,
self-cleared by guests without the supporting kernel patch, no
separate config knob, and a note on the per-buffer round-trip cost on
supported guests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Nikita Kalyazin <nikita.kalyazin@e2b.dev>
@cla-bot cla-bot Bot added the cla-signed label May 8, 2026
@cursor
Copy link
Copy Markdown

cursor Bot commented May 12, 2026

PR Summary

Medium Risk
Changes virtio balloon feature negotiation for all VMs using free_page_hinting, which can alter guest behavior and performance and may expose compatibility issues with older drivers.

Overview
Firecracker now always advertises VIRTIO_BALLOON_F_HINT_WAIT_ON_ACK when free_page_hinting is enabled, which may increase hinting latency on supporting guests and is only validated by guest-side behavior.

The only end-to-end coverage for negotiation is a new integration test that is excluded from CI and depends on a manually swapped, out-of-tree patched kernel artifact, so regressions/compatibility issues are easy to miss. Documentation is updated to describe the new bit and its guest-kernel dependency.

Reviewed by Cursor Bugbot for commit b4fbe13. Bugbot is set up for automated code reviews on this repo. Configure here.

@kalyazin kalyazin merged commit f0a35a1 into e2b-dev:firecracker-v1.14-direct-mem May 12, 2026
4 checks passed
@kalyazin kalyazin deleted the feat/fph-wait-ack-1.14 branch May 12, 2026 09:02
@kalyazin
Copy link
Copy Markdown
Author

PR Summary

Medium Risk Changes virtio balloon feature negotiation for all VMs using free_page_hinting, which can alter guest behavior and performance and may expose compatibility issues with older drivers.

Overview Firecracker now always advertises VIRTIO_BALLOON_F_HINT_WAIT_ON_ACK when free_page_hinting is enabled, which may increase hinting latency on supporting guests and is only validated by guest-side behavior.

The only end-to-end coverage for negotiation is a new integration test that is excluded from CI and depends on a manually swapped, out-of-tree patched kernel artifact, so regressions/compatibility issues are easy to miss. Documentation is updated to describe the new bit and its guest-kernel dependency.

Reviewed by Cursor Bugbot for commit b4fbe13. Bugbot is set up for automated code reviews on this repo. Configure here.

This is a known and accepted gap. We will be working on improving test coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants