feat(block): add Tracker with NotPresent/Dirty/Zero states#2545
feat(block): add Tracker with NotPresent/Dirty/Zero states#2545ValentaTomas merged 5 commits intomainfrom
Conversation
PR SummaryMedium Risk Overview Updates the UFFD test harness to treat wire Reviewed by Cursor Bugbot for commit dcb7de4. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7b5a3f8ab6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 7b5a3f8. Configure here.
There was a problem hiding this comment.
LGTM — small, self-contained utility with thorough unit tests; no consumers yet.
Extended reasoning...
Overview
This PR adds a new block.StateTracker[S comparable] generic utility (70 lines) backed by two roaring bitmaps, plus a thorough unit test file (97 lines). It exposes SetRange, Get, and Export. The PR is purely additive — no existing files are modified, no callers exist yet, and the consumer (#2541) lands separately.
Security risks
None. This is in-process state tracking on integer page indices with mutex-guarded reads/writes. No I/O, no untrusted input, no privileged operations. Export correctly returns clones to prevent caller mutation from leaking into the tracker (verified in the test).
Level of scrutiny
Low. The change is a pure helper struct with a tight API surface, lives under pkg/sandbox/block, and has no consumers in this PR. Mutex usage is correct (write lock for SetRange, read lock for Get/Export). The uint64 vs uint32 asymmetry between SetRange and Get is documented in a comment and matches roaring's API contract (end can equal 1<<32).
Other factors
Tests cover all transition pairs (default↔a, default↔b, a↔b, idempotent), partial overlap, empty/inverted ranges, and Export aliasing. Author confirms go test -race passes. The bug hunting system found no issues. The switch in SetRange silently no-ops on unknown states, which is a minor footgun if a caller passed a fourth value, but since states are fixed at construction time this is acceptable.
95659d3 to
bf4fc62
Compare
bf4fc62 to
02f8da8
Compare
|
Does it really need to be that "generic"? It seems in all cases we need one default case (I assume "not present") and two other states (dirty and removed ?) Are the NBD states that different? |
|
Ditched the generic. Now a non-generic |
❌ 5 Tests Failed:
View the top 2 failed test(s) by shortest run time
View the full list of 91 ❄️ flaky test(s)
To view more test analytics, go to the Test Analytics Dashboard |
d5b8ccc to
d8ed9f4
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d8ed9f4c8d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
a762cc9 to
b65129f
Compare
|
Re: Bugbot summary on UFFD page-state offset reconstruction — non-issue. |
CI Feedback 🧐A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
|
There was a problem hiding this comment.
LGTM — small, well-tested introduction of block.Tracker; the only production change in userfaultfd.go is a mechanical swap from a map keyed by addr to a bitmap keyed by page index, with equivalent semantics.
Extended reasoning...
Overview
This PR introduces a new block.Tracker type (with NotPresent/Dirty/Zero states backed by two roaring bitmaps) plus comprehensive tests, and migrates the test-only UFFD pageTracker map to it. The only production behavior change in userfaultfd.go is replacing pageTracker.setState(addr, addr+pageSize, faulted) with pageTracker.SetRange(idx, idx+1, block.Dirty) where idx = header.BlockIdx(offset, pageSize). The remaining changes are in test files (rpc_services_test.go, helpers_test.go) and a wire-comment update.
Security risks
None identified. This is sandbox infrastructure for tracking page state metadata — it doesn't grant access, alter permissions, or touch crypto/auth. The tracker is internally consistent (uses RWMutex, Export returns clones), and SetRange defensively no-ops on out-of-range bounds to prevent panics from the underlying 32-bit roaring bitmap.
Level of scrutiny
Low-to-moderate. While the file lives under sandbox/uffd (memory state for guest VMs), the actual prod-side delta is a single equivalent-semantics swap. The new Tracker API has 92.68% patch coverage with regression tests covering transitions, partial overlap, idempotency, empty/inverted ranges, out-of-range bounds (the boundary end == 1<<32 case is explicitly tested), and Export-clone independence. header.BlockIdx is a simple integer division and UFFD page faults are page-aligned, so the offset↔index round-trip is exact (matching prior snapshot semantics, as the author noted).
Other factors
All prior automated review concerns have been addressed: the SetRange unsupported-state issue was fixed by ditching the generic in favor of a fixed enum, the upper-bound panic concern was addressed in b65129f with a guard and a boundary regression test, and reviewer bchalios's design feedback ("does it really need to be that generic?") was incorporated by simplifying to a non-generic type. Codecov reports no failed tests.
b65129f to
bf920d8
Compare
A small two-state-plus-default tracker backed by roaring bitmaps. Used by upcoming UFFD work to track page states (Missing/Faulted/Removed) and by NBD to track zero pages, replacing ad-hoc map-based trackers with O(1) range ops and cheap snapshot exports.
…state Replace the map-based pageTracker with block.StateTracker[pageState], a roaring-bitmap-backed tracker with O(1) range ops. pageState gains a third value, removed, which is wired at the type level but not yet written anywhere -- #2520 adds the REMOVE-event handler that produces it. Page indices are computed at the call site via header.BlockIdx. pageStateEntries is updated to iterate the exported bitmaps so the cross-process test harness keeps working. Inline the 3-line pageState enum into userfaultfd.go and drop the dedicated page_tracker.go now that pageTracker is gone. Convert block.StateTracker's NewStateTracker / SetRange API from panics to errors. Distinct-state validation and unsupported-state checks now return fmt.Errorf descriptors; the userfaultfd-side init propagates the constructor error through NewUserfaultfdFromFd, and the SetRange call in the worker path logs and continues since these errors only fire on programming bugs.
Replace the generic block.StateTracker[S] with a non-generic block.Tracker over a fixed block.State enum (NotPresent default, Dirty, Zero). The same three states cover both UFFD page management (NotPresent=missing, Dirty=faulted, Zero=removed via DONTNEED/balloon) and upcoming NBD overlay tracking (NotPresent=fall-through, Dirty=overlay-owned, Zero=explicit zero/discard).
37be905 to
dcb7de4
Compare
|
CI status update: Failure signature: |
… COPY (#2520) Handles `UFFD_EVENT_REMOVE` so balloon-deflate / `madvise(MADV_DONTNEED)` transitions pages to a `removed` state instead of leaving stale `faulted` mappings. Drains the REMOVE batch under `settleRequests.Lock`; workers hold `settleRequests.RLock` across the state read → `UFFDIO_COPY` → `SetRange` sequence so a concurrent REMOVE can't slip between the read and the install. Soft-fails `UFFDIO_COPY` EAGAIN / partial copies onto a `deferredFaults` queue and wakes the poll loop via a self-pipe. Depends on #2545. --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Adds
block.Tracker, a non-generic state tracker over a fixedblock.Stateenum with three universal values that cover both UFFD memory handling and NBD overlay tracking:NotPresent(default) — fall through to the previous layerDirty— materialized in this layer (UFFD faulted page, NBD overlay-owned block)Zero— known-zero without consulting the previous layer (UFFD page removed viaDONTNEED/balloon, NBD explicit zero/discard)Backed by two roaring bitmaps (
dirty,zero);NotPresentindices are implicit. ExposesSetRange,Get, andExport(returns clones).First consumer is #2520 (UFFD per-page state); also used by #2546 for NBD zero-page tracking.