#1327 Step 1: split poll_descriptor.rs (#946 Phase 1.5)#1571
Conversation
This PR plans the Phase 1.5 helper-extraction follow-up to #946, splitting userspace-dp/src/afxdp/poll_descriptor.rs (currently 3292 LOC — drifted from the 2343 in the issue body) into a directory layout with: poll_descriptor/mod.rs — slim entry + pre-existing small helpers poll_descriptor/ctx.rs — typed PollCtx<'a> poll_descriptor/flow_cache_hit.rs — stage_flow_cache_hit extraction (~270 LOC, the single largest self-contained continue-terminated block) poll_descriptor/telemetry/ rx_descriptor.rs — record_rx_descriptor_telemetry (verbatim move) Step 1 is intentionally conservative. The post-flow-cache body (lines 880-2915) threads ~25 mutable locals through 17 of the 20 continue sites in the function; reviewers are asked to opine on whether stages 12+ are extractable in follow-up PRs without falling into the #946-Phase-2 dead-end pattern. Plan explicitly does NOT propose #946 Phase 2 (batched per-stage iteration — already PLAN-KILLED). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codex task-mpmurvhf-galzxb, AGY review-mpmus86y-f87cd8 dispatched for plan v1 round 1. Recorded per feedback_codex_session_loss_continuation so continuations can fetch by id rather than re-dispatch. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Round-1 verdicts: Codex PLAN-NEEDS-MAJOR (task-mpmurvhf-galzxb), AGY PLAN-NEEDS-MAJOR (review-mpmus86y-f87cd8). Critical applied fixes: - AGY #2 (CRITICAL): v1's StageOutcome<()> mapping all continues to RecycleAndContinue would cause UMEM double-free at L653 (ICMP TE generated, must NOT recycle now) and L876 (TX fallback, recycle_now=false set). v2 introduces dedicated 2-arm FlowCacheOutcome { Consumed, FallThrough } where the helper owns ALL pushes inline — caller never touches desc.addr on Consumed. Eliminates the hazard by design. - Codex #2: #[inline] too weak for 270-LOC hot helper. v2 mandates #[inline(always)] with cargo asm post-build gate confirming no call edge. - AGY #3 / Codex #4: PollCtx<'a> premature. v2 drops it. - AGY #5 / Codex #5: Stages 12+ extractability unproven. v2 commits to option (b) "Step 1 is the structural ceiling" with a 5-row extraction matrix demonstrating all candidates blocked by mutable-locals coupling that would require a separate #946-Phase-2-class DescriptorState plan. - AGY #10: Real call sites are worker/lifecycle.rs and tests.rs, not the worker/poll.rs / loop.rs claimed in v1. - AGY #8: Telemetry helper goes to flat poll_descriptor/rx_telemetry.rs, not nested telemetry/ subdir. Honest framing: v2's structural drop is ~12% (3292 → ~2900), not the 90% the issue body implied. PLAN-KILL remains acceptable. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Round-2 verdicts: Codex PLAN-NEEDS-MINOR (task-mpmv3ywr-yeiqag), AGY PLAN-NEEDS-MINOR (review-mpmv1y38-oyj7ha). Critical AGY fix: v2 helper signature `binding: &mut BindingWorker` will not compile. Outer driver holds `let mut received = binding.xsk.rx.receive(available)` which mutably borrows binding.xsk.rx. Passing &mut binding across the function-call boundary forces the compiler to assume the helper may touch xsk.rx, causing a borrow conflict. Today's inline code compiles because Rust's split-borrow analysis works within one function body but cannot span function-call boundaries. v3 fix: helper takes 8 disjoint refs/scalars: - &mut binding.flow (WorkerFlowCacheState) - &mut binding.tx_pipeline (WorkerTxPipeline) - &mut binding.tx_counters (WorkerTxCounters) - &mut binding.scratch (WorkerScratch) - &mut binding.mirror_sample_counter (u64) - &binding.live (Arc<BindingLiveState>) - binding.slot (u32, by value) - binding.ifindex (i32, by value) These are disjoint from binding.xsk.rx, so the borrow-checker accepts them alongside the outer `received` binding. Field-access audit on poll_descriptor.rs:548-879 confirmed only these 7 sub-fields are touched (no .bpf_maps, no .xsk, no .cos, no .timers, no .bind_meta). Codex fixes applied: - Added `packet_fabric_ingress: bool` parameter (used at L569). - Call-site list extended: tests.rs:3520, 3748 added (6 total). - Stages 12+ matrix reframed as architectural verdict only (precise continue counts removed; conclusion remains all blocked, ratified by AGY round-2). - Test-plan checkboxes changed from [x] to [ ] (required gates, not yet completed). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Round-3 verdicts: Codex PLAN-NEEDS-MAJOR (task-mpmvbfiy-yehdyn), AGY PLAN-READY (review-mpmvbnc1-l7248o). Codex caught a real borrow-checker hazard in v3: helper signature took both `packet_frame: &[u8]` (derived from owned_packet_frame.as_deref().unwrap_or(raw_frame) at L477) AND `owned_packet_frame: &mut Option<Box<[u8]>>`. Inline code works because NLL sees take() only on continue-terminated paths. Across the call boundary, the caller cannot prove that, so passing both is a borrow violation. v4 fix: helper does NOT take packet_frame. It takes &mut owned_packet_frame and raw_frame, then re-derives `let packet_frame: &[u8] = owned_packet_frame.as_deref().unwrap_or(raw_frame);` as its first local. NLL inside the helper body works identically to the inline code. Caller-side: the L477 packet_frame binding's NLL lifetime ends at L546 (last use before the flow-cache block). Helper call at L548 is therefore borrow-safe — the shared `packet_frame` is no longer alive when `&mut owned_packet_frame` is taken. On FallThrough, caller re-binds packet_frame for the slow path at L880+. Also removed unused `binding_ifindex` parameter (Codex r3 minor — field-access audit confirmed no binding.ifindex access in 548-879). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes #1327. Convert flat userspace-dp/src/afxdp/poll_descriptor.rs (3292 LOC, single 2490-LOC `poll_binding_process_descriptor`) into a directory module and extract two self-contained blocks. This is #946 Phase 1.5 — the helper-extraction follow-up that #946's closure note left open. It is pure code motion: every continue site, every push, and every side effect inside the lifted blocks is preserved verbatim. Layout after this PR: userspace-dp/src/afxdp/poll_descriptor/ mod.rs 2806 LOC — pub(super) fn driver + small pre-existing helpers + slow-path body. flow_cache_hit.rs 379 LOC — stage_flow_cache_hit (#[inline(always)]) + FlowCacheOutcome { Consumed, FallThrough } enum. rx_telemetry.rs 220 LOC — record_rx_descriptor_telemetry verbatim from #1128. Hot-path discipline: - #[inline(always)] on stage_flow_cache_hit. Verified by `nm /dev/shm/cargo/release/xpf-userspace-dp | grep stage_flow_cache_hit` (no symbol) and `objdump -d | grep` (no call edge): LLVM fully inlined the helper. - Zero new per-packet allocations. Every cached_* binding is a borrow into the existing flow-cache Cache entry. The only heap interactions in the helper are pre-existing (owned_packet_frame.take() + PendingForwardRequest build). - Helper owns ALL scratch_recycle.push and scratch_forwards.push sites for the flow-cache fast path. On FlowCacheOutcome::Consumed the caller MUST `continue` without touching desc.addr again — contract documented in flow_cache_hit.rs preamble. Borrow-safety: - Helper takes 8 disjoint binding sub-struct refs/scalars (&mut binding.flow, &mut binding.tx_pipeline, &mut binding.tx_counters, &mut binding.scratch, &mut binding.mirror_sample_counter, &binding.live, binding.slot, binding_index), NOT &mut BindingWorker. Disjoint from `binding.xsk.rx` which is held by the outer `received` binding. - Helper takes &mut owned_packet_frame and re-derives packet_frame internally. Caller's L477 packet_frame binding's NLL lifetime ends at L546 (last pre-flow-cache use); caller re-binds packet_frame after the helper for the L880+ slow-path code. Plan-review history: - Round 1 (v1): Codex + AGY both PLAN-NEEDS-MAJOR. - Round 2 (v2): Codex + AGY both PLAN-NEEDS-MINOR. - Round 3 (v3): Codex PLAN-NEEDS-MAJOR (caught packet_frame / owned_packet_frame borrow hazard); AGY PLAN-READY. - Round 4 (v4): AGY PLAN-READY; Codex provisional PLAN-READY (sandbox infra blocked direct filesystem verification). - Task IDs recorded in docs/pr/1327-poll-descriptor-stages/reviewer-ids.md. Stages 12+ verdict: Five candidates evaluated (resolve_session, nat_preroute, policy_eval, nat_postroute, fib_neighbor_resolve, session_install). All blocked by mutable-locals coupling that would need a separate #946-Phase-2-class DescriptorState plan. Step 1 is the structural ceiling for #1327; closing the issue after this merges is correct. Validation: - cargo build --release: clean (114 pre-existing warnings). - cargo test --release: 1417/1417 main suite pass. The snat_contract_doc_guard test fails identically on master and branch (docs/userspace-dataplane-gaps.md missing the "fail-closed" keyword); pre-existing, unrelated to #1327. Test updated only to accept the directory layout. - 5/10 afxdp:: flake: same 2/10 wg::engine concurrent-load failure on master and branch; pre-existing, unrelated. Solo, the failing test passes 10/10 on both master and branch. - cargo asm gate: no `call` to stage_flow_cache_hit emitted (nm + objdump confirm). - Go suite: 30/30 packages pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
@copilot review |
There was a problem hiding this comment.
Pull request overview
This PR refactors the userspace AF_XDP hot-path module poll_descriptor by converting the former single poll_descriptor.rs file into a directory module and extracting two self-contained helper blocks. This keeps the critical per-descriptor driver readable and reviewable while preserving behavior (pure code motion).
Changes:
- Convert
userspace-dp/src/afxdp/poll_descriptor.rsintouserspace-dp/src/afxdp/poll_descriptor/mod.rsand wire up sibling submodules. - Extract the flow-cache fast path into
poll_descriptor/flow_cache_hit.rs(#[inline(always)], returnsFlowCacheOutcometo preservecontinuesemantics). - Extract RX per-descriptor telemetry bookkeeping into
poll_descriptor/rx_telemetry.rs, and update the SNAT doc-guard test to accept either flat or directory layout; add plan/reviewer docs and update_Log.md.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| userspace-dp/tests/snat_contract_doc_guard.rs | Accepts either flat poll_descriptor.rs or directory poll_descriptor/mod.rs when scanning for SNAT call sites. |
| userspace-dp/src/afxdp/poll_descriptor/mod.rs | New directory-module driver; imports and calls the extracted flow-cache and RX telemetry helpers. |
| userspace-dp/src/afxdp/poll_descriptor/flow_cache_hit.rs | New extracted flow-cache fast-path helper with explicit Consumed/FallThrough contract. |
| userspace-dp/src/afxdp/poll_descriptor/rx_telemetry.rs | New extracted RX telemetry helper (verbatim move). |
| docs/pr/1327-poll-descriptor-stages/plan.md | Adds the refactor plan and rationale for the extraction ceiling. |
| docs/pr/1327-poll-descriptor-stages/reviewer-ids.md | Records reviewer task/job IDs for continuity. |
| _Log.md | Logs the work and validation gates for #1327 Step 1. |
| // #1327 Step 1 converted the flat poll_descriptor.rs to a directory | ||
| // module. The four source_nat_decision_for_flow call sites all stay | ||
| // in mod.rs (slow-path session resolution, which Step 1 did not | ||
| // attempt to extract — see docs/pr/1327-poll-descriptor-stages/plan.md | ||
| // "Stages 12+ verdict"). Compatibility: keep accepting either layout. | ||
| let poll_path = { | ||
| let flat = manifest_dir.join("src/afxdp/poll_descriptor.rs"); | ||
| if flat.exists() { | ||
| flat | ||
| } else { | ||
| manifest_dir.join("src/afxdp/poll_descriptor/mod.rs") | ||
| } | ||
| }; |
Copilot inline review (PR #1571 review 4366277997) noted that the SNAT doc-guard test now reads either the flat poll_descriptor.rs or the directory poll_descriptor/mod.rs, but the downstream assertion messages still hard-coded "poll_descriptor.rs" and the required-doc check still required the exact string "userspace-dp/src/afxdp/poll_descriptor.rs" to appear in the SNAT plan doc. This makes future failures confusing and forces external docs to keep referencing a path that may not exist. Fixes: - Assertion messages now use the actual module label (poll_path relative to repo root: either userspace-dp/src/afxdp/poll_descriptor.rs or .../mod.rs). - The plan-doc required-doc check accepts either path string. - The plan-doc enumeration check (poll.matches("poll_descriptor.rs:")) now also matches "poll_descriptor/mod.rs:" so the doc need not be re-edited for the layout switch. - The capability-doc check (against architecture/gaps docs) accepts either path token. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
@copilot review Addressed your inline review comment in fe176da: snat_contract_doc_guard.rs now derives the module label from the actual poll_path for all assertion messages, and the required-doc + enumeration checks accept both |
Claude SMR reviewActing as the in-conversation domain SME + CPU arch reviewer per the project triple-review-includes-Claude-SMR practice. Walked the diff end-to-end. Verdict: MERGE-READY. AF_XDP UMEM ownershipRecycle-exit map (3 continues at original L616/L653/L876) is correctly preserved:
No double-recycle, no use-after-free. CPU-arch /
|
Reviewed fe176da. The follow-up fixes are correct: assertion labels now track the resolved poll module path, and the doc-token/enumeration checks now correctly accept both |
…descriptor-stages
Auto-merge with conflicts on _Log.md only (both branches appended new entries to the action log; preserved both sets in chronological order via python concat). Post-merge: - cargo build --release --tests clean - cargo test --release --bin xpf-userspace-dp: 1418 passed / 0 failed / 2 ignored (was 1417 pre-merge; master added one test in #1571 or #1569). - wire_invariant_default_specimens still passes against committed fixture (no protocol-module changes in merged master commits). Carrying forward 4-of-4 reviewer attestation from pre-merge HEAD 28852ed since the merge is pure log/doc-only carry-forward and touches no production code. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Window-1 batch smoke FAIL — infra-ceiling, not a regression of this PRCounter hit 10 on the merge of #1572. Batch smoke ran on Pass A (CoS-off): FAIL — v4 reverse P=12 = 9.37 Gb/s / v6 = 9.24 Gb/s (gate ≥22 Gb/s); v4 push P=1 = 4.86 Gb/s with 62 retr (gate 0 retr); v6 baselines clean, 4.70-4.84 Gb/s 0 retr. Pass B (CoS-on 24-cell): NOT RUNNABLE — only port 5201 listens at 172.16.100.200; ports 5202-5206 closed. Why this is not a regression of this PRThe same throughput ceiling was observed on All 10 window PRs are non-dataplane code-motion refactors and have zero theoretical or measurable impact on dataplane throughput, XDP attach modes, AF_XDP zero-copy, or iperf3 server provisioning. No bisectPer user instruction, no bisect: the failure is documented infrastructure state, not in the 10 PRs. Follow-upFiled: #1580 — full batch-smoke transcript + window PR list + recommended triage actions, cross-referenced #1578. Merge stateCounter remains at 10. Batch merges are paused until #1578 is resolved or user reauthorizes. 🤖 Generated with Claude Code |
Summary
Closes #1327.
Convert flat
userspace-dp/src/afxdp/poll_descriptor.rs(3292 LOC,single 2490-LOC
poll_binding_process_descriptor) into a directorymodule and extract two self-contained blocks. This is #946
Phase 1.5 — the helper-extraction follow-up that #946's closure
note left open after #946 Phase 2 (batched per-stage iteration) was
PLAN-KILLED.
Pure code motion. Every
continuesite, every push, every sideeffect inside the lifted blocks is preserved verbatim.
Layout after this PR
mod.rsdrops from 3292 → 2806 (-486 LOC, ~15% reduction). Theflow-cache fast path is now individually reviewable in a 379-LOC
file.
record_rx_descriptor_telemetry(179 LOC of pure RX-sidebookkeeping) no longer sits at the bottom of the file separated
from the inner-loop driver by 540 LOC of pipeline body.
Hot-path discipline
#[inline(always)]onstage_flow_cache_hit(mandated byplan review round 2). Verified:
nmshows no symbol forstage_flow_cache_hitin the release binary;objdump -dshowsno
calledge frompoll_binding_process_descriptorto thehelper. LLVM fully inlined it.
cached_*binding isa borrow into the existing flow-cache
Cacheentry. The onlyheap interactions in the helper body are pre-existing
(
owned_packet_frame.take()and thePendingForwardRequestbuild).
FlowCacheOutcome::Consumedthecaller MUST
continuewithout touchingdesc.addragain.Contract documented in
flow_cache_hit.rspreamble.Borrow-safety
&mut binding.flow,&mut binding.tx_pipeline,&mut binding.tx_counters,&mut binding.scratch,&mut binding.mirror_sample_counter,&binding.live,binding.slot,binding_index— NOT&mut BindingWorker.These are disjoint from
binding.xsk.rxwhich is held mutablyby the outer
receivedbinding for the duration of thewhile let Some(desc) = received.read()loop.&mut owned_packet_frameand re-derivespacket_frameinternally as its first local. The caller'sL477
packet_framebinding's NLL lifetime ends at L546 (lastuse before the flow-cache block), so the
&mutto the helperis borrow-safe. On
FallThrough, the caller re-bindspacket_framefor the L880+ slow-path code.Stages 12+ verdict
Five candidates evaluated for follow-up extraction
(
resolve_session_or_create,nat_preroute,policy_eval,nat_postroute,fib_neighbor_resolve,session_install).All blocked by mutable-locals coupling that would need a
separate #946-Phase-2-class
DescriptorStateplan. Step 1 isthe structural ceiling for #1327; closing the issue after this
merges is correct.
Plan + adversarial review
Plan doc:
docs/pr/1327-poll-descriptor-stages/plan.md.Task IDs recorded in
docs/pr/1327-poll-descriptor-stages/reviewer-ids.md.Codex v4's verdict is "provisional" because the sandbox infra failed
to start a unified exec process on the last 3 attempts. AGY round 4
ratified v4 unconditionally, and round 3 AGY had already ratified
the disjoint sub-struct refs design.
Test plan
cargo build --release: clean (114 pre-existing warnings)cargo test --release: 1417/1417 main suite pass. Thesnat_contract_doc_guardtest fails identically on masterand branch —
docs/userspace-dataplane-gaps.mdis missingthe "fail-closed" keyword required by the test; pre-existing,
unrelated to Refactor: userspace-dp/src/afxdp/poll_descriptor.rs — split ~2100-LOC poll_binding_process_descriptor into stage helpers (#946 Phase 1.5) #1327. Test updated only to accept the new
directory layout.
cargo test --release -- afxdp::: same 2/10pre-existing flake on
wg::engine::engine_internal_tests::reconcile_peers_snapshot_is_atomic_under_concurrent_loadon master and branch (concurrent-load test, unrelated). Solo,
this test passes 10/10 on both.
cargo asmspot-check:nmconfirms nostage_flow_cache_hitsymbol in the release binary;#[inline(always)]fully inlined as required.go test ./...: 30/30 packages pass.Comprehensive smoke happens at end of wave via the runbook.
🤖 Generated with Claude Code