Skip to content

v0.1: fast-path BPF program + toolchain + verifier-pass test (PR #3)#3

Merged
lunarthegrey merged 9 commits intomainfrom
pr3-bpf-program
Apr 20, 2026
Merged

v0.1: fast-path BPF program + toolchain + verifier-pass test (PR #3)#3
lunarthegrey merged 9 commits intomainfrom
pr3-bpf-program

Conversation

@lunarthegrey
Copy link
Copy Markdown
Contributor

@lunarthegrey lunarthegrey commented Apr 20, 2026

Summary

Stands up the BPF half of the v0.1 fast-path: the §4.4 XDP program, §4.5 maps, §4.6 counter enum, the nightly+bpfel-unknown-none+bpf-linker toolchain pipeline, and a verifier-pass integration test that round-trips the program through the kernel's BPF verifier in CI. Userspace still stubs run; the aya loader, attach/detach, pin registry, and netns integration test land in PR #4.

What landed

BPF crate — crates/modules/fast-path/bpf/

New non-workspace crate on nightly Rust targeting bpfel-unknown-none. Separate from the main workspace so the rest of the tree stays on stable.

  • rust-toolchain.toml pins nightly-2026-04-14; .cargo/config.toml sets linker = bpf-linker and build-std = ["core"] (the target has no prebuilt rust-std).
  • aya-ebpf =0.1.1, network-types =0.1.0 pinned per SPEC §7.1.
  • Root Cargo.toml workspace.excludes the BPF subdir; build.rs invokes a nested cargo against it.

§4.4 program

Parse Ethernet → IPv4/IPv6 (with IHL + fragment + TTL + complex-header checks) → LPM src-or-dst match (§4.2) → bpf_fib_lookup → TTL decrement + RFC 1624 incremental csum fixup → MAC rewrite → defensive devmap pre-check (§4.4 step 9d) → bpf_redirect_map. Every verdict path bumps its corresponding counter from §4.6.

  • Dry-run via cfg map bypasses rewrites and short-circuits to XDP_PASS after matched-counter bumps.
  • 802.1Q/802.1ad tagged frames XDP_PASS — VLAN choreography (§4.7) is PR v0.1: VLAN choreography, packet fixtures, kernel 6.0+ probe fix (PR #5) #5.
  • IPv4 checksum fixup decrements the csum by 0x0100 in one's-complement arithmetic (the word at bytes 8-9 is TTL:proto; TTL-1 is equivalent to the checksum's M-word shifting +0x0100).

§4.5 maps + §4.6 counters

All §4.5 maps except vlan_resolve:

  • allow_v4, allow_v6: LpmTrie with 1024-entry max, matches src-or-dst per §4.2.
  • cfg: 1-entry Array<FpCfg>. FpCfg has a reserved version: u32 byte so future additions don't break userspace reads of older layouts.
  • stats: PerCpuArray<u64> with 19 entries matching the StatIdx enum discriminants. Indices are append-only once shipped — operator dashboards key on them.
  • log: 256 KiB RingBuf (power-of-two, multiple of 4 KiB and 16 KiB page sizes).
  • redirect_devmap: DevMapHash with 64 entries — hash-keyed per §4.5 note so size doesn't scale with ifindex density.

vlan_resolve is deferred to PR #5 (VLAN).

Verifier-pass test

crates/modules/fast-path/tests/verifier.rs loads the BPF ELF through aya::Ebpf, extracts the fast_path XDP program, and calls Xdp::load() — which round-trips the program through the kernel verifier. Pass = verifier accepts. Requires CAP_BPF + CAP_NET_ADMIN; a dedicated CI step runs it under sudo -E cargo test ... --ignored.

aya is Linux-only (SYS_bpf, netlink bindings etc.), so the dev-dep + test module are cfg(target_os = "linux")-gated. macOS cargo test still runs cleanly.

Toolchain hygiene

  • Swapped dtolnay/rust-toolchain → direct rustup (first-party).
  • Swapped Swatinem/rust-cacheactions/cache@v5 (first-party).
  • Swapped softprops/action-gh-releasegh release create (first-party).
  • Bumped every action to latest majors (checkout@v6, cache@v5, upload-artifact@v7, download-artifact@v8) to clear Node.js 20 deprecation warnings.
  • Bumped every Rust dep to actual latest-stable on crates.io: rust stable 1.85 → 1.95, thiserror 1.0 → 2.0, aya-build 0.1.2 → 0.1.3, network-types 0.0.6 → 0.1.0, bpf-linker 0.9.13 → 0.10.3.
  • Added .github/dependabot.yml covering workspace cargo, BPF cargo, and github-actions ecosystems — SPEC §7.1 requires "Dependabot or equivalent" for version drift.

CI-only BPF builds

build.rs attempts a nested cargo build --release in bpf/. On success it stages the ELF and emits cargo:rustc-cfg=packetframe_bpf_built; on any failure (no rustup, no nightly, no bpf-linker, real compile error) it writes an empty stub ELF and prints a cargo warning. PACKETFRAME_SKIP_BPF_BUILD=1 is an explicit bypass used by the four cross-build CI jobs (BPF bytecode is architecture-independent, the check job already builds it once).

Result: macOS dev laptops cargo test cleanly with a stub ELF; CI builds real BPF and verifier-tests it.

Deferred to PR #4

  • Packet-level bpf_prog_test_run fixtures for parse/fragment/TTL/allowlist-miss/complex-header verdicts. aya 0.13.1 doesn't wrap BPF_PROG_TEST_RUN for XDP; implementing via raw syscalls is feasible but non-trivial. PR v0.1: fast-path runs end-to-end on configured interfaces (PR #4) #4 introduces the netns integration test (which exercises FIB-return cases needing real routes) and is the natural home for these fixtures.
  • aya userspace loader: Module::load/attach/detach, pin registry, packetframe run wiring.
  • §2.3 per-interface trial-attach probe in the feasibility subcommand (graduates from Deferred once we have a real program to trial-attach with).

Reviewer notes

  • bpf_fib_lookup struct access uses the bindgen anonymous unions (.__bindgen_anon_1.tot_len, .__bindgen_anon_3.ipv4_src, etc.) because that's what the aya-ebpf bindings expose. Not pretty; stable vs layout drift matters more than prettiness for hot-path code.
  • IPv4 csum fixup is the simplified "TTL down by 1 → csum up by 0x0100 in one's complement" form with explicit carry fold. Matches what the reference implementations (ip_decrease_ttl in the kernel) do.
  • I couldn't verifier-check locally (no rustup per the approved plan's CI-only BPF build choice). First CI run hit clippy::unnecessary_cast on an i64 cast that was a no-op on 64-bit Linux; handled with a targeted #[allow] per the pattern established in PR v0.0.1 feasibility slice: workspace, config parser, §2.1 capability probes #1.

Test plan

  • CI green (five jobs: check, four cross-builds, plus the Dependabot config check).
  • Verifier-pass test green under sudo.
  • cargo test --workspace on macOS still clean (stub BPF path).
  • No Node.js 20 deprecation warnings in the action logs.
  • Feasibility probe reports all §2.1 caps as PASS on a Linux host with BPF enabled, post-merge.
  • Bump the BPF crate's aya-ebpf version via Dependabot and confirm a PR is opened (sanity-check the Dependabot config).

🤖 Generated with Claude Code

lunarthegrey and others added 8 commits April 20, 2026 03:39
First increment of PR #3. Stands up the toolchain plumbing end-to-end
with a trivial XDP program (always returns XDP_PASS); real §4.4 logic,
maps, and fixtures land in follow-up commits on this branch.

- `crates/modules/fast-path/bpf/` is a new non-workspace crate targeting
  `bpfel-unknown-none` on nightly. Its `rust-toolchain.toml` pins
  `nightly-2026-03-15`; `.cargo/config.toml` sets `linker = bpf-linker`
  and `build-std = ["core"]`. Dependencies pinned (`aya-ebpf =0.1.1`,
  `network-types =0.0.6`).
- Root `Cargo.toml` excludes the BPF subdir from the workspace (different
  toolchain, different target) and pins the userspace aya stack
  (`aya =0.13.1`, `aya-log =0.2.1`, `aya-build =0.1.2`) for PR #4.
- `crates/modules/fast-path/build.rs` runs a nested `cargo build` in
  `bpf/`. On success it stages the ELF into `OUT_DIR` and emits
  `cargo:rustc-cfg=packetframe_bpf_built`. On any failure (no rustup, no
  nightly, no bpf-linker, missing target, real compile error) it writes
  an empty stub ELF and prints a cargo warning. `PACKETFRAME_SKIP_BPF_BUILD=1`
  is an explicit bypass — used by the cross-build CI jobs.
- Userspace `lib.rs` exports `FAST_PATH_BPF: &[u8]` (the embedded ELF)
  and `FAST_PATH_BPF_AVAILABLE: bool` (const-evaluable, `true` iff the
  ELF is non-empty). A new test asserts ELF magic when available and an
  empty stub otherwise — so both CI and macOS-dev paths have a meaningful
  check.
- CI `check` job installs nightly + `bpfel-unknown-none` + pinned
  `bpf-linker` so `cargo test` actually builds BPF. Cross-build jobs set
  `PACKETFRAME_SKIP_BPF_BUILD=1` and rely on the stub — BPF bytecode is
  architecture-independent, so building it four times inside `cross`
  containers would be wasted work. `Cross.toml` passes the env var
  through the docker boundary.

Known gap: `.github/workflows/release.yml` isn't updated yet and will
produce binaries with a stub BPF ELF if a tag is pushed. Fixed in a
later PR before v0.1 is tagged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The tier-3 target has no prebuilt rust-std component; rustup errored
with 'component rust-std for target bpfel-unknown-none is unavailable
for download'. We build `core` from source via `[unstable] build-std`
in the BPF crate's .cargo/config.toml, which only needs rust-src.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Swap dtolnay/rust-toolchain + Swatinem/rust-cache for direct rustup
  commands and actions/cache@v4 (first-party only). rustup is
  pre-installed on GitHub runners.
- Bump to actual latest-stable on crates.io:
  - thiserror 1.0 → 2.0 (ConfigError compatible — `#[source]` on
    `std::io::Error` which impls Error; no code changes needed)
  - aya-build 0.1.2 → 0.1.3
  - network-types 0.0.6 → 0.1.0 (0.1.0 has stable module layout
    `eth::EthHdr`, `ip::Ipv4Hdr`/`Ipv6Hdr`, `tcp::TcpHdr`, `udp::UdpHdr`,
    `vlan`, `icmp` that the BPF program will consume)
  - bpf-linker 0.9.13 → 0.10.3
- Add .github/dependabot.yml covering workspace cargo, BPF cargo,
  and github-actions ecosystems. SPEC.md §7.1 requires drifts to
  surface as reviewable PRs; this is the "or equivalent" hook.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Rust stable 1.85.0 → 1.95.0 (actual latest) in rust-toolchain.toml
  and CI env. bpf-linker 0.10.3 requires rustc 1.86+, so this fixes
  the CI install failure.
- Rust nightly nightly-2026-03-15 → nightly-2026-04-14 (co-dated with
  the 1.95.0 release).
- Workspace `rust-version` stays at 1.85 (MSRV), deliberately behind
  the rust-toolchain pin so contributors with slightly older local
  toolchains still build. CI always runs the pinned latest stable.

Nightly for the BPF crate is not a choice — aya-ebpf targets
`bpfel-unknown-none` (tier-3, no prebuilt `rust-std`), so `core` must
be built from source via `-Z build-std`, which is nightly-only.
SPEC.md §7.1 calls this out; SPEC.md §1 mandates aya-ebpf. Userspace
stays stable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Define the six §4.5 maps the non-VLAN fast-path path needs
(`allow_v4`, `allow_v6`, `cfg`, `stats`, `log`, `redirect_devmap`; the
seventh, `vlan_resolve`, lands with VLAN in PR #5). Typed `FpCfg`
struct with a reserved version byte so future fields don't break
layout; typed `StatIdx` enum with explicit append-only discriminants
matching §4.6's wire format. `bump_stat` helper for per-CPU counter
increments on the hot path.

Program still always returns XDP_PASS — but now it touches the
`stats` map to bump `rx_total`, which proves map binding works through
the verifier in CI. Real §4.4 parse / allowlist / FIB / redirect
logic lands in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GitHub is deprecating Node.js 20 actions; v4s of checkout/cache/
upload-artifact/download-artifact run on Node 20. Bumps:

- actions/checkout v4 → v6 (latest v6.0.2)
- actions/cache v4 → v5 (latest v5.0.5)
- actions/upload-artifact v4 → v7 (latest v7.0.1)
- actions/download-artifact v4 → v8 (latest v8.0.1)

Also refactors release.yml to match ci.yml's first-party-only style:
direct rustup (no dtolnay/rust-toolchain), actions/cache (no
Swatinem/rust-cache), gh release create (no softprops/action-gh-release).
Same BPF toolchain wiring as the check job — release builds now
produce real BPF ELFs rather than stubs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First cut of the real XDP program. Parses Ethernet + IPv4/IPv6, rejects
fragments / low-TTL / complex headers into the kernel slow path,
consults the allow-v4/allow-v6 LPM tries with src-or-dst match (§4.2),
populates a `bpf_fib_lookup`, dispatches on the return code per §4.4
step 9, rewrites MACs + decrements TTL (IPv4 incremental csum fixup
per RFC 1624; IPv6 just hop_limit -= 1), defensively pre-checks the
devmap, and redirects via `bpf_redirect_map`.

802.1Q/802.1ad tagged frames XDP_PASS — VLAN choreography (§4.7)
lands in PR #5.

Dry-run (cfg.dry_run=1) bumps matched counters then XDP_PASS without
rewriting. Every verdict path bumps a corresponding StatIdx counter
per §4.6.

I can't verifier-check locally (no rustup); expect 1-2 CI cycles to
shake out bindgen-union access patterns on bpf_fib_lookup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Loads FAST_PATH_BPF through aya::Ebpf, grabs the `fast_path` program,
and calls Xdp::load() — which round-trips through the kernel verifier.
If the verifier accepts our §4.4 program, this test passes; if not, CI
fails with the verifier's rejection reason.

aya is Linux-only (uses netlink/SYS_bpf/etc.), so the dev-dep is
cfg-gated to target_os=linux and the test file is feature-guarded the
same way. macOS `cargo test` still runs cleanly.

The test is #[ignore] by default (loading BPF needs CAP_BPF +
CAP_NET_ADMIN); a dedicated CI step runs it under `sudo -E cargo
test ... -- --ignored`.

Packet-level bpf_prog_test_run fixtures for parse/fragment/TTL/
allowlist-miss/complex-header verdicts were originally planned for
this PR, but aya 0.13.1 doesn't expose a test_run wrapper for XDP —
implementing one via raw syscalls is non-trivial. Those fixtures
land in PR #4 alongside the netns integration test, which also
exercises the FIB-return cases that need real routes configured.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lunarthegrey lunarthegrey marked this pull request as ready for review April 20, 2026 09:17
@lunarthegrey lunarthegrey changed the title [WIP] v0.1 BPF program + fixtures (PR #3) v0.1: fast-path BPF program + toolchain + verifier-pass test (PR #3) Apr 20, 2026
@lunarthegrey
Copy link
Copy Markdown
Contributor Author

@claude review

Spec §4.4 step 7 lists sport/dport among the fields we must set on
bpf_fib_lookup. My initial implementation left them at zero. This
matters for routes with ECMP + L4-hash policy — without ports the
FIB's flow hash would collapse to a single next-hop, silently
diverging from the kernel's slow path for matched traffic.

Adds an `l4_ports` helper that returns the raw network-byte-order u16
pair from the TCP/UDP header. Bounds-checked via the packet's
data_end; returns (0, 0) for ICMP/ICMPv6 (no ports) or truncated L4
data. Both handle_ipv4 and handle_ipv6 populate fib.sport/fib.dport
before calling bpf_fib_lookup.

Uses `core::ptr::read_unaligned` rather than a direct cast because
the L4 header is not guaranteed 2-byte aligned in the verifier's
view of data; ptr read_unaligned avoids a UB-adjacent pattern even
though LLVM BPF would normally generate correct code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lunarthegrey lunarthegrey merged commit e14f275 into main Apr 20, 2026
5 checks passed
lunarthegrey added a commit that referenced this pull request Apr 20, 2026
…tach/status

Brings the fast-path module from "BPF ELF compiles + passes verifier"
(PR #3) to "runs end-to-end in dry-run on configured interfaces".

Core changes:

- `crates/modules/fast-path/src/linux_impl.rs` (new, Linux-only):
  opens aya::Ebpf, populates CFG (dry_run + reserved version byte),
  populates ALLOW_V4/ALLOW_V6 LPM tries from `allow-prefix`
  directives, XDP-attaches to each `attach <iface> <mode>` with
  trial-attach fallback per SPEC §2.3 (Auto → native first, fall back
  to generic; explicit Native/Generic use requested mode, no fallback),
  populates REDIRECT_DEVMAP with every configured ifindex before
  packet flow so the §4.4 defensive devmap pre-check works. Exports
  a `trial_attach_native` probe and `snapshot_stats` /
  `snapshot_links` accessors for the CLI.

- `crates/modules/fast-path/src/registry.rs` (new): JSON pin registry
  persistence at `<state-dir>/attachments.json`, atomic write-then-
  rename. SPEC §8 wants deterministic teardown; actual bpffs pinning
  lands in PR #6 but we persist the shape now so `detach` has
  something authoritative to read.

- `crates/modules/fast-path/src/lib.rs`: `FastPathModule` gains real
  trait implementations on Linux; non-Linux returns
  `ModuleError::other` from load/attach (NotImplemented on detach is
  valid since nothing was loaded). macOS dev loops still compile
  cleanly.

- `crates/cli/src/loader.rs` (new): drives the module lifecycle from
  `packetframe run`. Parses config, runs the SPEC §2.1 capability
  probe, refuses to attach if any required capability fails, then
  loads / attaches each configured module and persists the registry.
  SIGTERM/SIGINT → exit without explicit detach per SPEC §7.3 / §8.5
  (drops Ebpf; until PR #6 adds pinning, this implicitly detaches).
  SIGHUP logs a warning; reconfigure flow is PR #6.

- `crates/cli/src/feasibility.rs`: per-interface XDP trial-attach
  probe (§2.3) graduates from Deferred to real per-iface verdicts
  (`xdp.attach.<iface>` capability entries), populated from the
  config's `attach` directives.

- `packetframe detach` reads the pin registry and reports; in-kernel
  detach without an active loader needs pinning (PR #6). `packetframe
  status` prints the registry contents; live counter readback waits
  on pinning too.

Pins bumped: signal-hook 0.4 added for the signal loop. aya is now a
Linux-only regular dep on fast-path (was dev-only).

Follow-up commits on this branch will add the veth netns integration
test and the packet-level bpf_prog_test_run fixtures deferred from
PR #3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lunarthegrey added a commit that referenced this pull request Apr 20, 2026
Three intertwined issues led to PR #3's merged verifier test and PR
#4's attach test reporting `ok` while never actually exercising the
BPF ELF:

1. `crates/modules/fast-path/bpf/Cargo.toml` has no `[workspace]`
   table, so when `build.rs` invokes a nested `cargo build` in that
   directory, cargo walks up and finds the root workspace that
   excludes this path — then errors out because the package "believes
   it's in a workspace when it's not" (the user surfaced this running
   on a Linux VM; exit code 101 matches CI's failure mode). The root
   `workspace.exclude` alone is not sufficient when cargo is invoked
   from inside the excluded directory.

2. The old `build.rs` used `Command::status()`, which inherited
   stdout/stderr back to the outer cargo — which captures build-script
   output and only parses `cargo:` directives. Result: any nested
   cargo error was completely swallowed, leaving users staring at an
   opaque "BPF build failed (exit 101); using empty stub ELF" line
   with no way to diagnose.

3. When BPF build fails, `build.rs` writes an empty stub ELF and
   continues. The verifier + attach integration tests check
   `FAST_PATH_BPF_AVAILABLE` and early-return with `eprintln!` +
   `return`, which cargo counts as PASS. CI's sudo step reported 1/1
   passing; nothing ever actually round-tripped through the kernel
   verifier or aya's attach path.

Fixes:

- Add an empty `[workspace]` table to `bpf/Cargo.toml` so the nested
  cargo treats the subcrate as its own workspace root.
- `build.rs` now uses `Command::output()` and re-emits stdout + stderr
  as `cargo:warning` lines so the real error surfaces in the outer
  cargo's warning stream — no more opaque exit-code messages.
- New `PACKETFRAME_BPF_REQUIRED` env var makes `build.rs` panic on
  the stub fallback path instead of silently writing an empty ELF.
  Set in `ci.yml`'s `check` job so CI can never again be green with
  the BPF build broken. Local dev (without the env var) still stubs
  gracefully on macOS laptops.

After these land, CI will either actually exercise the BPF program or
fail loudly. If the verifier rejects something, or aya can't attach,
we find out instead of shipping the bug.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lunarthegrey added a commit that referenced this pull request Apr 20, 2026
Covers the no-FIB §9 Phase 1 cases that were deferred from PR #3
because aya 0.13.1 doesn't wrap BPF_PROG_TEST_RUN for XDP. The raw
syscall harness lives in `tests/common/mod.rs` — it builds synthetic
Ethernet + IPv4/IPv6 + TCP packets, runs them through the loaded
BPF program via `bpf(BPF_PROG_TEST_RUN)`, and lets the test assert
on both the verdict and per-counter deltas.

Fixtures in `tests/fixtures.rs` currently cover:

- ARP (non-IP) → XDP_PASS + pass_not_ip
- IPv4 options (IHL>5) → pass_complex_header
- IPv4 MF fragment → pass_fragment
- IPv4 non-zero fragment offset → pass_fragment
- IPv4 TTL=1 and TTL=0 → pass_low_ttl
- IPv4 allowlist miss → XDP_PASS without bumping matched_v4
- IPv6 fragment extension header → pass_complex_header
- IPv6 Hop-by-Hop extension → pass_complex_header
- IPv6 hop_limit=1 → pass_low_ttl
- IPv4 src-only match dry-run → matched_v4 + matched_src_only + fwd_dry_run
- IPv4 dst-only match dry-run → matched_dst_only
- IPv4 both-match dry-run → matched_both
- IPv6 src-only match dry-run → matched_v6 + matched_src_only
- rx_total bumps for every packet regardless of verdict

FIB-return verdicts (SUCCESS/NO_NEIGH/DROP/FRAG_NEEDED/not-in-devmap)
still need a netns with real routes + a populated redirect_devmap;
those are a separate slice.

All fixtures are `#[ignore]`-gated (need CAP_BPF). CI's existing
sudo step (`cargo test --tests -- --ignored`) picks them up
automatically — no workflow changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant