Skip to content

fix(build): wire ANYSCAN_USE_AF_XDP=1 through install-external-deps + package-worker-bundle + deploy#71

Merged
skullcrushercmd merged 1 commit intomainfrom
fix/afxdp-build-wireup
Apr 28, 2026
Merged

fix(build): wire ANYSCAN_USE_AF_XDP=1 through install-external-deps + package-worker-bundle + deploy#71
skullcrushercmd merged 1 commit intomainfrom
fix/afxdp-build-wireup

Conversation

@skullcrushercmd
Copy link
Copy Markdown
Contributor

Summary

Closes the AnyScan-side AF_XDP build wire-up gap that anygpt-42 caught: the bundle/deploy pipeline never forwarded the AF_XDP build flag to make, so the scanner binary that shipped in worker bundles had no AF_XDP code linked. The runtime --io-engine=af_xdp knob from PR D (b5da5fc) was therefore silently moot — the adapter requested af_xdp, the scanner had no af_xdp code, and the AF_PACKET fallback path took over on every host.

Cross-references: plans/2026-04-27-portscan-afxdp-plan-v1.md §3.5 (ENA / AWS specifics) and §3.6 (build-system integration), which already specified the env knob shape:

install-external-deps.sh (line 67, the make invocation) gains USE_AF_XDP=\"${ANYSCAN_USE_AF_XDP:-0}\" env-driven plumbing; default off so existing AMIs continue to build the same binary.

Design

Single env knob: ANYSCAN_USE_AF_XDP (default 0).

  • 0 → every make invocation is byte-identical to the legacy command. Existing AMIs continue to build the same AF_PACKET-only binary.
  • 1 → every make in the chain (install-external-deps.sh, package-worker-bundle.sh, deploy.sh) passes USE_AF_XDP=1, which the engine Makefile turns into -DUSE_AF_XDP + -lxdp -lbpf -lelf -lz + the send-afxdp.c/recv-afxdp.c sources.

A shared linkage probe (ldd | grep libxdp.so, with a readelf -d NEEDED fallback) detects cached AF_PACKET-only binaries; the build path force-rebuilds them via make clean + make USE_AF_XDP=1 rather than silently shipping the legacy binary.

Files touched

File Change
install-external-deps.sh env knob, binary_has_afxdp_linkage() helper, vulnscanner_make_args(), widened cache check; final post-build assertion bails out if the produced binary still lacks libxdp linkage (build deps missing).
package-worker-bundle.sh same env knob, rebuild_scanner_with_afxdp() helper that fires make USE_AF_XDP=1 in the engine repo when the cached binary is missing or AF_PACKET-only; README.txt now records use_af_xdp + scanner_included per bundle.
deploy.sh same env knob in install_vulnscanner_binary(); cleanup step removes a stale AF_PACKET-only binary so the existing build branch fires with USE_AF_XDP=1, then asserts libxdp linkage on the produced binary.
runtime.worker.env.template documents ANYSCAN_USE_AF_XDP as a build-time knob (operator shell / EnvironmentFile, not consumed by agentd) and how it relates to ANYSCAN_AF_XDP_AVAILABLE + ANYSCAN_SCANNER_IO_ENGINE that the runtime already reads.
tools/test-install-external-deps-afxdp.sh new bash unit test, 10 assertions across 4 cases.

Before → after

Before (the anygpt-42 gap):

  • install-external-deps.sh:120 unconditionally ran make (no flags) and install-external-deps.sh:117 short-circuited as soon as any scanner binary existed at the cache path.
  • package-worker-bundle.sh:519-540 only located a pre-built binary; no rebuild logic.
  • deploy.sh:91-94 only ran make (no flags); no cache-staleness handling.
  • None of the three scripts read ANYSCAN_USE_AF_XDP.

After:

  • Every make invocation in these three scripts is the byte-identical legacy command when ANYSCAN_USE_AF_XDP=0, and adds a USE_AF_XDP=1 token plus a clean-rebuild force path when ANYSCAN_USE_AF_XDP=1.
  • A cached AF_PACKET-only binary is detected and force-rebuilt instead of silently shipping the legacy binary.
  • runtime.worker.env.template documents the build-time knob alongside the existing runtime knobs, so an operator reading /etc/agentd/runtime.env understands what produced their scanner.

Test plan

  • bash -n install-external-deps.sh package-worker-bundle.sh deploy.sh install-worker-bundle.sh tools/test-install-external-deps-afxdp.sh → 5/5 ok
  • tools/test-install-external-deps-afxdp.sh → 10/10 assertions pass:
    • ANYSCAN_USE_AF_XDP=0 (default) → make argv has no USE_AF_XDP=1 token
    • ANYSCAN_USE_AF_XDP=1 + missing scanner → make USE_AF_XDP=1
    • ANYSCAN_USE_AF_XDP=1 + cached AF_PACKET-only binary → make clean followed by make USE_AF_XDP=1
    • ANYSCAN_USE_AF_XDP=1 + cached AF_XDP-linked binary → no rebuild
  • cargo build --release --manifest-path Cargo.toml → exit 0 (1m 02s)
  • cargo test --release --no-fail-fast --manifest-path Cargo.toml → all suites pass (anyscan-api 31/31, anyscan-worker 33/33 + 1 ignored, anyscan-path-bench 2/2, no regressions)

Out of scope (per anygpt-43 task split)

  • Scanner fork edits (this PR is AnyScan-side only).
  • Prod runtime.env edits.
  • anyscan_rate_controller.py adapter Python.
  • Live bench on a c6in.metal host (anygpt-4 territory).
  • install-worker-bundle.sh — operator-side, reads the bundle and does not rebuild. It already probes libxdp.so loadability via probe_afxdp_runtime_available() and writes ANYSCAN_AF_XDP_AVAILABLE, so flipping ANYSCAN_USE_AF_XDP=1 upstream + this probe downstream gives the runtime a coherent state.

References: anygpt-42 (gap discovery), anygpt-43 (this fix).

… package-worker-bundle + deploy

anygpt-42 caught the wire-up gap: the AnyScan-side build pipeline never
forwarded the AF_XDP build flag to make, so the scanner binary that
shipped in the worker bundle had no AF_XDP code path linked. The
runtime --io-engine=af_xdp knob from PR D (b5da5fc) was therefore
silently moot — the adapter requested af_xdp, the scanner had no af_xdp
code, and the AF_PACKET fallback path took over. This change closes
the gap by plumbing a single env knob through every build/bundle/deploy
script that calls make on the engine repo.

Design (plans/2026-04-27-portscan-afxdp-plan-v1.md §3.5, §3.6):
- ANYSCAN_USE_AF_XDP defaults to 0 so existing AMIs keep building the
  legacy AF_PACKET-only binary byte-identically.
- ANYSCAN_USE_AF_XDP=1 makes every make invocation in this chain pass
  USE_AF_XDP=1 to the engine Makefile, which adds -DUSE_AF_XDP plus
  the libxdp/libbpf/libelf link line and the send-afxdp.c/recv-afxdp.c
  sources (per the engine fork Makefile pattern at §3.6).
- A shared linkage probe (ldd | grep libxdp.so, falling back to
  readelf -d NEEDED) identifies cached AF_PACKET-only binaries; the
  build path force-rebuilds them via make clean + make USE_AF_XDP=1
  rather than silently shipping the legacy binary.

Files touched:
- install-external-deps.sh: env knob, binary_has_afxdp_linkage()
  helper, vulnscanner_make_args(), widened cache check at the make
  invocation. Final post-build assertion bails out if the produced
  binary still lacks libxdp linkage (build deps probably missing).
- package-worker-bundle.sh: same env knob, rebuild_scanner_with_afxdp()
  helper that fires `make USE_AF_XDP=1` in the engine repo when the
  cached binary is missing or AF_PACKET-only, README.txt now records
  the use_af_xdp + scanner_included build state per bundle.
- deploy.sh: same env knob in install_vulnscanner_binary(); when
  ANYSCAN_USE_AF_XDP=1 finds a stale AF_PACKET-only binary it removes
  it (and runs make clean) so the existing build branch fires with
  USE_AF_XDP=1, then asserts libxdp linkage on the produced binary.
- runtime.worker.env.template: documents ANYSCAN_USE_AF_XDP as a
  build-time knob (operator's shell / EnvironmentFile, NOT consumed
  by agentd) and how it relates to ANYSCAN_AF_XDP_AVAILABLE +
  ANYSCAN_SCANNER_IO_ENGINE that the runtime already reads.
- tools/test-install-external-deps-afxdp.sh: new bash unit test that
  stubs make/git/ldd/readelf on PATH and asserts:
    1. ANYSCAN_USE_AF_XDP=0 (default) -> make argv has NO USE_AF_XDP=1
    2. ANYSCAN_USE_AF_XDP=1 + missing scanner -> make USE_AF_XDP=1
    3. ANYSCAN_USE_AF_XDP=1 + cached AF_PACKET-only -> make clean
       followed by make USE_AF_XDP=1 (force rebuild)
    4. ANYSCAN_USE_AF_XDP=1 + cached AF_XDP-linked -> no rebuild

Before: install-external-deps.sh:120 unconditionally ran `make` and
short-circuited at line 117 whenever any scanner binary existed at
the cache path. package-worker-bundle.sh:519-540 only located a
pre-built binary; deploy.sh:91-94 only ran `make` (no flags). None
of the three scripts read ANYSCAN_USE_AF_XDP.

After: every make invocation in these three scripts is the byte-
identical legacy command when ANYSCAN_USE_AF_XDP=0, and adds a
USE_AF_XDP=1 token plus a clean-rebuild force path when
ANYSCAN_USE_AF_XDP=1. The new unit test pins this behavior.

Out of scope (per anygpt-43 task split): scanner fork edits, prod
runtime.env edits, anyscan_rate_controller.py changes, live bench
on a c6in.metal host (anygpt-4 territory), install-worker-bundle.sh
(operator-side, reads the bundle and does not rebuild — already
probes libxdp.so loadability via probe_afxdp_runtime_available()).

Verified:
- bash -n install-external-deps.sh package-worker-bundle.sh deploy.sh
  install-worker-bundle.sh tools/test-install-external-deps-afxdp.sh
  -> 5/5 ok
- tools/test-install-external-deps-afxdp.sh -> 10/10 assertions pass
- cargo build --release --manifest-path Cargo.toml -> exit 0
- cargo test --release --no-fail-fast --manifest-path Cargo.toml ->
  exit 0 (workspace tests + 31 anyscan-api tests + 33 anyscan-worker
  tests + 2 anyscan-path-bench tests, no failures, no regressions)
@skullcrushercmd skullcrushercmd merged commit 989c44e into main Apr 28, 2026
@skullcrushercmd skullcrushercmd deleted the fix/afxdp-build-wireup branch April 28, 2026 12:53
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cf7fe96623

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread deploy.sh
Comment on lines +120 to +124
&& [ -x "$VULNSCANNER_SOURCE_BIN" ] \
&& ! binary_has_afxdp_linkage "$VULNSCANNER_SOURCE_BIN"; then
printf '[*] Removing pre-AF_XDP scanner at %s so the build path with USE_AF_XDP=1 fires.\n' \
"$VULNSCANNER_SOURCE_BIN"
rm -f "$VULNSCANNER_SOURCE_BIN"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop deleting overridden scanner binaries

When ANYSCAN_USE_AF_XDP=1, this path unconditionally runs rm -f "$VULNSCANNER_SOURCE_BIN" before rebuilding. Because VULNSCANNER_SOURCE_BIN is environment-overridable, operators can point it outside VULNSCANNER_SOURCE_DIR; in that case a deploy run as root can delete an arbitrary prebuilt scanner (or any file at that path) just because it was built without libxdp linkage. This destructive behavior is new in this commit and should be limited to in-tree build artifacts or replaced with a non-destructive failure path.

Useful? React with 👍 / 👎.

skullcrushercmd pushed a commit that referenced this pull request Apr 28, 2026
…s + package-worker-bundle + deploy + adapter

anygpt-46. Mirrors PR #71 (ANYSCAN_USE_AF_XDP wire-up) for the second
kernel-bypass io_engine the scanner-c engine fork already advertises:
the engine Makefile already has the USE_PFRING_ZC=1 branch (-lpfring
-lpcap, src/{send,recv}-pfring.c added), engine.c already registers
io_engine_pfring_zc in pick_io_engine, and parsing.c::io_engine_from_string
already accepts the "pfring_zc" string. The wire-up gap was that no
build/bundle/deploy script ever forwarded USE_PFRING_ZC=1 to make and
no runtime knob would ever resolve to --io-engine=pfring_zc, so the
infrastructure ahead of the engine was the same shape AF_XDP was in
before #71.

Design — same shape as #71 so the two engines compose:

- ANYSCAN_USE_PFRING_ZC defaults to 0 so existing AMIs keep building
  byte-identically. Verified empirically: with both flags 0 the
  resulting `make` argv is `-C <repo>` with no extra tokens.
- ANYSCAN_USE_PFRING_ZC=1 makes every make invocation in the
  install/bundle/deploy chain emit USE_PFRING_ZC=1 alongside any
  USE_AF_XDP=1 already present, so a single rebuild can produce a
  binary linked against both libxdp and libpfring.
- A shared linkage probe (binary_has_pfring_zc_linkage) checks for
  libpfring.so via ldd then readelf -d NEEDED, mirroring the libxdp
  probe; the build path force-rebuilds (make clean + make USE_PFRING_ZC=1)
  when it finds a cached binary that lacks libpfring linkage rather
  than silently shipping the legacy binary.

Files touched:

- install-external-deps.sh: ANYSCAN_USE_PFRING_ZC env knob,
  binary_has_pfring_zc_linkage helper, install_pfring_zc_build_deps()
  best-effort apt-get block (skips on non-Debian/non-root/sudo-prompt
  hosts; points at the ntop apt-stable repo when libpfring-dev is
  not in stock repos), vulnscanner_make_args() emits both tokens, and
  the cache + post-build assertions branch on the new linkage probe.
- package-worker-bundle.sh: same env knob, binary_has_pfring_zc_linkage
  inline mirror, bundle_engine_make_args() centralizing both flags,
  rebuild_scanner_with_pfring_zc() helper, bundle staging block fires
  the rebuild when ANYSCAN_USE_PFRING_ZC=1 + cached binary lacks
  libpfring linkage. README.txt now records use_pfring_zc per bundle.
- deploy.sh: env knob, binary_has_pfring_zc_linkage helper,
  install_vulnscanner_binary() drops a libpfring-less cached binary
  before the build branch fires and post-installs an assertion that
  fails fast if the produced binary still lacks libpfring linkage.
- runtime.worker.env.template: documents ANYSCAN_USE_PFRING_ZC as a
  build-time knob (operator's shell / EnvironmentFile, NOT consumed
  by agentd) plus a separate ANYSCAN_PFRING_ZC_AVAILABLE runtime
  flag the install-time probe writes. Includes a prominent license
  obligation note: PF_RING ZC requires a commercial ntop license at
  runtime; without it libpfring throttles ZC traffic to ~100k pps,
  *below* the rate AF_PACKET already sustains, so flipping the knob
  on a license-less host regresses throughput.
- install-worker-bundle.sh: probe_pfring_zc_runtime_available() checks
  /proc/net/pf_ring (kmod loaded) + libpfring.so via ldconfig;
  apply_pfring_zc_availability() upserts ANYSCAN_PFRING_ZC_AVAILABLE
  unconditionally so a partial upgrade can't leave a stale "true" in
  place after the kmod is unloaded. Mirrors apply_afxdp_availability.
- vulnscanner-zmap-adapter.py: SUPPORTED_IO_ENGINES gains "pfring_zc",
  resolve_io_engine() generalized to look up the per-engine
  availability key from a small dict so future engines slot in
  cleanly. AF_PACKET behavior unchanged.
- tools/test-install-external-deps-pfring-zc.sh: new bash unit test
  (10/10 pass) mirroring tools/test-install-external-deps-afxdp.sh —
  asserts argv composition for default/missing/legacy/cached cases.
- test_vulnscanner_adapter_io_engine.py: 11 new test cases covering
  pfring_zc resolution + build_command + adapter end-to-end. Existing
  AF_XDP cases still green.

KNOWN LIMITATION (engine-side, out of scope for this PR per task
charter): src/engine.c::pfring_zc_init_per_thread is currently a stub
that errors at startup with "PF_RING ZC cluster has not been
initialized" — the per-thread cluster/pool/queue setup
(config->zc_cluster, config->zc_pool, ctx->zc_queue) is owned by a
follow-on engine patch. That patch is the engine-side equivalent of
the AF_XDP afxdp_tx_init_per_thread that landed in the engine fork
before AnyScan PR #71. Until it lands, --io-engine=pfring_zc on a
binary built with USE_PFRING_ZC=1 will surface that error message and
the operator must fall back to af_packet (or af_xdp). The wire-up
plumbing in this PR is correct independent of when the engine init
patch lands; landing this first means the engine PR is a one-line
swap when ready, with no AnyScan-side changes needed.

Verified:
- bash -n install-external-deps.sh package-worker-bundle.sh deploy.sh
  install-worker-bundle.sh tools/test-install-external-deps-pfring-zc.sh
  tools/test-install-external-deps-afxdp.sh -> 6/6 ok
- python3 -m py_compile vulnscanner-zmap-adapter.py
  test_vulnscanner_adapter_io_engine.py -> ok
- tools/test-install-external-deps-pfring-zc.sh -> 10/10 PASS
- tools/test-install-external-deps-afxdp.sh -> 10/10 PASS (no regression)
- python3 -m unittest test_vulnscanner_adapter_io_engine -v -> 25/25 ok
  (was 14 before; +11 pfring_zc cases, all 14 AF_XDP cases still green)
- cargo build --manifest-path Cargo.toml --release -> exit 0
- empirical default check: with both flags 0, the make argv is
  byte-identical to legacy (`make -C <repo>` only).
- empirical compose check: with both flags 1, single make argv is
  `make -C <repo> USE_AF_XDP=1 USE_PFRING_ZC=1`.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
skullcrushercmd added a commit that referenced this pull request Apr 28, 2026
…s + package-worker-bundle + deploy + adapter (#75)

anygpt-46. Mirrors PR #71 (ANYSCAN_USE_AF_XDP wire-up) for the second
kernel-bypass io_engine the scanner-c engine fork already advertises:
the engine Makefile already has the USE_PFRING_ZC=1 branch (-lpfring
-lpcap, src/{send,recv}-pfring.c added), engine.c already registers
io_engine_pfring_zc in pick_io_engine, and parsing.c::io_engine_from_string
already accepts the "pfring_zc" string. The wire-up gap was that no
build/bundle/deploy script ever forwarded USE_PFRING_ZC=1 to make and
no runtime knob would ever resolve to --io-engine=pfring_zc, so the
infrastructure ahead of the engine was the same shape AF_XDP was in
before #71.

Design — same shape as #71 so the two engines compose:

- ANYSCAN_USE_PFRING_ZC defaults to 0 so existing AMIs keep building
  byte-identically. Verified empirically: with both flags 0 the
  resulting `make` argv is `-C <repo>` with no extra tokens.
- ANYSCAN_USE_PFRING_ZC=1 makes every make invocation in the
  install/bundle/deploy chain emit USE_PFRING_ZC=1 alongside any
  USE_AF_XDP=1 already present, so a single rebuild can produce a
  binary linked against both libxdp and libpfring.
- A shared linkage probe (binary_has_pfring_zc_linkage) checks for
  libpfring.so via ldd then readelf -d NEEDED, mirroring the libxdp
  probe; the build path force-rebuilds (make clean + make USE_PFRING_ZC=1)
  when it finds a cached binary that lacks libpfring linkage rather
  than silently shipping the legacy binary.

Files touched:

- install-external-deps.sh: ANYSCAN_USE_PFRING_ZC env knob,
  binary_has_pfring_zc_linkage helper, install_pfring_zc_build_deps()
  best-effort apt-get block (skips on non-Debian/non-root/sudo-prompt
  hosts; points at the ntop apt-stable repo when libpfring-dev is
  not in stock repos), vulnscanner_make_args() emits both tokens, and
  the cache + post-build assertions branch on the new linkage probe.
- package-worker-bundle.sh: same env knob, binary_has_pfring_zc_linkage
  inline mirror, bundle_engine_make_args() centralizing both flags,
  rebuild_scanner_with_pfring_zc() helper, bundle staging block fires
  the rebuild when ANYSCAN_USE_PFRING_ZC=1 + cached binary lacks
  libpfring linkage. README.txt now records use_pfring_zc per bundle.
- deploy.sh: env knob, binary_has_pfring_zc_linkage helper,
  install_vulnscanner_binary() drops a libpfring-less cached binary
  before the build branch fires and post-installs an assertion that
  fails fast if the produced binary still lacks libpfring linkage.
- runtime.worker.env.template: documents ANYSCAN_USE_PFRING_ZC as a
  build-time knob (operator's shell / EnvironmentFile, NOT consumed
  by agentd) plus a separate ANYSCAN_PFRING_ZC_AVAILABLE runtime
  flag the install-time probe writes. Includes a prominent license
  obligation note: PF_RING ZC requires a commercial ntop license at
  runtime; without it libpfring throttles ZC traffic to ~100k pps,
  *below* the rate AF_PACKET already sustains, so flipping the knob
  on a license-less host regresses throughput.
- install-worker-bundle.sh: probe_pfring_zc_runtime_available() checks
  /proc/net/pf_ring (kmod loaded) + libpfring.so via ldconfig;
  apply_pfring_zc_availability() upserts ANYSCAN_PFRING_ZC_AVAILABLE
  unconditionally so a partial upgrade can't leave a stale "true" in
  place after the kmod is unloaded. Mirrors apply_afxdp_availability.
- vulnscanner-zmap-adapter.py: SUPPORTED_IO_ENGINES gains "pfring_zc",
  resolve_io_engine() generalized to look up the per-engine
  availability key from a small dict so future engines slot in
  cleanly. AF_PACKET behavior unchanged.
- tools/test-install-external-deps-pfring-zc.sh: new bash unit test
  (10/10 pass) mirroring tools/test-install-external-deps-afxdp.sh —
  asserts argv composition for default/missing/legacy/cached cases.
- test_vulnscanner_adapter_io_engine.py: 11 new test cases covering
  pfring_zc resolution + build_command + adapter end-to-end. Existing
  AF_XDP cases still green.

KNOWN LIMITATION (engine-side, out of scope for this PR per task
charter): src/engine.c::pfring_zc_init_per_thread is currently a stub
that errors at startup with "PF_RING ZC cluster has not been
initialized" — the per-thread cluster/pool/queue setup
(config->zc_cluster, config->zc_pool, ctx->zc_queue) is owned by a
follow-on engine patch. That patch is the engine-side equivalent of
the AF_XDP afxdp_tx_init_per_thread that landed in the engine fork
before AnyScan PR #71. Until it lands, --io-engine=pfring_zc on a
binary built with USE_PFRING_ZC=1 will surface that error message and
the operator must fall back to af_packet (or af_xdp). The wire-up
plumbing in this PR is correct independent of when the engine init
patch lands; landing this first means the engine PR is a one-line
swap when ready, with no AnyScan-side changes needed.

Verified:
- bash -n install-external-deps.sh package-worker-bundle.sh deploy.sh
  install-worker-bundle.sh tools/test-install-external-deps-pfring-zc.sh
  tools/test-install-external-deps-afxdp.sh -> 6/6 ok
- python3 -m py_compile vulnscanner-zmap-adapter.py
  test_vulnscanner_adapter_io_engine.py -> ok
- tools/test-install-external-deps-pfring-zc.sh -> 10/10 PASS
- tools/test-install-external-deps-afxdp.sh -> 10/10 PASS (no regression)
- python3 -m unittest test_vulnscanner_adapter_io_engine -v -> 25/25 ok
  (was 14 before; +11 pfring_zc cases, all 14 AF_XDP cases still green)
- cargo build --manifest-path Cargo.toml --release -> exit 0
- empirical default check: with both flags 0, the make argv is
  byte-identical to legacy (`make -C <repo>` only).
- empirical compose check: with both flags 1, single make argv is
  `make -C <repo> USE_AF_XDP=1 USE_PFRING_ZC=1`.

Co-authored-by: skullcmd <skullcmd@anyvm.tech>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
skullcrushercmd added a commit that referenced this pull request Apr 28, 2026
Phase 1 design document for adding a DPDK io_engine to the bundled C
scanner (AnyVM-Tech/anyscan-engine-c). Mirrors PR #65's AF_XDP plan
structure across §1-§10.

Why now: PR #65's AF_XDP work landed but the c6in.metal bench revealed
ENA on kernel <=6.12.74 forces drv+copy (not drv+zerocopy), capping the
8-NIC ceiling at ~22 M pps — short of the 30-50 M pps projection. DPDK
via vfio-pci bypasses the ENA kernel driver entirely, projecting
50-100 M pps realistic on c6in.metal.

This supersedes PR #63's deferral recommendation (which was conditioned
on AF_XDP clearing the throughput target — it did not).

Plan scope:
- engine repo: ~1,100 LOC (send-dpdk.c, recv-dpdk.c, dpdk-eal.c,
  dpdk-defs.h, vtable slot in engine.c, USE_DPDK Makefile block)
- AnyScan-side wire-up: ~765 LOC (mirrors PR #71's ANYSCAN_USE_AF_XDP
  pattern across install-external-deps.sh / package-worker-bundle.sh /
  deploy.sh / runtime.worker.env.template / adapter.py + new
  tools/setup-dpdk.sh for hugepages and vfio-pci bind/unbind)
- NIC-binding decision: dedicated-DPDK-NIC pattern. eth0 stays on
  kernel for agentd heartbeat; ENIs eth1..eth7 (c6in.metal) go to
  vfio-pci. Single-NIC instances are DPDK-ineligible by design.
- Effort: 12-15 days implementation + canary, ~3-4 weeks total.

Phase 2 implementation is gated on user/orchestrator approval after
this plan PR merges. No engine C code, no runtime config, no submodule
bumps in this PR.

Co-authored-by: skullcmd <skullcmd@anyvm.tech>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
skullcrushercmd added a commit that referenced this pull request Apr 28, 2026
…undle + deploy + adapter (#81)

Phase 2 wire-up for the DPDK io_engine landing in
AnyVM-Tech/anyscan-engine-c PR #4. Mirrors PR #71's AF_XDP wire-up shape
across the install / bundle / deploy / adapter / install-time-probe
chain so the engine repo's USE_DPDK=1 build flag actually reaches every
producer of a worker bundle, and so the runtime --io-engine=dpdk knob
plumbed through ANYSCAN_SCANNER_IO_ENGINE has DPDK code to dispatch to.

Why DPDK now: AWS ENA on kernel ≤6.12.74 forces AF_XDP into drv+copy
mode, capping c6in.metal at ~22M pps aggregate (memory:
anyscan_afxdp_ena_constraint, also PR #65 issuecomment-4338158487 —
6.19.11 STILL does not have ena_xdp_zc). DPDK bypasses the kernel ENA
driver entirely via vfio-pci and removes the syscall-kick + lower-half
-channels-only ZC constraint.

What lands here:
  - install-external-deps.sh: ANYSCAN_USE_DPDK env knob;
    binary_has_dpdk_linkage probe (librte_eal.so via ldd → readelf -d);
    install_dpdk_build_deps (libdpdk-dev + dpdk apt-get, fail-open);
    cache short-circuit invalidation when cached binary lacks DPDK
    linkage; vulnscanner_make_args extension; post-build assertion.
  - package-worker-bundle.sh: same env knob, linkage probe,
    rebuild_scanner_with_dpdk helper, bundle_engine_make_args, README.txt
    use_dpdk field. Composes with USE_AF_XDP=1 USE_PFRING_ZC=1 — the
    earliest matching rebuild block produces a binary linked against
    every requested engine in a single make invocation.
  - deploy.sh: same env knob, linkage probe, make_args extension,
    pre-DPDK cached-binary drop, post-build assertion.
  - install-worker-bundle.sh: binary_has_dpdk_linkage,
    probe_dpdk_runtime_available (5 gates: scanner USE_DPDK-built,
    librte_eal.so loadable, vfio_pci kernel module, hugepages reserved
    in /sys/kernel/mm/hugepages/*, /dev/vfio/vfio present),
    apply_dpdk_availability writing ANYSCAN_DPDK_AVAILABLE.
  - vulnscanner-zmap-adapter.py: SUPPORTED_IO_ENGINES gains "dpdk";
    _IO_ENGINE_AVAILABILITY_KEYS maps "dpdk" → ANYSCAN_DPDK_AVAILABLE
    so the same fall-back-with-warning path the AF_XDP / PF_RING ZC
    plumbing already exercises picks up dpdk for free.
  - runtime.worker.env.template: full DPDK section documenting
    ANYSCAN_USE_DPDK (build-time), ANYSCAN_DPDK_AVAILABLE (install
    probe), ANYSCAN_DPDK_PCI_BDFS (BDF / iface CSV), and
    ANYSCAN_DPDK_HUGEPAGES_GB (default 4).
  - tools/setup-dpdk.sh (NEW, ~370 LOC): bind / unbind / status
    subcommands. Reserves hugepages (1 GiB pages preferred, falls back
    to 2 MiB), modprobe vfio-pci, dpdk-devbind.py --bind=vfio-pci.
    Idempotent (re-runs are no-ops). Reversible (`unbind` returns the
    NICs to ena and frees hugepages). Refuses to bind eth0 (agentd
    control-plane interface) and refuses to bind the only NIC. THP
    gets switched to "never" on bind (DPDK + THP fragments the static
    hugepage pool).
  - tools/test-install-external-deps-dpdk.sh (NEW, ~270 LOC): mirrors
    test-install-external-deps-afxdp.sh. Four cases × multiple
    assertions: default unset → no USE_DPDK=1 in make argv; opt-in +
    missing scanner → USE_DPDK=1; opt-in + cached non-DPDK binary →
    make clean + USE_DPDK=1; opt-in + cached DPDK-linked binary → no
    rebuild. Stubs make/git/ldd/readelf so it runs hermetically.
  - test_vulnscanner_adapter_io_engine.py: 7 new DPDK assertions
    covering the dpdk-with-runtime-available, dpdk-without-runtime
    -fall-back-with-warning, missing-availability-var, uppercase
    normalization, and cross-engine availability isolation cases.
    Updated test_invalid_value_falls_back_to_af_packet_with_warning
    to use "fake_engine" instead of "dpdk" — dpdk is now valid.

Verification (on Debian bookworm with libdpdk-dev 24.11 installed):
  - tools/test-install-external-deps-afxdp.sh: 11/11 (regression OK).
  - tools/test-install-external-deps-pfring-zc.sh: 10/10 (regression OK).
  - tools/test-install-external-deps-dpdk.sh: 10/10.
  - python3 -m unittest discover: 116/116 (32 in
    test_vulnscanner_adapter_io_engine, of which 7 are DPDK-specific).
  - All bash scripts parse cleanly via `bash -n`.
  - tools/setup-dpdk.sh status runs cleanly (no NICs bound, expected).

Engine PR for io_engine_dpdk: AnyVM-Tech/anyscan-engine-c#4

Out of scope (separate workers per the plan):
  - Phase 2 systemd unit edit adding CAP_SYS_RAWIO/CAP_IPC_LOCK/
    CAP_NET_ADMIN to anyscan-worker.service. Documented in the env
    template. Until that lands operators must add caps manually before
    flipping the runtime knob.
  - Live c6in.metal bench (plan §5.3).
  - AMI rebuild.
  - mlx5 / non-AWS hardware support.

Refs: plans/2026-04-28-portscan-dpdk-impl-v1.md (§3.10 wire-up, §3.11
NIC-binding decision, §4.3 kernel feature checks, §5.7 unit test shape).
      anygpt-50

Co-authored-by: skullcmd <skullcmd@anyvm.tech>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant