Skip to content

feat: whitelist socket families and block legacy AIO via seccomp#61

Merged
koki-develop merged 1 commit intomainfrom
feat/seccomp-socket-whitelist-and-legacy-aio
Apr 30, 2026
Merged

feat: whitelist socket families and block legacy AIO via seccomp#61
koki-develop merged 1 commit intomainfrom
feat/seccomp-socket-whitelist-and-legacy-aio

Conversation

@koki-develop
Copy link
Copy Markdown
Member

Summary

  • Replace the AF_ALG-specific seccomp rule with a socket family whitelist allowing only AF_UNIX (1), AF_INET (2), AF_INET6 (10), AF_NETLINK (16). Every other family — AF_PACKET, AF_VSOCK, AF_BLUETOOTH, AF_TIPC, AF_KEY, AF_NFC, AF_KCM, AF_RDS, AF_CAN, AF_MPLS, and ~30 more — is blocked with EPERM. socketpair() is restricted to AF_UNIX (the only family the kernel itself accepts).
  • Block the legacy libaio family (io_setup, io_destroy, io_submit, io_getevents, io_cancel, io_pgetevents) with ENOSYS, mirroring the existing io_uring block. io_pgetevents is bound by name on x86_64 (kafel db has it as syscall 333) and by raw syscall number 292 on arm64 (kafel's aarch64 db is missing this entry; the asm-generic number is 292).
  • Whitelist uses kafel's built-in family arg (size=4) so the comparison is on the low 32 bits only — preserves the size=4 protection that motivated the original AF_ALG rule and avoids over-blocking legitimate calls with non-zero high bits.

Why now

Generalization of the AF_ALG (Copy Fail / CVE-2026-31431) block. AF_ALG is one of many niche socket families with kernel attack-surface history; whitelisting captures the long tail in one rule. Verified against canonical Linux kernel headers, kafel source (parser.y, codegen.c, syscall db), and BPF dump (dump_policy_bpf for both architectures).

Verification

  • BPF generated by kafel was dumped with `tools/dump_policy_bpf` for both amd64 and arm64; every relevant instruction was traced manually and matches the policy intent (`arg 0 low` only — confirming size=4).
  • Kernel `socketpair()` returning EOPNOTSUPP for non-AF_UNIX families verified empirically inside the container (outside nsjail).
  • glibc `getaddrinfo` AI_ADDRCONFIG path confirmed to use `socket(PF_NETLINK, SOCK_RAW, NETLINK_ROUTE)` via `check_pf.c:313` (sourceware glibc).
  • `ldd` confirmed no runtime binary in the image links against libaio; libaio is not installed.
  • `io_pgetevents_time64` (syscall 416) checked: arm64 kernel returns ENOSYS natively — no bypass path.
  • Full E2E suite (445 tests including 12 new socket/AIO tests) passes locally on arm64.

Test plan

  • CI passes on amd64 (test 17 `io_pgetevents_blocked_by_seccomp_(go,_x86_64)` is skipped on arm64 host; needs CI run to actually exercise the x86_64 path).
  • CI passes on arm64.
  • Manual review of seccomp.kafel comments for accuracy.

🤖 Generated with Claude Code

Generalize the previous AF_ALG-specific block to a whitelist that
allows only AF_UNIX, AF_INET, AF_INET6, and AF_NETLINK. This removes a
long tail of niche kernel subsystems (AF_PACKET, AF_VSOCK,
AF_BLUETOOTH, AF_TIPC, AF_KEY, AF_NFC, etc.) that have significant
historical CVE counts and zero legitimate sandbox use. socketpair() is
restricted to AF_UNIX, the only family the kernel itself accepts.

Also block the legacy libaio family (io_setup, io_destroy, io_submit,
io_getevents, io_cancel, io_pgetevents) with ENOSYS, mirroring the
existing io_uring block. No runtime in the image links against libaio.
io_pgetevents is referenced by name on x86_64 (kafel's amd64 syscall
db has it as 333) and by raw syscall number 292 on arm64 (kafel's
aarch64 db is missing this entry).

Tests cover representative blocked families (AF_PACKET, AF_VSOCK,
AF_TIPC), the AF_NETLINK regression, the socketpair restriction, each
legacy AIO syscall, both architectures of io_pgetevents, and a size=4
regression that catches accidental migration to the inline declaration
form (which would over-block legitimate calls with non-zero high bits
in the family arg).
@koki-develop koki-develop merged commit 57b219a into main Apr 30, 2026
7 checks passed
@koki-develop koki-develop deleted the feat/seccomp-socket-whitelist-and-legacy-aio branch April 30, 2026 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant