Skip to content

seccomp: fall back to NEW_LISTENER on kernels without WAIT_KILLABLE_RECV#63

Merged
congwang-mk merged 1 commit into
multikernel:mainfrom
dzerik:fix-seccomp-wait-killable-fallback
May 26, 2026
Merged

seccomp: fall back to NEW_LISTENER on kernels without WAIT_KILLABLE_RECV#63
congwang-mk merged 1 commit into
multikernel:mainfrom
dzerik:fix-seccomp-wait-killable-fallback

Conversation

@dzerik
Copy link
Copy Markdown
Contributor

@dzerik dzerik commented May 26, 2026

Fixes #62.

The Python implementation (commit 50d5eb9, "Enable SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV for reliable notifications") tried the install with WAIT_KILLABLE_RECV first and fell back to bare NEW_LISTENER on older kernels that rejected the bit with EINVAL. The Rust rewrite passed both flags unconditionally, so seccomp(SET_MODE_FILTER) returns EINVAL on every kernel < 5.19 — blocking sandlock on RHEL 9 (5.14) and Ubuntu 22.04 (5.15) at runtime, before Landlock is even attempted.

Change

Restore the pre-rewrite behaviour:

  • Try NEW_LISTENER | WAIT_KILLABLE_RECV first
  • On EINVAL (and only EINVAL), retry with NEW_LISTENER alone
  • Propagate any other error unchanged

The retry control flow is extracted into install_with_einval_fallback so the success-fast-path, EINVAL-retry, non-EINVAL-passthrough, and both-EINVAL terminal cases can be unit-tested without invoking the real seccomp(2).

WAIT_KILLABLE_RECV is a robustness flag (it stops signals from aborting in-flight notifications); dropping it on older kernels does not relax the security boundary, which is what the original Python fallback already relied on.

seccomp(SET_MODE_FILTER) also returns EINVAL for a malformed BPF program (bad instruction count, bad opcode, kernel verifier rejection). In that case both attempts use the same fprog, so the second call fails with EINVAL too and the error is propagated — the fallback cannot silently paper over a malformed filter. The fallback_einval_on_both_returns_second_einval test pins this behaviour.

Tests

  • seccomp::bpf::tests::fallback_succeeds_first_try_returns_fd_no_retry — happy path, no retry
  • seccomp::bpf::tests::fallback_einval_retries_with_fallback_flags — preferred-set returns EINVAL, fallback-set succeeds
  • seccomp::bpf::tests::fallback_non_einval_error_propagates_without_retryEPERM is not retried
  • seccomp::bpf::tests::fallback_einval_on_both_returns_second_einval — both attempts fail, terminal error surfaces

282 lib tests pass.

Verified on a real < 5.19 kernel

On stock Rocky Linux 9.6 (kernel 5.14.0-570.17.1.el9_6, Landlock ABI v5):

Before this change:

$ sandlock run --fs-read /usr --fs-read /lib64 --fs-read /etc -- /usr/bin/true
sandlock child: seccomp install: Invalid argument (os error 22)
Error: child process error: read notif fd from child: pipe closed
exit=1

After this change (same binary, same kernel):

$ sandlock run --fs-read /usr --fs-read /lib64 --fs-read /etc -- /usr/bin/true
exit=0

(The Rocky 9.6 run also requires #17 / Protection opt-out to skip the two v6 IPC scopes; the seccomp install passes either way once this fix lands.)

The Python implementation (commit 50d5eb9) tried the install with
SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV first and fell back to bare
NEW_LISTENER on older kernels that rejected the bit with EINVAL. The
Rust rewrite passed both flags unconditionally, so seccomp(SET_MODE_FILTER)
returns EINVAL on every kernel < 5.19 — blocking sandlock on RHEL 9
(5.14) and Ubuntu 22.04 (5.15) at runtime.

Restore the pre-rewrite behaviour: try the preferred flag set, retry
with the fallback set only on EINVAL, propagate any other error
unchanged. WAIT_KILLABLE_RECV is a robustness flag (stops signals
aborting in-flight notifications); dropping it on older kernels does
not relax the security boundary, which is what the original Python
fallback already relied on.

Extract the retry control flow into install_with_einval_fallback so
the EINVAL-only retry, success-fast-path, non-EINVAL passthrough, and
both-EINVAL terminal cases can be unit-tested without invoking the
real seccomp(2) syscall.

Verified on Rocky Linux 9.6 (5.14.0-570.17.1.el9_6): sandlock run with
no v6 protections succeeds with exit=0 after this change; without it,
the same invocation fails at seccomp install before Landlock is
attempted.

Fixes multikernel#62.
@congwang-mk congwang-mk merged commit a76b2d7 into multikernel:main May 26, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

seccomp install fails on kernels < 5.19 — Rust port lost Python fallback for SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV

2 participants