Skip to content

bypass4netns-style socket switching (opt-in host network mode) #16

@jserv

Description

@jserv

Problem

All network I/O goes through the SLIRP/passt userspace stack. For bulk
transfers, data-path overhead dominates.

Proposed Changes

Add an opt-in --net=host-bypass fast path using SECCOMP_IOCTL_NOTIF_ADDFD
(the same mechanism already used for pipe injection in forward_pipe()):

  1. Intercept socket() + connect() via seccomp for eligible sockets.
  2. Create the socket in the host network namespace, inject via ADDFD.
  3. Subsequent read/write/send/recv go directly to the host kernel
    without supervisor involvement on the data path.

Scope of first version:

  • AF_INET/AF_INET6, SOCK_STREAM, outbound connect() only.
  • Explicitly exclude AF_UNIX, AF_PACKET, raw sockets.
  • Policy controls to restrict which destination addresses/ports are eligible
    for bypass.

Additional syscall interception beyond socket() is required:
connect, bind, listen, accept, setsockopt, getsockopt, and
fcntl on bypassed FDs all need consideration in the dispatch layer
(seccomp-dispatch.c).

Considerations

  • epoll/poll multiplexing: LKL epoll cannot monitor host socket FDs and
    vice versa. Bridging both worlds may require intercepting epoll_ctl and
    epoll_wait to implement a unified event loop. This is the hardest
    architectural challenge and may limit applicability to simple
    connect-send-recv workloads initially.
  • Network isolation tradeoff: the guest operates directly in the host
    network namespace for bypassed sockets. bind() binds to host interfaces,
    getsockname() returns host IPs. This must be explicitly opt-in with
    clear documentation of the security implications.
  • Relationship to passt backend and SLIRP Phase 2 (SLIRP networking: server sockets, sendmsg interception, TCP test coverage #12): this is a
    complementary approach. passt improves the stack-mediated path; socket
    switching bypasses the stack entirely for eligible connections.
  • Address-family scope: start narrow (AF_INET/AF_INET6,
    SOCK_STREAM) and expand based on demand.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions