Skip to content

File descriptor leak crashes leanpoint under unresponsive upstreams #13

@ch4r10t33r

Description

@ch4r10t33r

Summary

leanpoint crashes (or becomes unresponsive) when one or more upstreams stop responding at the TCP layer but do not actively reject connections. Each poll spawns a worker thread that eventually gets abandoned via thread.detach() when it misses the wall-clock deadline, but the socket remains open for the kernel's TCP retransmit window (~15 minutes on Linux). Over time the main process exhausts RLIMIT_NOFILE and can no longer open new sockets — inbound or outbound.

Observed incident (devnet-4, 2026-04-23)

Public endpoints returned UpstreamConnectFailure for every upstream while the container appeared running. Snapshot inside the container:

  • /proc/$PID/fd/ held 1021 socket fds out of RLIMIT_NOFILE=1024 (Docker default).
  • ss -tn showed ~960 in ESTABLISHED toward the devnet clients and 63 in CLOSE-WAIT.
  • HTTP GET / on the public listener hung (accept loop could not get a new fd).
  • Container logs showed sustained WARN | Upstream (...) timed out after 5000ms — detaching thread over the hours preceding the outage.

The CLOSE-WAIT count is the smoking gun: the remote side had sent FIN, but our worker never called close() because it was still blocked inside recv() on a socket the peer had half-closed.

Root cause

src/upstreams.zig::pollUpstreamThread creates a fresh std.http.Client per poll and calls lean_api.fetchSlots synchronously. On Zig 0.14.1, std.http.Client does not expose connect_timeout or read_timeout — the existing @hasField guards in poller.zig / server.zig are no-ops on this version. To enforce a deadline, pollUpstreams waits up to request_timeout_ms and then just logs + moves on; any still-running worker is detached.

Detaching does not cancel the thread. It keeps blocking in the underlying syscall with the socket open until:

  1. the peer sends RST/FIN (can be near-instant), or
  2. the kernel gives up on TCP retransmits (default ~15 min, but can be longer when SYN-ACKs arrive and then the remote stalls mid-stream).

For the 16-upstream devnet we poll every ~4s; even a small fraction of stuck workers leaks fds faster than they drain. At default Docker nofile=1024, one day of mild upstream flakiness is enough to exhaust the limit.

Fix (PR #1)

Apply SO_RCVTIMEO and SO_SNDTIMEO to the socket returned by client.open() via std.posix.setsockopt on req.connection.?.stream.handle. This bounds every blocking recv/send to request_timeout_ms regardless of peer behavior, so detached workers reliably self-terminate (and their defer client.deinit() closes the socket) within the configured deadline.

This addresses the observed pathology — accepted connections that go silent or are half-closed. Connect-phase black-holes (SYN with no SYN-ACK) are out of scope here because we would need to switch to non-blocking connect + poll; they didn't show up in the incident.

Mitigation applied on the running container

Restarted with --ulimit nofile=65536:65536. This only buys time; without the socket-level timeout the leak still grows, just slower.

Follow-ups (not in this fix)

  • Non-blocking connect with explicit connect-timeout, so connect-phase hangs also clean up promptly.
  • Health counter + /healthz that fails when open-fd usage crosses a threshold, so orchestrators restart before exhaustion.
  • Track the currently-in-flight detached worker count and log when it grows unboundedly.
  • Revisit whether we need per-poll threads at all once socket-level timeouts exist — a sequential poll with hard timeouts would remove the detach/ref-count machinery entirely.

Reproduction

  1. Run leanpoint against an upstream that accepts TCP but never returns a response body (e.g. nc -l 5055 in accept-and-block mode).
  2. Watch ls /proc/$PID/fd | wc -l climb by one per poll_interval_ms until nofile is hit.
  3. Public listener stops accepting.

With the fix, the fd count stays flat: each worker returns within request_timeout_ms and releases its socket.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions