fix(fast_io): prevent io_uring SEND deadlock under TCP backpressure (#1872) by oferchen · Pull Request #3551 · oferchen/rsync

oferchen · 2026-05-02T03:59:46Z

Summary

Fixes docs: document canonical source repository #1872. An IORING_OP_SEND on a back-pressured TCP socket would sit in the kernel until the send buffer drained, holding submit_and_wait and starving any concurrent RECV completion side, producing an apparent deadlock during daemon-mode bidirectional transfers.
Adds a PollAdd(POLLOUT) readiness gate (with a linked timeout) in front of every submit_send_batch flush. SEND SQEs only enter the kernel once the socket reports writable, so submit_and_wait is bounded by readiness rather than peer drain. This mirrors upstream rsync's select()-based bidirectional I/O loop in io.c:perform_io.
Transient EAGAIN/EWOULDBLOCK SEND CQEs and ETIME/ECANCELED poll CQEs re-arm the readiness wait without surfacing fatal errors.

Why poll-add gating instead of an interleaved peek loop

Each IoUringSocketWriter/IoUringSocketReader already owns a private ring (so the SEND ring never carries RECV SQEs). The cleanest containment for the deadlock is therefore inside the writer's own submit_send_batch: defer the SEND submission until the kernel signals room. This adds zero coupling between the writer and reader rings and keeps the existing API surface intact, whereas an interleaved peek/timeout loop would require sharing CQE handling between two rings that today have no shared state.

Test plan

CI: cargo nextest run -p fast_io --all-features -E 'test(io_uring) or test(socket) or test(send)' (Linux must exercise test_socket_send_no_deadlock_under_backpressure_1872; macOS/Windows skip via #[cfg(target_os = "linux")]).
CI: full nextest matrix (Linux musl, Windows, macOS) on stable.
CI: fmt + clippy + interop workflows.

The new regression test:

Opens a loopback TCP pair and shrinks SO_SNDBUF/SO_RCVBUF to 4 KiB.
Pre-fills the writer's kernel send buffer with a sentinel byte until a non-blocking write returns EAGAIN, then restores blocking mode.
Spawns separate writer and drain threads, each backed by its own io_uring ring (IoUringPolicy::Enabled).
Asserts the writer reports the full 64 KiB payload within a 20 s wall-clock deadline (without the fix this would loop indefinitely).
Skips gracefully when io_uring is unavailable.

…1872) Without a readiness gate, an IORING_OP_SEND on a back-pressured TCP socket can sit in the kernel until the send buffer drains. While that SQE is pending the writer ring's submit_and_wait() does not return, starving any concurrent RECV completion side and producing the apparent deadlock reported in #1872. Mirror upstream rsync's bidirectional select() strategy (io.c:perform_io) inside submit_send_batch by gating each batch with a PollAdd(POLLOUT) SQE plus a linked Timeout. SEND SQEs are only submitted after the socket reports writable, so submit_and_wait is bounded by readiness rather than peer drain. Transient EAGAIN/ETIME results re-arm the readiness wait instead of failing. Adds a Linux-only regression test that prefills a small TCP socket buffer until EAGAIN, then drives a concurrent SEND and RECV across their own io_uring rings under a 20s wall-clock guard. The test skips gracefully when io_uring is not available.

) EWOULDBLOCK is defined equal to EAGAIN on Linux, so the second match arm is unreachable under -D warnings. Collapse to a single equality check and rerun rustfmt on the writer thread spawn block.

…1872) Same unreachable-pattern fix as the prior commit for batching.rs - applied to the prefill loop in tests.rs. EWOULDBLOCK == EAGAIN on Linux, so the second arm is unreachable under -D warnings.

The previous payload formula `(i % 250) + 1` produced byte 0xAB at i=170, which collides with PREFILL_MARKER (0xAB). The drain side relies on this byte never appearing in the payload to recognize the boundary between the saturating prefill and the io_uring writer output. Replace the mapping with a range that walks 1..=255 while skipping the marker byte, so the debug_assert on line 1040 holds for any index.

…1872) (#3551)

github-actions Bot added the bug Something isn't working label May 2, 2026

oferchen added 4 commits May 2, 2026 08:20

fix(io_uring): collapse EAGAIN/EWOULDBLOCK match arm and reformat (#1872

f17b7b9

) EWOULDBLOCK is defined equal to EAGAIN on Linux, so the second match arm is unreachable under -D warnings. Collapse to a single equality check and rerun rustfmt on the writer thread spawn block.

fix(io_uring): collapse second EAGAIN/EWOULDBLOCK match in tests.rs (#…

1954ac1

…1872) Same unreachable-pattern fix as the prior commit for batching.rs - applied to the prefill loop in tests.rs. EWOULDBLOCK == EAGAIN on Linux, so the second arm is unreachable under -D warnings.

oferchen force-pushed the fix/io-uring-send-backpressure-1872 branch from 4e3eed2 to 58e4329 Compare May 2, 2026 05:20

oferchen merged commit 3088f26 into master May 2, 2026
37 checks passed

oferchen deleted the fix/io-uring-send-backpressure-1872 branch May 2, 2026 13:01

oferchen added a commit that referenced this pull request May 5, 2026

fix(fast_io): prevent io_uring SEND deadlock under TCP backpressure (#…

a3d6422

…1872) (#3551)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(fast_io): prevent io_uring SEND deadlock under TCP backpressure (#1872)#3551

fix(fast_io): prevent io_uring SEND deadlock under TCP backpressure (#1872)#3551
oferchen merged 4 commits into
masterfrom
fix/io-uring-send-backpressure-1872

oferchen commented May 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

oferchen commented May 2, 2026

Summary

Why poll-add gating instead of an interleaved peek loop

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant