Skip to content

fix(tunnel-node): batch drain correctness and lock contention#695

Merged
therealaleph merged 2 commits intotherealaleph:mainfrom
dazzling-no-more:fix/tunnel-node-batch-drain
May 4, 2026
Merged

fix(tunnel-node): batch drain correctness and lock contention#695
therealaleph merged 2 commits intotherealaleph:mainfrom
dazzling-no-more:fix/tunnel-node-batch-drain

Conversation

@dazzling-no-more
Copy link
Copy Markdown
Contributor

Summary

Four fixes to the tunnel-node batch handler — two are correctness bugs, two are latency.

The bugs

1. Cleanup race drops tail bytes on close (silent data loss)

When a session's read buffer exceeds TCP_DRAIN_MAX_BYTES (16 MiB) and upstream signals EOF, drain_now correctly returns eof = false and leaves the tail in the buffer for the next poll. But the cleanup loop read the raw inner.eof atomic directly, saw true, and removed the session — aborting the reader_task and dropping the tail. Hits high-bandwidth (1 Gbps+) VPS that fill the buffer between polls (issue #460-style).

2. Sessions-map lock held across upstream awaits

Phase-1 data ops held the global sessions map across last_active.lock, writer.lock, write_all, and flush — head-of-line-blocking every other batch and connect/close op for the duration of an upstream write. Phase-2 drain held the map across the per-session read_buf.lock().await. The udp_data branch already did the right thing (clone Arc, drop lock, then await); TCP did not.

3. Mixed TCP+UDP batch paid the slower side's deadline

tokio::join!(wait_tcp, wait_udp) is conjunctive — a TCP-ready burst still paid the UDP LONGPOLL_DEADLINE (15 s) before responding. Comment said "either side", code did "both sides".

4. Watcher tasks leaked under select! cancellation

wait_for_any_drainable only aborts its per-session watcher tasks in a trailing loop, past every cancellation point. With the phase-2 wait flipped to select! (fix 3), the loser arm's future drops and detaches its watchers (dropping a JoinHandle doesn't abort). Each orphan holds an Arc<…Inner> and can steal a notify_one() permit from a future batch's watcher.

Changes

All in tunnel-node/src/main.rs:

  • Cleanup now tracks eof'd sids from the drain function's return value, not the atomic. A session is only removed once the drain that returned eof = true has shipped to the client.
  • tcp_drains / udp_drains carry the session's Arc<…Inner> alongside the sid. Phase-2 drains directly through the Arc; the global sessions map is only re-locked once at the end for removal, and only when there's something to remove.
  • Phase-1 data branch mirrors udp_data: clone Arc under map lock, drop, then write/flush.
  • tokio::join!tokio::select! for the phase-2 wait, with explicit handling of empty-side cases so the empty-slice short-circuit can't fire the select arm before the populated side gets to wait.
  • New AbortOnDrop newtype wraps watcher JoinHandles. Cleanup happens on every exit path including cancellation; the trailing abort loop is gone.

Test plan

  • cargo test --manifest-path tunnel-node/Cargo.toml — 35/35 pass (33 pre-existing + 2 new).
  • cargo check --tests — clean, no new warnings.
  • New: batch_keeps_over_cap_session_until_tail_is_drained — primes a >16 MiB buffer + atomic eof, asserts the session survives the first poll with the 4096-byte tail intact, then asserts it's reaped on the second poll once drain_now actually returns eof. Locks down fix 1.
  • New: batch_tcp_ready_does_not_pay_udp_longpoll_deadline — TCP-ready / UDP-idle pure-poll batch must return in <1 s (vs. the 15 s LONGPOLL_DEADLINE). Locks down fix 3.

@github-actions github-actions Bot added the type: fix fix: PR — auto-applied by release-drafter label May 3, 2026
@therealaleph
Copy link
Copy Markdown
Owner

@dazzling-no-more — all four fixes are real bugs and the new tests lock them down precisely. Reviewed locally:

  • Fix 1 (eof tail drop): correct — the cleanup loop reading the atomic was the wrong contract. Tracking eof'd sids from drain_now's return value is the right authority. Test batch_keeps_over_cap_session_until_tail_is_drained reproduces it.
  • Fix 2 (lock contention): correct — phase-1 holding the sessions map across write_all was a real head-of-line block. The mirror-the-udp_data shape is what we want. The Arc<Inner> carry-through in tcp_drains / udp_drains is clean.
  • Fix 3 (TCP+UDP join → select): subtle but correct — tokio::join! waits for both, select! waits for first. Comment said "either side", code did "both". Test batch_tcp_ready_does_not_pay_udp_longpoll_deadline is a great regression guard.
  • Fix 4 (AbortOnDrop for watchers): correct — JoinHandle drop without abort just detaches in tokio; with the select! flip from fix 3 the loser arm's watcher tasks would leak Arc<Inner> permits. RAII abort is the right pattern.

35/35 tests pass locally. Merging — will ship in v1.9.9.

Thanks for the rigor on these; especially appreciate the per-fix root-cause + test pairing.


[reply via Anthropic Claude | reviewed by @therealaleph]

@therealaleph therealaleph merged commit 38d9d9f into therealaleph:main May 4, 2026
1 check passed
therealaleph added a commit that referenced this pull request May 4, 2026
…rectness

Android (#700 from @ilok67):
- Reordered MhrvVpnService.teardown() to call Native.stopProxy() FIRST. The previous order (tun2proxy.stop → tun.close → join → stopProxy) crashed SIGSEGV ~2s after Disconnect: tun2proxy's worker thread was blocked in native code on a SOCKS5 socket read; after the 2s+4s timeouts expired with the worker still alive, Native.stopProxy freed the runtime including that socket, and the worker hit use-after-free in the next read. The old comment claimed "runtime shutdown will knock the rest of the world over" — wrong, Native.stopProxy can't forcibly terminate a separate native thread, it just frees memory the other thread is still using. New order closes the socket first, the worker's blocking read returns with EOF, the worker exits cleanly through its error path, and the join is then near-instant.

tunnel-node (PR #695 from @dazzling-no-more, merged):
- Cleanup now tracks eof'd sids from drain_now's return value, not the raw atomic — was silently dropping the tail on >16 MiB buffers when EOF arrived between polls.
- Phase-1 `data` op no longer holds the sessions map across upstream write/flush — was head-of-line-blocking every other batch op.
- Mixed TCP+UDP batch wait switched from tokio::join! to tokio::select! — was paying the UDP LONGPOLL_DEADLINE (15 s) on TCP-ready bursts.
- Watcher tasks now wrapped in AbortOnDrop newtype — was leaking Arc<Inner> permits when select!'s loser arm dropped its future.
- 2 new regression tests, 35/35 pass.

Example configs:
- config.exit-node.example.json: added aistudio.google.com + ai.google.dev to default hosts (#701 — AI Studio sanctions Iran IPs).
- config.fronting-groups.example.json: PR #696 from @Shjpr9 added Reddit/Fastly/Pinterest/CNN/BuzzFeed family domains on the Fastly 151.101.x.x edge.

Tests: 179 lib + 35 tunnel-node green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dazzling-no-more dazzling-no-more deleted the fix/tunnel-node-batch-drain branch May 4, 2026 05:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: fix fix: PR — auto-applied by release-drafter

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants