Skip to content

Relay Upstream Reconnect#42

Merged
maxholman merged 3 commits intoblock65:mainfrom
maxholman:feat/relay-reconnect
Feb 28, 2026
Merged

Relay Upstream Reconnect#42
maxholman merged 3 commits intoblock65:mainfrom
maxholman:feat/relay-reconnect

Conversation

@maxholman
Copy link
Copy Markdown
Contributor

Relay Upstream Reconnect

Scope

crates/daemon/src/mode/relay.rs — add a reconnect loop around the upstream
connection so the relay recovers when its upstream peer drops.

Out of scope

  • Keeping downstream peers alive across upstream reconnects (Phase 13b/13c)
  • Role transitions or Indeterminate state (Phase 13b)
  • Any changes to entry, exit, or transport crates

Why

The relay connects to its upstream peer once, extracts the channel pair, then
runs the listener loop forever against those channels. If the upstream drops,
the channels go dead but the listener keeps running — downstream peers remain
connected and silently receive nothing. The relay never attempts to reconnect.

This is a hard blocker for Phase 13c (auto-negotiation), which assumes ordering
independence and relay resilience.

Notes

  • connect_with_retry already exists in crates/daemon/src/transport.rs — use it
  • On upstream disconnect: tear down listener, reconnect upstream, restart listener.
    Downstream peers will see a transport drop and reconnect via their own retry
    loops. This is acceptable — full session continuity across upstream reconnects
    is Phase 13b scope.
  • The outer reconnect loop must also retry forever (same backoff strategy as
    connect_with_retry), not return an error.
  • Log clearly on upstream loss and on reconnect so operators can observe the
    behaviour.
  • Tests: at minimum a unit/integration test that simulates upstream drop and
    verifies the relay re-establishes the connection.
  • just check must pass before review.

maxholman and others added 3 commits February 28, 2026 17:32
The relay now uses connect_loop instead of connect_with_retry, so when
the source peer connection drops the relay tears down the listener,
reconnects, and restarts. Connected peers reconnect via their own retry
loops. Uses ConnectionTasks::wait_for_disconnect() in a tokio::select!
to detect source peer death — matching the exit node pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces three quality-of-life improvements:

1. SocketAddrExt trait: Replaces free function normalize_socket_addr()
   with addr.normalize() method on SocketAddr.

2. From implementations: Replaces proto conversion helper functions
   (node_role_to_proto, peer_to_proto, route_to_proto, peer_event_to_proto)
   with standard From<T> impls for ergonomic .into() usage.

3. AsyncProtoRead/AsyncProtoWrite traits: Moves length-delimited protobuf
   reading/writing into extension traits on AsyncRead/AsyncWrite streams.
   Replaces verbose calls with stream.read_proto::<T>(mtu) and
   stream.write_proto(&msg).

These changes eliminate custom function names in favor of idiomatic Rust
patterns per C-METHOD guidelines, improving discoverability and reducing
API surface area.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ethods

Moves the massive control loop and connection handler into idiomatic
methods on their respective structs:

- ControlChannels::run: The control loop is now a method on the channel
  struct instead of a standalone function. Includes a helper handle_message
  method to break down the monolithic control matching. Removes the
  clippy::too_many_lines suppression.

- ConnectionParams::run: The entry node's connection handler is now an
  idiomatic method. Resolves complex borrow-checker issues by destructuring
  self and using standalone helper functions for data tasks and the main
  loop. Follows the C-METHOD guidelines and improves testability.

All clippy::too_many_arguments and clippy::too_many_lines suppressions
related to these functions have been removed. The architecture is now
more modular and follows Rust API Guidelines.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@maxholman maxholman merged commit 9c32aad into block65:main Feb 28, 2026
4 checks passed
maxholman added a commit that referenced this pull request May 6, 2026
Closes 8 open dependabot alerts via transitive lockfile bumps:

- rustls-webpki 0.103.9 -> 0.103.13 — CRL/URI/wildcard name-constraint
  handling and panic-on-malformed-CRL DoS (alerts #27 #42 #43 #47)
- rand 0.8.5 -> 0.8.6 and 0.9.2 -> 0.9.4 — soundness fix for callers
  using a custom logger with rand::rng() (#45 #46)
- h3 1.15.8 -> 1.15.11 (website) — path traversal via double-decoded
  %252e%252e in serveStatic and SSE event injection via unsanitized
  carriage return (#24 #25)

No direct dependency edits; all bumps are transitive.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant