Skip to content

fix(ssh): add timeouts to SSH/WebSocket connections and per-channel state#219

Open
drew wants to merge 2 commits intomainfrom
ssh-flakyness/dn
Open

fix(ssh): add timeouts to SSH/WebSocket connections and per-channel state#219
drew wants to merge 2 commits intomainfrom
ssh-flakyness/dn

Conversation

@drew
Copy link
Collaborator

@drew drew commented Mar 11, 2026

Summary

  • Add timeout handling across the SSH connection stack to prevent indefinite hangs during connection establishment
  • Add SSH_PROXY_ACCEPT_TIMEOUT (5s) and SSH_PROXY_CONNECT_TIMEOUT (10s) for SSH proxy operations
  • Add UPSTREAM_HANDSHAKE_TIMEOUT (10s) for WebSocket upstream connections
  • Refactor SshHandler to maintain per-channel state (HashMap<ChannelId, ChannelState>) instead of global state, properly isolating pty_master, input_sender, and pty_request per channel

Changes

  • navigator-server/src/grpc.rs: Add timeouts to start_single_use_ssh_proxy for accept and connect operations
  • navigator-server/src/ssh_tunnel.rs: Extract establish_upstream function, add TunnelSetupError for cleaner error handling with proper HTTP status codes (504 for timeouts, 502 for other errors)
  • navigator-sandbox/src/ssh.rs: Refactor to per-channel state, add logging for reader drain timeouts
  • navigator-cli/src/ssh.rs and edge_tunnel.rs: Add upstream handshake timeouts

Testing

All SSH-related tests pass, including new tests for:

  • channel_data_routes_only_to_matching_channel
  • channel_eof_only_closes_matching_channel_input
  • cleanup_channel_removes_only_matching_state
  • establish_upstream_times_out_waiting_for_handshake_response
  • establish_upstream_rejects_non_ok_handshake_response

…tate

Add timeout handling across the SSH connection stack to prevent
indefinite hangs during connection establishment:

- Add SSH_PROXY_ACCEPT_TIMEOUT (5s) and SSH_PROXY_CONNECT_TIMEOUT (10s)
- Add UPSTREAM_HANDSHAKE_TIMEOUT (10s) for WebSocket upstream connections
- Extract establish_upstream for cleaner connection setup
- Refactor SshHandler to maintain per-channel state instead of global
  state, properly isolating pty_master, input_sender, and pty_request
  per ChannelId

This prevents clients from hanging indefinitely when connection
establishment fails or times out.
@drew drew self-assigned this Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant