Capability Handshake & Proper Ping#38
Merged
maxholman merged 6 commits intoblock65:mainfrom Feb 28, 2026
Merged
Conversation
Phase 13 design documents covering capability handshake (13a), indeterminate mode (13b), auto-negotiation (13c), hints (13d), route announcement (13e), security posture (13f), and mode transitions (13g). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ExitNodeHello → Handshake message with nested Capabilities struct - Add Ping/Pong messages for latency measurement - Add RoleHint and HintLevel for future role negotiation - Move NodeRole to data.proto (canonical location), remove DataNodeRole - Rename ROLE_UNKNOWN → ROLE_INDETERMINATE (phase 13b terminology) - control.proto references wallhack.data.NodeRole via import - management.proto: add tun_capable/listening/connecting to PeerInfo and StatusResponse, reserve removed capability field Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- hmac.rs: generic HMAC-SHA256 compute/verify using ring - psk.rs: PSK proof via TLS channel binding (RFC 9266 tls-exporter), serialization uses protobuf encode_to_vec for determinism - Rename bridge.rs → protocol.rs, add ControlChannels struct, bidirectional handshake support in control loop, ping/pong latency tracking, mandatory protocol tests - types.rs: NodeRole import moved from control to data module, ROLE_UNKNOWN → ROLE_INDETERMINATE Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- QUIC and WS clients send Handshake with PSK proof on connect - QUIC and WS servers read peer Handshake, send own back - AcceptResult carries peer_handshake, latency_rx, channel_binding - ConnectResult carries peer handshake via oneshot - PeerInfo and NodeStatus use Capabilities struct instead of 3 bools - update_capabilities() takes &Capabilities - Handler and IPC layer updated for Capabilities grouping - Add zeroize dep for PSK memory safety Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- All daemon modes (entry, exit, relay) populate local_handshake with Capabilities struct in ServerOptions - Entry mode validates PSK proof and updates peer capabilities - handle_connection refactored: ConnectionParams struct, validate_handshake(), spawn_data_tasks(), run_connection_loop() - Zeroizing<String> for PSK across daemon config pipeline - CLI and API handlers updated for Capabilities field access Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move `serialize_handshake_fields` free function into `Handshake::serialize_for_proof()` method (better API locality). - Bump binary size thresholds to ~1% above current measured sizes after handshake/PSK/proto additions (+59KB). - Fix transport-modes copy (WebSocket RTT description). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4 tasks
maxholman
added a commit
that referenced
this pull request
May 6, 2026
Sweep of website/ deps to latest within ranges, plus a vite downgrade from 8 -> 7 to match astro's transitive vite (7.3.2) and avoid a rolldown regression with @tailwindcss/vite 4.2.4. Closes alerts #28 #29 #30 #31 #33 #34 #35 #36 #37 #38 #39 #40 #44 #48 covering vite, picomatch, postcss, yaml, astro, smol-toml. - vite ^8.0.1 -> ^7.3.2 (drops the now-redundant vite 8 lineage; astro pulls 7.3.2 transitively, which is the patched version) - astro 6.0.6 -> 6.2.2 (#44) - @tailwindcss/vite 4.2.2 -> 4.2.4 - smol-toml: lockfile bump to 1.6.1 (#28) - postcss: lockfile bump to 8.5.14 (#48) - picomatch: lockfile bumps to 2.3.2 + 4.0.4 (#29 #30 #39 #40) - yaml is now omitted entirely (it was an optional vite peer) Verified: pnpm build succeeds; no @tailwindcss/vite peer-dep warnings.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Capability Handshake & Proper Ping
Scope
Wire protocol foundation: replace one-way
ExitNodeHellowith bidirectionalCapabilityexchange, wire up proper ping/pong with latency tracking, removedead capability code, rename
bridge.rs.Touches:
crates/wire/proto/(data, control, management protos),crates/core/src/transport/protocol.rs(renamed from bridge.rs),crates/core/src/control/peers.rs,crates/core/src/node_api.rs,crates/daemon/src/mode/{entry,exit,relay}.rs,crates/core/src/client/{quic,ws}/mod.rs,crates/core/src/server/{quic,ws}/mod.rs.Out of scope
--prefer,--exclude-role,--fixed-role) (Phase 13d)RoleTransitionmessages (Phase 13g)--psksuppressing auto-negotiate) (Phase 13f)routesandhintfields are defined in the proto but not acted onWhy
ExitNodeHellois one-directional (connector → acceptor only), leaks the PSKin plaintext inside the TLS tunnel, and carries no capability information. Every
subsequent auto-negotiation phase depends on both sides having a full picture of
each other's capabilities before tunnel traffic flows. This phase establishes
that foundation and also fixes the PSK authentication to use HMAC channel
binding instead of plaintext.
Notes
docs/tasks/13a-capability-handshake.md— items 1-6 are theimplementation checklist. Tests section is mandatory.
export_keying_material()channel binding.Requires threading TLS session handle from quinn/rustls into handshake path.
After validation, zeroize the plaintext PSK from memory.
Capabilityimmediately aftertransport connects and wait to receive. No ordering — bidi control stream
supports this already.
Capability directly in
accept()before spawningrun_control_loop,bypassing
run_control_stream_acceptor. The new bidirectional exchange needsto unify this — both sides should go through the same exchange path.
run_control_loop.update_latency()andPeerInfo.latency_msexist in the registry. Missinglink: pong receipt → latency calculation →
registry.update_latency().bridge.rs→protocol.rs— done.Progress (impl agent)
Branch:
feat/capability-handshake(frommainatc712ddc)Build status:
just checkpasses — fmt, clippy (slim + default), tests(172 pass), musl cross-build, VM smoke/resilience tests all green. No warnings.
Completed (prior sessions)
ExitNodeHello→Capabilityin all protos.NodeCapability,set_relay_capability(),capability_to_proto()all removed. Registry usestun_capable/listening/connectingfields directly.Capabilityviacontrol stream immediately after connecting.
Capabilityasfirst control message with 10s timeout.
AcceptResultcarriespeer_capability.entry.rsvalidates PSK (temporary raw bytes),exit.rsreadspeer_capability().local_capability: Option<Capability>toServerOptions.accept()now writes server's Capability to controlstream after reading client's Capability (before spawning control loop).
local_capabilityin theirServerOptionsconstruction.run_control_stream_initiatornow acceptscapability_tx: Option<oneshot::Sender<Capability>>and passes it torun_control_loop.loop; the receiver is stored in
ConnectResult.ConnectResultgainedpeer_capability_rx: Option<oneshot::Receiver<Capability>>withtake_peer_capability_rx().exit.rs: renamed_node_name→node_name(now used forlocal_capability).run_control_loopsignature changed:pong_tx: Option<&mpsc::Sender<Pong>>→
latency_tx: Option<&mpsc::Sender<f64>>.now_ms - pong.timestamp_msand sends thef64vialatency_tx(instead of forwarding rawPong).accept()createslatency_tx/latency_rxmpsc channel;passes
latency_txto control loop,latency_rxviaAcceptResult.entry.rshandle_connectiontakeslatency_rxparameter and uses it inthe select loop: periodic latency updates registry, one-shot REPL ping
injects Ping via
control_txand stashes oneshot inpending_ping.Completed (previous session)
just checkpasses — DONECompleted (this session — review fixes)
Capability→Handshakerename — DONEmessage Capability→message Handshakeindata.proto.capability = 1→handshake = 1incontrol.protoControlMessage.peer_capability→peer_handshake,local_capability→local_handshake,with_capability→with_handshake,update_capability→update_handshake,serialize_capability_fields→serialize_handshake_fields, etc.RoleHintproto shape fixed — DONEenum RoleHint→message RoleHint { HintLevel level = 1; DataNodeRole target = 2; }enum HintLevel(PREFER/EXCLUDE/FIXED) andenum DataNodeRoleindata.proto.psk.rsserialization and test to handleRoleHintmessage fields.Channel proliferation resolved — DONE
ControlChannelsstruct inprotocol.rsgroupingoutgoing_rx,handshake_tx,latency_tx,control_response_tx.run_control_loop,run_control_stream_initiator,run_control_stream_acceptorsignatures simplified from 7 args to 4.
handle_connectionrefactored — DONEConnectionParams<T>struct groups 8 arguments.validate_handshake()extracted: PSK proof + identity validation.spawn_data_tasks()extracted: incoming/outgoing data task spawning.run_connection_loop()extracted: manager + ping/latency select loop.#[allow(clippy::too_many_arguments, clippy::too_many_lines)]removed.PSK zeroize — DONE
zeroize = "1"dep towallhack-core,wallhackd,wallhack-cli.Option<String>→Option<Zeroizing<String>>across entire PSK pipeline:GlobalConfig,SecurityParams,ServerConfig,ClientConfig,QuicClient,QuicServer,WsServer,ConnectionParams.Mandatory tests — DONE
test_handshake_exchange: concurrent bidirectional handshake viaMockBiStream.test_malformed_handshake: non-Handshake first message →handshake_txunfulfilled.test_ping_latency: Ping auto-reply verified; Pong with past timestamp →latency computed and forwarded via
latency_tx.test_periodic_ping: timer fires at configured interval (start_paused).just checkpasses — All quality gates green (fmt, clippy slim+default,tests, musl cross-build, VM smoke/resilience, website build).
Completed (cleanup session)
Renamed
cap_variables tohandshake— DONEcap_tx/cap_rx→handshake_tx/handshake_rxin QUIC and WS clients.cap_result→handshake_resultin QUIC and WS servers.local_cap/let mut cap→local/let mut handshakein server accept paths.capvariables building Handshake structs →handshakein both clients.Capability vs handshake naming audit — DONE
update_handshake()→update_capabilities()(peers.rs + entry.rs call site).capparameter inserialize_handshake_fields,compute_proof,verify_proof→
handshake.cap/cap1/cap2→handshake/handshake1/handshake2.different_capabilities_produce_different_proofs→different_handshakes_produce_different_proofs.DataNodeRolededup — DONENodeRolefromcontrol.prototodata.proto. RemovedDataNodeRole.ROLE_UNKNOWN→ROLE_INDETERMINATE(Phase 13b terminology).RoleHint.targetnow usesNodeRoledirectly (no duplicate enum).control.protoreferenceswallhack.data.NodeRolevia existing import.types.rsandhandler.rsnow import fromwallhack_wire::data::NodeRole.Capabilities struct grouping — DONE
message Capabilities { tun_capable, listening, connecting }todata.proto.Handshakenow nestsCapabilities capabilities = 1(field numbers renumbered).outgoing connections" (not "Started with --listen/--connect").
PeerInfo(peers.rs) andnode_api::PeerInfo/NodeStatususeCapabilitiesinstead of 3 separate bools.update_capabilities()takes&Capabilitiesinstead of 3 bools.serialize_handshake_fields()replaced withencode_to_vec()— uses protobuf'sdeterministic encoding instead of manual byte packing.
psk_proofzeroed beforeencoding. Removed ~40 lines of manual serialization.
management.proto) kept flat — it's the CLI-facing IPCboundary, flat fields are appropriate there.
Remaining work
None — all review items addressed. Ready for re-review.
Review Notes
Addressed
run_quic_relay_capability— renamed torun_quic_exit_both(andsiblings). Uses
ConnectivitySpec::Bothterminology.ConnectivitySpec::Both.bindingin hmac tests — renamed tocontextthroughout hmac.rs(generic module uses generic names). Added known-output test that verifies
actual HMAC-SHA256 bytes.
Outstanding (resolved)
Capability→Handshakerename — the message carries identity,capabilities, authentication, and topology metadata. "Capability" described
one of eight fields. Renamed to
Handshakein proto, spec docs (13a-13g),and parent design doc. Impl agent must rename in Rust code to match.
struct).
channel_bindingterminology — standard RFC 9266 term. No change.ConnectivitySpec::Both= relay, but role derivation is Phase 13c scope.Leave as-is for now.
Review (Gemini) — All Fixed
test_handshake_exchange,test_malformed_handshake,test_ping_latency,test_periodic_ping.Zeroizing<String>across entire config pipeline.RoleHintproto shape — ✅message RoleHint { HintLevel level; NodeRole target; }.ControlChannelsstruct, 7 args → 4.handle_connectionrefactored into sub-functions.Capability→Handshakerename — ✅ Proto + all Rust code.Questions for review agent (all resolved)
local_handshake: Option<Handshake>naming — correct. Stores the fullHandshake message (name, version, psk_proof, routes, hint, plus capability
flags). "Handshake" accurately describes the message type.
capnamed variables — all renamed tohandshake(full name, no abbreviations).update_handshake()→update_capabilities().serialize_handshake_fieldsstays (serializes the full Handshake, not just capabilities).DataNodeRoleduplication — removed.NodeRolemoved todata.proto,ROLE_UNKNOWN→ROLE_INDETERMINATE,control.protouses via import.