Skip to content

v0.41.4

@somethingelseentirely somethingelseentirely tagged this 17 May 10:36
Two issues surfaced by the first successful sandbox sync via
0.41.3:

(1) trailing-dot leak through Endpoint::addr().

0.41.3 stripped dots in the outbound RelayMap iroh uses to
connect. But Endpoint::addr() — which we serialise into the
ticket printed by `pile net sync` startup — can return an
EndpointAddr whose TransportAddr::Relay still has the dotted
form. (Probably because the relay server reports its
canonical URL back to the client and iroh stores that for
self-address reporting.) The dotted URL then propagates to
the ticket consumer, whose iroh dials us via the dotted URL
and trips WAFs on their egress.

Adds `pub fn dot_stripped_endpoint_addr(addr) -> addr` —
applied to ep.addr() before ticket encoding, and applied to
parsed EndpointAddrs in trible's parse_peers and `pile net
pull <REMOTE>`. Outbound tickets are now dot-free; inbound
tickets get normalised even when minted by an unpatched peer.

(2) connection-per-RPC stall in fetch_reachable.

The previous BFS opened a fresh connect_authed for every
op_children parent and every op_get_blob child. Each auth
handshake is ~600ms (TLS + QUIC + OP_AUTH + verify_chain),
so a remote pile of 30+ blobs hit the pull_branch 30s
deadline before completing. The other Claude instance
observed exactly this: 39 connect → auth_ok → LocallyClosed
cycles in a 25s window, "pile sync never happens."

fetch_reachable now opens one authed connection at the top
and reuses it for every op_children and op_get_blob along
the BFS. iroh QUIC multiplexes streams cheaply, and our
SnapshotHandler::accept already accepts multiple sequential
bi-streams per connection (auth state is per-connection,
set on the first OP_AUTH stream, reused on every subsequent
stream).

Net effect: a 30-blob remote pull goes from ~30 connections
to 1. The "connect → auth_ok → LocallyClosed → reconnect"
cycle the other instance observed disappears entirely.

The DHT-fallback path in the per-blob fetch_blob helper is
no longer on this hot path; it remains available for the
single-blob NetCommand::Fetch RPC. DHT reachability hasn't
been load-bearing for any current use case, and per-blob
connects to different peers would defeat the reuse here.

All 8 workspace crates bumped 0.41.3 → 0.41.4. Source change
in triblespace-net + trible only. 17 lib + 2 + 3 integration
+ 1 doctest in triblespace-net all pass.

Worth filing upstream at iroh: normalise trailing dots in
RelayUrl::parse, which would let us drop both workarounds.
The full-completeness fix is in iroh.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Assets 2
Loading