feat(router): N-hop BFS in calculateLocalRoutes — deterministic local routing#2755
Merged
Merged
Conversation
… routing
Operator's observation: route-finder service randomization is the
dominant source of variance in mux-bw measurements (26x throughput
swing across intermediates for an identical request shape). Local
route calc is deterministic by construction — same input, same
output. But until now calculateLocalRoutes only supported 1-hop
(direct) and 2-hop routes. Anything deeper fell through to the
route-finder service (and its randomness).
This PR extends calculateLocalRoutes to find N-hop paths via BFS
over the transport-discovery graph, bounded by Config.MaxHops.
Combined with --local-route (SetForceLocalRoutes(true)) on the
caller side, this gives the operator a deterministic selector
for multi-hop runs:
cli visor ping mux-bw <peer> --local-route --min-hops 3
→ same 3-hop path every call, picked deterministically by
remote-PK string sort at each BFS expansion step.
Combined with DialOptions.ExcludeIntermediatePKs (Gamma's skycoin#2746),
the operator can run controlled N=2/4/8 disjoint-mux experiments
where each route is reproducibly chosen — the per-intermediate
variance becomes selectable rather than random.
IMPLEMENTATION
- Drop the previous "if MinHops > 2: error out" guard.
- Drop the previous 2-hop-only intermediate search loop.
- BFS state: { pk, path []Hop }, level = path length.
- Seed queue with localTps (1-hop paths to direct neighbors),
excluding any in ExcludeIntermediatePKs.
- At each level, check if any queued node is dst and the path
is >= MinHops. If yes, return that node's path (this is the
shortest acceptable path — BFS guarantees it).
- Expand each node by walking transportsByEdge[node.pk],
skipping already-visited nodes (no cycles).
- Deterministic ordering: sort seed and each level's children
by remote-PK string before queueing.
- Stop when a node hits MaxHops or the queue empties.
reverseHops + hopPath helpers added for forward-route flipping
and log-line rendering respectively. Existing 1-hop and self-
ping fast paths above the BFS are unchanged.
EFFECT
- mux-bw --local-route --min-hops 5 now actually finds a 5-hop
path locally without touching the route-finder service.
- Same call repeated returns the same path → reproducible
cross-call measurements.
- --min-hops 3, 4 work too (was previously erroring out per
skycoin#2753 since the 2-hop fallback couldn't satisfy them).
- Backwards compat for callers that don't set MinHops: minHops
defaults to 2 (same as previous behavior); BFS finds the
same 2-hop path the loop would have.
NOTES
- MaxHops is taken from r.conf.MaxHops (router-wide config,
defaults to 50 per router.go's maxHops constant — bounded
enough to prevent runaway BFS over a large global graph).
Future enhancement: per-call MaxHops in DialOptions if the
operator wants tighter control.
- The previous "2-hop only" comment in the function docstring
is updated to reflect N-hop BFS.
Build / gofmt / golangci-lint clean. Existing router tests pass
(37s wall — full pkg/router test suite).
5 tasks
0pcom
added a commit
that referenced
this pull request
May 21, 2026
…2758) #2755 introduced an N-hop BFS in calculateLocalRoutes. Empirical re-run against Beta and Gamma exposed a bug in the visited-set semantics: Gamma h=2 (--min-hops 2): 2-hop path returned ✓ Gamma h=3 (--min-hops 3): "BFS found no path" ✗ Gamma h=4 (--min-hops 4): same ✗ Gamma h=5 (--min-hops 5): same ✗ Beta h=2 (--min-hops 2): 4-hop path returned (despite shorter existing — visited-set picked an arbitrary suboptimal one due to sort+shadow interaction) ROOT CAUSE The previous BFS used a GLOBAL `visited[pk]` set. When intermediate P gets marked visited at depth K (because it sits on the path of length K to dst), the BFS refuses to expand through P at depth K+1 — even when no length-K path to dst exists and a length-K+1 path through P would be the only option. This is fine semantics for "shortest path from src to dst" (regular BFS) but WRONG for "any path from src to dst with length >= MinHops". FIX Replace global visited[pk] with two checks: 1. Per-path PK-membership for loop prevention (`pkInPath`): the same intermediate can appear in multiple paths but never twice in the same path. Cycles still impossible. 2. (pk, depth) expansion cache (`expandedAtDepth`): when multiple inbound paths converge on the same (node, depth) pair, only expand once. Children would be identical anyway. This caps the total work at O(|visors| × MaxHops) — same as the old global visited check, but parameterized by depth so longer paths through known intermediates aren't shadowed. EFFECT POST-MERGE cli visor ping mux-bw <peer> --local-route --min-hops 3 cli visor ping mux-bw <peer> --local-route --min-hops 4 cli visor ping mux-bw <peer> --local-route --min-hops 5 now find paths of the requested length when they exist in the local transport graph, instead of returning "no path" because shorter paths exist and shadowed the search. Determinism preserved (PK- string sort at each expansion step is unchanged). Build / gofmt / golangci-lint clean. Existing router tests pass (37s — full pkg/router test suite, unchanged from #2755 baseline).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Operator's observation: the route-finder service is the dominant source of variance in our mux-bw measurements (26× throughput swing across intermediates for an identical request shape). Local route calc is deterministic by construction — same input, same output. But until now `calculateLocalRoutes` only supported 1-hop (direct) and 2-hop routes; anything deeper fell through to the route-finder service (and its randomness).
This PR extends `calculateLocalRoutes` to find N-hop paths via BFS over the transport-discovery graph, bounded by `Config.MaxHops`. Combined with `--local-route` on the caller, the operator gets a deterministic selector:
```
cli visor ping mux-bw --local-route --min-hops 3
→ same 3-hop path every call, picked deterministically by
remote-PK string sort at each BFS expansion step.
```
Combined with `DialOptions.ExcludeIntermediatePKs` (#2746), the operator can run controlled N=2/4/8 disjoint-mux experiments where each route is reproducibly chosen — the per-intermediate variance becomes selectable rather than random.
Implementation
`reverseHops` + `hopPath` helpers added for forward-route flipping and log-line rendering.
Backwards compatibility
Test plan