fix: keep enriching provider records past stale ones#140
Merged
Conversation
Provider records returned by the DHT often arrive without addresses, so the previous 10-record cap was exhausted by stale records before any usable provider arrived. Keep the same record-arrival behavior but stop counting address-less records against the soft cap, drain longer, and let FindPeer enrichment run on the full request context. - daemon.go: split the per-provider worker into checkProvider; stream with count=0 from both DHT and IPNI; soft cap counts only providers that resolved at least one usable multiaddr; hard cap bounds total attempts. Mirror Kubo's `ipfs routing findprovs --num-providers` default of 20. - web/script.js: when every returned record lacks a multiaddr, show a hint that records are likely stale and point at Backend Config. - CHANGELOG.md: document the UI hint and the soft/hard cap behavior.
🚀 Build Preview on IPFS ready
|
A fresh libp2p host was spawned per provider, so each probe started with an empty peerstore and ignored every address the daemon had already learned about the peer. Kubo's bitswap dials from the daemon-wide peerstore. - checkProvider seeds each bitswap dial with addresses the daemon's main host already holds for the target peer, on top of record-supplied addrs and any FindPeer fallback. The probe host itself stays per-provider: vole.CheckBitswapCID installs a bitswap stream handler via host.SetStreamHandler, which replaces any prior handler, so a shared host would deliver concurrent probes' responses to the wrong receiver. - Bitswap dial timeout raised from 15s to 30s so NAT hole-punches and relay setup can complete. - web/script.js: the stale-records hint now requires at least one returned record, so it does not fire when the routing layer returned nothing at all. - CHANGELOG entries rewritten to lead with user-visible effects; the constant comment now references Kubo bitswap's per-round size rather than the one-shot findprovs CLI default. Smoke test on a cold local daemon for /ipns/ipfs.tech: before: 5 attempted, 0 with addrs, 0 Bitswap.Found after: 40 attempted, 12 with addrs, 2 Bitswap.Found
175245c to
7f8dab3
Compare
guillaumemichel
approved these changes
May 13, 2026
Co-authored-by: Guillaume Michel <guillaumemichel@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR brings ipfs-check closer to how real world Kubo provider lookup looks like during retrieval. cc @aschmahmann as he noticed https://check.ipfs.network/ being flaky with
/ipns/ipfs.tech(which has high drive-by provider churn).Problem
Provider records returned by the DHT often arrive without addresses. For CIDs with high provider churn like
/ipns/ipfs.tech, the previous 10-record cap was exhausted by stale records before any usable provider arrived. Even when the routing layer did return one, each per-provider probe ran on a fresh libp2p host with an empty peerstore, so addresses the daemon had already learned for that peer were ignored.Solution
Keep the same record-arrival behavior, but stop counting address-less records against the soft cap, drain longer, let
FindPeerenrichment run on the full request context, and seed each bitswap dial with addresses the daemon already knows for the peer.Details
Commit 90bd607 (soft/hard caps + UI hint)
daemon.go: split the per-provider worker intocheckProvider; stream withcount=0from both DHT and IPNI; soft cap of 20 counts only providers that resolved at least one usable multiaddr; hard cap of 40 bounds total attempts. 20 is sized for how many providers Kubo's bitswap accumulates across a real retrieval (10 per provider-search round, several rounds per session).web/script.js: when every returned record lacks a multiaddr, show a hint that records are likely stale and point at Backend Config.CHANGELOG.md: documents the UI hint and the soft/hard cap behavior.Commit 7f8dab3 (peerstore-seeded dials, longer timeout)
checkProviderseeds each bitswap dial with addresses the daemon's main host already holds for the target peer, on top of record-supplied addrs and anyFindPeerfallback. Mirrors how Kubo's bitswap dials.vole.CheckBitswapCIDinstalls a bitswap stream handler viahost.SetStreamHandler, which replaces any prior handler, so a shared host would route every probe's responses to whichever receiver was registered last and the others' messages would fail the sender-vs-target check insidebsReceiver.web/script.js: the "records likely stale" hint now requires at least one returned record, so it does not fire when the routing layer returned nothing at all.Difference
Smoke test on a cold local ipfs-check backend daemon for
/ipns/ipfs.tech(accelerated DHT off):Bitswap.Found=trueProduction with the accelerated DHT warmed up should do better.
Should we revisit other places?
Namely, does Kubo fail on
/ipns/ipfs.tech? I think No.I ran two fresh Kubo nodes and got identical results:
findprovsipfs ls /ipns/ipfs.techRouting.Type=auto, autoconf on)Routing.Type=dht, no delegated)Both Kubo runs effectively retrieve over the DHT for this CID.
cid.contactreturns 404. The DHT-only run had no delegated routers at all and still succeeded. The retrieval gap is therefore specific to ipfs-check, not the routing layer. It came from the cold per-probe host and the early cap on records, both of which this PR fixes.