Trigger gossip sync immediately on peer connection by quake · Pull Request #1271 · nervosnetwork/fiber

quake · 2026-04-14T00:25:32Z

Send TickNetworkMaintenance to self when a new peer connects, eliminating the initial 0-60s wait before gossip data sync begins. This is critical for WASM nodes that need peer address data from gossip shortly after connecting to a bootnode.

Fixes: #1269

Send TickNetworkMaintenance to self when a new peer connects, eliminating the initial 0-60s wait before gossip data sync begins. This is critical for WASM nodes that need peer address data from gossip shortly after connecting to a bootnode. Fixes: #1269

Copilot

Pull request overview

This PR reduces gossip bootstrap latency by triggering TickNetworkMaintenance immediately after a peer connects, rather than waiting for the next periodic maintenance interval (default 60s). This helps nodes (notably WASM) start gossip syncing soon after connecting to a bootnode and obtain peer address data earlier.

Changes:

On PeerConnected, enqueue GossipActorMessage::TickNetworkMaintenance immediately to kick off syncing with the new peer.
Add in-code rationale explaining the motivation (avoid initial 0–60s delay) and impact (WASM bootstrapping).

Copilot · 2026-04-14T00:32:06Z

+                // Immediately trigger network maintenance to start syncing with
+                // the new peer, rather than waiting for the next periodic
+                // TickNetworkMaintenance (up to 60s in production).
+                // This eliminates the initial gossip sync delay after connecting
+                // to a bootnode, which is critical for WASM nodes that need
+                // peer addresses from gossip data quickly.
+                myself.send_message(GossipActorMessage::TickNetworkMaintenance)?;


Triggering TickNetworkMaintenance immediately on PeerConnected introduces a race where the gossip actor may start an active syncer before ReceivedControl has been processed. The active syncer sends SendGossipMessage immediately in its pre_start, and send_message_to_peer ultimately calls state.get_control().expect(...), which will panic if state.control is still None. Consider gating the immediate tick on state.control.is_some(), or deferring the tick until after ReceivedControl is set (e.g., send the tick from the ReceivedControl handler if there are connected peers).

@copilot apply changes based on this feedback

Fixed in 48c2886. The TickNetworkMaintenance in PeerConnected is now gated on state.control.is_some(). If a peer connects before control arrives, the tick is sent from the ReceivedControl handler once state.peer_states is non-empty — covering both orderings without risk of a panic.

Fixed in commit 48c2886. The PeerConnected handler now gates the immediate TickNetworkMaintenance on state.control.is_some(). Additionally, the ReceivedControl handler triggers TickNetworkMaintenance if peer_states is non-empty, so both orderings (control arrives before or after peers connect) are handled correctly without any panic risk.

Copilot · 2026-04-14T00:32:07Z

+                // to a bootnode, which is critical for WASM nodes that need
+                // peer addresses from gossip data quickly.
+                myself.send_message(GossipActorMessage::TickNetworkMaintenance)?;


This behavior change is user-visible (removes the initial wait before gossip sync) and seems important for bootnode discovery; it would be good to add a regression test that fails without this change (e.g., set the configured gossip maintenance interval to a very large value and assert that after a peer connects, gossip-derived data/requests happen within a short deadline). This will help prevent future changes from reintroducing the startup sync delay.

@copilot apply changes based on this feedback

Added test_gossip_sync_starts_immediately_on_peer_connect in commit 91ab173. The test sets a 1-hour gossip maintenance interval, injects a node announcement into node A, then connects node B and verifies node B syncs the announcement within 2 seconds — far sooner than the interval would allow. Without the PeerConnected immediate tick, this test would fail.

…ontrol Agent-Logs-Url: https://github.com/nervosnetwork/fiber/sessions/5c445ccc-f1ef-4a6d-96aa-4254471fffb9 Co-authored-by: quake <8990+quake@users.noreply.github.com>

Agent-Logs-Url: https://github.com/nervosnetwork/fiber/sessions/42d6ab86-ae76-47da-8ccd-b535b89f7be1 Co-authored-by: quake <8990+quake@users.noreply.github.com>

gpBlockchain · 2026-04-14T05:54:38Z

This PR improves gossip sync speed, which is a nice improvement. But I don't think it fully fixes #1269.

In the original issue, connectPeer({"pubkey": "0291a6..."}) was called after sleeping 1 min and didn't return an error — meaning the gossip data was already synced. The actual problem is that the synced NodeAnnouncement for that node only contains a TCP address (/ip4/.../tcp/8119/...), not a wss address. Since WASM nodes can only connect via wss, the connection silently fails.

This might need a separate fix — either WASM nodes should filter for wss-compatible addresses when resolving via pubkey, or the target node needs to announce its wss address in NodeAnnouncement.

quake · 2026-04-14T05:57:39Z

it's resolved in other PR: #1270

…

On Tue, Apr 14, 2026 at 2:55 PM gpBlockchain ***@***.***> wrote: *gpBlockchain* left a comment (nervosnetwork/fiber#1271) <#1271 (comment)> This PR improves gossip sync speed, which is a nice improvement. But I don't think it fully fixes #1269 <#1269>. In the original issue, connectPeer({"pubkey": "0291a6..."}) was called after sleeping 1 min and *didn't return an error* — meaning the gossip data was already synced. The actual problem is that the synced NodeAnnouncement for that node only contains a TCP address ( /ip4/.../tcp/8119/...), not a wss address. Since WASM nodes can only connect via wss, the connection silently fails. This might need a separate fix — either WASM nodes should filter for wss-compatible addresses when resolving via pubkey, or the target node needs to announce its wss address in NodeAnnouncement. — Reply to this email directly, view it on GitHub <#1271 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAACGHVJRTZIZIGFN2GCGZL4VXHDJAVCNFSM6AAAAACXYDUGQ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DENBRGU3TSMRQGA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

gpBlockchain · 2026-04-14T08:12:55Z

it's resolved in other PR: #1270
…
On Tue, Apr 14, 2026 at 2:55 PM gpBlockchain @.> wrote: gpBlockchain left a comment (nervosnetwork/fiber#1271) <#1271 (comment)> This PR improves gossip sync speed, which is a nice improvement. But I don't think it fully fixes #1269 <#1269>. In the original issue, connectPeer({"pubkey": "0291a6..."}) was called after sleeping 1 min and didn't return an error — meaning the gossip data was already synced. The actual problem is that the synced NodeAnnouncement for that node only contains a TCP address ( /ip4/.../tcp/8119/...), not a wss address. Since WASM nodes can only connect via wss, the connection silently fails. This might need a separate fix — either WASM nodes should filter for wss-compatible addresses when resolving via pubkey, or the target node needs to announce its wss address in NodeAnnouncement. — Reply to this email directly, view it on GitHub <#1271 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAACGHVJRTZIZIGFN2GCGZL4VXHDJAVCNFSM6AAAAACXYDUGQ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DENBRGU3TSMRQGA . You are receiving this because you authored the thread.Message ID: @.>

There is another question, does the node broadcast the wss address? Because when I queried graph_nodes, I did not see the wss address

quake · 2026-04-14T10:53:06Z

There is another question, does the node broadcast the wss address? Because when I queried graph_nodes, I did not see the wss address

resolved in cca995d

@gpBlockchain

* fix: push limit to DB layer in list_payments to avoid unbounded memory usage (#1261) * fix: push limit to DB layer in list_payments to avoid unbounded memory usage get_payment_sessions_with_limit previously called collect_by_prefix without a limit, loading ALL payment session KV pairs into memory before applying .filter_map().take(limit) as iterator adapters. Add PrefixIterator::new_from() for cursor-based lazy iteration, and prefix_iter/prefix_iter_from helpers to FiberStore trait. Rewrite get_payment_sessions_with_limit as a simple iterator chain using the lazy batched PrefixIterator, which fetches only 100 entries at a time and stops as soon as enough results are collected. * Update crates/fiber-store/src/iterator.rs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * feat: update macOS build configuration for portability (#1237) * feat: update macOS build configuration for portability * refactor: simplify macOS build steps by removing Homebrew setup * renaming for keep portable * feat: add official Docker image support (#1244) * feat: add official Docker image support * ci: publish docker images to ghcr and docker hub * fix docker image readme * Bump fiber-rpc-gen to 0.1.22 (#1264) * network: onion & socks5 support for fiber (#1228) * onion & socks5 support for fiber * make CI happy * Add default configuration * fix tor service * use nested structure for onion and proxy config * send MaintainConnections message to NetworkActor when tor is reconnected * make fmt happy * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix some issues pointed out by copilot * fix openroc-json-generator * update ckb-testtool * update cargo.lock * merge newest changes * fix a race condition in onion service start * make fmt happy * Update crates/fiber-lib/src/fiber/onion_service.rs Co-authored-by: Eval Exec <execvy@gmail.com> * Change default onion external port * update * update .gitignore * isolate wasm configuration related to proxy and tor * add timeout check for start_onion_service & retry in `OnionService::start` * Added 3-second delay before sending MaintainConnections * move `proxy` and `onion` related configurations to their individual modules --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Eval Exec <execvy@gmail.com> * Fix fnn-migrate error messages and update README (#1249) * Initial plan * fix: update fnn-migrate flag from -p to -d in error message and README Agent-Logs-Url: https://github.com/gpBlockchain/fiber/sessions/a6b4f2a5-59de-4e8b-a07f-a3ffb49b7a48 Co-authored-by: gpBlockchain <32102187+gpBlockchain@users.noreply.github.com> * fix: show fiber data directory (without /store) in fnn-migrate error message Agent-Logs-Url: https://github.com/gpBlockchain/fiber/sessions/0de95c55-8af8-4a5c-b2f7-98b4c62c35b2 Co-authored-by: gpBlockchain <32102187+gpBlockchain@users.noreply.github.com> * Apply suggestion from @gpBlockchain * Apply suggestion from @gpBlockchain * Apply suggestion from @gpBlockchain * Update crates/fiber-store/src/db_migrate.rs Co-authored-by: gpBlockchain <32102187+gpBlockchain@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Quake Wang <quake.wang@gmail.com> * chore(deps): bump tokio from 1.50.0 to 1.51.1 Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.50.0 to 1.51.1. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](tokio-rs/tokio@tokio-1.50.0...tokio-1.51.1) --- updated-dependencies: - dependency-name: tokio dependency-version: 1.51.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * docs: refresh public node and network node documentation (#1266) * rename testnet-nodes.md and add network-nodes.md * docs: update README network node links * docs: refine pubkey-based node docs * docs: clarify node2 rpc discovery in public nodes guide * fix: abort funding on insufficient UDT cells (#1195) (#1253) When UDT cells are not yet indexed, the funding transaction builder would fail with a generic TxBuilderError that was silently swallowed (empty tx logged and ignored). This left UDT channels permanently stuck in the negotiating state. - Add FundingError::InsufficientCells variant that is non-temporary, so schedule_funding_retry aborts the channel instead of retrying - Reclassify AbsentTx as temporary so empty funding results get retried via the existing backoff mechanism - Extract map_tx_builder_error to convert the sentinel UDT message into InsufficientCells before it reaches the retry logic - Move FundingError tests to dedicated tests/error.rs module and add coverage for the new error mapping and classification Co-authored-by: ian <ian@cryptape.com> * feat: add gossip metrics benchmarks and CI integration (#1177) * feat: add gossip metrics benchmarks and CI integration - add gossip protocol metrics counters/histograms and active-sync/query observability - add criterion gossip benchmarks for multi-node propagation and sync-recovery - add tests/perf gossip benchmark flows driven by metrics (steady/burst + baseline/compare) - integrate benchmark CI with metrics-enabled startup and bootstrap gossip regression gating * fix: stabilize gossip benchmark runs and move gossip perf to dedicated workflow * chore: ignore generated gossip perf benchmark artifacts * fix: tighten gossip duplicate/rejected metrics and perf timing * ci: move perf benchmark artifacts under tests/perf/artifacts * refactor: move gossip metrics helpers into dedicated modules * chore(deps): bump rand from 0.8.5 to 0.9.3 in /tests/deploy/udt-init Bumps [rand](https://github.com/rust-random/rand) from 0.8.5 to 0.9.3. - [Release notes](https://github.com/rust-random/rand/releases) - [Changelog](https://github.com/rust-random/rand/blob/0.9.3/CHANGELOG.md) - [Commits](rust-random/rand@0.8.5...0.9.3) --- updated-dependencies: - dependency-name: rand dependency-version: 0.9.3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> * Trigger gossip sync immediately on peer connection (#1271) * Trigger gossip sync immediately on peer connection Send TickNetworkMaintenance to self when a new peer connects, eliminating the initial 0-60s wait before gossip data sync begins. This is critical for WASM nodes that need peer address data from gossip shortly after connecting to a bootnode. Fixes: #1269 * Fix gossip tick race: gate on control.is_some(), defer from ReceivedControl Agent-Logs-Url: https://github.com/nervosnetwork/fiber/sessions/5c445ccc-f1ef-4a6d-96aa-4254471fffb9 Co-authored-by: quake <8990+quake@users.noreply.github.com> * Add regression test for immediate gossip sync on peer connect Agent-Logs-Url: https://github.com/nervosnetwork/fiber/sessions/42d6ab86-ae76-47da-8ccd-b535b89f7be1 Co-authored-by: quake <8990+quake@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: quake <8990+quake@users.noreply.github.com> * feat: add addr_type parameter to connect_peer RPC for transport type filtering (#1270) * feat: add addr_type parameter to connect_peer RPC for transport type filtering When connect_peer is called with only a pubkey, the node randomly selects an address from the peer's known addresses. In WASM environments, this may select a non-WSS address that is unsupported. Add an optional addr_type parameter (tcp/ws/wss) to allow callers to filter addresses by transport type before random selection. * refactor: decouple actor protocol from JSON types and improve error messages - Introduce MultiAddrTransport enum in fiber-types as the internal transport filter type, keeping fiber-json-types::MultiAddrType as the RPC-facing type only - Add From<MultiAddrType> for MultiAddrTransport conversion in fiber-json-types convert.rs (behind conversion feature) - Update NetworkActorCommand::ConnectPeerWithPubkey to use MultiAddrTransport instead of fiber_json_types::MultiAddrType - Convert at the RPC boundary in rpc/peer.rs via .map(Into::into) - Add NoMatchingAddress(Pubkey, MultiAddrTransport) error variant to distinguish 'peer not found' from 'peer has no addresses matching the requested transport type' - Rename matches_addr_type -> matches_addr_transport for clarity * refactor: use tentacle::utils::TransportType and rename MultiAddrType Per review: reuse the existing tentacle::utils::TransportType instead of introducing a custom MultiAddrTransport in fiber-types. - Remove MultiAddrTransport enum and Display impl from fiber-types - Remove From<MultiAddrType> conversion from fiber-json-types/convert.rs - Rename MultiAddrType -> TransportType in fiber-json-types/peer.rs - Update NetworkActorCommand::ConnectPeerWithPubkey to use tentacle::utils::TransportType directly - Remove cfg(not(wasm32)) gate from TransportType import and find_type() (both are available on all targets in tentacle) - Replace matches_addr_transport() with find_type() == transport - Add to_transport_type() conversion in rpc/peer.rs at the RPC boundary - Update NoMatchingAddress error to use tentacle::utils::TransportType - Update TypeScript types and regenerate RPC docs * fix: allow DNS-based WSS addresses to pass the private address filter The private address filter used multiaddr_to_socketaddr() which only handles Ip4/Ip6 protocols, silently dropping DNS-based addresses like /dns4/example.com/tcp/443/wss. This prevented WSS addresses from being broadcast in node announcements. Add is_addr_reachable() helper that treats Dns4/Dns6 addresses as always reachable (since DNS implies a public endpoint), while preserving the existing IP-based reachability check. Applied at all three filter locations: announcement creation, graph ingestion, and gossip processing. * feat(cch): default final TLC expiry deltas to 60 hours (#1258) * feat(cch): default final TLC expiry deltas to 60 hours Raise DEFAULT_BTC_FINAL_TLC_EXPIRY_DELTA_BLOCKS to 360 (~10 min/block) and DEFAULT_CKB_FINAL_TLC_EXPIRY_DELTA_SECONDS to 216,000. Update CCH actor tests that assumed the previous 30h defaults. Made-with: Cursor * test(cch): replace expiry magic numbers with named constants Use BTC_BLOCK_TIME_SECS, DEFAULT_BTC_FINAL_TLC_EXPIRY_DELTA_BLOCKS, DEFAULT_CKB_FINAL_TLC_EXPIRY_DELTA_SECONDS, and per-test scenario consts in CCH actor tests. Made-with: Cursor --------- Co-authored-by: ian <ian@cryptape.com> * Local RPC method not found should not return unauthorized (#1235) * chore: bump version to v0.8.1 (#1274) --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Yukang <moorekang@gmail.com> Co-authored-by: Officeyutong <yt.xyxx@gmail.com> Co-authored-by: Eval Exec <execvy@gmail.com> Co-authored-by: gpBlockchain <32102187+gpBlockchain@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: sunchengzhu <36075573+sunchengzhu@users.noreply.github.com> Co-authored-by: ian <me@iany.me> Co-authored-by: ian <ian@cryptape.com> Co-authored-by: swananan <jt26wzz@gmail.com> Co-authored-by: quake <8990+quake@users.noreply.github.com> Co-authored-by: jjy <jjyruby@gmail.com>

quake requested review from chenyukang, Copilot and gpBlockchain April 14, 2026 00:25

Copilot started reviewing on behalf of quake April 14, 2026 00:26 View session

Copilot AI reviewed Apr 14, 2026

View reviewed changes

Copilot started work on behalf of quake April 14, 2026 00:42 View session

Copilot started work on behalf of quake April 14, 2026 00:43 View session

Fix gossip tick race: gate on control.is_some(), defer from ReceivedC…

48c2886

…ontrol Agent-Logs-Url: https://github.com/nervosnetwork/fiber/sessions/5c445ccc-f1ef-4a6d-96aa-4254471fffb9 Co-authored-by: quake <8990+quake@users.noreply.github.com>

Copilot finished work on behalf of quake April 14, 2026 01:02

Add regression test for immediate gossip sync on peer connect

91ab173

Agent-Logs-Url: https://github.com/nervosnetwork/fiber/sessions/42d6ab86-ae76-47da-8ccd-b535b89f7be1 Co-authored-by: quake <8990+quake@users.noreply.github.com>

Copilot finished work on behalf of quake April 14, 2026 01:20

jjyr approved these changes Apr 15, 2026

View reviewed changes

quake merged commit 0d2b574 into develop Apr 15, 2026
102 of 104 checks passed

quake deleted the agent/quake-opencode-wsl/8ee5820a branch April 15, 2026 23:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trigger gossip sync immediately on peer connection#1271

Trigger gossip sync immediately on peer connection#1271
quake merged 3 commits intodevelopfrom
agent/quake-opencode-wsl/8ee5820a

quake commented Apr 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 14, 2026

Uh oh!

quake Apr 14, 2026

Uh oh!

Copilot AI Apr 14, 2026

Uh oh!

Copilot AI Apr 14, 2026

Uh oh!

Copilot AI Apr 14, 2026

Uh oh!

quake Apr 14, 2026

Uh oh!

Copilot AI Apr 14, 2026

Uh oh!

gpBlockchain commented Apr 14, 2026

Uh oh!

quake commented Apr 14, 2026 via email

Uh oh!

gpBlockchain commented Apr 14, 2026

Uh oh!

quake commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

quake commented Apr 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

quake Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

quake Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

gpBlockchain commented Apr 14, 2026

Uh oh!

quake commented Apr 14, 2026 via email

Uh oh!

gpBlockchain commented Apr 14, 2026

Uh oh!

quake commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants