split blocksync reactor into 2 modes. by pompon0 · Pull Request #3511 · sei-protocol/sei-chain

pompon0 · 2026-05-27T16:32:46Z

Autobahn nodes will need to be able to send pre-giga blocks to pre-giga nodes during transition period (so that pre-giga nodes can blocksync to the upgrade height). However all the remaining parts of the blocksync reactor should be disabled. This pr extracts the tendermint-only blocksync logic to syncController which is optional part of the blocksync reactor. Additionally I have

refactored the blocksync reactor to use structured concurrency
fixed bug in poolRoutine which was silently terminating blocksync in case of block validation failure (now it will retry fetching a block)
fixed busy loop in blocksync pool.run.

cursor · 2026-05-27T16:33:36Z

PR Summary

High Risk
Large refactor of blocksync startup, P2P message routing, pool concurrency, and consensus handoff—core node sync path with behavior changes (monotone catch-up, validation retry).

Overview
Blocksync is split into an always-on query path and an optional active sync path. Reactor keeps the single blocksync P2P channel and serves BlockRequest / StatusRequest from local store even when catch-up is off; optional SyncerConfig wires a syncController that owns the pool, outbound requests, block apply, consensus handoff, and lag metrics. NewReactor no longer takes block executor / consensus reactor directly—those move into SyncerConfig (utils.Option).

Pool and concurrency are reworked. BlockPool drops BaseService, owns internal request/error channels, and runs via pool.run with scope + per-height bpRequester tasks (utils.Watch / Option). Caught-up uses a monotone max peer height so retracted peer heights do not falsely mark the node synced. poolRoutine no longer exits on validation failure—it evicts bad peers and continues (new tests).

Wiring and RPC: Node construction passes SyncerConfig; SwitchToBlockSync no longer takes context. RPC status reads state-sync metrics from StateSyncReactor instead of a Metricer interface (mock removed). Mempool tests use utils.TestRng / GenBytes for deterministic concurrency tests.

^{Reviewed by Cursor Bugbot for commit 4f62265. Bugbot is set up for automated code reviews on this repo. Configure here.}

github-actions · 2026-05-27T16:34:39Z

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`✅ passed`	`✅ passed`	`✅ passed`	May 29, 2026, 10:12 AM

codecov · 2026-05-27T16:36:02Z

Codecov Report

❌ Patch coverage is 72.72727% with 96 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.22%. Comparing base (3936ac9) to head (4f62265).
⚠️ Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
sei-tendermint/internal/blocksync/reactor.go	64.41%	71 Missing and 8 partials ⚠️
sei-tendermint/internal/blocksync/pool.go	92.72%	6 Missing and 2 partials ⚠️
sei-tendermint/internal/rpc/core/status.go	0.00%	8 Missing ⚠️
sei-tendermint/node/node.go	90.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3511      +/-   ##
==========================================
- Coverage   59.04%   58.22%   -0.82%     
==========================================
  Files        2199     2129      -70     
  Lines      182096   173895    -8201     
==========================================
- Hits       107513   101252    -6261     
+ Misses      64933    63659    -1274     
+ Partials     9650     8984     -666

Flag	Coverage Δ
sei-chain-pr	`63.22% <72.72%> (?)`
sei-db	`70.41% <ø> (ø)`
sei-db-state-db	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
sei-tendermint/internal/rpc/core/env.go	`76.15% <ø> (ø)`
sei-tendermint/internal/statesync/reactor.go	`71.72% <100.00%> (+1.26%)`	⬆️
sei-tendermint/libs/utils/option.go	`90.90% <100.00%> (ø)`
sei-tendermint/node/node.go	`65.37% <90.00%> (+0.17%)`	⬆️
sei-tendermint/internal/blocksync/pool.go	`89.53% <92.72%> (+5.00%)`	⬆️
sei-tendermint/internal/rpc/core/status.go	`73.62% <0.00%> (ø)`
sei-tendermint/internal/blocksync/reactor.go	`65.10% <64.41%> (+1.01%)`	⬆️

... and 72 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

cursor · 2026-05-28T10:50:29Z

-		return err
+		switch update.Status {
+		case p2p.PeerStatusUp:
+			s.channel.Send(wrap(&pb.StatusRequest{}), update.NodeID)


Peer-up handler sends StatusRequest instead of StatusResponse

Medium Severity

On PeerStatusUp, the old code sent a StatusResponse (advertising local height to the new peer), enabling the remote node to immediately learn this node's height. The new code sends a StatusRequest instead, which asks the peer for their height. While this node still learns about the peer, the peer no longer receives an immediate height advertisement. Peers that rely on receiving a StatusResponse on connection (e.g., pre-giga nodes during transition) won't learn this node's height until the next periodic StatusRequest broadcast (every 10 seconds).

^{Reviewed by Cursor Bugbot for commit 3206044. Configure here.}

this will be a minor temporary regression until upgrade completes.

wen-coding · 2026-05-28T17:54:41Z

-					"height", first.Height,
-					"err", err)
-				return
+				return consensusHandoff{}, fmt.Errorf("first.MakePartSet(%d): %w", first.Height, err)


We used to log error and stop blocksync, now we return an error, which may propagate back to the caller, will this cause a panic?
Is that intended behavior?

silently stopping blocksync on error here will just make the node halt, so panic is better. Afaict MakePartSet can fail only due to serialization error here. Since the blocks were already deserialized to get to this point, serialization is expected to always succeed.

wen-coding · 2026-05-28T20:30:42Z

+		errorsCh:     make(chan peerError, maxPeerErrBuffer), // NOTE: capacity should exceed peer count.
 		lastSyncRate: 0,
 		router:       router,
+		reportErr:    reportErr,


If we can restructure AddBlock (see below), then maybe the test can read the error channel instead and we don't need this reportErr argument?

wen-coding · 2026-05-28T20:36:47Z

 // height of the extended commit and the height of the block do not match, we
 // do not add the block and return an error.
 // TODO: ensure that blocks come in order for each peer.
 func (pool *BlockPool) AddBlock(peerID types.NodeID, block *types.Block, blockSize int) error {


How about we do:

func (pool *BlockPool) AddBlock(...) error { pendingErr, pendingPeerID, returnErr := pool.addBlockLocked(...) if pendingErr != nil { pool.sendError(pendingErr, pendingPeerID) // statically outside the lock } return returnErr } func (pool *BlockPool) addBlockLocked(...) (pendingErr error, pendingPeerID types.NodeID, returnErr error) { pool.mtx.Lock() defer pool.mtx.Unlock() // ... logic ... }

Then maybe we don't need the test which requires reportErr injection?

The point of injection was to test that sendError is not performed under lock, which requires being able to pause the goroutine while inside sendError. This is a preexisting test. Alternatively I can remove the test, I suppose. Waiting for goroutines to block on channel is a fragile logic to have.

cursor

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 5891958. Configure here.}

pompon0 added 4 commits May 27, 2026 17:10

WIP

090509c

WIP

aac57aa

WIP

f23fadf

WIP

9e0f121

pompon0 added the non-app-hash-breaking label May 27, 2026

pompon0 requested review from sei-will and wen-coding May 27, 2026 16:34

cursor Bot reviewed May 27, 2026

View reviewed changes

Comment thread sei-tendermint/internal/blocksync/reactor.go

Comment thread sei-tendermint/internal/blocksync/reactor.go Outdated

pompon0 added 2 commits May 27, 2026 19:20

terminating pool early

27b8ee0

tests

d130dfd

cursor Bot reviewed May 27, 2026

View reviewed changes

Comment thread sei-tendermint/internal/blocksync/pool.go Outdated

pompon0 added 5 commits May 28, 2026 09:54

attempt to remove more

f93ee9f

blocksync pool refactor

1dc6d72

regression fix

49271a0

another regression

93f2ecd

mempool test flake fix

244347b

cursor Bot reviewed May 28, 2026

View reviewed changes

Comment thread sei-tendermint/internal/blocksync/reactor.go

IgnoreCancel

3206044

cursor Bot reviewed May 28, 2026

View reviewed changes

busy loop fix

4c1559e

wen-coding reviewed May 28, 2026

View reviewed changes

pompon0 added 3 commits May 29, 2026 10:10

removed metricers, reverted previousMaxPeerHeight

b3d6d19

test fix

dbe40d6

adjusted doc

5891958

cursor Bot reviewed May 29, 2026

View reviewed changes

Comment thread sei-tendermint/internal/blocksync/pool.go

removed deadcode

4f62265

pompon0 requested a review from wen-coding May 29, 2026 10:11

sei-will approved these changes May 29, 2026

View reviewed changes

wen-coding approved these changes May 29, 2026

View reviewed changes

pompon0 added this pull request to the merge queue May 29, 2026

github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 29, 2026

pompon0 added this pull request to the merge queue May 29, 2026

Merged via the queue into main with commit c4e1a2a May 29, 2026
54 checks passed

pompon0 deleted the gprusak-blocksync branch May 29, 2026 17:23

wen-coding mentioned this pull request May 31, 2026

test(evm): cap child_process.exec in lib.js to surface stalled commands #3526

Open

2 tasks

Conversation

pompon0 commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot May 28, 2026

Choose a reason for hiding this comment

Peer-up handler sends StatusRequest instead of StatusResponse

Uh oh!

pompon0 May 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

wen-coding May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pompon0 May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wen-coding May 28, 2026

Choose a reason for hiding this comment

Uh oh!

wen-coding May 28, 2026

Choose a reason for hiding this comment

Uh oh!

pompon0 May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pompon0 commented May 27, 2026 •

edited

Loading

cursor Bot commented May 27, 2026 •

edited

Loading

github-actions Bot commented May 27, 2026 •

edited

Loading

codecov Bot commented May 27, 2026 •

edited

Loading

wen-coding May 28, 2026 •

edited

Loading

pompon0 May 29, 2026 •

edited

Loading