Skip to content

composite store iteration#3544

Merged
cody-littley merged 26 commits into
mainfrom
cjl/composite-iteration
Jun 4, 2026
Merged

composite store iteration#3544
cody-littley merged 26 commits into
mainfrom
cjl/composite-iteration

Conversation

@cody-littley

Copy link
Copy Markdown
Contributor

Describe your changes and provide context

It's now possible to iterate data in the composite store, even if that data is in memIAVL

Testing performed to validate your change

unit tests

@cody-littley cody-littley requested a review from blindchaser June 3, 2026 18:36
@cody-littley cody-littley self-assigned this Jun 3, 2026
@cody-littley cody-littley marked this pull request as ready for review June 3, 2026 18:36
@cursor

cursor Bot commented Jun 3, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Changes state-commit iteration semantics during EVM migration (merged backends vs old router-only old-DB iteration) and touches iterator lifecycle in flatkv/composite paths where double-close could corrupt Pebble.

Overview
Composite store iteration now builds per-store scans by merging iterators from memiavl and flatkv directly (iterate), instead of routing through the migration ModuleRouter. Results are merged with NewMergingIterator (flatkv wins on duplicates), then clipped to the caller’s [start, end) via NewDomainIterator. Nil start/end mean unbounded; unknown stores return an empty iterator rather than an error; the reserved migration store still cannot be iterated.

Migration routers no longer implement Iterator: it is removed from Router, Route, ModuleRouter, PassthroughRouter, MigrationManager, and test-only dual-write. RouterCommitKVStore takes an owner-supplied iterator callback (composite wires cs.iterate). memiavl drops nil-bound rejection for iterators.

flatkv merge paths stop double-closing child iterators after failed NewMergingIterator (Pebble close is not idempotent).

Tests add EVM oracle iteration during in-flight MigrateEVM, nil-bound and unknown-store behavior, and trim router iterator coverage.

Reviewed by Cursor Bugbot for commit 7b2314c. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedJun 4, 2026, 3:27 PM

@codecov

codecov Bot commented Jun 3, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 70.00000% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.92%. Comparing base (1a1b8cd) to head (7b2314c).

Files with missing lines Patch % Lines
sei-db/state_db/sc/composite/store.go 46.42% 11 Missing and 4 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3544      +/-   ##
==========================================
- Coverage   59.12%   58.92%   -0.21%     
==========================================
  Files        2218     2191      -27     
  Lines      183132   181258    -1874     
==========================================
- Hits       108285   106802    -1483     
+ Misses      65082    64801     -281     
+ Partials     9765     9655     -110     
Flag Coverage Δ
sei-db 70.41% <ø> (ø)
sei-db-state-db ?
sei-db-state-db-pr 75.91% <70.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-db/state_db/sc/flatkv/store_iteration.go 66.50% <ø> (+0.63%) ⬆️
sei-db/state_db/sc/memiavl/store.go 92.59% <ø> (-0.22%) ⬇️
sei-db/state_db/sc/migration/dual_write_router.go 100.00% <100.00%> (ø)
sei-db/state_db/sc/migration/migration_manager.go 95.32% <100.00%> (+0.68%) ⬆️
sei-db/state_db/sc/migration/migration_types.go 80.00% <ø> (ø)
sei-db/state_db/sc/migration/module_router.go 100.00% <100.00%> (ø)
sei-db/state_db/sc/migration/passthrough_router.go 100.00% <100.00%> (ø)
sei-db/state_db/sc/migration/router_builder.go 55.70% <ø> (-2.13%) ⬇️
sei-db/state_db/sc/migration/router_kvstore.go 95.55% <100.00%> (+0.31%) ⬆️
sei-db/state_db/sc/migration/thread_safe_router.go 100.00% <ø> (ø)
... and 1 more

... and 32 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 25d220c. Configure here.

}
// The merged iterator reports the union of child domains; present the
// caller's logical [start, end) instead, per the dbm.Iterator contract.
return iterators.NewDomainIterator(merged, start, end)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-load iteration hides unloaded memiavl

Medium Severity

composite.iterate calls memIAVL.Iterator directly and no longer applies the former IsLoaded guard from buildMemIAVLIteratorBuilder. When memiavl is not open yet, the backend returns a nil iterator that is skipped, so RouterCommitKVStore iteration can yield a valid but empty merged stream instead of failing loudly like reads and proofs still do during the state-sync pre-load window.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 25d220c. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding is not incorrect, but falls into the "will not fix" category. This is legacy weirdness from the composite store startup sequence. I hope to fix the startup sequence in future PRs, and touching that now is out of scope of the current change.

return nil, fmt.Errorf("iteration from the %q store is not permitted", migration.MigrationStore)
}

// flatkv is appended after memiavl so it is the rightmost (winning) child.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we document that memiavl-before-flatkv is required for migration safety, not just value precedence? If a migration commit interleaves between constructing the two iterators, this order makes the worst case a duplicate key(which will be dedup later) ; the reverse order could miss the key.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss this. IMO, interleaving migration creation with ApplyChangeSets() is unsafe no matter how we order these iterators. I'd like to understand if this is something that's currently possible. If so, we possibly need to tighten thread safety constraints.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, will revisit this as an immediate follow up item.

db "github.com/tendermint/tm-db"
)

// Route binds a set of module/store names to the database accessors

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this comment still mentions iteration, but iteration is no longer part of Route / ModuleRouter

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed mentions of iteration

@blindchaser

Copy link
Copy Markdown
Contributor

sei-db/state_db/sc/migration/migration_test_framework_test.go — lines 57, 108, 141, 198

Nit: these Iterator methods are now dead code since Router no longer requires iteration. Consider deleting them to avoid implying the interface still includes it.

@cody-littley

Copy link
Copy Markdown
Contributor Author

@blindchaser

sei-db/state_db/sc/migration/migration_test_framework_test.go — lines 57, 108, 141, 198 Nit: these Iterator methods are now dead code since Router no longer requires iteration. Consider deleting them to avoid implying the interface still includes it.

Removed dead methods.

Comment thread sei-db/state_db/sc/composite/store.go
Comment thread sei-db/state_db/sc/migration/dual_write_router.go
@cody-littley cody-littley added this pull request to the merge queue Jun 4, 2026
@cody-littley cody-littley removed this pull request from the merge queue due to a manual request Jun 4, 2026
@cody-littley cody-littley enabled auto-merge June 4, 2026 15:26
@cody-littley cody-littley added this pull request to the merge queue Jun 4, 2026
Merged via the queue into main with commit 2fe02f0 Jun 4, 2026
55 checks passed
@cody-littley cody-littley deleted the cjl/composite-iteration branch June 4, 2026 16:01
blindchaser added a commit that referenced this pull request Jun 6, 2026
…lidation

Make x/evm/keeper.PruneZeroStorageSlots an early-return no-op when
DefaultConsensusPolicy().SkipAppHashValidation() is true (i.e. under
-tags mock_block_validation). Production binaries are unaffected; only
shadow binaries are altered.

Why: v3 (yiren/v6.5.0-flatkv-shadow-on-main, mock_block_validation build,
PR #3544 merging-iterator already present) crashed at block 212,079,313
with NextValidatorsHash mismatch the moment it transitioned out of
block-sync into consensus mode -- 2h40m after EVM migration completed.

Initial RCA blamed the (then-suspected) sketchy MM.Iterator() no-op
shadow, but a branch audit shows that commit is NOT in the build (lives
only on the abandoned yiren/v6.5.0-flatkv-from-shadow). The v3 build was
already running PR #3544's proper merged iterator, so PruneZeroStorageSlots
and RemoveFirstNTxHashes were observing real EVM state. Divergence comes
from somewhere else on the EVM EndBlock hot path
(x/evm/keeper/abci.go:84-160) and cascades via bank -> staking power ->
ValidatorUpdates over enough blocks that NextValidatorsHash diverges by
the time v3 enters consensus.

PruneZeroStorageSlots is the most-suspect single line on that path: it
iterates the entire EVM storage keyspace lazily, persists a checkpoint
in the EVM kvstore (which is itself migrating), and issues store.Delete
ops back through the migration router. Disabling it isolates whether
the cascade runs through this function:

  - If v4 NextValidatorsHash matches at first consensus block
    -> PruneZeroStorageSlots was the source.
  - If v4 still mismatches
    -> divergence is elsewhere (RemoveFirstNTxHashes / deferred coinbase
       transfers / surplus credit), iterate further.

Build-tag-implicit gating uses the same precedent as
05d5dfb (LastResultsHash pair-skip): zero touch to production binaries
since DefaultConsensusPolicy().SkipAppHashValidation() returns false in
the default build, and the compiler folds the branch away.

storage_cleanup_test.go gets //go:build !mock_block_validation so the
behavioral test only runs on production builds; under the shadow tag the
function is intentionally a no-op and the test would (correctly) fail.

NEVER deploy this binary as a validator or a public RPC.

Refs: STO-558, follow-up to v3 NextValidatorsHash recurrence.
Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants