composite store iteration#3544
Conversation
PR SummaryMedium Risk Overview Migration routers no longer implement flatkv merge paths stop double-closing child iterators after failed Tests add EVM oracle iteration during in-flight MigrateEVM, nil-bound and unknown-store behavior, and trim router iterator coverage. Reviewed by Cursor Bugbot for commit 7b2314c. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3544 +/- ##
==========================================
- Coverage 59.12% 58.92% -0.21%
==========================================
Files 2218 2191 -27
Lines 183132 181258 -1874
==========================================
- Hits 108285 106802 -1483
+ Misses 65082 64801 -281
+ Partials 9765 9655 -110
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 25d220c. Configure here.
| } | ||
| // The merged iterator reports the union of child domains; present the | ||
| // caller's logical [start, end) instead, per the dbm.Iterator contract. | ||
| return iterators.NewDomainIterator(merged, start, end) |
There was a problem hiding this comment.
Pre-load iteration hides unloaded memiavl
Medium Severity
composite.iterate calls memIAVL.Iterator directly and no longer applies the former IsLoaded guard from buildMemIAVLIteratorBuilder. When memiavl is not open yet, the backend returns a nil iterator that is skipped, so RouterCommitKVStore iteration can yield a valid but empty merged stream instead of failing loudly like reads and proofs still do during the state-sync pre-load window.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 25d220c. Configure here.
There was a problem hiding this comment.
Finding is not incorrect, but falls into the "will not fix" category. This is legacy weirdness from the composite store startup sequence. I hope to fix the startup sequence in future PRs, and touching that now is out of scope of the current change.
| return nil, fmt.Errorf("iteration from the %q store is not permitted", migration.MigrationStore) | ||
| } | ||
|
|
||
| // flatkv is appended after memiavl so it is the rightmost (winning) child. |
There was a problem hiding this comment.
could we document that memiavl-before-flatkv is required for migration safety, not just value precedence? If a migration commit interleaves between constructing the two iterators, this order makes the worst case a duplicate key(which will be dedup later) ; the reverse order could miss the key.
There was a problem hiding this comment.
Let's discuss this. IMO, interleaving migration creation with ApplyChangeSets() is unsafe no matter how we order these iterators. I'd like to understand if this is something that's currently possible. If so, we possibly need to tighten thread safety constraints.
There was a problem hiding this comment.
Discussed offline, will revisit this as an immediate follow up item.
| db "github.com/tendermint/tm-db" | ||
| ) | ||
|
|
||
| // Route binds a set of module/store names to the database accessors |
There was a problem hiding this comment.
Nit: this comment still mentions iteration, but iteration is no longer part of Route / ModuleRouter
There was a problem hiding this comment.
removed mentions of iteration
|
sei-db/state_db/sc/migration/migration_test_framework_test.go — lines 57, 108, 141, 198 Nit: these Iterator methods are now dead code since Router no longer requires iteration. Consider deleting them to avoid implying the interface still includes it. |
Removed dead methods. |
…lidation Make x/evm/keeper.PruneZeroStorageSlots an early-return no-op when DefaultConsensusPolicy().SkipAppHashValidation() is true (i.e. under -tags mock_block_validation). Production binaries are unaffected; only shadow binaries are altered. Why: v3 (yiren/v6.5.0-flatkv-shadow-on-main, mock_block_validation build, PR #3544 merging-iterator already present) crashed at block 212,079,313 with NextValidatorsHash mismatch the moment it transitioned out of block-sync into consensus mode -- 2h40m after EVM migration completed. Initial RCA blamed the (then-suspected) sketchy MM.Iterator() no-op shadow, but a branch audit shows that commit is NOT in the build (lives only on the abandoned yiren/v6.5.0-flatkv-from-shadow). The v3 build was already running PR #3544's proper merged iterator, so PruneZeroStorageSlots and RemoveFirstNTxHashes were observing real EVM state. Divergence comes from somewhere else on the EVM EndBlock hot path (x/evm/keeper/abci.go:84-160) and cascades via bank -> staking power -> ValidatorUpdates over enough blocks that NextValidatorsHash diverges by the time v3 enters consensus. PruneZeroStorageSlots is the most-suspect single line on that path: it iterates the entire EVM storage keyspace lazily, persists a checkpoint in the EVM kvstore (which is itself migrating), and issues store.Delete ops back through the migration router. Disabling it isolates whether the cascade runs through this function: - If v4 NextValidatorsHash matches at first consensus block -> PruneZeroStorageSlots was the source. - If v4 still mismatches -> divergence is elsewhere (RemoveFirstNTxHashes / deferred coinbase transfers / surplus credit), iterate further. Build-tag-implicit gating uses the same precedent as 05d5dfb (LastResultsHash pair-skip): zero touch to production binaries since DefaultConsensusPolicy().SkipAppHashValidation() returns false in the default build, and the compiler folds the branch away. storage_cleanup_test.go gets //go:build !mock_block_validation so the behavioral test only runs on production builds; under the shadow tag the function is intentionally a no-op and the test would (correctly) fail. NEVER deploy this binary as a validator or a public RPC. Refs: STO-558, follow-up to v3 NextValidatorsHash recurrence. Co-authored-by: Cursor <cursoragent@cursor.com>


Describe your changes and provide context
It's now possible to iterate data in the composite store, even if that data is in memIAVL
Testing performed to validate your change
unit tests