Skip to content

FlatKV refactor for state sync import + export#3250

Merged
yzang2019 merged 11 commits intomainfrom
yzang/flatkv-statesync
Apr 16, 2026
Merged

FlatKV refactor for state sync import + export#3250
yzang2019 merged 11 commits intomainfrom
yzang/flatkv-statesync

Conversation

@yzang2019
Copy link
Copy Markdown
Contributor

@yzang2019 yzang2019 commented Apr 15, 2026

Describe your changes and provide context

Refactor the FlatKV export/import pipeline to use raw physical keys and vtype-serialized values end-to-end, and teach the SS (State Store) composite importer to handle FlatKV snapshot data.

SC (State Commitment) Layer

FlatKV Exporter — rewritten to raw export

  • Replaced the old parsing/conversion exporter with a raw iterator approach: KVExporter now uses RawGlobalIterator() to emit physical key/value pairs directly from the DBs without any transformation
  • Removed all per-type conversion functions (accountToNodes, codeToNodes, storageToNodes, legacyToNodes, pendingNodes)

FlatKV Importer — channel-based concurrent pipeline

  • Rewrote KVImporter as a concurrent pipeline with per-DB worker goroutines (dbWorker) writing batches and computing LtHash in parallel
  • AddNode sends to a shared ingestCh; a dispatch goroutine routes keys to the correct DB worker via routePhysicalKey
  • Fail-fast error propagation using atomic.Pointer[error] and a done channel — AddNode and dispatch exit immediately on first error
  • New FinalizeImport(version) method persists LtHash and version metadata after the entire import completes
  • Fsync is always disabled during import for performance

FlatKV RawGlobalIterator — simplified

  • Reimplemented as a sequentialIterator that iterates DBs in fixed order (EVM sub-DBs then legacyDB) using dataDBs()
  • Iterator errors are now checked and propagated (previously silently swallowed)
  • Removed EmptyIterator (dead code)

FlatKV Store refactoring

  • Added dataDBs() and namedDataDBs() helpers in store.go to centralize the list of data DBs, replacing inline DB lists throughout the codebase
  • Added routePhysicalKey(key) to route a physical key to the correct DB
  • Removed all memiavl references from FlatKV code and comments

Composite SC Store

  • Fixed flatkvCommiter typo → flatkvCommitter (used ~10 times)
  • Moved FlatKVExportModuleName to common/keys/keys.go as FlatKVStoreKey
  • Fixed composite importer routing bug: FlatKV nodes now return early if flatkvImporter is nil instead of falling through to Cosmos importer
  • Fixed error message: "failed to create evm importer" → "failed to create flatkv importer"

SS (State Store) Layer

SS Composite Import — rewritten and simplified

  • Rewrote Import from ~135 lines with two separate code paths into a unified ~75-line function:
    • Single routing loop for all write modes (DualWrite, SplitWrite, CosmosOnlyWrite, no-EVM)
    • evmCh is only created when EVM store is available and active
    • Lightweight done channel + sync.Once pattern replaces complex drainImportErr/sendNode retry closures
  • Added convertFlatKVNodes() function that transforms FlatKV physical-key snapshot nodes into SS-compatible format:
    • Strips module prefix (ktype.StripModulePrefix) and parses key kind (keys.ParseEVMKey)
    • Account keys: deserializes vtype.AccountData, emits separate nonce (8-byte BE) and codeHash (32-byte) nodes; skips zero-nonce and zero-codeHash (EOA accounts)
    • Storage/Code keys: deserializes the corresponding vtype and emits raw value with StoreKey = "evm"
    • Legacy keys: deserializes vtype.LegacyData and preserves the original module name as StoreKey so keys route back to the correct Cosmos SS module (e.g. "bank", "staking")
  • Updated import routing: when EVM SS is enabled, EVM keys go exclusively to EVM store, non-EVM keys go to Cosmos store
  • Removed unused normalizeSnapshotNode function and splitWrite variable from Import

Config & Misc

  • Fixed config validation error messages: ReaderConstantThreadCount / ReaderPoolQueueSize / MiscConstantThreadCount now say "must not be negative" (previously said "must be greater than 0" while allowing 0)
  • Fixed duplicate "for" typo in api.go comment

Testing performed to validate your change

  • ss/composite/store_test.go: Updated TestImport_OnlyEvmModule to verify EVM-only routing (no cosmos duplication); rewrote TestImport_OnlyEvmFlatkvModule and TestImport_BothEvmAndEvmFlatkv with proper FlatKV physical keys and vtype-serialized values; rewrote TestImport_CosmosOnlyWrite_ConvertsFlatkvToCosmos to verify conversion in no-EVM path
  • sc/flatkv/import_export_test.go: Updated round-trip tests, exporter tests for raw key/value output, importer tests for physical key routing, corrupt data propagation tests
  • sc/composite/store_test.go: Added RawGlobalIterator() to mock, updated all flatkvCommitter references, replaced FlatKVExportModuleName with keys.FlatKVStoreKey
  • sc/flatkv/store_write_test.go, store_read_test.go, snapshot_test.go, perdb_lthash_test.go, lthash_correctness_test.go: Fixed compile errors, updated to use centralized DB helpers and physical key formats

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 15, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedApr 16, 2026, 8:38 PM

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 15, 2026

Codecov Report

❌ Patch coverage is 82.28477% with 107 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.32%. Comparing base (69fa014) to head (3952eea).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
sei-db/state_db/sc/flatkv/store_iterator.go 63.63% 19 Missing and 9 partials ⚠️
sei-db/state_db/ss/composite/store.go 76.53% 16 Missing and 7 partials ⚠️
sei-db/state_db/sc/flatkv/importer.go 84.11% 12 Missing and 5 partials ⚠️
sei-db/state_db/sc/composite/store.go 65.62% 6 Missing and 5 partials ⚠️
sei-db/state_db/sc/flatkv/store_write.go 75.75% 5 Missing and 3 partials ⚠️
sei-db/state_db/sc/flatkv/exporter.go 80.00% 2 Missing and 2 partials ⚠️
sei-db/state_db/sc/flatkv/store_meta.go 63.63% 0 Missing and 4 partials ⚠️
sei-db/state_db/sc/flatkv/store_read.go 75.00% 4 Missing ⚠️
sei-db/state_db/sc/composite/exporter.go 78.57% 2 Missing and 1 partial ⚠️
sei-db/state_db/sc/flatkv/ktype/ktype.go 70.00% 2 Missing and 1 partial ⚠️
... and 1 more
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3250      +/-   ##
==========================================
+ Coverage   59.29%   59.32%   +0.02%     
==========================================
  Files        2070     2076       +6     
  Lines      169782   170033     +251     
==========================================
+ Hits       100670   100865     +195     
- Misses      60327    60357      +30     
- Partials     8785     8811      +26     
Flag Coverage Δ
sei-chain-pr 69.73% <82.28%> (?)
sei-db 70.41% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-db/common/keys/evm.go 80.00% <100.00%> (ø)
sei-db/config/sc_config.go 100.00% <100.00%> (ø)
sei-db/state_db/sc/composite/importer.go 92.59% <100.00%> (ø)
sei-db/state_db/sc/flatkv/config/config.go 74.19% <100.00%> (ø)
...db/state_db/sc/flatkv/config/flatkv_test_config.go 100.00% <100.00%> (ø)
sei-db/state_db/sc/flatkv/ktype/meta.go 100.00% <100.00%> (ø)
sei-db/state_db/sc/flatkv/snapshot.go 66.56% <100.00%> (ø)
sei-db/state_db/sc/flatkv/store_apply.go 88.50% <100.00%> (ø)
sei-db/state_db/sc/flatkv/store_lifecycle.go 59.25% <100.00%> (ø)
sei-db/state_db/sc/flatkv/test_helper.go 100.00% <100.00%> (ø)
... and 15 more

... and 40 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

// buffer reaches importBatchSize. If done fires, the worker abandons
// remaining work and exits immediately.
func (w *dbWorker) run(done <-chan struct{}) error {
for {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider to add

defer func() {
        if w.batch != nil {
            _ = w.batch.Close()
 }()

to avoid resource leak in some edge cases

Comment thread sei-db/state_db/ss/composite/store.go Outdated
return nil, fmt.Errorf("convertFlatKVNodes account: %w", err)
}
var nodes []types.SnapshotNode
if nonce := acct.GetNonce(); nonce != 0 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when nonce == 0 and codeHash is zero (e.g., an EOA that has received Sei but never sent a transaction), the SS store should still keep the account right?

@Kbhat1 @jewei1997

Copy link
Copy Markdown
Contributor Author

@yzang2019 yzang2019 Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What shall we store in SS in this case? Just account and nonce=0? Agreed it's a little bit tricky here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm thinking we should use vtype/account_data.go/IsDelete() to distinguish existing but nonce==0 vs non-existing/deleted, which decide if we put it in ss.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense, will make a change

return nil
}

newHash, _ := lthash.ComputeLtHash(w.ltHash, w.ltPairs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, we could offload lattice hash calculation to a work pool and get parallelism between DB operations and hash calculations. Cryptosim performance makes me think we could probably get a 2-3x speedup from this, assuming receiving data from the network isn't the bottleneck.

Writing this so we remember it in the future. Not a blocker for this PR (let's not prematurely optimize until we have benchmarks and actionable data).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, good point, let me either add a todo or just optimize in this PR

Comment on lines +15 to +18

// sequentialIterator iterates through a slice of DBs one at a time.
// It fully drains the current DB before moving to the next.
type sequentialIterator struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, I think putting these before structs that implement an interface makes the code easier to grok.

Suggested change
// sequentialIterator iterates through a slice of DBs one at a time.
// It fully drains the current DB before moving to the next.
type sequentialIterator struct {
var _ Iterator = (*sequentialIterator)(nil)
// sequentialIterator iterates through a slice of DBs one at a time.
// It fully drains the current DB before moving to the next.
type sequentialIterator struct {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread sei-db/state_db/ss/composite/store.go Outdated
Comment on lines +332 to +333
default:
return nil, nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return an error? If we get an unrecognized key type here, that's a bug, right?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea +1 think we can raise a new err @yzang2019

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, returning an error now

* main:
  Fix buffer offset in ProposerPriorityHash (CON-200) (#3255)
  Handle error case in light client divergence detector (#3254)
  perf(evmrpc): eliminate redundant block fetches in simulate backend (#3208)
  fix(evmrpc): omit notifications from legacy JSON-RPC batch responses per spec (#3246)
  fix: deduplicate block fetch in getTransactionReceipt (#3244)
@yzang2019 yzang2019 enabled auto-merge April 16, 2026 20:28
@yzang2019 yzang2019 added this pull request to the merge queue Apr 16, 2026
Merged via the queue into main with commit b57aeb8 Apr 16, 2026
39 checks passed
@yzang2019 yzang2019 deleted the yzang/flatkv-statesync branch April 16, 2026 21:05
blindchaser pushed a commit that referenced this pull request Apr 17, 2026
Refactor the FlatKV export/import pipeline to use raw physical keys and
vtype-serialized values end-to-end, and teach the SS (State Store)
composite importer to handle FlatKV snapshot data.

**FlatKV Exporter — rewritten to raw export**
- Replaced the old parsing/conversion exporter with a raw iterator
approach: `KVExporter` now uses `RawGlobalIterator()` to emit physical
key/value pairs directly from the DBs without any transformation
- Removed all per-type conversion functions (`accountToNodes`,
`codeToNodes`, `storageToNodes`, `legacyToNodes`, `pendingNodes`)

**FlatKV Importer — channel-based concurrent pipeline**
- Rewrote `KVImporter` as a concurrent pipeline with per-DB worker
goroutines (`dbWorker`) writing batches and computing LtHash in parallel
- `AddNode` sends to a shared `ingestCh`; a `dispatch` goroutine routes
keys to the correct DB worker via `routePhysicalKey`
- Fail-fast error propagation using `atomic.Pointer[error]` and a `done`
channel — `AddNode` and `dispatch` exit immediately on first error
- New `FinalizeImport(version)` method persists LtHash and version
metadata after the entire import completes
- Fsync is always disabled during import for performance

**FlatKV RawGlobalIterator — simplified**
- Reimplemented as a `sequentialIterator` that iterates DBs in fixed
order (EVM sub-DBs then legacyDB) using `dataDBs()`
- Iterator errors are now checked and propagated (previously silently
swallowed)
- Removed `EmptyIterator` (dead code)

**FlatKV Store refactoring**
- Added `dataDBs()` and `namedDataDBs()` helpers in `store.go` to
centralize the list of data DBs, replacing inline DB lists throughout
the codebase
- Added `routePhysicalKey(key)` to route a physical key to the correct
DB
- Removed all memiavl references from FlatKV code and comments

**Composite SC Store**
- Fixed `flatkvCommiter` typo → `flatkvCommitter` (used ~10 times)
- Moved `FlatKVExportModuleName` to `common/keys/keys.go` as
`FlatKVStoreKey`
- Fixed composite importer routing bug: FlatKV nodes now return early if
`flatkvImporter` is nil instead of falling through to Cosmos importer
- Fixed error message: "failed to create evm importer" → "failed to
create flatkv importer"

**SS Composite Import — rewritten and simplified**
- Rewrote `Import` from ~135 lines with two separate code paths into a
unified ~75-line function:
- Single routing loop for all write modes (DualWrite, SplitWrite,
CosmosOnlyWrite, no-EVM)
  - `evmCh` is only created when EVM store is available and active
- Lightweight `done` channel + `sync.Once` pattern replaces complex
`drainImportErr`/`sendNode` retry closures
- Added `convertFlatKVNodes()` function that transforms FlatKV
physical-key snapshot nodes into SS-compatible format:
- Strips module prefix (`ktype.StripModulePrefix`) and parses key kind
(`keys.ParseEVMKey`)
- **Account keys**: deserializes `vtype.AccountData`, emits separate
nonce (8-byte BE) and codeHash (32-byte) nodes; skips zero-nonce and
zero-codeHash (EOA accounts)
- **Storage/Code keys**: deserializes the corresponding vtype and emits
raw value with `StoreKey = "evm"`
- **Legacy keys**: deserializes `vtype.LegacyData` and preserves the
original module name as `StoreKey` so keys route back to the correct
Cosmos SS module (e.g. `"bank"`, `"staking"`)
- Updated import routing: when EVM SS is enabled, EVM keys go
exclusively to EVM store, non-EVM keys go to Cosmos store
- Removed unused `normalizeSnapshotNode` function and `splitWrite`
variable from Import

- Fixed config validation error messages: `ReaderConstantThreadCount` /
`ReaderPoolQueueSize` / `MiscConstantThreadCount` now say "must not be
negative" (previously said "must be greater than 0" while allowing 0)
- Fixed duplicate "for" typo in `api.go` comment

- **`ss/composite/store_test.go`**: Updated `TestImport_OnlyEvmModule`
to verify EVM-only routing (no cosmos duplication); rewrote
`TestImport_OnlyEvmFlatkvModule` and `TestImport_BothEvmAndEvmFlatkv`
with proper FlatKV physical keys and vtype-serialized values; rewrote
`TestImport_CosmosOnlyWrite_ConvertsFlatkvToCosmos` to verify conversion
in no-EVM path
- **`sc/flatkv/import_export_test.go`**: Updated round-trip tests,
exporter tests for raw key/value output, importer tests for physical key
routing, corrupt data propagation tests
- **`sc/composite/store_test.go`**: Added `RawGlobalIterator()` to mock,
updated all `flatkvCommitter` references, replaced
`FlatKVExportModuleName` with `keys.FlatKVStoreKey`
- **`sc/flatkv/store_write_test.go`**, **`store_read_test.go`**,
**`snapshot_test.go`**, **`perdb_lthash_test.go`**,
**`lthash_correctness_test.go`**: Fixed compile errors, updated to use
centralized DB helpers and physical key formats
Kbhat1 added a commit that referenced this pull request Apr 18, 2026
Two changes that together eliminate the need for SS-path iteration on
EVM data:

1. composite.Import now sends evm snapshot nodes to BOTH cosmos and evm
   under DualWrite (previously mutually-exclusive after #3250). This
   restores the state-sync-time safety net that made mainnet DualWrite
   state-synced nodes surface pointer lookups via cosmos even while the
   iterator routing bug was present.

2. Pointer registry collapsed to single-key: one record per pointee,
   with the uint16 version packed into the value instead of the key
   suffix. GetPointerInfo/GetAnyPointerInfo become single Gets — the
   only hot-path EVM iterator is gone. Migration handlers updated to
   read the version-prefixed value and drop the now-unneeded dedup map.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants