feat(flatkv): implements Account Row GC for FlatKV accountDB#3084
Conversation
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f847bce5f9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for _, paw := range s.accountWrites { | ||
| key := AccountKey(paw.addr) | ||
| encoded := EncodeAccountValue(paw.value) | ||
| if err := batch.Set(key, encoded); err != nil { | ||
| return fmt.Errorf("accountDB set: %w", err) | ||
| } | ||
| } | ||
|
|
||
| // Update local meta atomically with data (same batch) | ||
| newLocalMeta := &LocalMeta{ | ||
| CommittedVersion: version, | ||
| } | ||
| if err := batch.Set(DBLocalMetaKey, MarshalLocalMeta(newLocalMeta)); err != nil { | ||
| return fmt.Errorf("accountDB local meta set: %w", err) | ||
| } | ||
| pending = append(pending, pendingCommit{accountDBDir, batch}) | ||
| } | ||
|
|
||
| // Commit to codeDB | ||
| if len(s.codeWrites) > 0 || version > s.localMeta[codeDBDir].CommittedVersion { | ||
| s.phaseTimer.SetPhase("commit_code_db_prepare") | ||
| batch := s.codeDB.NewBatch() | ||
| defer func() { _ = batch.Close() }() | ||
|
|
||
| for _, pw := range s.codeWrites { | ||
| if pw.isDelete { | ||
| if err := batch.Delete(pw.key); err != nil { | ||
| return fmt.Errorf("codeDB delete: %w", err) | ||
| if paw.isDelete { | ||
| if err := batch.Delete(key); err != nil { |
There was a problem hiding this comment.
Add a migration for pre-existing all-zero account rows
Prior FlatKV versions persisted account deletes as 40-byte zero rows, and this branch only removes rows that are present in s.accountWrites during a new commit. Because LoadVersion/WAL catchup never sweep already-committed accountDB entries, an in-place upgrade leaves those historical tombstones intact forever: Get(nonce) still returns found=true for old deletions, and the upgraded node keeps a different LtHash/root hash than a node rebuilt from history under the new deletion rules. This needs an upgrade-time cleanup or compatibility read path, not just the new batch.Delete on freshly touched addresses.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
👎
There is no pre-existing state in production environments in FlatKV, so no extra migration needed.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3084 +/- ##
==========================================
- Coverage 58.59% 58.55% -0.05%
==========================================
Files 2096 2090 -6
Lines 173408 171845 -1563
==========================================
- Hits 101610 100621 -989
+ Misses 62753 62287 -466
+ Partials 9045 8937 -108
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
cody-littley
left a comment
There was a problem hiding this comment.
Overall, this looks to achieve the goals described in the description, so I'm a 👍 on merging.
In the slightly longer term, I think it might be worth thinking about whether this code can be generalized. For example, do we also want to do deletions in the other DBs if we see all-zero values? If we generalized, it would probably look something like this:
- the pebbleDB wrapper checks each value to see if it is all zeros, and instead does a deletion if it sees this
- the cache layer treats values that are all zeros as deleted values
- when passing a changeset to the part of the code that computes lattice hashes, it treats key deletions equivalently to setting the value to all zeros (i.e. both are an equivalent deletion)
| for _, paw := range s.accountWrites { | ||
| key := AccountKey(paw.addr) | ||
| encoded := EncodeAccountValue(paw.value) | ||
| if err := batch.Set(key, encoded); err != nil { | ||
| return fmt.Errorf("accountDB set: %w", err) | ||
| } | ||
| } | ||
|
|
||
| // Update local meta atomically with data (same batch) | ||
| newLocalMeta := &LocalMeta{ | ||
| CommittedVersion: version, | ||
| } | ||
| if err := batch.Set(DBLocalMetaKey, MarshalLocalMeta(newLocalMeta)); err != nil { | ||
| return fmt.Errorf("accountDB local meta set: %w", err) | ||
| } | ||
| pending = append(pending, pendingCommit{accountDBDir, batch}) | ||
| } | ||
|
|
||
| // Commit to codeDB | ||
| if len(s.codeWrites) > 0 || version > s.localMeta[codeDBDir].CommittedVersion { | ||
| s.phaseTimer.SetPhase("commit_code_db_prepare") | ||
| batch := s.codeDB.NewBatch() | ||
| defer func() { _ = batch.Close() }() | ||
|
|
||
| for _, pw := range s.codeWrites { | ||
| if pw.isDelete { | ||
| if err := batch.Delete(pw.key); err != nil { | ||
| return fmt.Errorf("codeDB delete: %w", err) | ||
| if paw.isDelete { | ||
| if err := batch.Delete(key); err != nil { |
There was a problem hiding this comment.
👎
There is no pre-existing state in production environments in FlatKV, so no extra migration needed.
## Summary Implements Account Row GC for FlatKV `accountDB`. Field deletes for nonce and codehash now collapse to a physical row delete when the full `AccountValue` becomes zero, so pending reads, committed state, LtHash updates, WAL replay, and rollback all treat an all-zero account as absent rather than as a persisted zero-value row. - `keys.go`: Replaces field-specific account helpers with `AccountValue.IsEmpty()`, which enforces the row-deletion invariant across balance, nonce, and codehash. Simplifies `EncodeAccountValue()` to write directly into a fixed-size output buffer without changing the 40-byte EOA / 72-byte contract encoding contract. - `store.go`: Extends `pendingAccountWrite` with `isDelete`, making row deletion an explicit pending-state transition rather than an implicit encoding convention. - `store_read.go`: `Get()` now returns `(nil, false)` for pending or committed all-zero account rows, aligning pre-commit reads with post-commit storage. Consolidates storage/code/legacy read paths through a shared KV helper while preserving delete-vs-not-found behavior. - `store_write.go`: `ApplyChangeSets()` flips `isDelete` on full account clears and clears it on any subsequent write, which guarantees delete-then-recreate ordering within the same block. LtHash baselines capture `nil` once a row is logically deleted, preventing phantom MixOut during cross-apply recreation. `commitBatches()` now issues `batch.Delete()` for deleted account rows and shares local-meta encoding across all DB batch builders. The combined LtHash pair slice is preallocated to avoid append aliasing across DB-specific pair lists. ## Test plan `lthash_correctness_test.go`: - **AccountRowDelete**: Full nonce+codehash delete removes the row and keeps LtHash equal to a full scan. - **AccountDeleteThenRecreate**: Delete in one `ApplyChangeSets`, recreate in a later `ApplyChangeSets` in the same block, and verify LtHash uses `nil` as the baseline after the logical delete. - **AccountPartialDeletePreservesRow**: Delete only codehash and verify the row remains encoded as an EOA. - **AccountPendingReadPartialDelete**: Confirms partial deletes keep `isDelete=false` and pending reads still expose surviving fields. - **AccountRowDeleteGetBeforeCommit**: Confirms full deletes return not-found from `Get()` and `Has()` before commit. `store_write_test.go`: - Updates existing delete-semantics and cross-apply ordering tests to assert row absence rather than persisted zero nonce rows. - **AccountRowDeletedWhenAllFieldsZero**: End-to-end row GC after commit. - **AccountRowPersistsWhenPartiallyZero**: Partial field zeroing preserves the row. - **AccountRowDeleteThenRecreate**: Delete in one block and recreate in a later block. `snapshot_test.go`: - Updates reopen expectations so deleted all-zero accounts remain absent after restart. - **AccountRowDeletePersistsAfterReopen**: Reopen preserves both deletion and LtHash. - **AccountRowDeleteSurvivesWALReplay**: Simulates a crash after per-DB writes but before the global watermark advances, then verifies WAL catchup replays the delete correctly. - **AccountRowDeleteAfterSnapshotRollback**: Rolling back to a pre-delete snapshot restores the row. `keys_test.go`: - Replaces removed helper coverage with **AccountValueIsEmpty**, covering zero, partial, and fully populated account values.
Summary
Implements Account Row GC for FlatKV
accountDB. Field deletes for nonce and codehash now collapse to a physical row delete when the fullAccountValuebecomes zero, so pending reads, committed state, LtHash updates, WAL replay, and rollback all treat an all-zero account as absent rather than as a persisted zero-value row.keys.go: Replaces field-specific account helpers withAccountValue.IsEmpty(), which enforces the row-deletion invariant across balance, nonce, and codehash. SimplifiesEncodeAccountValue()to write directly into a fixed-size output buffer without changing the 40-byte EOA / 72-byte contract encoding contract.store.go: ExtendspendingAccountWritewithisDelete, making row deletion an explicit pending-state transition rather than an implicit encoding convention.store_read.go:Get()now returns(nil, false)for pending or committed all-zero account rows, aligning pre-commit reads with post-commit storage. Consolidates storage/code/legacy read paths through a shared KV helper while preserving delete-vs-not-found behavior.store_write.go:ApplyChangeSets()flipsisDeleteon full account clears and clears it on any subsequent write, which guarantees delete-then-recreate ordering within the same block. LtHash baselines capturenilonce a row is logically deleted, preventing phantom MixOut during cross-apply recreation.commitBatches()now issuesbatch.Delete()for deleted account rows and shares local-meta encoding across all DB batch builders. The combined LtHash pair slice is preallocated to avoid append aliasing across DB-specific pair lists.Test plan
lthash_correctness_test.go:ApplyChangeSets, recreate in a laterApplyChangeSetsin the same block, and verify LtHash usesnilas the baseline after the logical delete.isDelete=falseand pending reads still expose surviving fields.Get()andHas()before commit.store_write_test.go:snapshot_test.go:keys_test.go: