db/state: prune TemporalMemBatch overlay entries past unwindToTxNum#20625
Merged
db/state: prune TemporalMemBatch overlay entries past unwindToTxNum#20625
Conversation
TemporalMemBatch tracks per-key overlay writes as []dataWithTxNum with
each write stamped by txNum. getLatest returns dataWithTxNums[len-1]
— the most recently appended entry — without comparing it against
sd.unwindToTxNum. Unwind() only recorded unwindToTxNum and an
unwindChangeset; it never touched sd.domains / sd.storage. Since
unwindChangeset is consulted only when the overlay misses, any key
still present in the overlay would keep returning a pre-unwind
write made inside the unwound txNum range.
Observed on post-Fusaka mainnet catch-up: after a forkchoice-driven
unwind, the first re-executed block reads a storage slot that was
first-written inside the unwound range. The overlay returns the
post-target write, flipping the SSTORE cost from SET (20000) to
RESET (2900) — exactly a 17100-gas shortfall per affected slot.
On mainnet block 24899403 this compounded to diff=-73829 across
several slots; block 24898955 showed diff=-34200.
Fix: on Unwind, walk sd.domains and sd.storage and drop any
dataWithTxNum whose txNum > unwindToTxNum. If a key's slice
empties out, delete the key so the unwindChangeset fallback (or
the underlying tx) can supply the pre-unwind answer.
Runs under sd.latestStateLock so the transition is visible
atomically to concurrent reads.
A regression test in db/state/execctx/domain_shared_test.go
(TestSharedDomain_UnwindDoesNotRestoreOverlayForNewKey) writes
a first-time storage value at txNum=100, calls Unwind(50), and
asserts the overlay no longer returns the post-target write.
Test fails on the pre-fix code and passes with this change.
Note: DB-layer siblings of this bug exist separately (missing
tombstone on domain unwind for []byte{} diffs; getLatestFromDb
fallthrough past in-range deletion markers, previously addressed
by #20483 and reverted by #20509). Those need their own surgical
fixes with dedicated regression guards and are intentionally
out-of-scope here.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
Member
files do have notion of deleted keys vs missing keys |
AskAlexSharov
approved these changes
Apr 17, 2026
3 tasks
github-merge-queue Bot
pushed a commit
that referenced
this pull request
Apr 18, 2026
…estFromDb (#20627) ## Summary Re-land the narrow core of #20483 (reverted by #20509), addressing the DB-layer siblings of the post-unwind stale-read bug. Complements #20625 which addressed the in-memory overlay side. Two linked bugs in the DB-level domain unwind and read paths caused stale data resurrection after an unwind that reverted a first-time write or a deletion inside the unwound range. ### Symptom observed on mainnet Post-Fusaka mainnet catch-up sync with a chaindata/ wipe (snapshots/ retained). On the first re-executed block after a forkchoice-driven unwind, execution returned less gas than the header — diffs observed of `-34200` (block 24,898,955), `-73829` (block 24,899,403), `-118872` (block 24,899,594). The diffs break down into multiples of `SSTORE_SET - SSTORE_RESET = 17100` plus cold-access flips of `2600`. Previous PR #20625 cleared the first two by pruning the in-memory overlay on Unwind. Block 24,899,594 still failed because the overlay was already flushed to DB at Unwind time — the stale-read path now was purely DB-layer, addressed here. ## Fixes ### 1. `unwind()` must restore empty tombstones — `db/state/domain.go:1317` (both DupSort and LargeValues paths) `DomainEntryDiff.Value` has three shapes, documented in `db/kv/helpers.go:247`: - `nil` — "different step": prev value lives at another step, skip restore (legacy V0 changeset shape) - `[]byte{}` — "no previous value": key was absent before this step; write an empty tombstone so the key appears absent again after the unwind completes - non-empty — restore the actual previous value The old guard `if len(value) > 0` skipped *both* `nil` and `[]byte{}`, leaving no tombstone after unwinding a first-time write. Corrected to `if value != nil`. ### 2. `getLatestFromDb` must treat empty values as authoritative — `db/state/domain.go:1665` Empty-value entries are deletion tombstones. The step-age guard previously discarded them when their step fell within the frozen file range, causing the caller to fall through to `getLatestFromFiles`. Frozen files encode deletions as absence, so the file returns the pre-deletion value — the exact resurrection the deletion was meant to prevent. Empty entries are now returned as `found=true` regardless of step age; the step-age guard still applies to non-empty entries. ## Relationship to #20483 / #20509 This is a deliberate re-land of the narrow core of #20483. Key differences: - **Excluded**: the LargeValues cross-check in `getLatest` (PR #20483 lines 1697–1737). That handled an interrupted-`PruneSmallBatches` edge case specific to `CodeDomain` / `RCacheDomain` and was the most likely source of the regressions that motivated #20509's revert. If it proves necessary, it can be added later as a separate PR with its own dedicated test. - **Included**: both matched fixes (write-side tombstone + read-side authoritativeness). They are a pair — neither is useful alone; staging them separately risks merging half and shipping a version that's still broken. ## Tests - `TestDomain_UnwindRestoresDeletionMarker` (DupSort + LargeValues subtests) — writes a key, deletes it, re-writes within the same step, builds files, then unwinds the re-write. **Fails on pre-fix code** (getLatest returns the stale post-unwind `value2` from the frozen file); passes with the fix. Exercises the write-side bug directly. - `TestDomain_DeletedKeyNotResurrectedByFiles` (DupSort + LargeValues subtests) — documents the read-side contract by writing a key and deleting it at a step that falls within file range. Passes on current `main` even without the fix (the file-build + prune semantics evolved since #20483 and no longer hit the specific stale-read in this exact test scenario), but retained as a forward regression guard and as documentation of the invariant. ## Test plan - [x] `go test -short ./db/state/...` — all pass - [x] `make lint` — 0 issues - [x] `make erigon` — builds clean - [x] Manual repro of the production symptom (mainnet sync from snapshots-only) in combination with #20625 — sync progresses past the catch-up / first-forkchoice-unwind window without a gas mismatch. (Re-verification run in progress alongside this PR.) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AskAlexSharov
pushed a commit
that referenced
this pull request
Apr 21, 2026
…ry orphan (#20716) Cherry-pick of #20710 to release/3.4. ## r3.4-specific adaptations - `sd1.Merge` call in the new test dropped `ctx` — release/3.4's signature is `Merge(uint64, *SharedDomains, uint64)`. - `TestSharedDomain_RepeatedUnwindAcrossStepBoundary` is skipped: it depends on the overlay-pruning semantics from PR #20625 (not present in release/3.4). The single-unwind variants (`TestSharedDomain_UnwindAcrossStepBoundary`, `TestSharedDomain_MergeUnwindAcrossStepBoundary`, `TestSharedDomain_UnwindWithDeleteAcrossStepBoundary`) still guard the per-step orphan-entry invariant this PR fixes. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: info@weblogix.biz <admin@10gbps.weblogix.it>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix a post-unwind stale-read in the in-memory domain overlay that causes gas-used mismatches on post-Fusaka mainnet catch-up.
TemporalMemBatchstores per-key overlay writes as[]dataWithTxNum, each entry stamped with its writetxNum.getLatestreturnsdataWithTxNums[len-1]— the most recently appended entry — without comparing it againstsd.unwindToTxNum.Unwind()only recordedunwindToTxNum+ anunwindChangeset; it never touchedsd.domains/sd.storage. SinceunwindChangesetis consulted only when the overlay misses, any key still present in the overlay kept returning a pre-unwind write made inside the unwoundtxNumrange.Observed symptom
Post-Fusaka mainnet catch-up, after a forkchoice-driven unwind. The first re-executed block reads a storage slot that was first-written inside the unwound range. The overlay returns the post-target write, flipping the SSTORE cost from
SET (20000)toRESET (2900)— exactly a 17100-gas shortfall per affected slot.diff=-34200(2 slots × 17100)diff=-73829(compound — several slots affected)Live trace instrumentation (not shipped) captured 3,082
SD_STALE_READevents between an unwind attxNum=3454259398and the resulting mismatch at block 24,899,403.Fix
On
Unwind, walksd.domainsandsd.storageand drop anydataWithTxNumwhosetxNum > unwindToTxNum. If a key's slice empties out, delete the key so theunwindChangesetfallback (or the underlying tx) supplies the pre-unwind answer. Runs undersd.latestStateLockso the transition is atomic to concurrent reads. Storage-btree mutations are staged afterScanto respect btree iterator rules.Regression test
TestSharedDomain_UnwindDoesNotRestoreOverlayForNewKeyindb/state/execctx/domain_shared_test.go:txNum=100Unwind(50)Test fails on pre-fix code with the exact error that mirrors the mainnet symptom; passes with this change.
Test plan
go test -short ./db/state/...— all passmake lint— 0 issuesmake erigon— builds cleanchaindata/wiped andsnapshots/retained (same repro that produced block-24,899,403 mismatch) — sync progresses past the catch-up / first-forkchoice-unwind window without a gas mismatch.Known adjacent issues, out of scope
DB-layer siblings of this bug exist separately and are not addressed here:
db/state/domain.go:1317— on-disk unwind currently conflatesnil("different step, skip") and[]byte{}("key was absent, write tombstone") viaif len(value) > 0, so first-time writes in the unwound range leave no restoring tombstone.db/state/domain.go:1665—getLatestFromDbdiscards deletion markers at a step within file range, so the caller falls through togetLatestFromFiles, which has no concept of deletions and returns stale pre-deletion data.Both were previously addressed by #20483 and reverted by #20509 while regressions were investigated. They need their own narrower fixes with dedicated regression guards and should be staged as separate PRs so they're independently revertible.
🤖 Generated with Claude Code