Skip to content

execution/execmodule: allow reorgs on canonical chain up to finalised hash#20825

Merged
taratorio merged 5 commits intomainfrom
worktree-fcu-reorg-canon-786
Apr 29, 2026
Merged

execution/execmodule: allow reorgs on canonical chain up to finalised hash#20825
taratorio merged 5 commits intomainfrom
worktree-fcu-reorg-canon-786

Conversation

@taratorio
Copy link
Copy Markdown
Member

@taratorio taratorio commented Apr 26, 2026

closes #20888

implements ethereum/execution-apis#786

which is needed for glamsterdam-devnet-0

@yperbasis yperbasis added the Glamsterdam https://eips.ethereum.org/EIPS/eip-7773 label Apr 26, 2026
Copy link
Copy Markdown
Member

@yperbasis yperbasis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Major

  1. Nil-pointer panic when finalizedHash is unknown to the EL — execution/execmodule/forkchoice.go:298-302 blockNum, err := e.blockReader.HeaderNumber(ctx, tx, finalizedHash) if err != nil { return sendForkchoiceErrorWithoutWaiting(...) } finalisedBlockNum = *blockNum
    BlockReader.HeaderNumber (db/snapshotsync/freezeblocks/block_reader.go:572) returns (nil, nil) when the block is absent — the existing FCU code at forkchoice.go:545 explicitly guards
    with if headNumber != nil. Here the bare *blockNum will panic if the CL ever sends a finalized hash the EL hasn't seen via newPayload. Recommend either treating blockNum == nil as the
    "no known finalized" branch, or returning InvalidForkchoiceState.

Minor
2. Spec-compliance gap when finalizedHash is zero — same block

The spec wording is "if there is a known finalizedBlockHash …". The implementation falls back to finishProgressBefore (execution-stage progress) when finalizedHash == 0x0, which is a
different proxy than "ancestor of finalized". Pragmatic for fresh devnets where finality hasn't kicked in yet, but worth either:

  • adding a comment explaining why this fallback is used, or
  • not allowing the no-op skip at all when finalized is unset (strictly spec-conforming, slightly less efficient at startup).

The current inline comment "is at or behind last finalized hash, treat as no-op" is also a little misleading on the fallback branch — the fallback is "behind execution progress", not
"behind finalized".

  1. Stale log message — forkchoice.go:369

e.logger.Warn("reorg requested too low, capping to the minimum unwindable block", ...)
return sendForkchoiceResultWithoutWaiting(outcomeCh, ForkChoiceResult{Status: ExecutionStatusReorgTooDeep}, false)

The warning still says "capping to the minimum unwindable block", but the code no longer caps — it returns ReorgTooDeep. I confirmed via the test run that the test trace still shows
this misleading WARN line. Suggest something like "reorg target below minimum unwindable block, returning ReorgTooDeep".

  1. Test asserts Error, not the specific code — engine_api_reorg_test.go:371

TestFcuReturnsReorgTooDeepCode38006 ends with require.Error(t, err) — given the test name explicitly calls out code 38006, it would be stronger to assert on the message/code, e.g.
require.ErrorContains(t, err, "Too deep reorg") or unwrap the JSON-RPC error and check Code == -38006. Right now any error along the way would make the test pass.

  1. Proto enum vs native enum drift — node/interfaces/execution/execution.proto:10

ExecutionStatusReorgTooDeep = 6 is added to the native enum, but the proto enum still stops at Busy = 5. The native enum's comment says "The numeric values intentionally match the
proto constants for easy conversion." That invariant is now broken. The proto isn't actually used for status conversion in this code path today, so it's not a runtime issue, but for
consistency consider either adding ReorgTooDeep = 6; to the proto or dropping the comment.

  1. convertGrpcStatusToEngineStatus doesn't cover the new status — engine_server.go:999

The early-return at engine_server.go:1135-1137 means we never reach the switch with ExecutionStatusReorgTooDeep, so the panic("unhandled execution status") is unreachable today. This
matches the existing pattern for InvalidForkchoice, so it's consistent — just flagging since a future caller of convertGrpcStatusToEngineStatus could trip it.

@taratorio
Copy link
Copy Markdown
Member Author

@yperbasis addressed


#1-5 addressed ✅
#6 Flag (convertGrpcStatusToEngineStatus) ✅ ─
Untouched, consistent with the InvalidForkchoice early-return pattern — reviewer's
own characterization.

Copy link
Copy Markdown
Member

@yperbasis yperbasis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remaining observations

Minor — possible regression on duplicate FCUs for current head. The old condition canonicalHash == blockHash && fcuHeader.Number >= finishProgressBefore happened to short-circuit the
common "CL re-sends FCU for the head we already have" case. The new spec-compliant condition <= *finalisedBlockNum does not — when head == finalised+k (k > 0), a duplicate FCU for the
current canonical head now falls through to the unwind path and computes unwindTarget = currentParentNumber, then calls pipelineExecutor.UnwindTo(unwindTarget). Worth a quick check
whether UnwindTo cheaply no-ops when the target is at or above the current execution progress; otherwise this could mean an unnecessary unwind+re-execute cycle on each duplicate FCU.
The spec change itself is the intended behavior, but if UnwindTo doesn't already short-circuit, an explicit "blockHash is current head" early-return alongside the new condition would
preserve the old optimization.

Minor — finalisedBlockNum == nil semantics. When finalizedHash is non-zero but unknown to the EL (HeaderNumber returns (nil, nil)), the code currently just skips the no-op path.
That's safe (no panic) but the FCU continues into the unwind logic, where the unknown finalised hash isn't otherwise validated until the existing checks at forkchoice.go:545+. This is
consistent with prior behavior — flagging only because it's the only branch where finalisedBlockNum ends up nil and an explicit comment ("intentionally fall through; existing path
validates finalised") would help future readers.

Nit — test name vs setup. TestFcuAllowsReorgBackOnCanonicalChainWhenAfterFinalisedHash reads as if a real finalised hash is present, but MockCl.UpdateForkChoice (mock_cl.go:230)
hard-codes FinalizedBlockHash = cl.genesis. Effectively the test exercises "finalised = genesis(0)" so b2 is trivially "after" finalised. Functionally fine — and the test does
validate the right behavior — but the name suggests a richer scenario than is being tested. A second test variant where the CL advances finalised to a non-genesis block before the
reorg would more directly cover the spec wording.

Nit — unused locals in TestFcuReturnsReorgTooDeepCode38006. b3Canon, b4Canon, b5Canon are only used for VerifyTxnsInclusion and never referenced again. The BuildCanonicalBlock calls
are needed for side-effects, so the structure is fine; could shorten to _, err = ... after the verify but stylistic only.

@taratorio taratorio enabled auto-merge April 29, 2026 10:46
@taratorio taratorio added this pull request to the merge queue Apr 29, 2026
Merged via the queue into main with commit 67ece3f Apr 29, 2026
37 checks passed
@taratorio taratorio deleted the worktree-fcu-reorg-canon-786 branch April 29, 2026 11:57
Sahil-4555 pushed a commit to Sahil-4555/erigon that referenced this pull request May 6, 2026
…BlockNum (erigontech#21007)

follow up to erigontech#20825
best to use the last known finalised hash from the db and not from the
request, otherwise the logic can be circumvented by a bad request
also if there is no known finalised hash the finalisedBlockNum should be
0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Glamsterdam https://eips.ethereum.org/EIPS/eip-7773

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[glamsterdam-devnet-0] allow reorgs on canonical chain up to finalised hash

3 participants