Skip to content

fix builder withdrawal type detection#697

Merged
pk910 merged 2 commits into
masterfrom
pk910/withdrawals-tagging
May 18, 2026
Merged

fix builder withdrawal type detection#697
pk910 merged 2 commits into
masterfrom
pk910/withdrawals-tagging

Conversation

@pk910
Copy link
Copy Markdown
Member

@pk910 pk910 commented May 15, 2026

Fix builder withdrawal type detection at epoch boundaries

Summary

Fixes two bugs in the state simulator (indexer/beacon/state_sim.go) that caused
withdrawals to be classified incorrectly on Gloas chains:

  1. Delayed builder payments at the start of an epoch had their reference slots
    in descending order
    instead of slot-ascending order.
  2. The direct builder payment for the parent block was tagged as a builder
    sweep (BuilderFullWithdrawal)
    instead of a builder payment, mainly
    affecting the first block of each epoch.

Both bugs are visible at e.g. https://dora.glamsterdam-devnet-3.ethpandaops.io/slot/65184#withdrawals
(slot 0 of an epoch): indices 9629–9632 are delayed payments with reversed
ref slots, and index 9633 is the parent block's direct payment mis-tagged as
a builder sweep.

Root cause

Bug 1 — reversed delayed-payment ref slots

resolveDelayedPaymentRefSlots and resolveInitialDirectRefs both have a DB
fallback path (used once the source epoch falls out of the in-memory cache).
That path called db.GetSlotsRange, which returns rows ordered slot DESC,
and iterated the result forward. The matching loop then paired the first
delayed entry (appended in slot-ascending order by process_builder_pending_payments)
with the highest-slot missed block, producing reverse ordering.

The cache path was already iterating ascending, so the bug only showed up on
finalized/pruned epochs.

Bug 2 — parent's direct payment mis-classified

Per the Gloas spec, the queue at the time of process_withdrawals is:

[unconsumed_directs_from_prev_epoch..., delayed_from_epoch_transition..., parent_direct]

The trailing parent_direct is appended by process_parent_execution_payload
(spec: gloas/beacon-chain.md#new-process_parent_execution_payload), which runs
immediately before process_withdrawals. In the simulator that step was modelled
implicitly by applyBlock(parent) appending the parent's UID after draining —
fine for non-first-of-epoch blocks.

But getParentBlocks filters parents to the current epoch only. For:

  • the first block of an epoch, or
  • any block whose immediate parent lies in the previous epoch (e.g. missed
    slots at an epoch boundary, multi-slot gaps),

parentBlocks is empty, applyBlock never runs for the parent, and the
parent's direct payment is missing from the simulator's queue. The
classifier's BuilderPaymentCount is therefore off by 1, and the extra
builder withdrawal in the block's payload falls through
classifyWithdrawalType's isBuilder → WithdrawalTypeBuilderFullWithdrawal
branch.

Fix

  1. DB iteration order — iterate dbSlots in reverse in both
    resolveInitialDirectRefs and resolveDelayedPaymentRefSlots, so slots
    are walked ascending regardless of the SQL order.

  2. Per-entry type tracking — add a Type field on
    trackedBuilderWithdrawal (WithdrawalTypeBuilderPayment /
    WithdrawalTypeBuilderDelayedPayment). Initial entries are tagged in
    resetState based on DelayedBuilderPaymentCount; entries appended during
    replay are tagged direct. This is necessary because the post-fix queue can
    have three sections — [direct, delayed, direct] — which position-based
    classification cannot express.

  3. Cross-epoch parent's direct payment — in replayWithdrawalState, when
    len(parentBlocks) == 0, look up the immediate parent (cache → DB) via the
    new helper resolveParentDirectEntry and append its direct payment to the
    queue if it had a delivered, canonical, builder-produced payload. This
    mirrors process_parent_execution_payload for the spec cases that
    applyBlock doesn't cover.

  4. Classifier rewriteclassifyBuilderPayments now reads the per-entry
    Type field instead of inferring direct/delayed from position. The
    delayed-ref-slot resolution iterates the queue once to collect delayed
    entries in queue order, matching against missed-payload blocks in
    slot-ascending order.

  5. builderDelayedCount removed — no longer needed now that each entry
    carries its own type. applyBlock's drain simplifies to truncating the
    queue (no delayed-count accounting).

Compatibility with other withdrawal types

The fix preserves the section order required by
get_expected_withdrawals (spec: gloas/beacon-chain.md#modified-get_expected_withdrawals):

Spec source isBuilder Classifier branch
get_builder_withdrawals yes idx < BuilderPaymentCount → from sim
get_pending_partial_withdrawals no idx < BuilderPaymentCount + PartialCount
get_builders_sweep_withdrawals yes idx ≥ … + isBuilderBuilderFullWithdrawal
get_validators_sweep_withdrawals no exit check / sweep

The new queue entry I append lives strictly inside section 1 (it's in the
pending-withdrawals queue, where the spec puts it). Builder sweep withdrawals
in section 3 are produced from a different source (state.builders sweep
cursor) and are still discriminated purely by index position relative to
BuilderPaymentCount + PartialCount. No regression for blocks that have both
a parent direct payment and a builder sweep in the same slot.

@pk910 pk910 enabled auto-merge May 18, 2026 11:11
@pk910 pk910 merged commit b692969 into master May 18, 2026
4 checks passed
@pk910 pk910 deleted the pk910/withdrawals-tagging branch May 18, 2026 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants