Skip to content

feat(derivation): deriveForce skip#967

Merged
tomatoishealthy merged 7 commits into
feat/sequencer-finalfrom
optimize-reorg
Jun 2, 2026
Merged

feat(derivation): deriveForce skip#967
tomatoishealthy merged 7 commits into
feat/sequencer-finalfrom
optimize-reorg

Conversation

@curryxbo
Copy link
Copy Markdown
Contributor

@curryxbo curryxbo commented Jun 1, 2026

…Number

When local verify sees batchInfo.lastBlockNumber missing locally, the new dispatcher distinguishes four cases:

  • A consistent: header present + versioned hash matches L1 → existing rebuildBlob path, safe advances as before.
  • B self-heal: header present + versioned hash mismatch → existing self-heal, now calling deriveForce(_, 0) to keep the full-overwrite semantics.
  • C fill-gap: header missing AND L2 head stays flat across the 3×2s retry window → pull the real L1 blob via fetchRollupDataByTxHash, then deriveForce(_, localLatestL2) writes only blocks above the local head, leaving existing blocks untouched (no EL reorg). Falls through to the shared verifyBatchRoots + advanceSafe.
  • D wait-next-poll: header missing AND L2 head grew within the window → continue, sequencer is alive and P2P will deliver the missing blocks before the next poll cycle.

deriveForce now takes a skipNumber so scenarios B and C share one implementation: skipNumber=0 means "overwrite everything", skipNumber>0 skips blocks already present locally and only appends the missing tail. The parent header anchor is max(firstNum-1, skipNumber) so the first block we actually write always has its parent in the EL.

The retry loop surfaces non-NotFound RPC errors from HeaderByNumber / BlockNumber so transient client failures are visible instead of being masked as "sequencer stopped".

corey and others added 2 commits June 1, 2026 14:05
…Number

When local verify sees batchInfo.lastBlockNumber missing locally, the
new dispatcher distinguishes four cases:

- A consistent: header present + versioned hash matches L1 → existing
  rebuildBlob path, safe advances as before.
- B self-heal: header present + versioned hash mismatch → existing
  self-heal, now calling deriveForce(_, 0) to keep the full-overwrite
  semantics.
- C fill-gap: header missing AND L2 head stays flat across the 3×2s
  retry window → pull the real L1 blob via fetchRollupDataByTxHash,
  then deriveForce(_, localLatestL2) writes only blocks above the
  local head, leaving existing blocks untouched (no EL reorg). Falls
  through to the shared verifyBatchRoots + advanceSafe.
- D wait-next-poll: header missing AND L2 head grew within the window
  → continue, sequencer is alive and P2P will deliver the missing
  blocks before the next poll cycle.

deriveForce now takes a skipNumber so scenarios B and C share one
implementation: skipNumber=0 means "overwrite everything", skipNumber>0
skips blocks already present locally and only appends the missing tail.
The parent header anchor is max(firstNum-1, skipNumber) so the first
block we actually write always has its parent in the EL.

The retry loop surfaces non-NotFound RPC errors from HeaderByNumber /
BlockNumber so transient client failures are visible instead of being
masked as "sequencer stopped".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Scenario C (header missing + L2 head flat) appends blocks above the
local head via deriveForce. Even though it doesn't reorg existing
blocks, blocksync / broadcast reactors may race with derivation on
the same height range as P2P delivers blocks for the missing tail.
Wrap the fill-gap block writes with StopReactorsBeforeReorg /
StartReactorsAfterReorg so the reactor lifecycle matches scenario B
self-heal: derivation owns the writes during the operation, then
reactors restart from the post-write head and catch up via P2P.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@curryxbo curryxbo requested a review from a team as a code owner June 1, 2026 06:28
@curryxbo curryxbo requested review from tomatoishealthy and removed request for a team June 1, 2026 06:28
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 1, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 119f4305-3fbf-476b-b0c2-bb072ad63792

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch optimize-reorg

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

The previous reactor stop/start pattern around deriveForce returned
early on derive errors, leaving consensus reactors permanently
stopped: Stop is idempotent on retry, but Start was never reached, so
blocksync / broadcast stayed paused indefinitely until process restart.

Wrap stop/derive/start in withReactorsQuiesced so Start runs in a
defer regardless of body outcome. Restart height is read from the L2
EL latest (truth source) rather than the lastHeader returned by
deriveForce — that header may be nil or stale if writes failed
mid-batch, but the EL always knows what's actually applied. Use
context.Background() for the deferred BlockNumber read so a cancelled
parent ctx doesn't block reactor recovery.

Both scenario B (self-heal) and scenario C (fill-gap) call sites now
share the same helper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@curryxbo curryxbo changed the title feat(derivation): SPEC-005 §4.3 Path B entry-point + deriveForce skip… feat(derivation): deriveForce skip Jun 1, 2026
The previous implementation logged the post-write BlockNumber read
failure but still passed the zero-valued cur to StartReactorsAfterReorg,
which would have told blocksync to resume from height 0 and re-fetch
the entire chain from genesis.

withReactorsQuiesced now captures the pre-write height before stopping
reactors and uses it as a safe lower-bound fallback if the post-write
EL read fails. If even the pre-write read fails we don't stop reactors
at all and return the error so the batch retries next poll, rather
than committing to a quiesce window with no way to pick a sensible
restart height.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the inline 6s retry loop in the per-batch dispatcher with a
single observation captured once per derivationBlock pull. The
previous-pull's L2 latest is cached on Derivation.lastObservedL2Latest;
the next pull compares its current latest against the cache to derive
l2Grew, then runs through the per-batch loop with that single bool:

  header present                → A / B (rebuildBlob).
  header missing AND l2Grew     → D (alive, skip; next poll re-tries).
  header missing AND !l2Grew    → C (stopped → L1 blob fill-gap).

The poll interval (default 15s) is naturally the observation window —
no inline sleep, no extra RPCs in the per-batch loop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
corey and others added 2 commits June 2, 2026 09:55
The Path B entry-point's alive-vs-stopped signal is the cross-pull
movement of L2 latest. Capture it once per pull, just above the
`for _, lg := range logs` loop, so:

  - All batches in a pull share the same verdict — fillGap's own
    block writes can't pollute later batches' growth comparison.
    (The previous fix attempt moved the read into the per-batch
    branch, which broke multi-batch sequencer-stopped recovery: only
    one batch was rescued per poll instead of all accumulated logs.)

  - Snapshot is fresher than placing it at the top of derivationBlock,
    where the L1 prep RPCs (getLatestConfirmedBlockNumber,
    fetchRollupLog, LastCommittedBatchIndex) sit between the read
    and the consumer.

  - Reads are colocated with their use site for easier review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tomatoishealthy tomatoishealthy merged commit dac0b43 into feat/sequencer-final Jun 2, 2026
6 of 7 checks passed
@tomatoishealthy tomatoishealthy deleted the optimize-reorg branch June 2, 2026 03:55
tomatoishealthy pushed a commit that referenced this pull request Jun 2, 2026
Co-authored-by: corey <corey.zhang@bitget.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants