Skip to content

feat(pipeline): allow syncing blocks ontop of the proposed chain#21025

Open
Maddiaa0 wants to merge 19 commits intomerge-train/spartanfrom
md/pipelining-syncing
Open

feat(pipeline): allow syncing blocks ontop of the proposed chain#21025
Maddiaa0 wants to merge 19 commits intomerge-train/spartanfrom
md/pipelining-syncing

Conversation

@Maddiaa0
Copy link
Copy Markdown
Member

@Maddiaa0 Maddiaa0 commented Mar 3, 2026

Overview

Key contributions:

  • In the pr above feat(pipeline): introduce pipeline views for building #21026 publishing was a blocking action, in this pr we move publishing to be a non blocking option, there a publisher can schedule when it should start trying to publish a block.
  • This keeps track of valid checkpoints that are pending and not settled to L1 - and allows building ontop of them.

Adds a second p2p callback that separates what runs for all nodes / validator nodes

Testing

epochs_mbps.pipeline now expects 3 blocks per checkpoint, just like the original epochs_mbps test, now it is fully pipelined.

Upcoming

  • updating the timetable to allow for longer time building in the slot - this pr does not extend the time allocated to block building.
  • handing rollbacks when the pendingCheckpoint needs to be rolled back / cleared.

@Maddiaa0 Maddiaa0 force-pushed the md/pipelining-syncing branch from 8de6157 to 31f941d Compare March 3, 2026 20:51
@Maddiaa0 Maddiaa0 force-pushed the md/update-epoch-cache-for-buildahead branch from 3a75fdb to 1ca98d8 Compare March 3, 2026 20:51
Comment on lines +74 to +76
// Atomically set the pending checkpoint number alongside the block if provided
pendingCheckpointNumber !== undefined &&
this.store.blockStore.setPendingCheckpointNumber(pendingCheckpointNumber),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't match the comment that this Sets the pending checkpoint number (quorum-attested but not yet L1-confirmed), right?

That said, let's discuss the model. Weirdly, I like the concept of having an uncheckpointed checkpoint. But it seems like we have two different things:

  • A checkpoint-being-built, which is the checkpoint being built via proposed blocks. This is just a checkpoint number, since a Checkpoint object needs all its blocks to be ready. Today we already have this, but don't need to explicitly store it.
  • A checkpoint-being-proposed, which is the Checkpoint object for which the current proposer has sent a checkpoint proposal, and would ultimately make it onto L1.

We need to define which ones we expose to clients of the archiver, and also to users via APIs like getL2Tips.

@Maddiaa0 Maddiaa0 force-pushed the md/pipelining-syncing branch 2 times, most recently from 6a16e98 to 28a0520 Compare March 6, 2026 17:19
@Maddiaa0 Maddiaa0 force-pushed the md/pipelining-syncing branch 2 times, most recently from 27f8a4b to f5c6308 Compare March 9, 2026 12:12
@Maddiaa0 Maddiaa0 force-pushed the md/update-epoch-cache-for-buildahead branch from b1b1b14 to ce0e422 Compare March 9, 2026 12:12
@Maddiaa0 Maddiaa0 force-pushed the md/pipelining-syncing branch from f5c6308 to ae6b651 Compare March 9, 2026 12:26
@Maddiaa0 Maddiaa0 marked this pull request as ready for review March 9, 2026 12:29
@Maddiaa0 Maddiaa0 force-pushed the md/update-epoch-cache-for-buildahead branch from ce0e422 to d8b8d31 Compare March 9, 2026 16:01
@Maddiaa0 Maddiaa0 force-pushed the md/pipelining-syncing branch 2 times, most recently from 3a47fbf to 4bb6845 Compare March 9, 2026 16:07
@Maddiaa0 Maddiaa0 force-pushed the md/update-epoch-cache-for-buildahead branch from d8b8d31 to 6973d24 Compare March 9, 2026 16:41
@Maddiaa0 Maddiaa0 force-pushed the md/pipelining-syncing branch from 4bb6845 to 0e9b52e Compare March 9, 2026 16:41
@Maddiaa0 Maddiaa0 force-pushed the md/pipelining-syncing branch from 0e9b52e to dee426a Compare March 12, 2026 15:52
@Maddiaa0 Maddiaa0 force-pushed the md/update-epoch-cache-for-buildahead branch from 6973d24 to 21cdcd5 Compare March 12, 2026 15:52
@Maddiaa0 Maddiaa0 force-pushed the md/update-epoch-cache-for-buildahead branch from 21cdcd5 to 9543a61 Compare March 16, 2026 18:08
@Maddiaa0 Maddiaa0 force-pushed the md/pipelining-syncing branch from 2d31671 to 2591475 Compare March 19, 2026 21:03
// Trigger syncs to flush any queued blocks, retrying until we find the data or give up.
if (!blockData) {
blockData = await retryUntil(
async () => {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably want to

  1. Not call syncImmediate here. That's going to force the archiver to query L1 aggressively. When the block is pushed to the archiver it already calls that function, se we aren't going to make things go any faster.
  2. Use a more appropriate timeout. Presumably that wuld be the end of the slot?

Copy link
Copy Markdown
Contributor

@spalladino spalladino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work man

Comment on lines +77 to +85
/** Storage format for a pending checkpoint (attested but not yet L1-confirmed). */
type PendingCheckpointStore = {
header: Buffer;
checkpointNumber: number;
startBlock: number;
blockCount: number;
totalManaUsed: string;
feeAssetPriceModifier: string;
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we capture archive, outhash, or all data that's not L1 or attestations?

Also, nit: rename to PendingCheckpointStorage for consistency with the other types here.

Copy link
Copy Markdown
Member Author

@Maddiaa0 Maddiaa0 Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can, I just kept the minimum; will add

Comment on lines +201 to +213
// The same check as above but for checkpoints. Accept the block if either the confirmed
// checkpoint or the pending (locally validated but not yet confirmed) checkpoint matches.
const expectedCheckpointNumber = blockCheckpointNumber - 1;
if (
!opts.force &&
previousCheckpointNumber !== expectedCheckpointNumber &&
pendingCheckpointNumber !== expectedCheckpointNumber
) {
const [reported, source]: [CheckpointNumber, 'confirmed' | 'pending'] =
pendingCheckpointNumber > previousCheckpointNumber
? [pendingCheckpointNumber, 'pending']
: [previousCheckpointNumber, 'confirmed'];
throw new CheckpointNumberNotSequentialError(blockCheckpointNumber, reported, source);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any situation where addProposedBlock would add a block for the pending checkpoint? My understanding is we add proposed blocks, then throw a checkpoint proposal on top to flag those as "pending checkpointing", and then keep adding proposed blocks for the next one.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, adding a block to a pending checkpoint breaks the blockCount property of the PendingCheckpointStore. Seems like we should not allow that.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we should not end up adding directly to the pending checkpoint, only above it.

This case is to allow building ontop of the pending checkpoint - not for. But it looks like it may allow what you have mentioned, I'll make it more strict

block: { number: provenBlockNumber, hash: provenBlockData.blockHash.toString() },
checkpoint: provenCheckpointId,
},
pendingCheckpoint: pendingCheckpointBlockData
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely need to rethink naming. The rollup contract already has the concept of "pending checkpoint" which is actually the checkpointed checkpoint. We need to change either.

Other thoughts, in random order, some mutually exclusive:

  • Do we want to bundle the pending checkpoint with the proposed chain tip? It'll break the property that, in a chain tip, the reported block matches the last block of the reported checkpoint. But maybe it's fine.
  • Do we want to differentiate attested vs non-attested pending checkpoints? Personally I don't think so.
  • Should the pendingCheckpoint tip return the checkpointed one if there's no pending checkpoint block data? It's the only chain tip that may be undefined.
  • Do we want to make a bigger rename? It seems like we're dealing with "pending checkpoints" and "checkpointed checkpoints". Maybe "checkpoint" was the bad name, and we should be talking about "pending bundles" and "checkpointed bundles" or something like that?

No need to make these naming changes in this PR, but let's give it a good thought before closing this epic.

}

// What's the slot of the first uncheckpointed block?
// Don't prune blocks that are covered by a pending checkpoint (awaiting L1 submission from pipelining)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking forward to it!

Comment on lines +1025 to +1036
const current = await this.getPendingCheckpointNumber();
if (pending.checkpointNumber <= current) {
this.#log.warn(`Ignoring stale pending checkpoint number ${pending.checkpointNumber} (current: ${current})`);
return;
}
const confirmed = await this.getLatestCheckpointNumber();
if (pending.checkpointNumber !== confirmed + 1) {
this.#log.warn(
`Ignoring pending checkpoint ${pending.checkpointNumber}: expected ${confirmed + 1} (confirmed + 1)`,
);
return;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless we can think of legitimate situations for this, I'd throw instead of warning. It will help us catch inconsistencies easier.

if (this.interrupted) {
return undefined;
}
return this.sendRequests();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we re-simulate before sending, this time with the actual state of L1, instead of the manual overrides? This should help catch scenarios where the previous checkpoint didn't behave as we expected (not to mention the ones where it didn't land).

Copy link
Copy Markdown
Member Author

@Maddiaa0 Maddiaa0 Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#21250 in here i add a preCheck hook to each submission that runs after the enqueue sleep - without the overrides

Comment on lines +298 to +302
const grandparentCheckpointNumber = CheckpointNumber(this.checkpointNumber - 2);
const [grandparentCheckpoint, manaTarget] = await Promise.all([
rollup.getCheckpoint(grandparentCheckpointNumber),
rollup.getManaTarget(),
]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this works because we're guaranteed to have a grandparent checkpoint if there's a non-undefined pendingCheckpointData? Otherwise I see this failing in the first checkpoint(s)?

Comment on lines +373 to +389
if (invalidateCheckpoint) {
// After invalidation, L1 will roll back to checkpoint N-1. The archive at N-1 already
// exists on L1, so we just pass the matching archive (the lastArchive of the invalid checkpoint).
archiveForCheck = invalidateCheckpoint.lastArchive;
l1Overrides.forcePendingCheckpointNumber = invalidateCheckpoint.forcePendingCheckpointNumber;
this.metrics.recordPipelineDepth(0);
} else if (this.epochCache.isProposerPipeliningEnabled() && syncedTo.hasPendingCheckpoint) {
// Parent checkpoint hasn't landed on L1 yet. Override both the pending checkpoint number
// and the archive at that checkpoint so L1 simulation sees the correct chain tip.
const parentCheckpointNumber = CheckpointNumber(checkpointNumber - 1);
l1Overrides.forcePendingCheckpointNumber = parentCheckpointNumber;
l1Overrides.forceArchive = { checkpointNumber: parentCheckpointNumber, archive: syncedTo.archive };
this.metrics.recordPipelineDepth(1);

this.log.verbose(
`Building on top of pending checkpoint (pending=${syncedTo.pendingCheckpointData?.checkpointNumber})`,
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the order of this ifs be switched? If there's a pending checkpoint AND the tip of the checkpointed chain is invalid, we should expect the pending checkpoint to perform the invalidation when it lands, so we should build on top of the pending checkpoint instead.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ive dug into this and yes you're right, the parent should have bundled it with their checkpoint

Base automatically changed from merge-train/spartan to next March 20, 2026 22:34
@Maddiaa0 Maddiaa0 changed the base branch from next to graphite-base/21025 March 25, 2026 15:59
@Maddiaa0 Maddiaa0 force-pushed the md/pipelining-syncing branch from 7802979 to b371deb Compare March 25, 2026 15:59
@Maddiaa0 Maddiaa0 changed the base branch from graphite-base/21025 to merge-train/spartan March 25, 2026 15:59
Comment on lines +165 to +167
public getProposedCheckpoint(): Promise<ProposedCheckpointData | undefined> {
return this.store.blockStore.getProposedCheckpoint();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I should've asked this earlier: is it possible that there's more than one "proposed checkpoint" in flight at any given time? Eg let's say that we're close to the end of slot N. The L1 tx for posting checkpoint N hasn't landed yet, but slot N+1 is already built and its checkpoint proposal has already been broadcasted via p2p. Wouldn't that lead to two checkpoints awaiting checkpointing?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an issue for this - i wanted to cement the happy path first

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent this happening in this PR, theres a Max Pipelining depth. The proposer should not build if their checkpoint is 2 ahead of what is currently on L1.

For byzantine nodes, the proposed checkpoint is not stored if it is too far ahead of the current checkpointed block.

In the future we can relax these constraints. But right now there is a max depth to prevent reorgs getting too big

Comment on lines +273 to +281
// Don't prune blocks that are covered by a proposed checkpoint (awaiting L1 submission from pipelining)
const firstUncheckpointedBlockNumber = BlockNumber(lastCheckpointedBlockNumber + 1);
if (proposedCheckpoint) {
const lastPendingBlock = BlockNumber(proposedCheckpoint.startBlock + proposedCheckpoint.blockCount - 1);
if (lastPendingBlock >= firstUncheckpointedBlockNumber) {
this.log.trace(`Skipping prune: proposed checkpoint covers blocks up to ${lastPendingBlock}`);
return;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this check depend on the slot numbers, as opposed to just the block numbers? Is it possible we have moved way past the slots in which the proposed checkpoint should have been mined, but we don't prune it because this check does not look at anything time-dependent?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its fixed in the pr on top but i can move it here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't, I'm fine with tackling it later

Comment on lines +643 to +644
totalManaUsed: BigInt(stored.totalManaUsed ?? '0'),
feeAssetPriceModifier: BigInt(stored.feeAssetPriceModifier ?? '0'),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When could these be undefined?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never, this is legacy will clean up

Comment on lines +361 to +366
if (
previousBlock.checkpointNumber === block.checkpointNumber &&
previousBlock.indexWithinCheckpoint !== block.indexWithinCheckpoint - 1
) {
throw new BlockIndexNotSequentialError(block.indexWithinCheckpoint, previousBlock.indexWithinCheckpoint);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a check for when we cross from one checkpoint to the next (checkpoint number should be +1, index within checkpoint should be zero)

Comment on lines +621 to +633
// Register checkpoint proposal handler for all nodes.
// Validates proposals before setting proposed checkpoint on archiver.
const getValidatorAddresses = validatorClient
? () => validatorClient.getValidatorAddresses().map(a => a.toString())
: undefined;
createCheckpointProposalHandler(config, {
checkpointsBuilder: validatorCheckpointsBuilder,
blockSource: archiver,
l1ToL2MessageSource: archiver,
epochCache,
dateProvider,
telemetry,
}).register(p2pClient, archiver, getValidatorAddresses);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn, in #21999 I moved checkpoint proposal logic to the block proposal handler, and renamed it to "proposal handler". Sorry for the conflict. We should unify it. I'm fine with either approach.

Comment on lines +132 to +133
// When pipelining, force the proposed checkpoint number and fee header to the parent so that
// the fee computation matches what L1 will see when the previous pipelined checkpoint has landed.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm: the LAG for updating fees is enough that we can safely precompute it this much in advance?

@Maddiaa0 Maddiaa0 force-pushed the md/pipelining-syncing branch from bfccd52 to 69756aa Compare March 27, 2026 18:00
p2pClient.registerBlockProposalHandler(blockHandler);

// Register checkpoint proposal handler if blob uploads are enabled and we are reexecuting
if (this.blobClient.canUpload() && shouldReexecute) {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i register it for everyone, but i can guard this by pipeliningEnabled

Comment on lines +116 to +117
// These tests verify parity with Solidity FeeLib.sol (computeNewEthPerFeeAsset, clampedAdd).
// If FeeLib.sol changes, these tests must be updated to match.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll have overlap with #22116. You merge first on this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants