feat(slasher): per-slot data-withholding watcher (A-523, A-525) by PhilWindle · Pull Request #23116 · AztecProtocol/aztec-packages

PhilWindle · 2026-05-08T21:38:22Z

Summary

Moves the data-withholding slash from the L1-prune path to a per-slot check at slotStart(checkpoint.slot + slashDataWithholdingToleranceSlots), and removes the now-unnecessary VALID_EPOCH_PRUNED offense and EpochPruneWatcher.

Per AZIP-7: validators are responsible for making tx data available, not for ensuring proofs land. The new DataWithholdingWatcher ticks at quarter-eth-slot cadence and, for each published checkpoint older than dataWithholdingToleranceSlots (default 3), probes the local mempool for the txs in the checkpoint's blocks. Missing txs trigger a DATA_WITHHOLDING slash for the validators who actually attested to that checkpoint.

Highlights

New DataWithholdingWatcher (yarn-project/slasher/src/watchers/data_withholding_watcher.ts) with full unit-test coverage. Sentinel-style tick + restart floor (no KV).
Slot-keyed DATA_WITHHOLDING — moved from 'epoch' to 'slot' in getTimeUnitForOffense. Offense identity is now per-checkpoint, not per-epoch.
Single source of truth for tolerance. P2PClient.collectingMissingTxs anchors its tx-collection deadline to slotStart(block.slot + slashDataWithholdingToleranceSlots) so the collection effort runs to exactly the wall-clock instant the watcher renders its verdict. The ad-hoc p2pMissingTxCollectionDeadlineMs is removed.
A-525 deletions bundled in: OffenseType.VALID_EPOCH_PRUNED, slashPrunePenalty config + env var + spartan plumbing, EpochPruneWatcher class + tests, valid_epoch_pruned_slash.test.ts.
e2e test rewritten in data_withholding_slash.test.ts: 4 validators, slashSelfAllowed, tx is mined normally then stubbed missing on every node, watcher fires, slash executes on-chain, committee is kicked. Asserts slot-keyed offense identity + on-chain effect.

Test plan

yarn workspace @aztec/slasher test — 76 tests pass (incl. 9 new DataWithholdingWatcher tests).
yarn workspace @aztec/stdlib test src/slashing — 55 tests pass after the keying flip.
yarn workspace @aztec/p2p test src/client/p2p_client.test.ts — 20 tests pass with the new slot-anchored deadline.
yarn build clean across the monorepo.
The e2e (e2e_p2p_data_withholding_slash) is in place but only runs in CI.

Out of scope

L1 contract changes — none needed (offense type code is purely off-chain).
The L2PruneUnproven event/emitter is left in place; nothing subscribes to it after this PR but the event itself stays available for future observers.
Re-execution of checkpoints (covered by sibling A-1022).

Closes A-523.
Closes A-525.

spalladino

Looks good! Left a few comments. Aside from them, something I'm worried about are race conditions that cause the TxPool to evict txs before the data withholding watcher has had the time to check availability. For instance: let's say an epoch gets proven really fast, so its txs get evicted from the mempool (since they are no longer needed) faster than the data withholding tolerance. When the watcher kicks in, the txs will have been removed from the tx pool, and the slash will kick on. Or even worse: in the event of a prune due to missing proof, the txs for all blocks in the epoch become unprotected (or removed if they're invalid, as they may have been anchored to pruned blocks), and the data withholding check will kick in a few slots after the prune and want to slash all attestors of the last few checkpoints in the epoch. I'm not sure what's the best way to fix this though, and apologies for not catching this earlier.

spalladino · 2026-05-13T12:14:19Z

+    // Staleness: the proposer can only publish this checkpoint within its wallclock slot,
+    // which is `slot - pipeliningOffset`.
+    const currentSlot = this.epochCache.getSlotNow();
+    const lastAttestableWallclockSlot = slotNumber - this.epochCache.pipeliningOffset();


We should use the same helpers as in the attestation_validator here, which I think are more lenient.

More in general, when in doubt, we should probably broadcast our attestation anyway. Better be penalized on p2p than slashed due to inactivity.

spalladino · 2026-05-13T12:25:38Z

+    // before we declare its data missing. For checkpoint slot S, we therefore process S
+    // only once we are in slot `S + tolerance + 1` or later.
+    const tolerance = this.config.slashDataWithholdingToleranceSlots;
+    const currentSlot = this.epochCache.getSlotNow();


I'd change to L2BlockSource.getSyncedL2SlotNumber in case the archiver is lagging behind for some reason

spalladino · 2026-05-13T12:26:51Z

+      this.log.warn(`Detected ${missingTxs.length} missing txs at slot ${slot} but no recoverable attesters`, {
+        slot,
+        missingTxs: missingTxs.map(h => h.toString()),
+      });


This is weird, the publisher itself should at least be an attester. But fine to leave this in just as a guard.

spalladino · 2026-05-13T12:32:43Z

+    // 4. Wait until the proposer is the designated current proposer.
+    t.logger.warn(`Waiting for ${proposerAddress} to be the current proposer`);


With pipelining you should wait until proposerAddress is to be the proposer in the next slot according to the rollup contract. I'm surprised the tests passes anyway.

spalladino · 2026-05-13T12:34:02Z

+    // 7. Wait for the slash to execute on L1. Advance to the slashing tally epoch so B/C/D
+    //    can start voting on the offense round.


I'd argue we should start dropping this part from all e2e slashing tests, so they run faster, and keep it in only one of them.

spalladino · 2026-05-13T12:34:56Z

+    const proposerP2pService: any = (proposerNode as any).p2pClient.p2pService;
+    const originalPropagate = proposerP2pService.propagate.bind(proposerP2pService);
+    jest.spyOn(proposerP2pService, 'propagate').mockImplementation(((msg: any) => {
+      if (msg instanceof Tx) {
+        t.logger.info(`Suppressing outbound tx gossip from proposer ${proposerAddress}`);
+        return Promise.resolve();
+      }
+      return originalPropagate(msg);
+    }) as any);


I prefer adding test-only config flags to control this behaviour, so e2e tests don't need to mess with internals. But it's a personal preference.

A-523: Replace the L1-prune-time data-withholding check with a per-slot watcher that runs once `slashDataWithholdingToleranceSlots` full slots have elapsed after a published checkpoint, probes the local mempool via ITxProvider.hasTxs, and emits a slot-keyed DATA_WITHHOLDING offense for every attester (union of L1-published + p2p-pool) that signed the checkpoint when any of its txs are still missing. Restart-floor at boot slot, ticks at quarter-eth-slot cadence. A-525: Remove the legacy VALID_EPOCH_PRUNED slashing path: EpochPruneWatcher, slashPrunePenalty, and OffenseType.VALID_EPOCH_PRUNED. Protocol fixes that close the data-withholding loophole: - validateCheckpointProposal: refuse to attest if the block's enclosing checkpoint is already on L1 — we cannot legitimately attest based on L1-synced data we did not witness on p2p. - shouldAttestToSlot: refuse a proposal whose proposer's wallclock window has elapsed (`currentSlot > slot - pipeliningOffset`). - block-sync timeout in validateCheckpointProposal: use `getReexecutionDeadline` so the deadline is the end of the proposal's own slot when pipelining is off, and the start of the target slot when it is on. P2P: - ITxProvider.hasTxs(txHashes) added for the watcher's mempool probe. - Optional p2pMissingTxCollectionDeadlineSlots clamped up to the tolerance window, so collection never gives up before the slash verdict. E2E: - Rewrite data_withholding_slash to model the realistic attack: stub proposer outbound tx gossip + two blind attesters + one honest member who naturally refuses via the new gates. Retry epoch advance until a full committee is found. - Patch duplicate_attestation_slash: stub `notifyOwnCheckpointProposal` on the malicious nodes to break the self-loopback into local validation (with skipPushProposedBlocksToArchiver the proposer's block never lands in its archiver, so the loopback hung the proposer past the staleness gate and suppressed all duplicate attestations).

…-523-move-data-withholding-check-to-end-of-slot-instead-of-at-l2

…23/A-525)

spalladino · 2026-05-14T19:32:52Z

+  }
+
+  public start(): Promise<void> {
+    this.initialSlot = this.epochCache.getSlotNow();


Should be l2BlockSource.getSyncedL2SlotNumber() here as well, sorry I missed it on the first pass

spalladino · 2026-05-14T19:34:58Z

+ * driven externally via `removeBefore` (typically by the proposal handler, once a
+ * checkpoint reaches L1 finality).
+ */
+export class InMemoryCheckpointReexecutionTracker implements CheckpointReexecutionTracker {


Personally I've started removing intermediate interfaces when we have a single implementation, since it makes code navigation more annoying. We should agree on whether it's a good policy or not.

spalladino · 2026-05-14T19:36:22Z

+    // We successfully re-executed every block in this checkpoint locally, record for any observers
+    this.reexecutionTracker.recordReexecuted(checkpointNumber, proposal.archive);


Should we add a recordTxsCollected before we reexecute, in case we do manage to collect the txs but fail to reexecute within the deadline?

Mirror the same archiver-synced-slot logic used in work(): use l2BlockSource.getSyncedL2SlotNumber() rather than the wallclock to floor processing on start. Avoids treating slots as already-started before the archiver has ingested them. Per PR review.

…kpoint_proposal_slash test slashPrunePenalty was removed from SetupOptions as part of A-523/A-525. This e2e test still referenced it, breaking the TS build.

PhilWindle added ci-draft Run CI on draft PRs. ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure labels May 8, 2026

PhilWindle force-pushed the phil/a-523-move-data-withholding-check-to-end-of-slot-instead-of-at-l2 branch 2 times, most recently from 78d0044 to ec8293d Compare May 12, 2026 19:34

PhilWindle marked this pull request as ready for review May 12, 2026 19:37

PhilWindle requested review from LHerskind, Maddiaa0 and just-mitch as code owners May 12, 2026 19:37

PhilWindle force-pushed the phil/a-523-move-data-withholding-check-to-end-of-slot-instead-of-at-l2 branch from ec8293d to 51fe9e3 Compare May 13, 2026 08:53

spalladino reviewed May 13, 2026

View reviewed changes

PhilWindle force-pushed the phil/a-523-move-data-withholding-check-to-end-of-slot-instead-of-at-l2 branch 3 times, most recently from e48f310 to cdf0290 Compare May 13, 2026 16:40

PhilWindle force-pushed the phil/a-523-move-data-withholding-check-to-end-of-slot-instead-of-at-l2 branch from cdf0290 to 7c1cfff Compare May 13, 2026 17:29

PhilWindle added 2 commits May 13, 2026 17:53

Comment

0136e24

Minor re-order

7681753

PhilWindle force-pushed the phil/a-523-move-data-withholding-check-to-end-of-slot-instead-of-at-l2 branch from 7f1b01c to 425d9bf Compare May 14, 2026 17:37

Merge remote-tracking branch 'origin/merge-train/spartan' into phil/a…

cd34200

…-523-move-data-withholding-check-to-end-of-slot-instead-of-at-l2

PhilWindle force-pushed the phil/a-523-move-data-withholding-check-to-end-of-slot-instead-of-at-l2 branch from 425d9bf to cd34200 Compare May 14, 2026 17:39

fix(stdlib): drop slashPrunePenalty from helpers test (removed in A-5…

21579f5

…23/A-525)

spalladino approved these changes May 14, 2026

View reviewed changes

PhilWindle enabled auto-merge (squash) May 15, 2026 10:38

fix(end-to-end): drop slashPrunePenalty from broadcasted_invalid_chec…

6b08d1d

…kpoint_proposal_slash test slashPrunePenalty was removed from SetupOptions as part of A-523/A-525. This e2e test still referenced it, breaking the TS build.

PhilWindle merged commit 58acdae into merge-train/spartan May 15, 2026
14 checks passed

PhilWindle deleted the phil/a-523-move-data-withholding-check-to-end-of-slot-instead-of-at-l2 branch May 15, 2026 15:56

AztecBot mentioned this pull request May 15, 2026

feat: merge-train/spartan #23253

Open

spalladino mentioned this pull request May 16, 2026

test(e2e): batch 3 of pipelining e2e test migration #23328

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(slasher): per-slot data-withholding watcher (A-523, A-525)#23116

feat(slasher): per-slot data-withholding watcher (A-523, A-525)#23116
PhilWindle merged 7 commits into
merge-train/spartanfrom
phil/a-523-move-data-withholding-check-to-end-of-slot-instead-of-at-l2

PhilWindle commented May 8, 2026

Uh oh!

spalladino left a comment

Uh oh!

spalladino May 13, 2026

Uh oh!

spalladino May 13, 2026

Uh oh!

spalladino May 13, 2026

Uh oh!

spalladino May 13, 2026

Uh oh!

spalladino May 13, 2026

Uh oh!

spalladino May 13, 2026

Uh oh!

spalladino May 14, 2026

Uh oh!

spalladino May 14, 2026

Uh oh!

spalladino May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		// 4. Wait until the proposer is the designated current proposer.
		t.logger.warn(`Waiting for ${proposerAddress} to be the current proposer`);

		// 7. Wait for the slash to execute on L1. Advance to the slashing tally epoch so B/C/D
		// can start voting on the offense round.

		// We successfully re-executed every block in this checkpoint locally, record for any observers
		this.reexecutionTracker.recordReexecuted(checkpointNumber, proposal.archive);

Conversation

PhilWindle commented May 8, 2026

Summary

Highlights

Test plan

Out of scope

Uh oh!

spalladino left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants