fix(epoch-cache): use TTL-based caching with finalization tracking and correct lag by spalladino · Pull Request #22204 · AztecProtocol/aztec-packages

spalladino · 2026-04-01T00:34:28Z

Motivation

PR #22153 introduced a hard "finalized block guard" that refuses to compute committees if L1 data isn't finalized. While the safety goal is valid (preventing L1 reorgs from invalidating cached committees), it breaks many tests that don't properly set L1 finalized time and would cause the chain to stall if L1 stops finalizing. This PR takes a different approach that preserves safety while maintaining liveness.

Also fixes the lag parameter: the old code used lagInEpochsForValidatorSet (the looser constraint) instead of lagInEpochsForRandao (the binding one), and computed the sampling timestamp from the slot rather than the epoch start.

Fixes A-680

Approach

Instead of refusing to serve committee data that isn't finalized, use a TTL-based cache: finalized entries are cached permanently, non-finalized entries expire after one Ethereum slot (12s) and get re-fetched from L1. The cache map stores both resolved entries and in-flight promises directly, so concurrent callers for the same epoch coalesce on a single L1 query. On fetch failure, the previous stale entry is restored so the next caller retries cleanly.

Changes

epoch-cache: Replaced the simple Map<EpochNumber, EpochCommitteeInfo> cache with Map<EpochNumber, CachedEpochEntry | Promise<CachedEpochEntry>>. Each resolved entry carries L1 block provenance metadata (number, hash, timestamp) and a finalized flag. Switched from lagInEpochsForValidatorSet to lagInEpochsForRandao and compute sampling timestamp from epoch start via getStartTimestampForEpoch. Simplified isEscapeHatchOpen to delegate cache management to getCommittee.
epoch-cache (tests): Updated unit tests for the new cache structure. Added 4 new TTL tests: re-query after TTL, no re-query for finalized, concurrent coalescing, eventual finalization promotion.
epoch-cache (integration tests): New integration test suite against real Anvil with deployed L1 contracts and 4 validators. Tests finalized committee retrieval, non-finalized TTL refresh, and cache re-fetch after L1 reorg.
epoch-cache (README): Added comprehensive documentation covering committee computation, LAG values, RANDAO seed, proposer selection, escape hatch, TTL caching with finalization tracking, and configuration.

…d correct lag The epoch cache previously cached committee data forever once fetched, and used lagInEpochsForValidatorSet (the looser constraint) for its staleness guard. This could serve stale data after L1 reorgs and was less strict than the L1 contract. Switch to a TTL-based approach: finalized entries are cached permanently, while non-finalized entries expire after one Ethereum slot (12s) and get re-fetched. Use lagInEpochsForRandao (the binding constraint) and compute the sampling timestamp from the epoch start to match the L1 contract's logic. Concurrent requests for the same epoch coalesce on a single in-flight promise stored directly in the cache map. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…promises in LRU When a stale non-finalized entry expires, do a lightweight refresh: query only the block hash at the original block number and the finalized block timestamp. If the hash matches (no reorg), keep the cached data and just update the timestamp and finalization flag — avoiding expensive getCommitteeAt and getSampleSeedAt calls. Only do a full re-fetch on hash mismatch (reorg). Also cache empty committees (without the finalized flag, so they always get re-queried after TTL), and include in-flight promises in LRU purge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ss prefetched timestamps Rename l1BlockNumber/l1BlockHash to lastQueryL1BlockNumber/lastQueryL1BlockHash and l1BlockTimestamp to lastRefreshL1Timestamp for clarity on their roles. Move the latest block fetch into the initial Promise.all in refreshStaleEntry to minimize latency. Pass already-fetched latest and finalized timestamps from refreshStaleEntry to fetchAndCache on reorg, avoiding redundant L1 queries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…stead of timestamps Avoids creating fake block objects with zero number/hash on the prefetched path. The refreshStaleEntry method already has the full latest and finalized blocks from its Promise.all, so just pass them through directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

spalladino and others added 4 commits March 31, 2026 21:28

PhilWindle approved these changes Apr 8, 2026

View reviewed changes

spalladino enabled auto-merge (squash) April 8, 2026 12:28

spalladino disabled auto-merge April 8, 2026 12:42

spalladino merged commit 2730d08 into merge-train/spartan Apr 8, 2026
12 checks passed

spalladino deleted the palla/epoch-cache-ttl branch April 8, 2026 12:42

spalladino mentioned this pull request Apr 8, 2026

fix(epoch-cache): use finalized L1 block and correct lag for committee guard #22153

Closed

AztecBot mentioned this pull request Apr 8, 2026

feat: merge-train/spartan #22352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(epoch-cache): use TTL-based caching with finalization tracking and correct lag#22204

fix(epoch-cache): use TTL-based caching with finalization tracking and correct lag#22204
spalladino merged 4 commits intomerge-train/spartanfrom
palla/epoch-cache-ttl

spalladino commented Apr 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

spalladino commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Approach

Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

spalladino commented Apr 1, 2026 •

edited

Loading