fix(epoch-cache): use TTL-based caching with finalization tracking and correct lag#22204
Merged
spalladino merged 4 commits intomerge-train/spartanfrom Apr 8, 2026
Merged
fix(epoch-cache): use TTL-based caching with finalization tracking and correct lag#22204spalladino merged 4 commits intomerge-train/spartanfrom
spalladino merged 4 commits intomerge-train/spartanfrom
Conversation
…d correct lag The epoch cache previously cached committee data forever once fetched, and used lagInEpochsForValidatorSet (the looser constraint) for its staleness guard. This could serve stale data after L1 reorgs and was less strict than the L1 contract. Switch to a TTL-based approach: finalized entries are cached permanently, while non-finalized entries expire after one Ethereum slot (12s) and get re-fetched. Use lagInEpochsForRandao (the binding constraint) and compute the sampling timestamp from the epoch start to match the L1 contract's logic. Concurrent requests for the same epoch coalesce on a single in-flight promise stored directly in the cache map. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…promises in LRU When a stale non-finalized entry expires, do a lightweight refresh: query only the block hash at the original block number and the finalized block timestamp. If the hash matches (no reorg), keep the cached data and just update the timestamp and finalization flag — avoiding expensive getCommitteeAt and getSampleSeedAt calls. Only do a full re-fetch on hash mismatch (reorg). Also cache empty committees (without the finalized flag, so they always get re-queried after TTL), and include in-flight promises in LRU purge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ss prefetched timestamps Rename l1BlockNumber/l1BlockHash to lastQueryL1BlockNumber/lastQueryL1BlockHash and l1BlockTimestamp to lastRefreshL1Timestamp for clarity on their roles. Move the latest block fetch into the initial Promise.all in refreshStaleEntry to minimize latency. Pass already-fetched latest and finalized timestamps from refreshStaleEntry to fetchAndCache on reorg, avoiding redundant L1 queries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…stead of timestamps Avoids creating fake block objects with zero number/hash on the prefetched path. The refreshStaleEntry method already has the full latest and finalized blocks from its Promise.all, so just pass them through directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PhilWindle
approved these changes
Apr 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
PR #22153 introduced a hard "finalized block guard" that refuses to compute committees if L1 data isn't finalized. While the safety goal is valid (preventing L1 reorgs from invalidating cached committees), it breaks many tests that don't properly set L1 finalized time and would cause the chain to stall if L1 stops finalizing. This PR takes a different approach that preserves safety while maintaining liveness.
Also fixes the lag parameter: the old code used
lagInEpochsForValidatorSet(the looser constraint) instead oflagInEpochsForRandao(the binding one), and computed the sampling timestamp from the slot rather than the epoch start.Fixes A-680
Approach
Instead of refusing to serve committee data that isn't finalized, use a TTL-based cache: finalized entries are cached permanently, non-finalized entries expire after one Ethereum slot (12s) and get re-fetched from L1. The cache map stores both resolved entries and in-flight promises directly, so concurrent callers for the same epoch coalesce on a single L1 query. On fetch failure, the previous stale entry is restored so the next caller retries cleanly.
Changes
Map<EpochNumber, EpochCommitteeInfo>cache withMap<EpochNumber, CachedEpochEntry | Promise<CachedEpochEntry>>. Each resolved entry carries L1 block provenance metadata (number, hash, timestamp) and afinalizedflag. Switched fromlagInEpochsForValidatorSettolagInEpochsForRandaoand compute sampling timestamp from epoch start viagetStartTimestampForEpoch. SimplifiedisEscapeHatchOpento delegate cache management togetCommittee.