Add cross-cycle presence cache to reduce repair HeadObject calls#192
Merged
raymondjacobson merged 1 commit intoOpenAudio:mainfrom Apr 4, 2026
Merged
Conversation
Repair cycles call bucket.Attributes (HeadObject) for every locally-held CID to verify presence. On S3-compatible backends this is the dominant source of metadata API calls. Add an imcache LRU (500K entries, no TTL) that remembers confirmed-present keys across cycles. Non-cleanup repair cycles check the cache first and skip the Attributes call on hit. Cleanup cycles (every 4th) clear the cache via RemoveAll and do full verification — they need ModTime for over-replication decisions and run blob validation. Cache is populated after all validation passes (so corrupt blobs are never cached), on replicateToMyBucket writes, and invalidated on dropFromMyBucket deletes and cleanup validation deletes.
raymondjacobson
approved these changes
Apr 3, 2026
Contributor
raymondjacobson
left a comment
There was a problem hiding this comment.
nice little PR here! The once every 4 cycles cleanup makes sense to me.
we'll see how the 500k limit fares. it may be worth exposing this in a var because archive nodes that volunteer to store the whole catalog would make use of a larger cache too
RolfAris
added a commit
to RolfAris/go-openaudio
that referenced
this pull request
Apr 6, 2026
Move the listing-derived presence index from cleanup-only to all repair cycles (uploads, previews, qm_cids). Always on — no flag needed. The index replaces per-key HeadObject with a single ListObjects pagination at cycle start. Staleness between listings is covered by the existing knownPresent write-path cache (PR OpenAudio#192). On build failure, falls back to per-key HeadObject (same as before).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
bucket.Attributes(HeadObject) for every locally-held CID. On S3-compatible backends, this is the dominant source of metadata API calls.imcacheLRU cache (knownPresent, 500K entries, no TTL) that remembers confirmed-present keys across cycles. Non-cleanup cycles check the cache and skipAttributeson hit.RemoveAll()and do full verification — they needModTimefor over-replication decisions and run blob integrity validation.replicateToMyBucketwrites. Invalidated ondropFromMyBucketand cleanup validation deletes.haveInMyBucketand serving paths are unchanged.Design notes
imcachewithWithNoExpiration()and LRU eviction, consistent with the four existing caches onMediorumServer. No new dependencies.knownPresent.Removeinside the cleanup validation delete block is currently unreachable due to a pre-existing bug whereerris checked instead oferrVal(see Fix two bugs in repairCid: dead cleanup validation and wrong polarity #175). It becomes live when Fix two bugs in repairCid: dead cleanup validation and wrong polarity #175 merges.repair_known_presentcounter tracks cache hits per cycle.known_present_sizeis logged at cycle completion.