cache blessed base manifests and hot sets in memory#37
Merged
Conversation
Base images (bases/*) are immutable after bless — no reason to fetch them from S3 on every fork. Add bounded in-memory caches (64 entries) on ExportRouter for deserialized VolumeManifests and parsed hot set indices. Cache hits clone an Arc (~0ns) instead of an S3 round-trip (~100ms). - Lazy population: first fork from a base fills the cache, subsequent forks hit it - Pre-warm on startup: after discover_exports, scan bases/ in each unique S3 prefix and load manifests + hot sets concurrently (8-wide) - Only download up to remaining cache capacity during pre-warm Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jaredLunde
added a commit
that referenced
this pull request
May 16, 2026
Pure mechanical move. Nine handoff-specific methods come off `ExportRouter` and onto a new `HandoffCoordinator` in `glidefs/src/handoff/coordinator.rs`: - handoff_snapshot → HandoffCoordinator::snapshot - freeze_all, unfreeze_all - set_all_caches_freeze - take_ublk_server, recover_handoff_devices, revive_after_failed_handoff - is_per_io_daemon_supported - get_handler_sync (also kept ExportRouter::get_handler async variant for NBD) The coordinator wraps `Arc<ExportRouter>` and reaches per-export state through three new `pub(crate)` accessors on the router: - `exports_map() -> &DashMap<String, ExportState>` - `cache_dir_path() -> &Path` - `ublk_server_mutex() -> &Mutex<UblkServer>` (cfg-gated) The single `cache.inner.manifest_etag.lock()` reach in `recover_handoff_devices` is replaced by a new `pub(crate)` method `WriteCache::set_manifest_etag(Option<String>)`, so the coordinator never reaches into `pub(super) inner`. `PredecessorCutoverCtx` and `SuccessorTakeoverCtx` now carry `Arc<HandoffCoordinator>` instead of `Arc<ExportRouter>`. CRH and the trait's default `get_handler` impl follow. `run_predecessor` and `run_successor` take `Arc<HandoffCoordinator>`. `cli/server.rs` constructs the coordinator next to the router build (both predecessor SIGHUP path and successor entry point). `router.rs` shrinks from ~3232 lines of handoff cruft to its actual job: per-export I/O dispatch. handoff_sequential_50_crh ✓ (this run; 575s, 50 clean handoffs, fio do_verify clean, oracle scan zero corrupt blocks) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Base images (bases/*) are immutable after bless — no reason to fetch them from S3 on every fork. Add bounded in-memory caches (64 entries) on ExportRouter for deserialized VolumeManifests and parsed hot set indices. Cache hits clone an Arc (~0ns) instead of an S3 round-trip (~100ms).