feat: pluggable index cache via CacheBackend trait by wjones127 · Pull Request #6222 · lance-format/lance

wjones127 · 2026-03-18T17:02:59Z

Previously the Session's index cache was hardcoded to Moka. This adds a
CacheBackend trait so users can provide their own cache implementation,
with MokaCacheBackend as the default.

Changes:

Split cache.rs into cache/{mod,backend,keys,moka}.rs
Add CacheBackend trait with get, insert, get_or_insert,
invalidate_prefix, and stats methods
Rename CacheKey::type_id() to type_name() (returns &'static str)
Accept custom CacheBackend in DatasetBuilder
Document why get_or_insert uses Pin<Box<dyn Future>> (object safety)

Serialization and zero-IO reconstruction are in the follow-up PR #6223.

The Session's index cache was hardcoded to use Moka. This adds a CacheBackend trait so users can provide their own cache implementation (e.g. Redis-backed, disk-backed, shared across processes). Two-layer design: - CacheBackend: object-safe async trait with opaque byte keys. This is what plugin authors implement (get, insert, invalidate_prefix, clear, num_entries, size_bytes). - LanceCache: typed wrapper handling key construction (prefix + type tag), type-safe get/insert, DeepSizeOf size computation, hit/miss stats, and concurrent load deduplication. MokaCacheBackend is the default, preserving existing behavior. Custom backends are wired through Session::with_index_cache_backend() or DatasetBuilder::with_index_cache_backend(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-03-18T17:04:59Z

PR Review: feat: pluggable index cache via CacheBackend trait

P0: TOCTOU race in `get_or_insert` — dedup is broken

The check-then-register pattern in get_or_insert has a race between the two separately-locked critical sections:

// Block 1: check (drops lock at end of block)
{
    let map = self.in_flight.lock().await;
    if let Some(rx) = map.get(&cache_key) { ... }
}
// <-- gap: another task can also see an empty map here

// Block 2: register as leader (re-acquires lock)
let (tx, rx) = tokio::sync::watch::channel(None);
{
    let mut map = self.in_flight.lock().await;
    map.insert(cache_key.clone(), rx);
}

Between dropping the lock in block 1 and re-acquiring it in block 2, N tasks can all see an empty in-flight map and all decide they are the "leader." The last one to insert overwrites the earlier receivers, so:

The loader runs N times (dedup completely defeated).
Tasks that cloned an earlier receiver get RecvError when that sender drops without the value propagating through the map.

Fix: Merge the check and register into a single critical section:

let mut map = self.in_flight.lock().await;
if let Some(rx) = map.get(&cache_key) {
    let mut rx = rx.clone();
    drop(map);
    // ... wait for leader ...
} else {
    let (tx, rx) = tokio::sync::watch::channel(None);
    map.insert(cache_key.clone(), rx);
    drop(map);
    // ... run loader ...
}

The test_get_or_insert_dedup test passes by luck because tokio::task::yield_now() with a broadcast barrier doesn't reliably interleave into this gap. A test with a longer sleep or explicit synchronization at the gap point would expose this.

P1: `type_tag` relies on unspecified pointer identity

fn type_tag<T: 'static>() -> [u8; 8] {
    (std::any::type_name::<T>().as_ptr() as u64).to_le_bytes()
}

The Rust spec does not guarantee that type_name::<T>().as_ptr() returns the same address across calls, nor that distinct types produce distinct addresses (string deduplication/interning is compiler-implementation-defined). The old code used TypeId as a key component, which has correct identity semantics.

If two different types happen to collide (same pointer or same string literal address), the cache will silently return a wrong type and the downcast::<T>().unwrap() will panic at runtime.

Consider using TypeId bytes instead. TypeId implements Hash and Eq, so you could hash it to get a stable discriminator, or use transmute to extract its bytes (it's currently 128 bits).

P1: `invalidate_prefix` is fire-and-forget

pub fn invalidate_prefix(&self, prefix: &str) {
    let prefix_bytes = self.make_prefix(prefix);
    let cache = self.cache.clone();
    tokio::spawn(async move {
        cache.invalidate_prefix(&prefix_bytes).await;
    });
}

This means invalidation is not guaranteed to complete before subsequent cache reads. A caller that invalidates then immediately reads could get stale data. The old code was synchronous. If this must remain sync (non-async signature), at minimum document this caveat prominently, or consider changing the signature to async fn.

Minor notes

DeepSizeOf for LanceCache now returns 0, which is a silent behavioral regression for any code that relies on deep_size_of() for memory accounting. Consider at least delegating to approx_size_bytes().
WeakLanceCache::get_or_insert lost all dedup behavior — concurrent loads for the same key will all run the loader independently. The old code used moka's optionally_get_with which handled this.

Add type_name()/type_id() to CacheKey and UnsizedCacheKey traits so backends can identify the type of cached entries. Add parse_cache_key() utility for backends to extract (user_key, type_id) from opaque key bytes. CacheKey-based methods now pipe the key's type_id through to the backend. Non-CacheKey methods use type_id_of::<T>() as a sentinel. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

1. Remove #[cfg(test)] convenience methods; tests now use CacheKey via a TestKey helper, eliminating the parallel method hierarchy. 2. Fix dedup race condition: re-check the cache while holding the in-flight lock so no two tasks can both become leader for the same key. 3. Use Arc::try_unwrap on the leader error path to preserve the original error type when possible. 4. Make invalidate_prefix async instead of fire-and-forget spawn. 5. Replace type_name().as_ptr() with a hash of std::any::TypeId for stable type discrimination. Defined once in type_id_of() and used by CacheKey::type_id() default. 6. Add dedup to WeakLanceCache::get_or_insert, sharing the in-flight map from the parent LanceCache. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov · 2026-03-19T06:44:03Z

Codecov Report

❌ Patch coverage is 83.43465% with 109 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
rust/lance-core/src/cache/mod.rs	87.89%	53 Missing and 5 partials ⚠️
rust/lance/src/session.rs	30.43%	16 Missing ⚠️
rust/lance-core/src/cache/moka.rs	76.08%	11 Missing ⚠️
rust/lance/src/dataset/builder.rs	46.66%	8 Missing ⚠️
rust/lance-core/src/cache/keys.rs	56.25%	7 Missing ⚠️
rust/lance-core/src/cache/backend.rs	0.00%	6 Missing ⚠️
rust/lance/src/session/index_caches.rs	66.66%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

Address feedback: 1. Move get_or_insert() onto CacheBackend. The method takes a pinned future (not a closure), so LanceCache can type-erase the user's non-'static loader before passing it to the backend. Default impl does simple get-then-insert; MokaCacheBackend uses moka's built-in optionally_get_with for dedup. This eliminates duplicated dedup logic and the manual watch-channel machinery. 2. Restore type_name().as_ptr() for type_id derivation on CacheKey. Remove standalone type_id_of() function. The derivation lives in one place: CacheKey::type_id()/UnsizedCacheKey::type_id(). 3. Remove approx_size_bytes from CacheBackend trait and Session debug output. Only approx_num_entries remains. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove all methods that bypass CacheKey from WeakLanceCache (get, insert, get_or_insert, get_unsized, insert_unsized). Remove insert_unsized/get_unsized from LanceCache. Remove type_tag helper. All cache access now goes through CacheKey/UnsizedCacheKey. Make parse_cache_key return (empty, 0) instead of panicking on short keys. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Restore approx_size_bytes on CacheBackend so DeepSizeOf on LanceCache reports actual cache memory usage (used by Session::size_bytes). Fixes test_metadata_cache_size Python test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The type_name().as_ptr() approach for type discrimination was unstable across crate boundaries due to monomorphization. Replace with an explicit fn type_id() -> &'static str that each CacheKey impl provides as a short human-readable literal (e.g. 'Vec<IndexMetadata>', 'Manifest'). Key format changes from user_key\0<8 LE bytes> to user_key\0<type_id str>. parse_cache_key() now returns (&[u8], &str).

Add IvfIndexState struct and serialization to lance-index, enabling IVFIndex to export its reconstructable state (IVF model, quantizer metadata) without non-serializable handles. Add reconstruct_vector_index() which rebuilds an IVFIndex from cached state by re-opening FileReaders (cheap with warm metadata cache) instead of re-fetching global buffers from object storage. Also adds IvfQuantizationStorage::from_cached() to skip global buffer reads during reconstruction, and Session::file_metadata_cache() to expose the metadata cache for the reconstruction context. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Reconstructed VectorIndex instances need the original cache key prefix to share partition entries with the two-tier cache backend. Also adds LanceCache::with_backend_and_prefix() and WeakLanceCache::prefix(). Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

Previously, the disk cache codec reconstructed `Arc<dyn VectorIndex>` from `IvfIndexState` during deserialization, requiring a `ReconstructionContext` with deferred OnceLock initialization and sync-to-async runtime juggling. The ObjectStore in that context also lacked proper credential wrappers. Now the cache stores `Arc<dyn VectorIndexData>` (serializable state) instead of `Arc<dyn VectorIndex>` (live index). Lance's `open_vector_index()` detects cached state and reconstructs using its own ObjectStore (with credentials) and metadata cache. This eliminates the ReconstructionContext, OnceLock pattern, and runtime juggling. Changes: - Add VectorIndexData trait (lance-index) with write_to/as_any/tag - Add DeepSizeOf impl for IvfIndexState - Change VectorIndexCacheKey::ValueType to dyn VectorIndexData - Add reconstruction-from-cache path in open_vector_index() - Fix panicking downcast in LanceCache::get_with_id (return None) - Add Debug/Clone/Copy derives to SubIndexType Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

rust/lance/src/session.rs

rust/lance-core/src/cache.rs

rust/lance-core/src/cache/mod.rs

rust/lance-core/src/cache.rs

rust/lance/src/index/vector/ivf/partition_serde.rs

rust/lance/src/index/vector/ivf/v2.rs

- Split cache.rs into submodules (backend, keys, moka, mod) - Rename CacheKey::type_id() to type_name() across all implementors - Improve CacheBackend and get_or_insert docs - Add Spillable trait with writer-based serialize for partition_serde - Cache file metadata and file sizes to enable zero-IO reconstruction - Add test_reconstruct_from_cache_zero_io test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move VectorIndexData, IvfIndexState, partition_serde, cacheable_state, and zero-IO reconstruction out of this PR to keep it focused on the pluggable cache backend. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…x-cache

The method was renamed in lance-format#6209 but the test call site in v2.rs was not updated during the merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The file_sizes parameter on IVFIndex::try_new and the file_size_map() usage in open_vector_index were from merged PR lance-format#5497, not the serialization PR. Restoring them avoids unnecessary HEAD requests. Also restores vector index cache check in open_generic_index. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions bot added the enhancement New feature or request label Mar 18, 2026

wjones127 and others added 2 commits March 18, 2026 20:04

wjones127 force-pushed the feat/pluggable-index-cache branch from 56a3273 to 00867ad Compare March 19, 2026 16:56

wjones127 and others added 3 commits March 19, 2026 10:05

cleanup

74fdc2c

cleanup

1ba4ac3

Restore approx_size_bytes on CacheBackend so DeepSizeOf on LanceCache reports actual cache memory usage (used by Session::size_bytes). Fixes test_metadata_cache_size Python test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

wjones127 force-pushed the feat/pluggable-index-cache branch from 369afe2 to 1ba4ac3 Compare March 19, 2026 18:29

wjones127 marked this pull request as ready for review March 19, 2026 19:05

wjones127 and others added 6 commits March 19, 2026 17:07

feat: add partition serde for all quantizer types (PR lance-format#6223)

2e7602e

chore: make index_caches module public for downstream codec registration

f1ed934

merge

9c9f7b8

wjones127 marked this pull request as draft March 20, 2026 18:58

wjones127 and others added 3 commits March 20, 2026 12:32

est

33429e9

fix

4fdbe51

wjones127 commented Mar 20, 2026

View reviewed changes

wjones127 and others added 2 commits March 20, 2026 16:06

wjones127 mentioned this pull request Mar 20, 2026

feat: VectorIndex serialization and zero-IO reconstruction #6223

Open

wjones127 and others added 3 commits March 20, 2026 16:59

Merge remote-tracking branch 'upstream/main' into feat/pluggable-inde…

bc02d58

…x-cache

fix: update commit_existing_index to commit_existing_index_segments

c85f9b5

The method was renamed in lance-format#6209 but the test call site in v2.rs was not updated during the merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: pluggable index cache via CacheBackend trait#6222

feat: pluggable index cache via CacheBackend trait#6222
wjones127 wants to merge 21 commits intolance-format:mainfrom
wjones127:feat/pluggable-index-cache

wjones127 commented Mar 18, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

codecov bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wjones127 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 18, 2026

PR Review: feat: pluggable index cache via CacheBackend trait

P0: TOCTOU race in get_or_insert — dedup is broken

P1: type_tag relies on unspecified pointer identity

P1: invalidate_prefix is fire-and-forget

Minor notes

Uh oh!

codecov bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

wjones127 commented Mar 18, 2026 •

edited

Loading

P0: TOCTOU race in `get_or_insert` — dedup is broken

P1: `type_tag` relies on unspecified pointer identity

P1: `invalidate_prefix` is fire-and-forget

codecov bot commented Mar 19, 2026 •

edited

Loading