Skip to content

jamesgober/iqdb-cache

Repository files navigation

Rust logo
iqdb-cache
iQDB IN-PROCESS CACHE

Crates.io Downloads docs.rs CI MSRV

iqdb-cache is an in-process caching layer for search results. For large indexes that do not fit in RAM, a well-tuned cache turns a repeated query into a memory read instead of a fresh scan.

It wraps any IndexCore as a CachedIndex — itself a drop-in IndexCore — and is purely an opt-in optimization: a database is correct with no cache at all, and wrapping an index never changes what a search returns, only how fast a repeat returns.



MSRV is 1.87+ (Rust 2024 edition). LRU result cache. Mutation-exact invalidation. Optional TTL. Off by default.

Status: stable (1.0). The public API is committed under SemVer for the 1.x series — no breaking changes until 2.0. See CHANGELOG.md.


What it does

  • Transparent wrapperCachedIndex<I> implements IndexCore, so it slots in anywhere the wrapped index does, including behind Box<dyn IndexCore>
  • Result memoization — identical searches (same query, same SearchParams) are served from an in-memory cache instead of re-running
  • Mutation-exact invalidation — every insert / insert_batch / delete clears the cache, so a search never observes a stale result
  • Optional TTL — give entries an expiry to bound staleness from changes the wrapper can't see; off by default, and verified deterministically with a mock clock
  • Four eviction policies — LRU (default), LFU, FIFO, and ARC, selectable through one config knob; all arena-backed with amortized O(1) operations and bounded to the configured capacity
  • Off by default — size the cache, or disable it with capacity 0 for a pure passthrough to A/B the cache's effect without touching call sites
  • Hit/miss/eviction statsCacheStats exposes lifetime hit, miss, and eviction counters plus a hit_rate for tuning
  • Zero unsafe — the whole crate is #![forbid(unsafe_code)]

Installation

[dependencies]
iqdb-cache = "1.0"

Quick start

Wrap any index and let repeated searches come from memory:

use iqdb_cache::CachedIndex;
use iqdb_index::IndexCore;
use iqdb_types::{DistanceMetric, SearchParams};

// `stub_index()` stands in for a real `iqdb-flat` / `iqdb-hnsw` index.
let cached = CachedIndex::new(iqdb_cache::doc_stub::stub_index());
let params = SearchParams::new(3, DistanceMetric::Cosine);

let cold = cached.search(&[1.0, 0.0, 0.0], &params).expect("search");
let warm = cached.search(&[1.0, 0.0, 0.0], &params).expect("search"); // served from cache
assert_eq!(cold, warm);

let stats = cached.cache_stats();
assert_eq!(stats.hits, 1);
assert_eq!(stats.misses, 1);

Size the cache, or disable it entirely:

use iqdb_cache::CachedIndex;

// Hold the 4096 most-recent distinct searches.
let sized = CachedIndex::with_capacity(iqdb_cache::doc_stub::stub_index(), 4096);
assert_eq!(sized.capacity(), 4096);

// Capacity 0 is a pure passthrough — useful for measuring the cache's effect.
let bypass = CachedIndex::with_capacity(iqdb_cache::doc_stub::stub_index(), 0);
assert!(!bypass.is_enabled());

A write invalidates the cache, so the next search reflects it — never a stale result:

use std::sync::Arc;

use iqdb_cache::CachedIndex;
use iqdb_index::IndexCore;
use iqdb_types::{DistanceMetric, SearchParams, VectorId};

let mut cached = CachedIndex::new(iqdb_cache::doc_stub::stub_index());
let params = SearchParams::new(10, DistanceMetric::Cosine);

let before = cached.search(&[0.0, 0.0, 0.0], &params).expect("search");
cached
    .insert(VectorId::from(42u64), Arc::from(&[0.0, 0.0, 0.0][..]), None)
    .expect("insert");
let after = cached.search(&[0.0, 0.0, 0.0], &params).expect("search");

// The new vector is visible immediately; the cached result was discarded.
assert_eq!(after.len(), before.len() + 1);

Give entries a time-to-live to bound staleness from changes made behind the wrapper's back — through a CacheConfig (the Tier-2 path):

use std::time::Duration;

use iqdb_cache::{CacheConfig, CachedIndex};

let config = CacheConfig::new()
    .capacity(4096)
    .ttl(Duration::from_secs(300)); // results reused within 5 min are hits

let cached = CachedIndex::with_config(iqdb_cache::doc_stub::stub_index(), config);
assert_eq!(cached.ttl(), Some(Duration::from_secs(300)));

Choose an eviction policy to match the workload — LRU (default), LFU, FIFO, or ARC:

use iqdb_cache::{CacheConfig, CachedIndex, EvictionPolicy};

// LFU favours a stable hot-set where a few queries dominate.
let cached = CachedIndex::with_config(
    iqdb_cache::doc_stub::stub_index(),
    CacheConfig::new().capacity(4096).policy(EvictionPolicy::Lfu),
);
assert_eq!(cached.policy(), EvictionPolicy::Lfu);

Errors

CachedIndex introduces no errors of its own: every fallible call forwards the wrapped index's iqdb_types::Result unchanged. A search that errors is not cached, so a later identical search re-runs against the index.


Examples

Runnable examples live in examples/:

cargo run --example quickstart   # wrap an index; first search misses, the repeat hits
cargo run --example policies     # the four eviction policies side by side
cargo run --example tuning       # capacity + TTL + policy via CacheConfig, and invalidation

Status

v1.0.0stable. The CachedIndex wrapper, mutation-exact invalidation, an optional per-entry TTL (via clock-lib, tested deterministically with a mock clock), and four eviction policies (LRU, LFU, FIFO, ARC) behind one config knob. The public API is committed under SemVer for the 1.x series (no breaking changes until 2.0; the frozen surface is recorded in the ROADMAP). Every core invariant is property-tested against a brute-force reference index under every policy (the cache is transparent and bounded; a write is never stale); the shared-cache path is loom-model-checked across thread interleavings; the cache is validated against a realistic consumer workload (hot-set queries + a read/write mix); cargo audit and cargo deny pass; and it is verified on Windows + Linux across stable and the 1.87 MSRV. The hit path is benchmarked: on the reference machine a 10k-vector / dim-64 search costs ~234 µs uncached versus ~250 ns from cache — a ~940× speedup (FIFO ~250 ns, LRU ~278 ns, ARC ~387 ns, LFU ~1.17 µs; a TTL adds ~29 ns). The full surface is documented in docs/API.md.



Where It Fits

iqdb-cache sits above the index family and below the database. It builds on:

  • iqdb-types — core types (VectorId, Hit, SearchParams, DistanceMetric, Filter)
  • iqdb-index — the IndexCore trait it wraps
  • iqdb — exposes caching via the database builder

It is unblocked today: its first-party dependencies (iqdb-types, iqdb-index, and clock-lib for TTL) are all stable at 1.0.


Standards

Built to the iQDB Rust standard. See REPS.md (Rust Efficiency & Performance Standards) and dev/DIRECTIVES.md for the engineering law and the definition of done. Before a PR: cargo fmt --all, cargo clippy --all-targets --all-features -- -D warnings, and cargo test --all-features must be clean.


License

Licensed under either of

at your option.

COPYRIGHT © 2026 JAMES GOBER.

About

In-process vector and result caching with LRU/LFU/ARC eviction - part of the iQDB family.

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages