Skip to content

research(nightly): MUVERA FDE — multi-vector late-interaction search for ruvector#445

Draft
ruvnet wants to merge 1 commit intomainfrom
research/nightly/2026-05-08-multi-vector-maxsim
Draft

research(nightly): MUVERA FDE — multi-vector late-interaction search for ruvector#445
ruvnet wants to merge 1 commit intomainfrom
research/nightly/2026-05-08-multi-vector-maxsim

Conversation

@ruvnet
Copy link
Copy Markdown
Owner

@ruvnet ruvnet commented May 8, 2026

Summary

Nightly research branch implementing MUVERA Fixed Dimensional Encoding (NeurIPS 2024,
arXiv:2405.19504) as a new standalone Rust crate
crates/ruvector-multivec. This closes the multi-vector search capability gap: ruvector
had a correct but O(n×T_q×T_d×D) brute-force MultiVectorIndex; MUVERA converts it into
a single MIPS problem, enabling standard HNSW to power ColBERT-style late-interaction retrieval.

Deliverables

  • crates/ruvector-multivec/ — new Rust crate with 4 index variants:

    • CentroidIndex — mean-pool baseline (cheapest, ~23% recall@10)
    • MaxSimIndex — exact ColBERT MaxSim oracle
    • MuveraFdeIndex — MUVERA FDE approximation (3-9× faster than brute-force)
    • MuveraFdeRerankIndex — two-stage pipeline: FDE retrieval + exact MaxSim rerank
  • docs/adr/ADR-193-multi-vector-maxsim.md — architecture decision record

  • docs/research/nightly/2026-05-08-multi-vector-maxsim/README.md — full research
    document with SOTA survey, implementation notes, and real benchmark methodology

Real benchmark numbers (x86-64 Linux, cargo --release)

n T D MaxSim QPS FDE QPS Speedup
1,000 8 64 565 391 0.69× (small n, FDE overhead)
5,000 16 128 12 38 +3.2×
10,000 32 128 2 19 +9.5×
20,000 32 128 1 9 +9×

Criterion micro-benchmarks (D=64, T=8): centroid_dot=396ns · maxsim=3.36µs · FDE_encode+dot=9.07µs

All 12 unit tests pass (cargo test -p ruvector-multivec).
Build is clean (cargo build --release -p ruvector-multivec).

What this is NOT

  • FDE recall at PoC settings (M=8, R=4) is intentionally low (~5-22%). Production MUVERA
    uses M=32, R=8 → 95%+ recall. This crate establishes the correct framework.
  • HNSW integration is deferred to ADR-194. Current crate does linear scan over FDE vectors.
  • PQ compression of FDE vectors is deferred to ADR-195.

Research document

docs/research/nightly/2026-05-08-multi-vector-maxsim/README.md

ADR

docs/adr/ADR-193-multi-vector-maxsim.md

Public Gist (SEO overview)

https://gist.github.com/ruvnet/c74fbbf13352e92a0b116736dab9293f

Related

  • ADR-154 (RaBitQ), ADR-160 (ACORN), ADR-162 (ACORN WASM)
  • Competitor context: Qdrant 1.9 (Jul 2024), Weaviate 1.25 (Sep 2024) both shipped MUVERA

https://claude.ai/code/session_018926WBoRBHzmxmQ1Qd4xSw

…interaction search

Implements MUVERA Fixed Dimensional Encoding (NeurIPS 2024, arXiv:2405.19504) as a
standalone Rust crate. Converts O(n×T_q×T_d×D) brute-force ColBERT MaxSim search into
a single MIPS problem via random projection bucketing.

## Deliverables

- crates/ruvector-multivec/: new standalone Rust crate
  - src/scoring.rs: maxsim_exact, chamfer_score, centroid_dot, FdeEncoder
  - src/index.rs: MultiVecIndex trait + CentroidIndex, MaxSimIndex, MuveraFdeIndex,
                  MuveraFdeRerankIndex (FDE + exact MaxSim rerank pipeline)
  - src/error.rs: MultivecError
  - src/main.rs: benchmark binary (multivec-demo)
  - benches/multivec_bench.rs: Criterion scoring + search benchmarks
- docs/adr/ADR-193-multi-vector-maxsim.md: architecture decision record
- docs/research/nightly/2026-05-08-multi-vector-maxsim/README.md: research document

## Key measured numbers (x86-64 Linux, cargo --release)

  n=5K, T=16, D=128: FDE 38 QPS vs MaxSim 12 QPS (+3.2× speedup)
  n=10K, T=32, D=128: FDE 19 QPS vs MaxSim 2 QPS (+9.5× speedup)
  n=20K, T=32, D=128: FDE 9 QPS vs MaxSim 1 QPS (+9× speedup)

  Criterion per-pair: centroid_dot=396ns, maxsim_exact=3.36µs, FDE_encode+dot=9.07µs (D=64)

## Tests

  cargo test -p ruvector-multivec  — 12 tests, all pass

https://claude.ai/code/session_018926WBoRBHzmxmQ1Qd4xSw
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants