Once a vector segment is sealed and SQ8 quantization is built (automatic at ≥ 1000 live vectors), every subsequent search iterates every vector in the segment. HNSW's logarithmic advantage is discarded. Per-query latency scales linearly with segment size.
Summary
VectorCollection::search has two branches per sealed segment. When seg.sq8 is None, it calls seg.index.search(query, top_k, ef) which uses the HNSW graph correctly. When seg.sq8 is Some(_) — which is the common case after a segment crosses DEFAULT_SEAL_THRESHOLD and live_count() >= 1000 — it stops using the graph entirely and performs a full linear scan of the SQ8 bytes, then reranks.
Current code
nodedb-vector/src/collection/search.rs:22-48
let results = if let Some((codec, sq8_data)) = &seg.sq8 {
// Quantized two-phase search.
let rerank_k = top_k.saturating_mul(3).max(20);
let mut candidates: Vec<(u32, f32)> = Vec::with_capacity(seg.index.len());
let dim = seg.index.dim();
for i in 0..seg.index.len() { // ← iterates EVERY vector
if seg.index.is_deleted(i as u32) {
continue;
}
let sq8_vec = &sq8_data[i * dim..(i + 1) * dim];
let d = match self.params.metric {
DistanceMetric::L2 => codec.asymmetric_l2(query, sq8_vec),
DistanceMetric::Cosine => codec.asymmetric_cosine(query, sq8_vec),
DistanceMetric::InnerProduct => codec.asymmetric_ip(query, sq8_vec),
_ => {
let dequant = codec.dequantize(sq8_vec);
distance(query, &dequant, self.params.metric)
}
};
candidates.push((i as u32, d));
}
if candidates.len() > rerank_k {
candidates.select_nth_unstable_by(rerank_k, |a, b| { ... });
candidates.truncate(rerank_k);
}
// … phase-2 FP32 rerank on candidates …
} else {
seg.index.search(query, top_k, ef) // ← only HNSW path
};
Vec::with_capacity(seg.index.len()) and for i in 0..seg.index.len() confirm the intent: full linear scan. There is no graph traversal in this branch.
SQ8 auto-build trigger: nodedb-vector/src/collection/lifecycle.rs:231 — any sealed segment with live_count() >= 1000 gets SQ8 built automatically.
Why it's broken
- HNSW's purpose is
O(log N) ANN search. This branch is O(N) per sealed segment per query.
- At 1 M vectors × 768 dim, each query reads ~768 MB of SQ8 bytes sequentially and computes 1 M distance calls before reranking. Per-query latency goes from millisecond to seconds.
- With multiple sealed segments, the cost multiplies.
- The two-phase design is correct in principle (approximate → rerank), but the candidate-generation step must use the HNSW graph (with SQ8 as the in-graph distance function), not a linear scan.
- This also wastes all the memory spent maintaining the HNSW graph for sealed segments — the graph is built, stored, and then never consulted once SQ8 is on.
Reproduction
- Build a collection; insert ≥ ~100 k vectors so a segment seals and SQ8 is constructed (see
lifecycle.rs:231).
- Run a single nearest-neighbour query.
- Profile: a flamegraph shows ~100 % of time in the
for i in 0..seg.index.len() loop at collection/search.rs:27.
- Compare against the same collection with SQ8 disabled (or small enough not to trigger seal): HNSW path runs in milliseconds.
Notes
- Found during a CPU/memory audit sweep of
nodedb-vector/src/*.
- Easy to verify with a
criterion benchmark comparing sq8=None vs sq8=Some paths on the same dataset.
Once a vector segment is sealed and SQ8 quantization is built (automatic at ≥ 1000 live vectors), every subsequent search iterates every vector in the segment. HNSW's logarithmic advantage is discarded. Per-query latency scales linearly with segment size.
Summary
VectorCollection::searchhas two branches per sealed segment. Whenseg.sq8isNone, it callsseg.index.search(query, top_k, ef)which uses the HNSW graph correctly. Whenseg.sq8isSome(_)— which is the common case after a segment crossesDEFAULT_SEAL_THRESHOLDandlive_count() >= 1000— it stops using the graph entirely and performs a full linear scan of the SQ8 bytes, then reranks.Current code
nodedb-vector/src/collection/search.rs:22-48Vec::with_capacity(seg.index.len())andfor i in 0..seg.index.len()confirm the intent: full linear scan. There is no graph traversal in this branch.SQ8 auto-build trigger:
nodedb-vector/src/collection/lifecycle.rs:231— any sealed segment withlive_count() >= 1000gets SQ8 built automatically.Why it's broken
O(log N)ANN search. This branch isO(N)per sealed segment per query.Reproduction
lifecycle.rs:231).for i in 0..seg.index.len()loop atcollection/search.rs:27.Notes
nodedb-vector/src/*.criterionbenchmark comparingsq8=Nonevssq8=Somepaths on the same dataset.