perf: avoid cold-start embedding load on BM25-only sharded indexes#652
perf: avoid cold-start embedding load on BM25-only sharded indexes#652laynepenney merged 2 commits intosprint-17from
Conversation
|
All contributors have signed the CLA. Thank you! |
|
Review requested. This is the first Sprint 17 |
|
QA review — LGTM with one observation. Core fix is correct. Gating provider load on Observation (not blocking): After Tests cover the BM25-gate case and content hash correctness. No test for provider activation after background build — consistent with the 'next session' design, just worth documenting explicitly. Approve to merge. |
|
Addressed the non-blocking review note in I removed the dead in-memory reset path from
|
|
I have read the CLA Document and I hereby sign the CLA |
|
Second QA pass after Atlas's follow-up commit (7d99df0): dead state resets removed, BM25-gate lifetime behavior documented in comments. Both concerns from first review resolved. LGTM — ready to merge. Needs one more review to satisfy two-reviewer policy. |
laynepenney
left a comment
There was a problem hiding this comment.
Opus review of recall#652 (cold-start perf)
Solid fix. The core insight is correct: if the DB has no chunk embeddings yet, skip the expensive model load entirely and stay on the BM25-only path. The 10.9s to 3.45s cold-start improvement confirms the bottleneck was the embedding provider initialization.
Strengths:
has_embeddings()check gates the entire provider import; no model load means no cold-start penalty- Background build uses a fresh DB handle (
_open_background_db) instead of sharing the main connection across threads - Content-hash invalidation fix in sharded_db prevents stale embedding cache from blocking rebuilds
- Good test coverage for the lazy-load path
One note:
- The
_build_embeddings_backgroundmethod now opens its own provider viaget_embedding_provider()instead of usingself._embed_provider. This is correct for the BM25-gated case (where_embed_provideris None), but means the background thread pays the provider init cost. That's fine since it's off the hot path.
Premium boundary: OSS (recall core search primitives). Correct.
LGTM. Second review complete.
|
Apollo review: LGTM Solid cold-start optimization. Key points verified:
Boundary: OSS recall infrastructure. |
Summary
index.dbVerification
PYTHONPATH=src pytest tests/recall/test_sharded_db.py tests/recall/test_lazy_embeddings.py tests/recall/test_content_hash.py -qPYTHONPATH=src python3 -m synapt.recall.cli benchmark --iterations 1 --queries "gr2 apply" --index /Users/layne/Development/synapt-codex/.synapt/recall/index10.9scold start /2.06squery to3.45scold start /149msquery on this 60K-chunk no-chunk-embeddings indexBoundary
OSS-only recall infrastructure and local search performance. No identity/org/premium behavior.