perf: avoid cold-start embedding load on BM25-only sharded indexes by laynepenney · Pull Request #652 · synapt-dev/recall

laynepenney · 2026-04-10T22:12:09Z

Summary

keep sharded indexes with no chunk embeddings on the fast BM25-only path
compute sharded content hashes from shard data instead of empty index.db
broaden Hugging Face cache detection for the local MiniLM model
add regressions for sharded content-hash correctness and the no-provider cold-start path

Verification

PYTHONPATH=src pytest tests/recall/test_sharded_db.py tests/recall/test_lazy_embeddings.py tests/recall/test_content_hash.py -q
PYTHONPATH=src python3 -m synapt.recall.cli benchmark --iterations 1 --queries "gr2 apply" --index /Users/layne/Development/synapt-codex/.synapt/recall/index
local benchmark moved from roughly 10.9s cold start / 2.06s query to 3.45s cold start / 149ms query on this 60K-chunk no-chunk-embeddings index

Boundary

OSS-only recall infrastructure and local search performance. No identity/org/premium behavior.

github-actions · 2026-04-10T22:12:23Z

All contributors have signed the CLA. Thank you!
_{Posted by the CLA Assistant Lite bot.}

laynepenney · 2026-04-10T22:13:02Z

Review requested. This is the first Sprint 17 recall#435 slice: sharded content-hash correctness plus the cold-start BM25 fast path for indexes with no chunk embeddings. Local benchmark on the 60K-chunk workspace index moved from roughly 10.9s cold start / 2.06s query to 3.45s cold start / 149ms query.

laynepenney · 2026-04-10T22:15:31Z

QA review — LGTM with one observation.

Core fix is correct. Gating provider load on has_chunk_embeddings is the right call — no reason to pay the model load cost for BM25-only indexes. Performance numbers are compelling (10.9s → 3.45s cold start, 2.06s → 149ms query). Boundary declaration present and correct.

Observation (not blocking): After _build_embeddings_background completes and writes to DB, it resets self._embeddings_loaded / _all_embeddings / _emb_matrix — but self._embed_provider is still None (set at init time when has_chunk_embeddings was False). So the same TranscriptIndex object stays on BM25 for its lifetime even after background embeddings land. The embeddings only take effect on the next fresh load. If that's intentional (comment says 'the next explicit build can populate embeddings'), the state resets in the background worker are dead code. Worth a clarifying comment or removing the dead resets to avoid confusion.

Tests cover the BM25-gate case and content hash correctness. No test for provider activation after background build — consistent with the 'next session' design, just worth documenting explicitly.

Approve to merge.

laynepenney · 2026-04-10T22:25:40Z

Addressed the non-blocking review note in 7d99df0.

I removed the dead in-memory reset path from _build_embeddings_background and clarified the BM25-only comment so it explicitly says this TranscriptIndex instance stays BM25-only until a fresh load. Focused regression slice still passes:

PYTHONPATH=src pytest tests/recall/test_sharded_db.py tests/recall/test_lazy_embeddings.py tests/recall/test_content_hash.py -q

laynepenney · 2026-04-10T22:25:40Z

I have read the CLA Document and I hereby sign the CLA

laynepenney · 2026-04-10T22:55:00Z

Second QA pass after Atlas's follow-up commit (7d99df0): dead state resets removed, BM25-gate lifetime behavior documented in comments. Both concerns from first review resolved. LGTM — ready to merge. Needs one more review to satisfy two-reviewer policy.

laynepenney

Opus review of recall#652 (cold-start perf)

Solid fix. The core insight is correct: if the DB has no chunk embeddings yet, skip the expensive model load entirely and stay on the BM25-only path. The 10.9s to 3.45s cold-start improvement confirms the bottleneck was the embedding provider initialization.

Strengths:

has_embeddings() check gates the entire provider import; no model load means no cold-start penalty
Background build uses a fresh DB handle (_open_background_db) instead of sharing the main connection across threads
Content-hash invalidation fix in sharded_db prevents stale embedding cache from blocking rebuilds
Good test coverage for the lazy-load path

One note:

The _build_embeddings_background method now opens its own provider via get_embedding_provider() instead of using self._embed_provider. This is correct for the BM25-gated case (where _embed_provider is None), but means the background thread pays the provider init cost. That's fine since it's off the hot path.

Premium boundary: OSS (recall core search primitives). Correct.

LGTM. Second review complete.

laynepenney · 2026-04-10T23:10:32Z

Apollo review: LGTM

Solid cold-start optimization. Key points verified:

Defers embedding model load until the index has chunk embeddings, avoiding ~2-3s cold-start on BM25-only sharded indexes
Background build opens its own DB connection (prevents SQLite locking)
HF cache lookup handles both shorthand and fully-qualified model names
Sharded content_hash() uses heapq.merge for correct global ordering
Test confirms get_embedding_provider is never called on the BM25-only path

Boundary: OSS recall infrastructure.

perf: avoid cold-start embedding load on sharded BM25-only indexes

08db8e3

chore: clarify BM25-only cold-start behavior

7d99df0

laynepenney commented Apr 10, 2026

View reviewed changes

laynepenney merged commit b062dfa into sprint-17 Apr 10, 2026
9 of 10 checks passed

laynepenney deleted the atlas/recall-435-cold-start branch April 10, 2026 23:19

github-actions bot locked and limited conversation to collaborators Apr 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: avoid cold-start embedding load on BM25-only sharded indexes#652

perf: avoid cold-start embedding load on BM25-only sharded indexes#652
laynepenney merged 2 commits intosprint-17from
atlas/recall-435-cold-start

laynepenney commented Apr 10, 2026

Uh oh!

github-actions bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

laynepenney commented Apr 10, 2026

Uh oh!

laynepenney commented Apr 10, 2026

Uh oh!

laynepenney commented Apr 10, 2026

Uh oh!

laynepenney commented Apr 10, 2026

Uh oh!

laynepenney commented Apr 10, 2026

Uh oh!

laynepenney left a comment

Uh oh!

laynepenney commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

laynepenney commented Apr 10, 2026

Summary

Verification

Boundary

Uh oh!

github-actions bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

laynepenney commented Apr 10, 2026

Uh oh!

laynepenney commented Apr 10, 2026

Uh oh!

laynepenney commented Apr 10, 2026

Uh oh!

laynepenney commented Apr 10, 2026

Uh oh!

laynepenney commented Apr 10, 2026

Uh oh!

laynepenney left a comment

Choose a reason for hiding this comment

Uh oh!

laynepenney commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Apr 10, 2026 •

edited

Loading