Add semantic and hybrid search via local vector embeddings#277
Conversation
Proposes hybrid BM25 + vector search over the email corpus with embeddings generated via OpenAI-compatible endpoints (Ollama, LM Studio, llama.cpp-server). Vectors live in a separate ~/.msgvault/vectors.db attached at query time so hybrid RRF fusion stays a single SQLite query. Search is exposed through CLI, HTTP, and stdio MCP; the TUI is unchanged. Reranker and RAG are deferred to follow-up specs.
- Track an active index generation with a model:dimension fingerprint so search refuses to run on a mismatched corpus instead of silently comparing cross-model vectors. Model rotation is an explicit msgvault embed --full-rebuild, never an auto-invalidation. Remove the --model one-shot override from msgvault search. - Rewrite the fused SQL so account, date, label, and attachment filters push into both signal CTEs. The previous post-fusion filter starved recall on narrow filters. - Extend the existing msgvault search CLI and MCP server instead of treating them as greenfield. Gmail-style query syntax is preserved; --mode and --explain flags are additive. All 8 existing MCP tools are unchanged; search_messages gains optional mode/explain args and find_similar_messages is the only new tool. - Drop total from vector/hybrid response shape and use returned/has_more instead. total has no stable definition without a similarity threshold or full scan. fts responses still report total. - Add a Compatibility and versioning section enumerating every affected public surface and its additive delta.
- Dual-enqueue during rebuild: sync writes pending_embeddings for every non-retired generation, not just active. The building generation now stays current with ingest throughout a long rebuild, so the atomic swap can't drop messages that arrived mid-build. - Backend contract carries GenerationID on Upsert, Search, Delete, Stats, plus new CreateGeneration / ActivateGeneration / RetireGeneration methods. Removes the ambiguity when two same-dim generations coexist in vectors_vec_dN. - One explicit pagination rule per mode: fts keeps page/page_size as today; vector/hybrid accepts only page=1 with page_size capped at max_page_size_hybrid. page>1 returns HTTP 400 pagination_unsupported. CLI --offset errors for non-fts modes. - index_building now only fires when no active generation exists (first-ever build). Rebuilds with an active generation are transparent to search. - Default mode is fts everywhere at the public API. Removed default_mode from config.
- Switch the embedding worker from claim-by-delete to claim-by-mark: pending_embeddings gains claimed_at and claim_token; completion transactionally DELETEs the row and writes the embedding. Startup reclaim pass releases stale claims. The previous recovery scan was unsafe under dual-enqueue because it relied on needs_embedding as an external source of truth. - Drop the needs_embedding column from messages. pending_embeddings is the only source of truth for pending work. Initial enqueue on CreateGeneration seeds the queue from SELECT id FROM messages in the same transaction that creates the generation, so there is no window where a generation exists with an incomplete queue. - Rewrite the sample fused SQL against the actual msgvault schema: messages.sender_id -> participants, messages.source_id -> sources, EXISTS on message_labels. Filter values are resolved to IDs at the Go layer (matching internal/query) and the SQL takes JSON arrays of IDs. Adds deleted_at IS NULL exclusion. - Fix stale package-layout note: /api/v1/search/hybrid and /vector sub-endpoints were removed in the previous revision; handlers.go gains a mode query param on the existing /api/v1/search handler. MCP server location and command wiring updated accordingly.
- Disallow --account with --full-rebuild. Generations are corpus-wide by definition (the activation flips the entire active index), so a partial rebuild cannot safely replace the current generation. The CLI rejects the combination with a clear error. --account remains valid on incremental runs as a drain-prioritization filter. - Fix the deletion column in the sample SQL: existing search paths in internal/query/sqlite.go and internal/search/parser.go exclude deleted_from_source_at IS NULL, not deleted_at. The previous wording would have changed which messages are searchable. - Unify the /api/v1/stats extension: one optional vector_search sub-object containing active_generation, building_generation, and pending_embeddings_total. Both the compatibility table and the observability section now reference the same shape.
Unify on deleted_from_source_at IS NULL as the one deletion predicate used everywhere: initial generation seed, sync-side enqueue, and the fused search SQL. This matches existing msgvault search behavior in internal/query/sqlite.go and internal/search/parser.go. Also document the policy for messages that are marked deleted after embedding: not actively purged in MVP (search hides them via the same predicate; the next --full-rebuild drops them naturally). Active purging is a storage-hygiene improvement, not a correctness fix.
Decomposes the design spec into 30 TDD tasks across 11 phases: foundation (config, backend interface, sqlite-vec extension, schema); generations and storage ops (Create/Activate/Retire, Upsert, Search, Delete, Stats, FusedSearch); embedding pipeline (HTTP client, preprocessing, claim-mark-complete queue, worker, reclaim, sync-side enqueuer); hybrid RRF and search engine; CLI extensions (--mode, --explain, msgvault embed); HTTP and MCP surfaces; scheduler wiring; documentation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces internal/vector/sqlitevec, a small package that wires the sqlite-vec SQLite extension into mattn/go-sqlite3 when the sqlite_vec build tag is set. Callers invoke RegisterExtension() once at startup; it calls sqlite_vec.Auto() (a process-global sqlite3_auto_extension hook) and registers a "sqlite3_vec" driver alias for database/sql. The stub (!sqlite_vec) returns ErrNotBuilt and the default "sqlite3" driver name so the rest of the tree compiles without the extension. Parametrizes the Makefile on BUILD_TAGS so build, build-release, install, test, test-v, and bench all pick up fts5 and sqlite_vec uniformly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Define baseline SQLite schema for the vector-search co-located vectors.db (generations, embeddings, pending queue, embed runs, schema_version) plus an embed-backed Migrate() that applies it, enables WAL + foreign keys, and ensures a dim-specific vec0 table. EnsureVectorTable builds vectors_vec_dN on demand so multiple dimensions can coexist during model rollover.
…tire) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add ErrUnknownGeneration sentinel so callers can distinguish "no such generation" from generic DB errors. Wrap BeginTx and PrepareContext errors in Upsert. Expand test coverage for empty chunks, unknown generation, multi-chunk with mixed Truncated flags, and re-upsert replacement. Make existing tests capture query errors instead of swallowing them. Document ErrUnknownGeneration / ErrDimensionMismatch contract on Upsert's godoc. vec0 virtual tables reject INSERT OR REPLACE on primary-key conflicts, so Upsert now DELETEs the existing vec row before inserting to support idempotent re-embedding. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds Filter.IsEmpty() and the sqlitevec Search path: look up the generation's dimension, resolve any structured filter into a message_id IN (...) fragment via a main-DB query, then run the vec0 MATCH query and return ranked hits ordered by ascending distance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… harden test helper Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Delete removes entries from embeddings and the dimension-specific vec0 table in one transaction; unknown generation wraps ErrUnknownGeneration; empty IDs is a no-op. Stats returns EmbeddingCount and PendingCount for the given generation, or aggregated across all generations when gen == 0. StorageBytes is left for the caller to derive from file size.
Adds the hybrid BM25 + ANN CTE (spec §5.3) using a fresh SQLite connection per query that opens msgvault.db and ATTACHes vectors.db under the "vec" alias. Filters resolve to JSON ID arrays and push into both signal branches via json_each(). RRF fusion is computed in SQL via FULL OUTER JOIN. Uses FTS5's built-in `rank` column instead of `bm25(messages_fts)` because the latter errors with "unable to use function bm25 in the requested context" when the FTS table participates in a JOIN. `k` for sqlite-vec and `LIMIT` for the BM25 CTE are interpolated from trusted integers on the request. Adds `MainPath` to Options, `Path() string` on Backend, and a stub `applySubjectBoost` — the real boost logic lives in Task 11. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SQLite ATTACH is per-connection, but *sql.DB is a pool. A query dispatched after ATTACH could check out a different physical connection and fail with "no such table: vec.vectors_vec_d768". Pin the pool to a single connection via SetMaxOpenConns(1) / SetMaxIdleConns(1) so every subsequent query reuses the attached connection. Add FusedSearch tests for FTS-only (VectorScore NaN), vector-only (BM25Score NaN), no-signals error, unknown generation, and dimension mismatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SQLite's bm25() function cannot be used inside subqueries or after joins
("unable to use function bm25 in the requested context"). The FTS5
built-in fts.rank column aliases to the default bm25 score, so reading
fts.rank inside the CTE is semantically identical and correct.
Discovered while implementing Task 9 (FusedSearch); spec updated so
Task 11 (subject boost) and Task 17 (hybrid engine orchestration) don't
copy the broken pattern.
New internal/vector/embed package providing a Client.Embed method that
posts to {endpoint}/embeddings with the OpenAI-compatible JSON schema.
Supports optional Bearer auth, per-request timeout, exponential-backoff
retry on 5xx and network errors, and immediate failure on 4xx. Verifies
returned vector dimensions and that every input index is populated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous sentinel (empty want = skip) silently hid the EmptySubjectAndBody case: it legitimately expects "" so the check short-circuited. Replace with a dedicated boolean so the empty-both assertion actually runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wraps the pending_embeddings table with a crash-safe Claim/Complete/ Release/ReclaimStale API. Each claim stamps rows with a crypto-random 16-hex token and timestamp; Complete/Release only act on rows whose token matches, so a stale or foreign token is a silent no-op. Stale claims can be reclaimed to the available pool via a cutoff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add Worker that drains pending_embeddings for a building generation by claiming a batch, fetching subject/body from the main DB, preprocessing, calling the embedding client, and upserting chunks into the backend. On embed or upsert error, claimed rows are released so another worker can retry them. Honors context cancellation and applies defaults for BatchSize (32) and Log. Drop the sqlite_vec build tag from queue.go since Queue is pure SQL and has no sqlite-vec dependency; keeps the queue usable by worker.go, which is tag-neutral. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds StaleThreshold field to WorkerDeps (default 10 minutes) and a ReclaimStale method that delegates to Queue.ReclaimStale. Callers invoke this at startup before RunOnce so rows claimed by crashed workers are released and retried. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds Enqueuer in internal/vector/embed that inserts message IDs into pending_embeddings for every non-retired generation. Wires the optional EmbedEnqueuer interface into the Syncer at the batch-commit point in both full and incremental sync paths, plus the rarer history-label-add path. Enqueue failures are logged and treated as non-fatal; missed IDs get picked up on the next full rebuild. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements internal/vector/hybrid.Fuse: combines BM25 and vector ranked lists via Reciprocal Rank Fusion (RRF), with an optional per-hit subject-term boost for messages whose subject matches any query term. BM25Score/VectorScore are NaN when a signal is absent, matching the FusingBackend contract. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GitHub's windows-latest runner doesn't ship sqlite3.h, which sqlite-vec's CGo bindings #include unconditionally with -DSQLITE_CORE. Build the Windows CI and release artifacts with -tags "fts5" only; vector search on Windows remains available to developers who provide the header themselves (e.g. via MSYS2's mingw-w64-x86_64-sqlite3 package). Documented the limitation in docs/vector-search.md and scripts/build.ps1. Linux release now installs libsqlite3-dev so the same header is available during the containerised build; mattn/go-sqlite3 still links its bundled amalgamation at runtime, so this only affects compile-time header resolution. Updated the Nix vendorHash to reflect the current go.sum (the new sqlite-vec-go-bindings dependency shifted it).
roborev: Combined Review (
|
The previous commit dropped sqlite_vec from Windows builds as a shortcut, which meant Windows binaries would have silently lost vector search. windows-latest runners ship MSYS2, and installing mingw-w64-x86_64-sqlite3 + pointing CGO_CFLAGS at its include path is enough to satisfy sqlite-vec's #include <sqlite3.h>. Build Windows artifacts with -tags "fts5 sqlite_vec" again, restore the docs, and have scripts/build.ps1 auto-configure CGO_CFLAGS when the MSYS2 header is present.
roborev: Combined Review (
|
Windows sqlite_vec builds hit undefined-reference link errors for ArrowArrayIsReleased, ArrowSchemaRelease, and other helpers from arrow-go/v18's arrow/c/helpers.h. The header declares those as plain `inline` rather than `static inline`, so C99 semantics require one TU to emit an external definition — and nothing in the cdata package does. GNU89 inline semantics always emit an external copy, which resolves the link. Apply the flag only on Windows, where the MinGW toolchain trips on this; Linux and macOS builds continue with their existing flags.
roborev: Combined Review (
|
With only -fgnu89-inline, every translation unit that pulls in arrow-go/v18's helpers.h emits an external copy of each helper, so MinGW's ld then fails on duplicate definitions of ArrowArrayIsReleased et al. Passing --allow-multiple-definition to the linker lets it keep the first definition and drop the rest, which is the standard pairing for projects that can't patch the upstream header.
roborev: Combined Review (
|
Empty-filter Search now loops instead of stopping at a fixed 2× over- fetch: each pass either satisfies k live hits or doubles fetch up to the generation's embedded count. Two new regression tests lock this in — one where >k deletions force an expansion round, and one where the full corpus produces fewer live hits than k and Search must terminate cleanly. scripts/build.ps1 now appends the required MSYS2 include path, -fgnu89-inline, and --allow-multiple-definition flags regardless of pre-existing CGO_CFLAGS/CGO_LDFLAGS exports, so users with their own SQLite include paths or linker flags don't silently skip the MinGW workaround. docs/vector-search.md replaces the Unix-shell snippet with PowerShell syntax and lists all three flags the Windows source build needs.
roborev: Combined Review (
|
- F2: drop unused registerErr from sqlitevec/ext.go and document the forward-compatible nil return. - F6: isUniqueConstraintErr now uses errors.As against sqlite3.ErrConstraintUnique / ErrConstraintPrimaryKey instead of matching error text — robust against driver locale or formatting changes. - F8: search_messages MCP tool replaces the shared withOffset() with an inline parameter that explicitly states offset>0 is rejected for mode=vector and mode=hybrid. - F9: pure vector mode no longer reports a misleading rrf score (RRF needs two signals to fuse). vectorHitsToFused sets RRFScore=NaN, scoreBreakdown.RRF becomes a *float64 omitted when NaN, and the API/MCP docs note that rrf is absent in mode=vector.
…tivation enqueue - F12: TestFusedSearch_NullSubjectExcludedBySubjectFilter pins down the intentional drop of NULL-subject messages when a subject filter is supplied — exercises both the fused CTE path and the Search filter path so any future change has to touch them together. - F13: TestEmbedJob_Run_PostActivationEnqueueDrainsOnNextRun confirms the activation gate's eventual-consistency property: a pending row enqueued after pendingCount==0 still gets drained on the next tick via the active-generation top-up path. Added a comment in embed_job.go documenting why the gate is intentionally non-atomic.
- F5: Queue.Complete and Queue.Release replace per-id Exec loops with a single statement bound to json_each(message_ids). At default BatchSize=32 the change is small; at larger batch sizes it removes N round-trips against the vectors.db write lock. - F7: Enqueuer.EnqueueMessages uses one INSERT-OR-IGNORE … FROM json_each(?) per non-retired generation instead of N×M ExecContext calls. A 5,000-message incremental sync with two non-retired generations now issues 2 writes instead of 10,000, so sync flushes no longer starve the embed worker's Claim.
The fused CTE applied LIMIT in SQL and then did the subject-boost re-rank in Go, so any boost-eligible hit ranked at pre-boost rank Limit+1 was invisible — it never made it into the candidate set the Go code re-sorted. Over-fetch by subjectBoostOverfetchFactor (4×) when boost is active, apply the boost, re-sort, then trim back to the requested limit. TestFusedSearch_SubjectBoostOverFetchesBeyondLimit covers the case: seven non-boost-eligible subjects rank ahead of a single boost- eligible subject in pure ANN order. With Limit=3 and 5× boost, the boost-eligible hit is now promoted into the visible page; without the over-fetch the SQL LIMIT would have hidden it before re-rank ran.
…ed stale threshold - F1: ErrEmbeddingTimeout sentinel wraps context.DeadlineExceeded from the embed call; HTTP returns 503 embedding_timeout, MCP returns the matching tool error. Clients can now retry transient embedder slowness instead of treating it as a permanent failure. - F3: Document the FULL OUTER JOIN identity that makes the fused CTE's pool-size correlated subqueries safe to read on the first result row only (and zero-default safe when fused is empty). Add TestFusedSearch_EmptyFilteredSetReportsNotSaturated to lock the contract. - F14: Auto-derive Worker.StaleThreshold from EmbedTimeout × (MaxRetries + 1) × 2 with a 10-minute floor. Users who configure a long embed timeout (slow remote endpoint) no longer need to remember to bump StaleThreshold themselves; a ReclaimStale tick can no longer pull live work out from under a worker that is still inside its retry budget.
roborev: Combined Review (
|
… retries default - R1 (14845): CLI rendered NaN RRFScore as the literal string "NaN" in table mode and panicked encoding/json in JSON mode. Pure vector mode intentionally sets RRFScore=NaN to mark "no fusion happened"; the CLI now mirrors the API/MCP convention — formatOptionalScore for table output, omit the rrf_score key when NaN for JSON output. - R2 (14848): Subject boost was over-fetching 4 × Limit before reranking, which left correctness gaps for large boost factors or large KPerSignal. Replaced the heuristic with a full-pool fetch (2 × KPerSignal, the natural cap of bm25_used FULL OUTER JOIN ann_used). Adds TestFusedSearch_SubjectBoostPromotesDeepRankHit: 100x boost surfaces a hit ranked at the deepest pre-boost position into the top page. - R3 (14849 MED): Documented why chi/v5's Timeout middleware does not preempt our structured 503 embedding_timeout response — it wraps ctx with a deadline and only writes 504 in a deferred function after the inline handler returns, by which point our response is already on the wire. Made RequestTimeout configurable via ServerOptions and added TestHandleSearch_HybridEmbeddingTimeoutFiresChi with a 100ms cap + blocking embedder; locks the contract so a future chi switch to http.TimeoutHandler-style preemption fails the test instead of silently breaking the JSON error path. - R4 (14849 LOW): derivedStaleThreshold treated EmbedMaxRetries=0 as one attempt while embed.NewClient defaults zero to 3 retries — a worker built with EmbedTimeout set but EmbedMaxRetries left zero could derive a stale threshold shorter than the client's real retry budget and reclaim live work. Normalize zero to 3 and add a regression case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
roborev: Combined Review (
|
- R1 (14853 MED): derivedStaleThreshold mis-modeled embed.Client's MaxRetries semantics — the client treats it as the TOTAL number of HTTP attempts (the for-loop runs `for attempt := 1; attempt <= MaxRetries` and the docstring says "MaxRetries total attempts"), not retries-after- the-first. The worker was computing `attempts := maxRetries + 1`, over-deriving the budget by one timeout window for every caller. Aligned the math (`attempts := maxRetries`) and updated the WorkerDeps comment to call out the total-attempts semantics. Adjusted test expectations: 5m × 3 attempts × 2 = 30m (was 40m), and 10m × 3 × 2 = 60m for the zero-default case (was 80m). - R2 (14853 LOW): http.Server.WriteTimeout was hardcoded to 60s, so a caller setting RequestTimeout > 60s would still lose the race to the underlying socket cutoff and never deliver the structured error. Tied WriteTimeout to s.requestTimeout + 5s; the buffer covers chi's deferred WriteHeader and the response flush. Documented the contract on ServerOptions.RequestTimeout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
roborev: Combined Review (
|
Vector search documentation lives in the docs site now. Delete docs/vector-search.md, revert the vector additions to docs/api.md, and point the README link at https://msgvault.io/usage/vector-search/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
roborev: Combined Review (
|
The new name follows the existing verb-first CLI pattern (build-cache, sync-full, import-emlx) and is discoverable to users who don't know what an embedding is. "embed" stays as a cobra alias so existing cronjobs and scripts keep working. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
roborev: Combined Review (
|
The HTTP API reference now lives on msgvault.io/api-server/. Keeping an in-repo copy invites drift. Point the README link at the docs site instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
roborev: Combined Review (
|
The vector search feature has never been released, so there is no shipped "embed" command for users to be compatible with. Drop the alias and the Long-text mention of it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
roborev: Combined Review (
|
Reconciles 172 ahead / 37 behind state with upstream wesm/msgvault. Strategy: accept upstream wholesale for connector code (M365, iMessage, gvoice, IMAP XOAUTH2) where upstream's implementations are more battle-tested and already cover the fork's bug fixes. Hand-resolve store/sync/search/cmd/build files to union both feature sets. Preserved from fork: - SQLCipher encryption-at-rest (passphrase, AES-GCM token encryption) - Advisory file locking (tryLock, lockFile, syscall.Flock) - AI Archive Intelligence subsystem (internal/embedding/, vec_messages, pipeline_runs, --semantic search) - Web UI (React/TypeScript SPA in web/) - Hot-path search tokenizer (dispatchToken, toLowerFast, parseSizeFast) - migrateAddContentID, InitVectorTable, content_id attachment column Adopted from upstream: - Dialect interface + loggedDB wrapper + structured logging pipeline (wesm#276 PostgreSQL dialect refactor foundation) - OpenReadOnly() for MCP read-only access - IsBusyError, SchemaStale helpers - Unified text import (wesm#238) — M365 OAuth (wesm#228), iMessage (wesm#224), Google Voice (wesm#225) — all wholesale - Search enhancements: regex, FTS5 snippets, sorting (wesm#252), domain normalization (normalizeAddr, looksLikeDomain, gTLDs) - rebuild-fts command (wesm#287), 8 bug fixes from wesm#254 - IMAP date filtering (wesm#222), greeting wait (wesm#248) - Vector subsystem (wesm#277) — coexists with fork's AI Archive Intelligence as parallel implementation; future cleanup needed Build/runtime fixes applied during merge: - Replaced mattn/go-sqlite3 imports with mutecomm/go-sqlcipher/v4 (drop-in API-compatible) to resolve duplicate symbol linker errors - Dropped sqlite_vec from default BUILD_TAGS (requires SQLite 3.38+ APIs sqlcipher v4.4.2 does not expose; re-enable when sqlcipher upgrades) - safeRowsAffected helper in db_logger.go: defer recover around RowsAffected() call (sqlcipher returns nil internal Result for multi-statement DDL) - Wired normalizeAddr into hot-path tokenizer for from:/to:/cc:/bcc: Stubbed under unreachable build tag (need follow-up decision): - cmd/msgvault/cmd/sync_gvoice.go — fork's sync API obsolete vs upstream's import-based gvoice - cmd/msgvault/cmd/sync_imessage.go — same situation Verified: go build ./... passing, go vet clean, 45/45 test packages pass with 0 failures. See MERGE_REPORT.md for file-by-file resolution notes.
Summary
Adds optional semantic and hybrid search to msgvault, using vector embeddings generated against any OpenAI-compatible endpoint (Ollama, llama.cpp, LM Studio, etc.). Keyword (FTS5) search remains the default; vector support is opt-in behind a build tag and a config block.
What's new
Search modes
mode=vector— pure semantic search; the free-text query is embedded and nearest neighbors are returned.mode=hybrid— BM25 and vector results fused via Reciprocal Rank Fusion, with an optional subject boost.mode=fts— existing keyword search, still the default.All three modes are available through:
msgvault search "..." --mode hybrid [--explain] [--json]GET /api/v1/search?q=...&mode=hybridsearch_messagestool (mode+explainarguments)Filter operators (
from:,to:,label:,has:attachment,after:,before:,account:) work uniformly across all three modes.Find similar messages
New MCP tool
find_similar_messagestakes a seedmessage_idand returns nearest-neighbor messages, excluding the seed itself. Supportsaccount,after,before,has_attachmentfilters.Embedding lifecycle
msgvault build-embeddings [--full-rebuild] [--yes]— backfill or rebuild the index. Resumable: rerun to pick up where a prior run left off.model:dimensionfingerprint. The old generation keeps serving queries until the new one is fully built, then activates atomically.msgvault serveruns the worker on a cron ([vector.embed.schedule] cron = "...") and optionally after every successful scheduled sync (run_after_sync = true).Stats and observability
GET /api/v1/statsincludes avector_searchsub-object: active generation, fingerprint, message count, pending-queue depth, building generation (if any).--explainon search surfaces per-hit scores (BM25, vector distance, RRF, subject boost).vector_not_enabled,index_stale,index_building,missing_free_text,pagination_unsupported,invalid_mode,embedding_timeout.Storage
vectors.db(sqlite-vec ANN index + queue + generations table) next tomsgvault.db. The main database is unchanged.foreign_keys=ONon every open.Enabling
make buildalready does this):[vector]block to~/.msgvault/config.toml:Full guide: msgvault.io/usage/vector-search/. HTTP response shapes: msgvault.io/api-server/.
Compatibility
-tags sqlite_vec, or configs without[vector] enabled = true, behave exactly as before.🤖 Generated with Claude Code