Skip to content

SLayer 0.6.3: Speed up and clean up embedding search

Choose a tag to compare

@ZmeiGorynych ZmeiGorynych released this 13 May 17:06
· 314 commits to main since this release
62693d3

SLayer 0.6.3

Release theme: embedding-search hot paths run faster on YAML-backed stores, plus a new datasource filter on search and several correctness fixes around memory id allocation and cascade-delete that affected the embedding-search subsystem introduced in 0.6.0.


Highlights

New: optional datasource filter on search

search(...) and all four surfaces (MCP tool, REST POST /search, CLI slayer search, SlayerClient.search) gain an optional datasource: Optional[str] = None argument. When set, all three retrieval channels pre-filter their corpora to that one datasource:

  • Entity hits (tantivy + embedding-cosine channels) include only docs rooted at the requested datasource: exact name match or strict dotted-path descendant (<ds>.<model>, <ds>.<model>.<leaf>). Character-prefix matches do not qualify, so datasource="prod" excludes a sibling datasource named prod_v2.
  • Memory hits include any memory whose entities list contains at least one entity rooted at the requested datasource. Memories spanning multiple datasources surface from each.
  • BM25 / IDF statistics in channels 1 and 2 and the cosine matrix in channel 3 reflect only the filtered subset (pre-filter, not post-filter).

Unknown datasource names raise ValueError (HTTP 400 on REST). Validation runs before any corpus walk so typos surface fast.

This builds on the canonical-id namespace rule shipped in this same release: dotted datasource names are rejected, so the "rooted at" prefix match is unambiguous. Helper slayer.memories.resolver.canonical_id_rooted_at encodes the same dotted-namespace rule used by the embedding cascade-delete.

Embedding storage moved to a SQLite sidecar in YAMLStorage

Before 0.6.3, YAMLStorage persisted every embedding row to a single embeddings.yaml list. Each save/get/list/delete operation read and re-wrote the entire file, with the embedding vector (typically 768-1536 floats per row) inline. On every slayer ingest, EmbeddingService._apply_pending re-parsed embeddings.yaml M times for hash-skip and then M more times for writes (M = model + visible columns + named measures + custom aggregations). Even moderate corpora made ingest unusably slow.

0.6.3 introduces a SidecarEmbeddingStore helper backed by SQLite. YAMLStorage now persists embeddings to <base_dir>/embeddings.db while keeping models, datasources, datasource priority, and memories in their original git-diffable YAML form. SQLiteStorage delegates to the same helper. The SQL lives in exactly one place.

Net effect on a JaffleShop-sized model tree: slayer ingest goes from "minutes of YAML round-tripping" to "one batched read + one batched write per refresh."

New batched embedding APIs

The StorageBackend ABC gained two methods, plumbed all the way through EmbeddingService:

  • save_embeddings(rows: List[Embedding]) -> None
  • get_embeddings_for_canonical_ids(*, canonical_ids: List[str], embedding_model_name: str) -> Dict[str, Embedding]

Default implementations fall back to M-iteration over the existing single-row methods, so third-party storage backends continue to work without modification. The bundled SQLite + YAML backends override them to issue single batched round-trips via SidecarEmbeddingStore.save_many / get_many. EmbeddingService._apply_pending (used by every save_memory, edit_model, and slayer ingest) now makes exactly one batched read and one batched write per call, independent of subtree size.

Cascade-delete fix: descendants only, not character prefixes

StorageBackend.delete_embeddings_for_canonical(canonical_id_prefix=X) previously used LIKE X || '%', which silently matched anything starting with the same characters. Concrete consequences:

  • delete_memory(4) cascaded to delete_embeddings_for_canonical("memory:4"), which also wiped memory:42, memory:43, memory:400, and so on.
  • delete_datasource("orders") also wiped embeddings rooted at sibling datasources like orders_archive, orders123.
  • delete_model("orders", "customers") also wiped embeddings rooted at orders.customers_v2, orders.customers123.

The cascade now matches the canonical id exactly OR strict dotted-path descendants (<root>.<...>). Pinned by regression tests.

Datasource-name validation tightened

To make the dotted-path namespace unambiguous:

  • DatasourceConfig.name now rejects ., /, \, NUL, empty/whitespace-only. __ is deliberately allowed (datasource names never appear in SQL alias positions).
  • SlayerModel.data_source rejects the same set (was previously missing the ., whitespace, and NUL checks).
  • Storage-layer _validate_path_component (used by get_model/delete_model/get_datasource/delete_datasource) likewise rejects ..

Upgrade note: if any of your existing datasource names contain a ., save/load will start failing validation on 0.6.3. Rename them before upgrading. __ is still fine.

Memory ids no longer use a separate counter store

counters.yaml (YAML) and the id_counters table (SQLite) are gone. The next id is derived directly from the memories corpus:

  • YAML: max(int_ids) + 1 over the current rows in memories.yaml (or 1 for an empty file).
  • SQLite: INSERT ... RETURNING id against the memories table. Id assignment happens inside SQLite's write lock atomically with the insert, so two concurrent save_memory calls can never collide on the same id.

Behaviour change: memory ids of deleted memories may now be reused by future saves. Cascade-on-delete in delete_memory already removes the matching embedding row, so reuse never strands data. (Pre-0.6.3 documentation claimed "ids are never reused"; that contract is gone, replaced with "ids increase monotonically while the corpus grows; freed ids may be reused.")

Fixed: concurrent save_memory race on SQLite

The pre-0.6.3 SQLiteStorage._next_seq_sync + _save_memory_sync flow used two separate SQLite connections. Concurrent save_memory calls could both read the same MAX(id) + 1 before either inserted, then both INSERT OR REPLACE the same id, silently clobbering one of the two memories. 0.6.3 collapses both steps into one INSERT ... RETURNING id transaction. Pinned by a regression test firing 25 concurrent saves and asserting unique ids.


Migration

YAMLStorage.__init__ performs an idempotent one-time rename on first open at 0.6.3:

If present Renamed to
<base_dir>/embeddings.yaml embeddings.yaml.legacy
<base_dir>/counters.yaml counters.yaml.legacy

If a .legacy file already exists, neither file is touched (idempotent, never clobbers an existing backup). Embeddings are regeneratable artifacts; re-run slayer ingest (or rely on slayer serve --ingest-on-startup from 0.6.1) to repopulate embeddings.db. Until then the embedding-similarity channel of search contributes nothing and emits a single warning. Tantivy full-text and BM25 entity-overlap continue to work.

For SQLiteStorage, no migration is required. The legacy id_counters table, if present from a pre-0.6.3 database, is left in place as harmless dead data and is never queried.


Public API surface

Added (concrete on StorageBackend, with default M-iteration impls):

  • save_embeddings(rows)
  • get_embeddings_for_canonical_ids(*, canonical_ids, embedding_model_name)

Behaviour change:

  • delete_embeddings_for_canonical(canonical_id_prefix=...): semantics narrowed to exact-id OR strict dotted-path descendant. Previously a character prefix; now namespace-aware.
  • Memory ids of deleted memories may be reused.
  • DatasourceConfig.name and SlayerModel.data_source reject ., leading/trailing whitespace, NUL bytes.
  • search(...) gains optional datasource argument across MCP, REST, CLI, and the Python client.

Internal (not a public contract, but documented for backend authors):

  • New helper class slayer.storage.sidecar_embedding_store.SidecarEmbeddingStore.
  • New mixin SidecarEmbeddingsMixin providing the embedding CRUD forwards by delegating to self._embeddings_store. Both bundled backends inherit it.
  • New pure helper slayer.memories.resolver.canonical_id_rooted_at(canonical_id, datasource) -> bool.

Removed (internal):

  • SQLiteStorage._next_seq_sync, _seed_counter_sync, _COUNTER_SEED_TABLES: counter machinery dead-code-removed.
  • YAMLStorage._read_counters, _write_counters, _max_memory_id.
  • id_counters table creation in SQLiteStorage._init_db (existing tables left untouched).

Docs

  • docs/configuration/storage.md: documents the new embeddings.db sidecar, the .legacy rename, and the new memory-id allocation.
  • docs/concepts/memories.md: id-reuse note, no more counter-store language.
  • docs/concepts/search.md: embedding sidecar paragraph, cascade-semantics clarification, and the datasource filter section.
  • CLAUDE.md: memories + embeddings + search sections updated.

Test coverage added

  • tests/test_sidecar_embedding_store.py (new): helper CRUD round-trips, batched APIs (incl. empty-input short-circuit verified by sqlite3.connect spy), the prefix-greedy regression set, and the get_many chunked-IN regression at 2000 ids.
  • tests/test_embeddings_storage.py (extended): parametrised across both backends; YAML legacy-rename idempotency; same prefix-greedy regression set via the public ABC.
  • tests/test_memories_storage.py (extended): id reuse on tail delete; empty-corpus first id is 1; new save never collides with existing id; concurrent save_memory produces 25 unique ids; delete_datasource / get_datasource reject bad inputs.
  • tests/test_models.py (extended): full validator coverage for DatasourceConfig.name and SlayerModel.data_source.
  • tests/test_embeddings_service.py (extended): pins that _apply_pending issues exactly one batched read + one batched write per call.
  • tests/test_canonical_id_helpers.py (new): dotted-namespace rule coverage for the new helper.
  • tests/test_search_datasource_filter.py (new): memory scoping (kept / cross-datasource / dropped / untagged), entity scoping at channel 2, validation, recency fallback, empty-corpus, None == no filter.
  • tests/test_search_surfaces.py (extended): acceptance + rejection of datasource across MCP, REST, CLI, and the Python client.

Total: 2584 unit tests, all green; ruff check clean.


Contributors

Egor Kraev. Generated with Claude Code.