Add embedding-based semantic search (third retrieval channel)#114
Conversation
Two-channel `search` tool that finds both memories and entities (datasources / non-hidden models / non-hidden columns / named measures / aggregations). Channel 1 reuses the entity-overlap BM25 path from `recall_memories`; channel 2 builds a per-call in-memory tantivy index over memories ∪ entities. Memory hits from both channels are fused via RRF (k=60); entity hits surface their raw tantivy BM25 score. Surfaces: MCP `search`, REST `POST /search`, CLI `slayer search [--entity ...] [--question ...] [--query ...]`, and `SlayerClient.search`. Plus `slayer search refresh-samples` and the in-process refresh wired through `slayer ingest` / `edit_model`. Storage adds a v5→v6 schema bump (no-op forward) for a new optional `Column.sampled: Optional[str]` cache, written through `StorageBackend.update_column_sampled` on YAML / SQLite / JoinSync. The sampled snapshot is the human-readable per-column profile string the search index renders into each entity's text field; previously this was recomputed by `inspect_model` on every call. Tests cover the index builder, RRF fusion, render helpers, profiling, service-level orchestration, storage write-back, end-to-end MCP / REST / CLI / client surfaces (`tests/test_search_surfaces.py`), and a live-DB integration suite that drives the real `ingest_datasource_idempotent` path on SQLite (`tests/integration/test_integration.py::test_search_*`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* `slayer/client/slayer_client.py`: defer `slayer.search.service` import to the `search()` method. The service transitively imports `tantivy`, which isn't in the client extras (`httpx + pandas`); remote-only installs that never call `.search()` should not blow up at module-import time. * `slayer/engine/ingestion.py`: wrap the post-ingest sample-value refresh in `try/except Exception` so a raise inside `refresh_all_table_backed_sampled` is captured as an `IngestionError` rather than killing the whole idempotent ingest call. * `slayer/mcp/server.py`: skip the live `_collect_measure_profile` query when every column was served from the `Column.sampled` cache. A fully warmed cache no longer hits the database on `inspect_model`. * `slayer/search/index.py`: filter hidden models out of the datasource doc input. Without this, a hidden model's name leaked into the parent datasource doc and surfaced via tantivy queries. * `slayer/search/service.py`: hoist the `list_memories(entities=None)` fetch above the channel branches so both channels share one corpus scan instead of two. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…exity threshold * `slayer/cli.py` — split monolithic `_run_search` (cognitive complexity 42) into a 4-line dispatcher plus `_run_search_refresh_samples`, `_refresh_samples_async`, `_run_search_query`, and a `_print_search_response_text` helper. Hoist the function-local `import asyncio`, `slayer.engine.profiling.refresh_*`, `SlayerQueryEngine`, and `SearchService` imports to module top per the project's "imports at the top" rule. * `slayer/search/service.py` — split `SearchService.search` (cognitive complexity 54) into `_resolve_inputs`, `_recency_fallback`, `_run_channel_1`, `_run_channel_2`, plus a module-level `_fuse_memory_hits` helper. The orchestrator function is now a 60-line linear walk. * `slayer/engine/profiling.py` — extract `_profile_categorical_column` and `_profile_numeric_temporal_columns` from `_collect_dim_profile` (cognitive complexity 26). * `slayer/search/index.py` — extract `_add_datasource_docs`, `_add_model_subtree_docs`, and `_add_memory_docs` from `build_in_memory_index` (cognitive complexity 18). No behaviour changes — every existing test passes unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* `tests/test_search_render.py` — fix vacuous `or` assertion: the test for "no sampled section when sampled is None" passed regardless of what render_column_text emitted because `or` short-circuited. Replaced with two `not in text` assertions (the actual contract). * `tests/test_search_service.py` — replace tautological assertion: `assert any(h.id in mem_ids for h in response.memories)` was always true when `response.memories` was non-empty (mem_ids was constructed from the same list). Replaced with a content check on memory text. * `tests/test_v6_migration.py` — the two storage round-trip tests were saving a current-version `SlayerModel(...)` which already has `version=6`, so `get_model` never had to run the v5→v6 migration on load. Now seed each backend with a raw v5 dict (YAML directly on disk; SQL `INSERT INTO models` on the SQLite backend) and assert `loaded.version == 6` and `sampled is None` per column. * `tests/test_engine_profiling.py` — drop unused `asyncio.get_event_loop_policy().new_event_loop()` (allocated but never installed or closed); hoist `import asyncio` and `import sqlite3` to module top per the project's import-placement rule. * `tests/test_search_surfaces.py` — `code, out = ...` → `code, _ = ...` in the refresh-samples-help test (Sonar S1481 / CodeRabbit nit). * `CLAUDE.md` — add `--query` to the compact `slayer search` CLI mention (was missing — `--question` and `--entity` were listed but not `--query`). * `docs/concepts/models.md` — update the field-table default and the schema-version narrative for `SlayerModel` from `3` to `6` (DEV-1361 bumped to 5, DEV-1375 bumped to 6); `SlayerQuery` stays at 3 and `DatasourceConfig` stays at 1. * `docs/concepts/search.md` — annotate the bare ` ``` ` fence containing the RRF formula with `text` to satisfy markdownlint MD040. * `slayer/search/render.py` — fix the module docstring's claim that the model renderer mentions hidden columns: `render_model_text` actually filters via `visible_columns`, so a hidden column never appears in any indexed text. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…findings * `tests/test_search_writeback.py:182,183` — Sonar S125 (commented-out code) flagged the `# sampled=None` annotations next to two Column definitions. The annotations document the test's contract (these columns intentionally have no cached profile so the writeback path fires). Suppress with NOSONAR(S125). * `tests/test_search_surfaces.py:173` — Sonar S5754 (reraise to stop the application) flagged the `except SystemExit` branch in `_run_cli`. The test deliberately captures the exit code rather than reraising — that's how it asserts on `slayer search --help` exit status. Suppress with NOSONAR(S5754). * `tests/test_search_surfaces.py:53` — Sonar S3776 (cognitive complexity 18→15) on the `_call_mcp_tool` test helper. The helper branches over three FastMCP result shapes (tuple / list / object); splitting it would hurt readability for an 18-vs-15 overshoot. Suppress with NOSONAR(S3776). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* `slayer/search/service.py` — split the round-1 `_run_channel_2` helper (cognitive complexity 18 after the Group B refactor) into `_run_channel_2` + two module-level helpers `_split_tantivy_hits` and `_backfill_memory_by_id`. Brings `_run_channel_2` under 15. * `tests/test_search_writeback.py` — tighten the refresh assertion per CodeRabbit: was `errors == [] or all(isinstance(e, str) for e in errors)` (vacuous when refresh emits any string error); now `errors == []` with a diagnostic message. Also hoist the function-local `from slayer.engine.profiling import refresh_all_table_backed_sampled` to module top. * `tests/test_v6_migration.py` — hoist `from slayer.core.models import DatasourceConfig` to module top per the import-placement rule. Add `# NOSONAR(S7493)` on the sync `open(v5_path, "w")` — test seeds a tempfile and mirrors the YAMLStorage sync-in-async pattern; aiofiles is overkill for test setup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* `slayer/search/service.py::MemoryHit` — fix docstring/code mismatch: ``score`` is *always* the RRF-fused value (the code routes single- channel results through ``rrf_fuse`` too, so the previous "raw score when only one channel contributed" claim was misleading). Document the formula and that the value is comparable across channels. * `slayer/search/service.py` — convert positional args to keyword form on three multi-parameter helper calls per the project's "use keyword arguments for functions with more than 1 parameter" rule: ``resolve_entity(raw=...)``, ``extract_entities_from_query(query=...)``, ``bm25_rank(memories=..., query_entities=...)``, ``_backfill_memory_by_id(memory_by_id=..., all_memories=..., mem_ids=...)``. ``_backfill_memory_by_id`` is now keyword-only at its signature, since every call site supplies kwargs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Channel 1 of search already wraps the same BM25 ranker recall_memories
used, so the two retrieval surfaces were redundant. Hard-remove
recall_memories (no deprecation shim — too few users) and reshape
search so query-bearing memories surface in their own example_queries
bucket with an independent max_example_queries=2 cap, preventing bulky
example queries from crowding out small learning-only notes. Also adds
resolved_input_entities to SearchResponse for diagnostics.
Removed surfaces: MCP recall_memories tool, REST POST /memories/recall +
RecallMemoriesRequest, CLI `slayer memory recall` subcommand,
SlayerClient.recall_memories, RecallHit/RecallResponse Pydantic models,
MemoryService.recall_memories + helpers.
Docs (CLAUDE.md, docs/concepts/{memories,search}.md, slayer-overview
skill) and tests (drop recall test classes, extend search tests for the
new shape) updated accordingly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Group A — `slayer/search/service.py`: extract `_build_memory_hit` helper to bring `_fuse_memory_hits` cognitive complexity from 22 to under the Sonar S3776 threshold of 15. The helper centralises the learning-vs-example-query partition + matched_entities + text-fallback logic and is reused by `_recency_fallback` for DRYness. Group B — `slayer/cli.py`: wrap `_run_search_query`'s `asyncio.run` in a try/except for `EntityResolutionError` / `AmbiguousModelError` / `ValueError`, matching the friendly `_exit_with_error` path used by the other memory subcommands. Move the three inline `slayer.core.errors` imports in `cli.py` to module top (user's import-placement rule). Prefix `slayer search` help-text examples with `poetry run` per repo convention. Group C — `tests/test_search_service.py`: tighten two assertions. Channel-2 entity-hit isinstance check now also asserts the list is non-empty (was vacuous on empty list). RRF outranking test now finds BOTH the dual-channel and single-channel memory indices and asserts the dual-channel one ranks ahead — the previous assertion only checked that the dual-channel memory exists. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A new dense-embedding channel runs alongside tantivy and entity-overlap BM25; memory rankings are now RRF-fused across all three channels, and entity hits — previously tantivy-only with raw scores — are RRF-fused across channels 2 and 3. The channel is gated behind the optional `embedding_search` extra (litellm + numpy); when the extra is not installed, no API key is configured, or no embedding rows exist for the active model, search degrades gracefully via tantivy + BM25 and emits a single warning. Embeddings are persisted in a sidecar table keyed by `(canonical_id, embedding_model_name)`. Refresh is inline on the same write-side edges as `Column.sampled`: `slayer ingest`, `edit_model`, and `save_memory`. A SHA256 `content_hash` on each row makes idempotent re-runs cheap. The default model is `openai/text-embedding-3-small`, overridable via `SLAYER_EMBEDDING_MODEL`; provider credentials are read by litellm directly. Per-entity embed failures are non-fatal. Storage logic refactored to keep cascade-delete in the ABC: backends now implement `_delete_model_row` / `_delete_datasource_row` and the ABC wrappers handle embedding-row cascade by canonical-id prefix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a unified semantic search surface (BM25 + tantivy + optional embeddings fused via RRF), per-column sampled profiling and writeback, an embeddings sidecar with persistence and cascade deletes, ingestion/edit refresh hooks, storage ABC/backend changes, new API/CLI/MCP/Client search surfaces, v6 schema bump, removal of recall_memories, and extensive tests and docs. ChangesSemantic search, embeddings, and sampled cache
Sequence Diagram(s)sequenceDiagram
participant Client
participant API as FastAPI /search
participant SearchService
participant Storage as StorageBackend
participant Embeddings as EmbeddingService
Client->>API: POST /search {entities, query, question}
API->>SearchService: search(...)
SearchService->>Storage: load models, memories, corpus metadata
alt question present
SearchService->>SearchService: build in-memory tantivy corpus & run full-text
end
alt embeddings enabled and question present
SearchService->>Embeddings: fetch_corpus()
SearchService->>Embeddings: embed_question(question)
Embeddings->>Storage: read/write embedding rows (refresh)
SearchService->>SearchService: compute cosine/top-k + fuse via RRF
end
SearchService-->>API: SearchResponse (memories, example_queries, entities, warnings)
API-->>Client: JSON
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 13
🧹 Nitpick comments (8)
slayer/storage/v6_migration.py (1)
15-15: ⚡ Quick winUse keyword arguments in migration registration decorator.
Line 15 uses two positional arguments; please switch to named arguments for consistency with repository Python style rules.
As per coding guidelines
**/*.py: Use keyword arguments for functions with more than 1 parameter.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@slayer/storage/v6_migration.py` at line 15, The decorator usage `@register_migration`("SlayerModel", 5) should use keyword arguments per style rules; update the decorator call to use named parameters (e.g., register_migration(name="SlayerModel", version=5)) so the migration registration is explicit and consistent with functions that take multiple parameters.slayer/embeddings/client.py (1)
111-111: ⚡ Quick winUse keyword args for the multi-parameter internal call.
Line 111 should pass
textsexplicitly for consistency with repo conventions.💡 Suggested fix
- result = await embed_batch([text], model=resolved_model) + result = await embed_batch(texts=[text], model=resolved_model)As per coding guidelines
**/*.py: Use keyword arguments for functions with more than 1 parameter.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@slayer/embeddings/client.py` at line 111, Call embed_batch using keyword arguments to follow the repo convention: change the call that currently uses positional args to pass texts explicitly (e.g., use texts=[text], model=resolved_model). Update the invocation of embed_batch in slayer/embeddings/client.py (the line with result = await embed_batch([text], model=resolved_model)) so it uses the texts= keyword while keeping model=resolved_model and the same variables (text, resolved_model).tests/test_embeddings_storage.py (1)
32-38: ⚡ Quick winInclude
JoinSyncStoragein this matrix too.These tests only exercise the raw YAML/SQLite backends, but this PR also adds embedding-method delegation in
JoinSyncStorage. Adding a wrapped-backend case here would catch a broken passthrough on the adapter path.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_embeddings_storage.py` around lines 32 - 38, The parametrized storage fixture currently yields only YAMLStorage and SQLiteStorage; add cases that yield JoinSyncStorage wrappers to exercise delegation. Update the fixture (storage) to include wrapped variants (e.g., for YAML and SQLite) or extend params to include "join_yaml"/"join_sqlite", and when those values are seen, construct JoinSyncStorage(inner=YAMLStorage(...)) and JoinSyncStorage(inner=SQLiteStorage(...)) so tests exercise JoinSyncStorage's passthrough behavior alongside YAMLStorage and SQLiteStorage.tests/test_search_index.py (1)
151-163: 💤 Low valueWeak assertions don't verify the documented expectation.
The comment states "Both groups should be non-empty" but the assertions only check
isinstance(..., list), which is always true. Consider assertinglen(memory_hits) > 0andlen(entity_hits) > 0to match the documented expectation.Suggested fix
- # Both groups should be non-empty for "customer" since memory 2 mentions - # "customers" and entities include customers/customer_id. - assert isinstance(memory_hits, list) # type sanity - assert isinstance(entity_hits, list) + # Both groups should be non-empty for "customer" since memory 2 mentions + # "customers" and entities include customers/customer_id. + assert len(memory_hits) > 0, "expected at least one memory hit" + assert len(entity_hits) > 0, "expected at least one entity hit"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_search_index.py` around lines 151 - 163, The test test_kind_field_split_for_memory_vs_entity documents that both memory_hits and entity_hits should be non-empty but only asserts their types; update the assertions in this test to assert len(memory_hits) > 0 and len(entity_hits) > 0 (instead of isinstance checks) so the behavior of search_index / build_in_memory_index is actually verified for the "customer" query.slayer/engine/profiling.py (1)
87-96: 💤 Low valueMinor: redundant
or Nonein data_source handling.On line 94,
model.data_source or Noneis redundant sincedata_sourceis alreadyOptional[str]. If it'sNone, the expression evaluates toNone; if it's an empty string, converting toNonemight be intentional for disambiguation.If the intent is to treat empty string as
None, this is fine. Otherwise, it can be simplified.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@slayer/engine/profiling.py` around lines 87 - 96, The use of "model.data_source or None" when calling engine.execute is redundant because model.data_source is already Optional[str]; update the call in the try block (the SlayerQuery.model_validate / engine.execute usage) to pass model.data_source directly (i.e., remove the "or None" coalesce) unless you explicitly want empty-string values converted to None—in that case add a short comment clarifying that behavior.slayer/search/service.py (1)
597-608: 💤 Low valueAccessing private method
_list_all_model_identitiesoutside storage layer.The
_collect_index_corpusmethod accessesstorage._list_all_model_identities(), which is a private method (prefixed with_). While this works sinceSearchServiceis part of the same codebase, it couples the service to storage internals.Consider whether a public method like
list_all_models()orlist_model_identities()would be more appropriate for the storage ABC, or document this as an intentional internal coupling.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@slayer/search/service.py` around lines 597 - 608, _collect_index_corpus is calling the private storage method _list_all_model_identities which couples SearchService to storage internals; replace that direct private call with a public storage API (e.g., add and use list_model_identities or list_all_models on the storage ABC) and update implementations accordingly so _collect_index_corpus calls storage.list_model_identities() (keep using storage.list_datasources() and storage.get_model(name, data_source=ds) unchanged); alternatively, if this coupling is intentional, add a clear comment/docstring on SearchService._collect_index_corpus noting the deliberate use of the private API.slayer/search/index.py (1)
121-169: 💤 Low valueConsider extracting entity-type renderers to reduce cognitive complexity.
SonarCloud flagged cognitive complexity of 16 (limit 15). While the function is readable as-is, extracting the inner loops for columns/measures/aggregations into helper functions would address this and improve testability.
However, given the function's straightforward iteration pattern and the minor threshold violation, this is low priority.
♻️ Optional refactor sketch
def _collect_model_children_pairs( model: SlayerModel, model_canonical: str, ) -> List[Tuple[str, str, str]]: """Extract column/measure/aggregation pairs for a single model.""" out: List[Tuple[str, str, str]] = [] for column in model.columns: if column.hidden: continue out.append(( f"{model_canonical}.{column.name}", "column", render_column_text(model=model, column=column), )) for measure in model.measures: if measure.name is None: continue out.append(( f"{model_canonical}.{measure.name}", "measure", render_measure_text(model=model, measure=measure), )) for aggregation in model.aggregations: out.append(( f"{model_canonical}.{aggregation.name}", "aggregation", render_aggregation_text(model=model, aggregation=aggregation), )) return out🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@slayer/search/index.py` around lines 121 - 169, The function _collect_render_pairs exceeds SonarCloud's cognitive complexity threshold due to the inlined loops for model children; extract the column/measure/aggregation handling into a helper (e.g., _collect_model_children_pairs) that accepts a SlayerModel and its canonical id and returns List[Tuple[str,str,str]>; replace the three inner loops inside _collect_render_pairs with a single call to that helper and ensure the helper applies the same filters (skip hidden columns and measures with name None) and uses render_column_text, render_measure_text, and render_aggregation_text so behavior remains identical.tests/test_search_writeback.py (1)
170-198: 💤 Low valueUnused
monkeypatchfixture parameter.The
monkeypatchfixture is declared in the function signature but never used in this test. Consider removing it to avoid confusion.♻️ Proposed fix
`@pytest.mark.asyncio` async def test_inspect_model_writes_back_on_sampled_miss( - sqlite_table_setup, monkeypatch, + sqlite_table_setup, ) -> None:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_search_writeback.py` around lines 170 - 198, The test function test_inspect_model_writes_back_on_sampled_miss declares an unused monkeypatch fixture parameter; remove monkeypatch from the function signature (or use it if there was an intended patch) so the test signature becomes async def test_inspect_model_writes_back_on_sampled_miss(sqlite_table_setup) -> None: and update any call sites if needed; ensure no other references to monkeypatch exist in this function.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.claude/skills/slayer-overview.md:
- Line 35: Update the search description that currently calls it "two-channel"
to document three retrieval channels: entity-overlap BM25 over memories, tantivy
full-text over memories ∪ entities, and dense embedding similarity (optional)
over memories ∪ entities; keep the note that the query-bearing memories are
partitioned into `example_queries` and reference the optional embedding behavior
consistent with docs/concepts/search.md so the `search` overview matches the
full design.
In `@CLAUDE.md`:
- Around line 115-116: Update the CLAUDE.md synopsis for the slayer search CLI
to match the semantic-search docs by changing "two-channel" to "three-channel"
and adding the new flag to the usage string and description: include
--max-example-queries in the usage for `slayer search [--entity ENT ...]
[--query JSON_OR_@FILE] [--question TEXT] [--max-memories N] [--max-entities N]
[--max-example-queries N] [--format json|text]` and mention the third channel
(example queries) in the short description for the `slayer search` command so it
aligns with the semantic-search section.
In `@docs/concepts/ingestion.md`:
- Line 160: Update the sentence describing embedding refresh to clarify that
setting SLAYER_EMBEDDING_MODEL is optional: when the embedding_search extra is
installed and either a SLAYER_EMBEDDING_MODEL is configured or the runtime
default embedding model (e.g., openai/text-embedding-3-small) is used, the
embedding refresh will re-run for the datasource doc plus every visible model
and its visible children; mention SLAYER_EMBEDDING_MODEL as an optional override
so readers understand the default behavior and that re-ingests remain cheap due
to the SHA256 content_hash on each row.
In `@slayer/cli.py`:
- Around line 605-607: Calls to helper functions _run_search_refresh_samples,
_run_search_query, and _refresh_samples_async currently pass multiple positional
arguments; update each call to use keyword arguments (e.g.,
_run_search_refresh_samples(args=..., storage=...), _run_search_query(args=...,
storage=...), _refresh_samples_async(args=..., storage=...)) so the parameter
order is explicit and follows the repository Python guideline for functions with
more than one parameter.
- Around line 624-626: The loop that calls storage.get_model(model_name,
data_source=ds) currently continues silently when m is None, which lets typos in
a user-specified --model filter go unnoticed; change this by tracking unresolved
model names (e.g., collect model_name when storage.get_model returns None) and
after the loop, if the user provided --model and any unresolved names exist,
surface them to the user (print/log a clear error or warning and exit non-zero)
instead of proceeding to a “completed successfully” message; update the code
paths that reference model_name, m, storage.get_model and the --model handling
to fail-fast or fail loudly when requested models are missing.
In `@slayer/client/slayer_client.py`:
- Line 253: Update the stale docstring that currently reads "Two-channel
semantic search over memories + canonical entities." in slayer_client.py to
reflect the new optional third channel (embeddings) added by this PR: change
wording to indicate "Two- or three-channel semantic search" and enumerate the
channels (memories, canonical entities, and optional embeddings when the
embeddings channel/parameter is enabled), and mention any relevant parameter
name that enables the third channel so the docstring matches actual retrieval
behavior.
In `@slayer/embeddings/client.py`:
- Around line 108-120: The cache implementation using _QUERY_CACHE and
_QUERY_CACHE_MAX currently behaves FIFO because hits don't update recency;
modify the hit path in the function that looks up key (before returning cached)
to refresh recency by moving the key to the most-recent position (e.g., remove
and reinsert the key or use collections.OrderedDict and call move_to_end) so the
eviction logic that pops next(iter(_QUERY_CACHE)) effectively implements LRU;
keep the existing embed_batch, vec extraction, and eviction code but ensure hits
update the cache order before returning.
In `@slayer/mcp/server.py`:
- Around line 1892-1896: The current gate only marks column upserts and a few
source/filter changes, so pure measure/aggregation/join edits never set
model_level_change and thus skip handle_edit_refresh in edit_model; update the
edit-path to treat non-column semantic edits as model-level changes by setting
model_level_change = True (or otherwise marking the change) when encountering
edits to measures, aggregations, joins or similar non-column metadata, and
ensure changed_columns and model_level_change are passed into the existing
post-save/refresh logic (handle_edit_refresh) so the semantic-search/index
refresh path runs for these edits (refer to changed_columns, model_level_change,
handle_edit_refresh, and edit_model to locate where to set the flag and
propagate it).
- Around line 2574-2594: The docstring for the search function is out of
date—update the Slayer MCP `search` docstring to describe the new optional
embedding retrieval channel and how its results are fused with the existing
channels; explicitly mention that search now has a third (optional) embeddings
channel, when it runs, that fused ranking uses Reciprocal Rank Fusion (k=60)
across channels (and that entity hits remain channel-2 only), and keep the
existing notes about partitioning query-bearing memories into example_queries vs
memories (SlayerQuery) and capping behavior; reference the `search` function,
the embedding retrieval path, Reciprocal Rank Fusion (k=60), and
`SlayerQuery`/`example_queries` so readers can find the related logic.
- Around line 2128-2139: The post-save call to handle_edit_refresh should be
made truly best-effort: wrap the await handle_edit_refresh(...) call in a
try/except so any exception is caught and does not abort after the model is
persisted, log a warning including the exception details and context (engine,
saved_model.name, data_source), and capture the return value (e.g., errors =
await handle_edit_refresh(...)); if that return contains non-empty errors, log
them as warnings and append them to the MCP response warnings payload (or to a
warnings list on the current save result/context) so the API response surfaces
these refresh warnings instead of dropping them.
- Around line 1258-1279: The code collects measure profiles via
_collect_measure_profile(model=model, engine=engine) into measure_profile but
never persists them; update the logic after measure_profile is assigned to
iterate its entries and call storage.update_column_sampled(...) for each
measured column (similar to the loop that persists dim profile values),
formatting values with the same _format_dim_profile_value or a suitable
formatter and catching exceptions to log warnings; reference
_collect_measure_profile, measure_profile, storage.update_column_sampled,
_format_dim_profile_value and Column.sampled so the persisted sampled values are
saved for numeric/date/time columns.
In `@slayer/search/__init__.py`:
- Around line 4-7: Update the package docstring in slayer/search/__init__.py so
it accurately describes SearchService as a three-channel orchestrator
(entity-overlap BM25 over memories, tantivy full-text over the unioned corpus,
plus an optional embedding-based retrieval channel) and mention that results are
fused (e.g., via RRF) and that the embedding channel is conditional; update any
brief usage notes to reflect the optional embeddings flag and fusion behavior.
Use the symbol SearchService to locate the orchestrator and edit the top-level
module docstring accordingly.
In `@tests/integration/test_integration.py`:
- Around line 3135-3143: The assertion currently allows unrelated hits like
"test_sqlite.customers.region"; tighten it so the test requires at least one
actual amount column. In the block that computes column_hits and expected,
remove "test_sqlite.customers.region" from expected and assert that there exists
a hit with id equal to "test_sqlite.orders.amount" or that endswith(".amount")
(i.e., replace the current combined any(...) or any(...endswith(".amount"))
check with a single assertion requiring any(h.id.endswith(".amount") or h.id ==
"test_sqlite.orders.amount") against column_hits) so only actual amount columns
satisfy the test.
---
Nitpick comments:
In `@slayer/embeddings/client.py`:
- Line 111: Call embed_batch using keyword arguments to follow the repo
convention: change the call that currently uses positional args to pass texts
explicitly (e.g., use texts=[text], model=resolved_model). Update the invocation
of embed_batch in slayer/embeddings/client.py (the line with result = await
embed_batch([text], model=resolved_model)) so it uses the texts= keyword while
keeping model=resolved_model and the same variables (text, resolved_model).
In `@slayer/engine/profiling.py`:
- Around line 87-96: The use of "model.data_source or None" when calling
engine.execute is redundant because model.data_source is already Optional[str];
update the call in the try block (the SlayerQuery.model_validate /
engine.execute usage) to pass model.data_source directly (i.e., remove the "or
None" coalesce) unless you explicitly want empty-string values converted to
None—in that case add a short comment clarifying that behavior.
In `@slayer/search/index.py`:
- Around line 121-169: The function _collect_render_pairs exceeds SonarCloud's
cognitive complexity threshold due to the inlined loops for model children;
extract the column/measure/aggregation handling into a helper (e.g.,
_collect_model_children_pairs) that accepts a SlayerModel and its canonical id
and returns List[Tuple[str,str,str]>; replace the three inner loops inside
_collect_render_pairs with a single call to that helper and ensure the helper
applies the same filters (skip hidden columns and measures with name None) and
uses render_column_text, render_measure_text, and render_aggregation_text so
behavior remains identical.
In `@slayer/search/service.py`:
- Around line 597-608: _collect_index_corpus is calling the private storage
method _list_all_model_identities which couples SearchService to storage
internals; replace that direct private call with a public storage API (e.g., add
and use list_model_identities or list_all_models on the storage ABC) and update
implementations accordingly so _collect_index_corpus calls
storage.list_model_identities() (keep using storage.list_datasources() and
storage.get_model(name, data_source=ds) unchanged); alternatively, if this
coupling is intentional, add a clear comment/docstring on
SearchService._collect_index_corpus noting the deliberate use of the private
API.
In `@slayer/storage/v6_migration.py`:
- Line 15: The decorator usage `@register_migration`("SlayerModel", 5) should use
keyword arguments per style rules; update the decorator call to use named
parameters (e.g., register_migration(name="SlayerModel", version=5)) so the
migration registration is explicit and consistent with functions that take
multiple parameters.
In `@tests/test_embeddings_storage.py`:
- Around line 32-38: The parametrized storage fixture currently yields only
YAMLStorage and SQLiteStorage; add cases that yield JoinSyncStorage wrappers to
exercise delegation. Update the fixture (storage) to include wrapped variants
(e.g., for YAML and SQLite) or extend params to include
"join_yaml"/"join_sqlite", and when those values are seen, construct
JoinSyncStorage(inner=YAMLStorage(...)) and
JoinSyncStorage(inner=SQLiteStorage(...)) so tests exercise JoinSyncStorage's
passthrough behavior alongside YAMLStorage and SQLiteStorage.
In `@tests/test_search_index.py`:
- Around line 151-163: The test test_kind_field_split_for_memory_vs_entity
documents that both memory_hits and entity_hits should be non-empty but only
asserts their types; update the assertions in this test to assert
len(memory_hits) > 0 and len(entity_hits) > 0 (instead of isinstance checks) so
the behavior of search_index / build_in_memory_index is actually verified for
the "customer" query.
In `@tests/test_search_writeback.py`:
- Around line 170-198: The test function
test_inspect_model_writes_back_on_sampled_miss declares an unused monkeypatch
fixture parameter; remove monkeypatch from the function signature (or use it if
there was an intended patch) so the test signature becomes async def
test_inspect_model_writes_back_on_sampled_miss(sqlite_table_setup) -> None: and
update any call sites if needed; ensure no other references to monkeypatch exist
in this function.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 877745c5-8486-4efa-a343-3b3508cfd27e
⛔ Files ignored due to path filters (1)
poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (61)
.claude/skills/slayer-overview.mdCLAUDE.mddocs/concepts/ingestion.mddocs/concepts/memories.mddocs/concepts/models.mddocs/concepts/queries.mddocs/concepts/references.mddocs/concepts/schema-drift.mddocs/concepts/search.mddocs/configuration/storage.mdpyproject.tomlslayer/api/server.pyslayer/cli.pyslayer/client/slayer_client.pyslayer/core/models.pyslayer/embeddings/__init__.pyslayer/embeddings/client.pyslayer/embeddings/models.pyslayer/embeddings/ranker.pyslayer/embeddings/service.pyslayer/engine/ingestion.pyslayer/engine/profiling.pyslayer/mcp/server.pyslayer/memories/__init__.pyslayer/memories/models.pyslayer/memories/ranker.pyslayer/memories/service.pyslayer/search/__init__.pyslayer/search/index.pyslayer/search/render.pyslayer/search/rrf.pyslayer/search/service.pyslayer/storage/base.pyslayer/storage/join_sync.pyslayer/storage/migrations.pyslayer/storage/sqlite_storage.pyslayer/storage/v6_migration.pyslayer/storage/yaml_storage.pytests/integration/test_integration.pytests/test_embeddings_client.pytests/test_embeddings_service.pytests/test_embeddings_storage.pytests/test_engine_profiling.pytests/test_memories_cli.pytests/test_memories_client.pytests/test_memories_inspect.pytests/test_memories_mcp.pytests/test_memories_rest.pytests/test_memories_storage.pytests/test_migrations.pytests/test_search_index.pytests/test_search_render.pytests/test_search_rrf.pytests/test_search_service.pytests/test_search_surfaces.pytests/test_search_three_channel.pytests/test_search_writeback.pytests/test_storage_sampled.pytests/test_storage_type_refinement.pytests/test_v4_migration.pytests/test_v6_migration.py
…c-search-using-embeddings # Conflicts: # tests/integration/test_integration.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
CLAUDE.md (1)
115-116:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix CLI synopsis to reflect the third channel and complete flag set.
This line still says "two-channel" and omits
--max-example-queriesfrom the usage string, creating inconsistency with the semantic-search section above (line 94) which describes three channels and with line 96 which lists the flag.📝 Proposed documentation fix
-- `slayer search [--entity ENT ...] [--query JSON_OR_@FILE] [--question TEXT] [--max-memories N] [--max-entities N] [--format json|text]` (DEV-1375) — two-channel semantic search over memories + canonical entities. See [docs/concepts/search.md](docs/concepts/search.md). +- `slayer search [--entity ENT ...] [--query JSON_OR_@FILE] [--question TEXT] [--max-memories N] [--max-example-queries N] [--max-entities N] [--format json|text]` (DEV-1375/DEV-1386) — up to three-channel semantic search over memories + canonical entities (BM25 + tantivy + optional embeddings). See [docs/concepts/search.md](docs/concepts/search.md).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@CLAUDE.md` around lines 115 - 116, Update the CLI synopsis string in the `slayer search` usage line to reflect the third channel and include the missing flag: change "two-channel semantic search over memories + canonical entities" to "three-channel semantic search over memories + canonical entities + examples" and add the `--max-example-queries N` (or `--max-example-queries` form used elsewhere) into the flags list for the `slayer search [--entity ...] [--query ...] [--question ...] [--max-memories N] [--max-entities N] [--format json|text]` synopsis so it matches the semantic-search section and the flags listed elsewhere.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@CLAUDE.md`:
- Around line 115-116: Update the CLI synopsis string in the `slayer search`
usage line to reflect the third channel and include the missing flag: change
"two-channel semantic search over memories + canonical entities" to
"three-channel semantic search over memories + canonical entities + examples"
and add the `--max-example-queries N` (or `--max-example-queries` form used
elsewhere) into the flags list for the `slayer search [--entity ...] [--query
...] [--question ...] [--max-memories N] [--max-entities N] [--format
json|text]` synopsis so it matches the semantic-search section and the flags
listed elsewhere.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d608308b-cf0a-4405-b0d3-7cd5ea8f5967
📒 Files selected for processing (5)
CLAUDE.mddocs/concepts/queries.mddocs/concepts/references.mdpyproject.tomltests/integration/test_integration.py
✅ Files skipped from review due to trivial changes (2)
- docs/concepts/queries.md
- docs/concepts/references.md
🚧 Files skipped from review as they are similar to previous changes (2)
- pyproject.toml
- tests/integration/test_integration.py
Group A — valid fixes: - slayer/search/index.py: extract _render_model_subtree_pairs to drop _collect_render_pairs cognitive complexity from 16 to ~9 (S3776). - tests/test_embeddings_client.py: remove the dead `fail` stub + trivially-true `called == []` assertion in the empty-input test (S5603 / S7503). - slayer/cli.py: collapse chained startswith into a tuple form (S8513, pre-existing). Group B — NOSONAR suppressions for false-positive S7503 on async stubs that deliberately match production async signatures (litellm.aembedding, our embed_batch, our embed_query). Each suppression includes the rule key + reason per the project convention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@slayer/cli.py`:
- Around line 523-526: The refresh-samples command is ignoring the shared
--format flag; update the refresh-samples handler to check args.format and emit
JSON when args.format == "json" (otherwise keep text), i.e. where the command
currently prints text for results replace the unconditional text output with a
conditional that serializes the result to JSON when args.format == "json" (use
json.dumps) and otherwise prints the existing text format; also apply the same
change to the analogous block referenced at the other location (around lines
640-649) so both refresh paths honor args.format.
- Line 4: Remove the top-level "import asyncio" and instead import run_sync from
slayer.async_utils, then replace the two calls to asyncio.run(...) in
slayer/cli.py with run_sync(...) (use run_sync from slayer.async_utils), passing
the coroutine via the named parameter (e.g., coro=YourAsyncEntry(...)) and
convert any positional arguments you were passing into keyword arguments on that
coroutine call; update both occurrences (the ones previously calling asyncio.run
at the two spots mentioned) to use run_sync and remove the now-unused asyncio
import.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3dfe7cb2-60e0-4d19-a25e-431b59b443e5
📒 Files selected for processing (5)
slayer/cli.pyslayer/search/index.pytests/test_embeddings_client.pytests/test_embeddings_service.pytests/test_search_three_channel.py
🚧 Files skipped from review as they are similar to previous changes (4)
- tests/test_embeddings_client.py
- slayer/search/index.py
- tests/test_search_three_channel.py
- tests/test_embeddings_service.py
…l failing" CI without OPENAI_API_KEY was producing per-entity refresh warnings that bubbled up to save_memory / ingest / edit_model responses, breaking three tests that assert empty warnings on the happy path. The root cause: ``is_available()`` only checked that ``litellm`` could be imported, not that the configured embedding model had a usable key. With the extra installed but no key configured, ``embed_batch`` failed per-entity at the litellm boundary, and the write-side surfaces those as warnings — which is correct for runtime failures but wrong for a "feature not configured" case. Changes: - ``is_available()`` now also calls ``litellm.validate_environment`` to confirm the active model has a key in the environment. Both "extra not installed" and "extra installed but no key" yield False. Unknown models trust the user and let the actual call surface any error. - ``EmbeddingService.refresh_*`` returns an empty warning list when ``is_available()`` is False (was emitting a "extra not installed" warning that bubbled up). Genuine per-entity runtime failures still bubble up to the response per the spec. - The search-side ``_run_channel_3`` warning now covers both cases in one message: "extra not installed or no API key configured". - ``tests/conftest.py`` adds a session-wide autouse fixture that stubs ``is_available`` to ``False``. Tests that exercise the embedding path override it locally via their own fixtures. This stops unit tests from hitting the real OpenAI API on a developer machine that has ``OPENAI_API_KEY`` set. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Group A — "two-channel" docstring rot: - .claude/skills/slayer-overview.md, CLAUDE.md, search/__init__.py, slayer_client.py, mcp/server.py: every doc/docstring that still says "two-channel" now says "three-channel" with the optional embedding caveat. CLAUDE.md's CLI synopsis also gains the missing --max-example-queries flag. - docs/concepts/ingestion.md: clarify that SLAYER_EMBEDDING_MODEL is an optional override of the default, not a hard precondition; align the failure-mode language with the "channel disabled vs failing" split from round 2. Group B — CLI hardening: - slayer/cli.py: switch _run_search_refresh_samples / _run_search_query / _refresh_samples_async to keyword arguments per the project convention. New RefreshSamplesResult Pydantic envelope. - slayer/cli.py: `slayer search refresh-samples --model foo` now tracks unresolved user-specified model names and exits non-zero with a clear error instead of silently reporting success. Group C — edit_model refresh path correctness: - slayer/mcp/server.py: new model_doc_changed flag fires the refresh hook for measure/aggregation/join edits too. Without this, those edits left the embedding-channel state stale even though the stored model changed. Sample-value refresh still gated on filter/sql changes; the new flag triggers only the embedding subtree refresh. - slayer/mcp/server.py: wrap handle_edit_refresh in try/except so a flaky embedding API can't fail the tool after a successful save. Captured errors land in response["warnings"] instead of being dropped silently. - slayer/mcp/server.py inspect_model: persist measure-side (numeric / temporal) profile values to Column.sampled too, mirroring the dim-profile persist loop. Future inspect_model / search calls now hit the cache instead of re-running the live profile query. Group D — code correctness: - slayer/embeddings/client.py: the query-embedding cache now refreshes recency on a hit (pop + reinsert), so eviction is true LRU instead of FIFO. Frequently-used keys no longer age out. - tests/integration/test_integration.py: tighten test_search_question_finds_column to require an `.amount` column hit. Previously a `customers.region` hit also satisfied the assertion, masking relevance regressions. - slayer/storage/v6_migration.py: @register_migration now uses kwargs to match the project convention (>1 param → keyword). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@slayer/mcp/server.py`:
- Line 2628: Update the docstring summary line that currently reads "Two-channel
semantic search over memories + canonical entities." to accurately reflect the
implementation described in the body (three channels); specifically change that
first-line summary to mention "Three-channel semantic search" (or a short phrase
like "Three-channel semantic search over memories, canonical entities, and the
third channel described below") so the one-line summary matches the docstring
body for the function whose docstring begins with "Two-channel semantic search
over memories + canonical entities."
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a04e748b-defc-4366-94f8-e1e9a4d43553
📒 Files selected for processing (10)
.claude/skills/slayer-overview.mdCLAUDE.mddocs/concepts/ingestion.mdslayer/cli.pyslayer/client/slayer_client.pyslayer/embeddings/client.pyslayer/mcp/server.pyslayer/search/__init__.pyslayer/storage/v6_migration.pytests/integration/test_integration.py
✅ Files skipped from review due to trivial changes (2)
- .claude/skills/slayer-overview.md
- docs/concepts/ingestion.md
🚧 Files skipped from review as they are similar to previous changes (4)
- slayer/storage/v6_migration.py
- slayer/embeddings/client.py
- tests/integration/test_integration.py
- slayer/cli.py
- slayer/cli.py: replace the two asyncio.run() call sites I added in rounds 1 and 3 with run_sync() from slayer.async_utils, matching every other async dispatch in this file. Required for CLI use from inside an already-running event loop (Jupyter, async test runners). The top-level `import asyncio` is now unused; dropped. - slayer/cli.py: `slayer search refresh-samples --format json` now emits the RefreshSamplesResult envelope as JSON instead of silently falling back to the text path. text remains the default. - slayer/mcp/server.py: fix the stale "Two-channel" first line of the search() tool's docstring — the body already describes all three channels. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|



Summary
search: dense embedding similarity via litellm, RRF-fused with the existing tantivy + entity-overlap BM25 channels. Entity hits — previously tantivy-only — are now RRF-fused across channels 2 and 3.(canonical_id, embedding_model_name). Refresh inline onslayer ingest,edit_model, andsave_memory; SHA256content_hashmakes idempotent re-runs cheap.embedding_searchextra (litellm+numpy). When the extra is missing, no API key is configured, or there are no rows for the active model, the channel emits one warning and search degrades gracefully via tantivy + BM25.delete_model/delete_datasourcecascade into theStorageBackendABC so embedding cleanup is backend-agnostic; backends implement only the row primitives.Configuration
openai/text-embedding-3-small. Override viaSLAYER_EMBEDDING_MODEL. Provider credentials (OPENAI_API_KEY, etc.) read by litellm directly.slayer ingest(or re-save memories) to populate the new model's rows.Test plan
poetry run pytest -m "not integration"— 2348 passed.poetry run ruff check slayer/ tests/.poetry install -E embedding_search, setOPENAI_API_KEY,slayer datasources create demo --ingest,slayer memory save "...", thenslayer search --question "..."— confirm channel 3 surfaces both memory and relevant entity.poetry install(no extra), same search — confirm the response has a warning about missing embedding support and tantivy + BM25 hits still surface.SLAYER_EMBEDDING_MODEL=openai/text-embedding-3-large, re-run search — confirm channel 3 emits a "no embedding rows" warning, then re-ingest and re-run to confirm hits return.🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Changes
Tests