Skip to content

Add embedding-based semantic search (third retrieval channel)#114

Merged
ZmeiGorynych merged 15 commits into
mainfrom
egor/dev-1386-semantic-search-using-embeddings
May 12, 2026
Merged

Add embedding-based semantic search (third retrieval channel)#114
ZmeiGorynych merged 15 commits into
mainfrom
egor/dev-1386-semantic-search-using-embeddings

Conversation

@ZmeiGorynych
Copy link
Copy Markdown
Member

@ZmeiGorynych ZmeiGorynych commented May 11, 2026

Summary

  • Add a third retrieval channel to search: dense embedding similarity via litellm, RRF-fused with the existing tantivy + entity-overlap BM25 channels. Entity hits — previously tantivy-only — are now RRF-fused across channels 2 and 3.
  • Persist embeddings in a sidecar table keyed by (canonical_id, embedding_model_name). Refresh inline on slayer ingest, edit_model, and save_memory; SHA256 content_hash makes idempotent re-runs cheap.
  • Gate behind the optional embedding_search extra (litellm + numpy). When the extra is missing, no API key is configured, or there are no rows for the active model, the channel emits one warning and search degrades gracefully via tantivy + BM25.
  • Refactor delete_model / delete_datasource cascade into the StorageBackend ABC so embedding cleanup is backend-agnostic; backends implement only the row primitives.

Configuration

  • Default model: openai/text-embedding-3-small. Override via SLAYER_EMBEDDING_MODEL. Provider credentials (OPENAI_API_KEY, etc.) read by litellm directly.
  • Switching the active model leaves prior rows inert; re-run slayer ingest (or re-save memories) to populate the new model's rows.
  • Per-entity embed failures (rate limits, transient network errors, bad keys) are non-fatal and surface as warnings.

Test plan

  • Unit suite green: poetry run pytest -m "not integration" — 2348 passed.
  • Lint clean: poetry run ruff check slayer/ tests/.
  • Manual smoke (with extra): poetry install -E embedding_search, set OPENAI_API_KEY, slayer datasources create demo --ingest, slayer memory save "...", then slayer search --question "..." — confirm channel 3 surfaces both memory and relevant entity.
  • Manual smoke (without extra): fresh venv, poetry install (no extra), same search — confirm the response has a warning about missing embedding support and tantivy + BM25 hits still surface.
  • Manual smoke (model change): with the extra installed, set SLAYER_EMBEDDING_MODEL=openai/text-embedding-3-large, re-run search — confirm channel 3 emits a "no embedding rows" warning, then re-ingest and re-run to confirm hits return.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Three‑channel semantic search (BM25 + full‑text + optional embeddings) with fused ranking; new REST /search, CLI search (+ refresh-samples), MCP tool and client search API.
    • Per‑column sampled snapshots persisted and refreshable; embedding sidecar persistence with cascade-delete behavior.
  • Changes

    • Memory retrieval unified into search (separate recall surfaces removed); save memory now triggers best‑effort embedding refresh.
    • Schema version bump for models (v6) and related migration notes.
  • Tests

    • Extensive new unit and end‑to‑end tests for search, embeddings, sampled refresh, and migrations.

Review Change Stack

ZmeiGorynych and others added 10 commits May 10, 2026 18:02
Two-channel `search` tool that finds both memories and entities
(datasources / non-hidden models / non-hidden columns / named measures /
aggregations). Channel 1 reuses the entity-overlap BM25 path from
`recall_memories`; channel 2 builds a per-call in-memory tantivy index
over memories ∪ entities. Memory hits from both channels are fused via
RRF (k=60); entity hits surface their raw tantivy BM25 score.

Surfaces: MCP `search`, REST `POST /search`, CLI
`slayer search [--entity ...] [--question ...] [--query ...]`, and
`SlayerClient.search`. Plus `slayer search refresh-samples` and the
in-process refresh wired through `slayer ingest` / `edit_model`.

Storage adds a v5→v6 schema bump (no-op forward) for a new optional
`Column.sampled: Optional[str]` cache, written through
`StorageBackend.update_column_sampled` on YAML / SQLite / JoinSync. The
sampled snapshot is the human-readable per-column profile string the
search index renders into each entity's text field; previously this was
recomputed by `inspect_model` on every call.

Tests cover the index builder, RRF fusion, render helpers, profiling,
service-level orchestration, storage write-back, end-to-end MCP / REST /
CLI / client surfaces (`tests/test_search_surfaces.py`), and a live-DB
integration suite that drives the real `ingest_datasource_idempotent`
path on SQLite (`tests/integration/test_integration.py::test_search_*`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* `slayer/client/slayer_client.py`: defer `slayer.search.service` import to
  the `search()` method. The service transitively imports `tantivy`, which
  isn't in the client extras (`httpx + pandas`); remote-only installs that
  never call `.search()` should not blow up at module-import time.
* `slayer/engine/ingestion.py`: wrap the post-ingest sample-value refresh
  in `try/except Exception` so a raise inside `refresh_all_table_backed_sampled`
  is captured as an `IngestionError` rather than killing the whole
  idempotent ingest call.
* `slayer/mcp/server.py`: skip the live `_collect_measure_profile` query
  when every column was served from the `Column.sampled` cache. A fully
  warmed cache no longer hits the database on `inspect_model`.
* `slayer/search/index.py`: filter hidden models out of the datasource
  doc input. Without this, a hidden model's name leaked into the
  parent datasource doc and surfaced via tantivy queries.
* `slayer/search/service.py`: hoist the `list_memories(entities=None)`
  fetch above the channel branches so both channels share one corpus
  scan instead of two.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…exity threshold

* `slayer/cli.py` — split monolithic `_run_search` (cognitive complexity
  42) into a 4-line dispatcher plus `_run_search_refresh_samples`,
  `_refresh_samples_async`, `_run_search_query`, and a
  `_print_search_response_text` helper. Hoist the function-local
  `import asyncio`, `slayer.engine.profiling.refresh_*`,
  `SlayerQueryEngine`, and `SearchService` imports to module top per
  the project's "imports at the top" rule.
* `slayer/search/service.py` — split `SearchService.search` (cognitive
  complexity 54) into `_resolve_inputs`, `_recency_fallback`,
  `_run_channel_1`, `_run_channel_2`, plus a module-level `_fuse_memory_hits`
  helper. The orchestrator function is now a 60-line linear walk.
* `slayer/engine/profiling.py` — extract `_profile_categorical_column`
  and `_profile_numeric_temporal_columns` from `_collect_dim_profile`
  (cognitive complexity 26).
* `slayer/search/index.py` — extract `_add_datasource_docs`,
  `_add_model_subtree_docs`, and `_add_memory_docs` from
  `build_in_memory_index` (cognitive complexity 18).

No behaviour changes — every existing test passes unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* `tests/test_search_render.py` — fix vacuous `or` assertion: the test
  for "no sampled section when sampled is None" passed regardless of
  what render_column_text emitted because `or` short-circuited. Replaced
  with two `not in text` assertions (the actual contract).
* `tests/test_search_service.py` — replace tautological assertion:
  `assert any(h.id in mem_ids for h in response.memories)` was always
  true when `response.memories` was non-empty (mem_ids was constructed
  from the same list). Replaced with a content check on memory text.
* `tests/test_v6_migration.py` — the two storage round-trip tests were
  saving a current-version `SlayerModel(...)` which already has
  `version=6`, so `get_model` never had to run the v5→v6 migration on
  load. Now seed each backend with a raw v5 dict (YAML directly on
  disk; SQL `INSERT INTO models` on the SQLite backend) and assert
  `loaded.version == 6` and `sampled is None` per column.
* `tests/test_engine_profiling.py` — drop unused
  `asyncio.get_event_loop_policy().new_event_loop()` (allocated but
  never installed or closed); hoist `import asyncio` and `import sqlite3`
  to module top per the project's import-placement rule.
* `tests/test_search_surfaces.py` — `code, out = ...` → `code, _ = ...`
  in the refresh-samples-help test (Sonar S1481 / CodeRabbit nit).
* `CLAUDE.md` — add `--query` to the compact `slayer search` CLI mention
  (was missing — `--question` and `--entity` were listed but not
  `--query`).
* `docs/concepts/models.md` — update the field-table default and the
  schema-version narrative for `SlayerModel` from `3` to `6` (DEV-1361
  bumped to 5, DEV-1375 bumped to 6); `SlayerQuery` stays at 3 and
  `DatasourceConfig` stays at 1.
* `docs/concepts/search.md` — annotate the bare ` ``` ` fence containing
  the RRF formula with `text` to satisfy markdownlint MD040.
* `slayer/search/render.py` — fix the module docstring's claim that the
  model renderer mentions hidden columns: `render_model_text` actually
  filters via `visible_columns`, so a hidden column never appears in
  any indexed text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…findings

* `tests/test_search_writeback.py:182,183` — Sonar S125 (commented-out
  code) flagged the `# sampled=None` annotations next to two Column
  definitions. The annotations document the test's contract (these
  columns intentionally have no cached profile so the writeback path
  fires). Suppress with NOSONAR(S125).
* `tests/test_search_surfaces.py:173` — Sonar S5754 (reraise to stop
  the application) flagged the `except SystemExit` branch in
  `_run_cli`. The test deliberately captures the exit code rather than
  reraising — that's how it asserts on `slayer search --help` exit
  status. Suppress with NOSONAR(S5754).
* `tests/test_search_surfaces.py:53` — Sonar S3776 (cognitive
  complexity 18→15) on the `_call_mcp_tool` test helper. The helper
  branches over three FastMCP result shapes (tuple / list / object);
  splitting it would hurt readability for an 18-vs-15 overshoot.
  Suppress with NOSONAR(S3776).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* `slayer/search/service.py` — split the round-1 `_run_channel_2`
  helper (cognitive complexity 18 after the Group B refactor) into
  `_run_channel_2` + two module-level helpers `_split_tantivy_hits`
  and `_backfill_memory_by_id`. Brings `_run_channel_2` under 15.
* `tests/test_search_writeback.py` — tighten the refresh assertion
  per CodeRabbit: was `errors == [] or all(isinstance(e, str) for e
  in errors)` (vacuous when refresh emits any string error); now
  `errors == []` with a diagnostic message. Also hoist the
  function-local `from slayer.engine.profiling import
  refresh_all_table_backed_sampled` to module top.
* `tests/test_v6_migration.py` — hoist `from slayer.core.models
  import DatasourceConfig` to module top per the import-placement
  rule. Add `# NOSONAR(S7493)` on the sync `open(v5_path, "w")` —
  test seeds a tempfile and mirrors the YAMLStorage sync-in-async
  pattern; aiofiles is overkill for test setup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* `slayer/search/service.py::MemoryHit` — fix docstring/code mismatch:
  ``score`` is *always* the RRF-fused value (the code routes single-
  channel results through ``rrf_fuse`` too, so the previous "raw score
  when only one channel contributed" claim was misleading). Document the
  formula and that the value is comparable across channels.
* `slayer/search/service.py` — convert positional args to keyword form
  on three multi-parameter helper calls per the project's "use keyword
  arguments for functions with more than 1 parameter" rule:
  ``resolve_entity(raw=...)``, ``extract_entities_from_query(query=...)``,
  ``bm25_rank(memories=..., query_entities=...)``,
  ``_backfill_memory_by_id(memory_by_id=..., all_memories=..., mem_ids=...)``.
  ``_backfill_memory_by_id`` is now keyword-only at its signature, since
  every call site supplies kwargs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Channel 1 of search already wraps the same BM25 ranker recall_memories
used, so the two retrieval surfaces were redundant. Hard-remove
recall_memories (no deprecation shim — too few users) and reshape
search so query-bearing memories surface in their own example_queries
bucket with an independent max_example_queries=2 cap, preventing bulky
example queries from crowding out small learning-only notes. Also adds
resolved_input_entities to SearchResponse for diagnostics.

Removed surfaces: MCP recall_memories tool, REST POST /memories/recall +
RecallMemoriesRequest, CLI `slayer memory recall` subcommand,
SlayerClient.recall_memories, RecallHit/RecallResponse Pydantic models,
MemoryService.recall_memories + helpers.

Docs (CLAUDE.md, docs/concepts/{memories,search}.md, slayer-overview
skill) and tests (drop recall test classes, extend search tests for the
new shape) updated accordingly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Group A — `slayer/search/service.py`: extract `_build_memory_hit` helper
to bring `_fuse_memory_hits` cognitive complexity from 22 to under the
Sonar S3776 threshold of 15. The helper centralises the
learning-vs-example-query partition + matched_entities + text-fallback
logic and is reused by `_recency_fallback` for DRYness.

Group B — `slayer/cli.py`: wrap `_run_search_query`'s `asyncio.run` in a
try/except for `EntityResolutionError` / `AmbiguousModelError` /
`ValueError`, matching the friendly `_exit_with_error` path used by the
other memory subcommands. Move the three inline `slayer.core.errors`
imports in `cli.py` to module top (user's import-placement rule).
Prefix `slayer search` help-text examples with `poetry run` per repo
convention.

Group C — `tests/test_search_service.py`: tighten two assertions.
Channel-2 entity-hit isinstance check now also asserts the list is
non-empty (was vacuous on empty list). RRF outranking test now finds
BOTH the dual-channel and single-channel memory indices and asserts
the dual-channel one ranks ahead — the previous assertion only checked
that the dual-channel memory exists.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A new dense-embedding channel runs alongside tantivy and entity-overlap
BM25; memory rankings are now RRF-fused across all three channels, and
entity hits — previously tantivy-only with raw scores — are RRF-fused
across channels 2 and 3. The channel is gated behind the optional
`embedding_search` extra (litellm + numpy); when the extra is not
installed, no API key is configured, or no embedding rows exist for the
active model, search degrades gracefully via tantivy + BM25 and emits a
single warning.

Embeddings are persisted in a sidecar table keyed by
`(canonical_id, embedding_model_name)`. Refresh is inline on the same
write-side edges as `Column.sampled`: `slayer ingest`, `edit_model`,
and `save_memory`. A SHA256 `content_hash` on each row makes idempotent
re-runs cheap. The default model is `openai/text-embedding-3-small`,
overridable via `SLAYER_EMBEDDING_MODEL`; provider credentials are read
by litellm directly. Per-entity embed failures are non-fatal.

Storage logic refactored to keep cascade-delete in the ABC: backends
now implement `_delete_model_row` / `_delete_datasource_row` and the
ABC wrappers handle embedding-row cascade by canonical-id prefix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a unified semantic search surface (BM25 + tantivy + optional embeddings fused via RRF), per-column sampled profiling and writeback, an embeddings sidecar with persistence and cascade deletes, ingestion/edit refresh hooks, storage ABC/backend changes, new API/CLI/MCP/Client search surfaces, v6 schema bump, removal of recall_memories, and extensive tests and docs.

Changes

Semantic search, embeddings, and sampled cache

Layer / File(s) Summary
Data & migrations
slayer/core/models.py, slayer/storage/migrations.py, slayer/storage/v6_migration.py, pyproject.toml
Add Column.sampled, bump SlayerModel default version to 6, register v5→v6 no-op migrator and Embedding entity; add tantivy and optional litellm/numpy extras and embedding_search.
Embeddings subsystem
slayer/embeddings/models.py, slayer/embeddings/client.py, slayer/embeddings/ranker.py, slayer/embeddings/service.py, slayer/embeddings/__init__.py
Persisted Embedding model; lazy litellm-based async embed client with query cache; NumPy top-k cosine ranker; EmbeddingService for refresh/read with content-hash skipping and best-effort warnings; limited re-exports.
Storage & backends
slayer/storage/base.py, slayer/storage/sqlite_storage.py, slayer/storage/yaml_storage.py, slayer/storage/join_sync.py
Extend StorageBackend ABC with embedding sidecar methods and update_column_sampled; implement embedding tables/files, CRUD, prefix deletion; cascade deletes for models/datasources/memories; delegate changes in JoinSync.
Search orchestration
slayer/search/render.py, slayer/search/index.py, slayer/search/rrf.py, slayer/search/service.py, slayer/search/__init__.py
Renderers for entities/memories, in-memory Tantivy corpus/index, RRF fusion, SearchService implementing BM25 memory ranker, tantivy full-text, optional embedding similarity, fused SearchResponse (memories / example_queries / entities).
Engine profiling & ingestion
slayer/engine/profiling.py, slayer/engine/ingestion.py
Centralized profiling helpers for Column.sampled, refresh helpers and edit/ingest hooks to backfill/persist sampled values and invoke embedding refresh best-effort.
Surfaces & clients
slayer/api/server.py, slayer/cli.py, slayer/client/slayer_client.py, slayer/mcp/server.py
Add POST /search endpoint and SearchRequest, slayer search CLI + refresh-samples, SlayerClient.search, MCP search tool; remove /memories/recall, CLI memory recall, and recall_memories client; update help/docs wiring.
Memories write-side
slayer/memories/*
Narrow memory package to write-only surfaces (save_memory, forget_memory), remove recall models/APIs, and perform best-effort embedding refresh during saves.
Docs & config
docs/**, CLAUDE.md, .claude/skills/slayer-overview.md
Document semantic search design, channels, fusion rules, embedding gating and failure modes, sampled cache semantics and refresh triggers, migration notes, and CLI usage.
Tests
tests/**
Add/extend tests: embeddings client/service/storage, search render/index/rrf/service, profiling/writeback, surfaces (API/CLI/MCP/client), migrations (v6), sampled persistence; remove/adjust recall-related tests.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant API as FastAPI /search
  participant SearchService
  participant Storage as StorageBackend
  participant Embeddings as EmbeddingService
  Client->>API: POST /search {entities, query, question}
  API->>SearchService: search(...)
  SearchService->>Storage: load models, memories, corpus metadata
  alt question present
    SearchService->>SearchService: build in-memory tantivy corpus & run full-text
  end
  alt embeddings enabled and question present
    SearchService->>Embeddings: fetch_corpus()
    SearchService->>Embeddings: embed_question(question)
    Embeddings->>Storage: read/write embedding rows (refresh)
    SearchService->>SearchService: compute cosine/top-k + fuse via RRF
  end
  SearchService-->>API: SearchResponse (memories, example_queries, entities, warnings)
  API-->>Client: JSON
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • whimo
  • AivanF

Poem

I burrow through indices with fluffy feet,
Fuse ranks with a twitch—RRF makes results neat.
I sample columns, stash snapshots so sweet,
Embed tiny vectors where memories meet.
Recall naps—now Search hops in to greet! 🐇

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch egor/dev-1386-semantic-search-using-embeddings

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🧹 Nitpick comments (8)
slayer/storage/v6_migration.py (1)

15-15: ⚡ Quick win

Use keyword arguments in migration registration decorator.

Line 15 uses two positional arguments; please switch to named arguments for consistency with repository Python style rules.

As per coding guidelines **/*.py: Use keyword arguments for functions with more than 1 parameter.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@slayer/storage/v6_migration.py` at line 15, The decorator usage
`@register_migration`("SlayerModel", 5) should use keyword arguments per style
rules; update the decorator call to use named parameters (e.g.,
register_migration(name="SlayerModel", version=5)) so the migration registration
is explicit and consistent with functions that take multiple parameters.
slayer/embeddings/client.py (1)

111-111: ⚡ Quick win

Use keyword args for the multi-parameter internal call.

Line 111 should pass texts explicitly for consistency with repo conventions.

💡 Suggested fix
-    result = await embed_batch([text], model=resolved_model)
+    result = await embed_batch(texts=[text], model=resolved_model)

As per coding guidelines **/*.py: Use keyword arguments for functions with more than 1 parameter.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@slayer/embeddings/client.py` at line 111, Call embed_batch using keyword
arguments to follow the repo convention: change the call that currently uses
positional args to pass texts explicitly (e.g., use texts=[text],
model=resolved_model). Update the invocation of embed_batch in
slayer/embeddings/client.py (the line with result = await embed_batch([text],
model=resolved_model)) so it uses the texts= keyword while keeping
model=resolved_model and the same variables (text, resolved_model).
tests/test_embeddings_storage.py (1)

32-38: ⚡ Quick win

Include JoinSyncStorage in this matrix too.

These tests only exercise the raw YAML/SQLite backends, but this PR also adds embedding-method delegation in JoinSyncStorage. Adding a wrapped-backend case here would catch a broken passthrough on the adapter path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_embeddings_storage.py` around lines 32 - 38, The parametrized
storage fixture currently yields only YAMLStorage and SQLiteStorage; add cases
that yield JoinSyncStorage wrappers to exercise delegation. Update the fixture
(storage) to include wrapped variants (e.g., for YAML and SQLite) or extend
params to include "join_yaml"/"join_sqlite", and when those values are seen,
construct JoinSyncStorage(inner=YAMLStorage(...)) and
JoinSyncStorage(inner=SQLiteStorage(...)) so tests exercise JoinSyncStorage's
passthrough behavior alongside YAMLStorage and SQLiteStorage.
tests/test_search_index.py (1)

151-163: 💤 Low value

Weak assertions don't verify the documented expectation.

The comment states "Both groups should be non-empty" but the assertions only check isinstance(..., list), which is always true. Consider asserting len(memory_hits) > 0 and len(entity_hits) > 0 to match the documented expectation.

Suggested fix
-    # Both groups should be non-empty for "customer" since memory 2 mentions
-    # "customers" and entities include customers/customer_id.
-    assert isinstance(memory_hits, list)  # type sanity
-    assert isinstance(entity_hits, list)
+    # Both groups should be non-empty for "customer" since memory 2 mentions
+    # "customers" and entities include customers/customer_id.
+    assert len(memory_hits) > 0, "expected at least one memory hit"
+    assert len(entity_hits) > 0, "expected at least one entity hit"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_search_index.py` around lines 151 - 163, The test
test_kind_field_split_for_memory_vs_entity documents that both memory_hits and
entity_hits should be non-empty but only asserts their types; update the
assertions in this test to assert len(memory_hits) > 0 and len(entity_hits) > 0
(instead of isinstance checks) so the behavior of search_index /
build_in_memory_index is actually verified for the "customer" query.
slayer/engine/profiling.py (1)

87-96: 💤 Low value

Minor: redundant or None in data_source handling.

On line 94, model.data_source or None is redundant since data_source is already Optional[str]. If it's None, the expression evaluates to None; if it's an empty string, converting to None might be intentional for disambiguation.

If the intent is to treat empty string as None, this is fine. Otherwise, it can be simplified.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@slayer/engine/profiling.py` around lines 87 - 96, The use of
"model.data_source or None" when calling engine.execute is redundant because
model.data_source is already Optional[str]; update the call in the try block
(the SlayerQuery.model_validate / engine.execute usage) to pass
model.data_source directly (i.e., remove the "or None" coalesce) unless you
explicitly want empty-string values converted to None—in that case add a short
comment clarifying that behavior.
slayer/search/service.py (1)

597-608: 💤 Low value

Accessing private method _list_all_model_identities outside storage layer.

The _collect_index_corpus method accesses storage._list_all_model_identities(), which is a private method (prefixed with _). While this works since SearchService is part of the same codebase, it couples the service to storage internals.

Consider whether a public method like list_all_models() or list_model_identities() would be more appropriate for the storage ABC, or document this as an intentional internal coupling.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@slayer/search/service.py` around lines 597 - 608, _collect_index_corpus is
calling the private storage method _list_all_model_identities which couples
SearchService to storage internals; replace that direct private call with a
public storage API (e.g., add and use list_model_identities or list_all_models
on the storage ABC) and update implementations accordingly so
_collect_index_corpus calls storage.list_model_identities() (keep using
storage.list_datasources() and storage.get_model(name, data_source=ds)
unchanged); alternatively, if this coupling is intentional, add a clear
comment/docstring on SearchService._collect_index_corpus noting the deliberate
use of the private API.
slayer/search/index.py (1)

121-169: 💤 Low value

Consider extracting entity-type renderers to reduce cognitive complexity.

SonarCloud flagged cognitive complexity of 16 (limit 15). While the function is readable as-is, extracting the inner loops for columns/measures/aggregations into helper functions would address this and improve testability.

However, given the function's straightforward iteration pattern and the minor threshold violation, this is low priority.

♻️ Optional refactor sketch
def _collect_model_children_pairs(
    model: SlayerModel,
    model_canonical: str,
) -> List[Tuple[str, str, str]]:
    """Extract column/measure/aggregation pairs for a single model."""
    out: List[Tuple[str, str, str]] = []
    for column in model.columns:
        if column.hidden:
            continue
        out.append((
            f"{model_canonical}.{column.name}", "column",
            render_column_text(model=model, column=column),
        ))
    for measure in model.measures:
        if measure.name is None:
            continue
        out.append((
            f"{model_canonical}.{measure.name}", "measure",
            render_measure_text(model=model, measure=measure),
        ))
    for aggregation in model.aggregations:
        out.append((
            f"{model_canonical}.{aggregation.name}", "aggregation",
            render_aggregation_text(model=model, aggregation=aggregation),
        ))
    return out
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@slayer/search/index.py` around lines 121 - 169, The function
_collect_render_pairs exceeds SonarCloud's cognitive complexity threshold due to
the inlined loops for model children; extract the column/measure/aggregation
handling into a helper (e.g., _collect_model_children_pairs) that accepts a
SlayerModel and its canonical id and returns List[Tuple[str,str,str]>; replace
the three inner loops inside _collect_render_pairs with a single call to that
helper and ensure the helper applies the same filters (skip hidden columns and
measures with name None) and uses render_column_text, render_measure_text, and
render_aggregation_text so behavior remains identical.
tests/test_search_writeback.py (1)

170-198: 💤 Low value

Unused monkeypatch fixture parameter.

The monkeypatch fixture is declared in the function signature but never used in this test. Consider removing it to avoid confusion.

♻️ Proposed fix
 `@pytest.mark.asyncio`
 async def test_inspect_model_writes_back_on_sampled_miss(
-    sqlite_table_setup, monkeypatch,
+    sqlite_table_setup,
 ) -> None:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_search_writeback.py` around lines 170 - 198, The test function
test_inspect_model_writes_back_on_sampled_miss declares an unused monkeypatch
fixture parameter; remove monkeypatch from the function signature (or use it if
there was an intended patch) so the test signature becomes async def
test_inspect_model_writes_back_on_sampled_miss(sqlite_table_setup) -> None: and
update any call sites if needed; ensure no other references to monkeypatch exist
in this function.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.claude/skills/slayer-overview.md:
- Line 35: Update the search description that currently calls it "two-channel"
to document three retrieval channels: entity-overlap BM25 over memories, tantivy
full-text over memories ∪ entities, and dense embedding similarity (optional)
over memories ∪ entities; keep the note that the query-bearing memories are
partitioned into `example_queries` and reference the optional embedding behavior
consistent with docs/concepts/search.md so the `search` overview matches the
full design.

In `@CLAUDE.md`:
- Around line 115-116: Update the CLAUDE.md synopsis for the slayer search CLI
to match the semantic-search docs by changing "two-channel" to "three-channel"
and adding the new flag to the usage string and description: include
--max-example-queries in the usage for `slayer search [--entity ENT ...]
[--query JSON_OR_@FILE] [--question TEXT] [--max-memories N] [--max-entities N]
[--max-example-queries N] [--format json|text]` and mention the third channel
(example queries) in the short description for the `slayer search` command so it
aligns with the semantic-search section.

In `@docs/concepts/ingestion.md`:
- Line 160: Update the sentence describing embedding refresh to clarify that
setting SLAYER_EMBEDDING_MODEL is optional: when the embedding_search extra is
installed and either a SLAYER_EMBEDDING_MODEL is configured or the runtime
default embedding model (e.g., openai/text-embedding-3-small) is used, the
embedding refresh will re-run for the datasource doc plus every visible model
and its visible children; mention SLAYER_EMBEDDING_MODEL as an optional override
so readers understand the default behavior and that re-ingests remain cheap due
to the SHA256 content_hash on each row.

In `@slayer/cli.py`:
- Around line 605-607: Calls to helper functions _run_search_refresh_samples,
_run_search_query, and _refresh_samples_async currently pass multiple positional
arguments; update each call to use keyword arguments (e.g.,
_run_search_refresh_samples(args=..., storage=...), _run_search_query(args=...,
storage=...), _refresh_samples_async(args=..., storage=...)) so the parameter
order is explicit and follows the repository Python guideline for functions with
more than one parameter.
- Around line 624-626: The loop that calls storage.get_model(model_name,
data_source=ds) currently continues silently when m is None, which lets typos in
a user-specified --model filter go unnoticed; change this by tracking unresolved
model names (e.g., collect model_name when storage.get_model returns None) and
after the loop, if the user provided --model and any unresolved names exist,
surface them to the user (print/log a clear error or warning and exit non-zero)
instead of proceeding to a “completed successfully” message; update the code
paths that reference model_name, m, storage.get_model and the --model handling
to fail-fast or fail loudly when requested models are missing.

In `@slayer/client/slayer_client.py`:
- Line 253: Update the stale docstring that currently reads "Two-channel
semantic search over memories + canonical entities." in slayer_client.py to
reflect the new optional third channel (embeddings) added by this PR: change
wording to indicate "Two- or three-channel semantic search" and enumerate the
channels (memories, canonical entities, and optional embeddings when the
embeddings channel/parameter is enabled), and mention any relevant parameter
name that enables the third channel so the docstring matches actual retrieval
behavior.

In `@slayer/embeddings/client.py`:
- Around line 108-120: The cache implementation using _QUERY_CACHE and
_QUERY_CACHE_MAX currently behaves FIFO because hits don't update recency;
modify the hit path in the function that looks up key (before returning cached)
to refresh recency by moving the key to the most-recent position (e.g., remove
and reinsert the key or use collections.OrderedDict and call move_to_end) so the
eviction logic that pops next(iter(_QUERY_CACHE)) effectively implements LRU;
keep the existing embed_batch, vec extraction, and eviction code but ensure hits
update the cache order before returning.

In `@slayer/mcp/server.py`:
- Around line 1892-1896: The current gate only marks column upserts and a few
source/filter changes, so pure measure/aggregation/join edits never set
model_level_change and thus skip handle_edit_refresh in edit_model; update the
edit-path to treat non-column semantic edits as model-level changes by setting
model_level_change = True (or otherwise marking the change) when encountering
edits to measures, aggregations, joins or similar non-column metadata, and
ensure changed_columns and model_level_change are passed into the existing
post-save/refresh logic (handle_edit_refresh) so the semantic-search/index
refresh path runs for these edits (refer to changed_columns, model_level_change,
handle_edit_refresh, and edit_model to locate where to set the flag and
propagate it).
- Around line 2574-2594: The docstring for the search function is out of
date—update the Slayer MCP `search` docstring to describe the new optional
embedding retrieval channel and how its results are fused with the existing
channels; explicitly mention that search now has a third (optional) embeddings
channel, when it runs, that fused ranking uses Reciprocal Rank Fusion (k=60)
across channels (and that entity hits remain channel-2 only), and keep the
existing notes about partitioning query-bearing memories into example_queries vs
memories (SlayerQuery) and capping behavior; reference the `search` function,
the embedding retrieval path, Reciprocal Rank Fusion (k=60), and
`SlayerQuery`/`example_queries` so readers can find the related logic.
- Around line 2128-2139: The post-save call to handle_edit_refresh should be
made truly best-effort: wrap the await handle_edit_refresh(...) call in a
try/except so any exception is caught and does not abort after the model is
persisted, log a warning including the exception details and context (engine,
saved_model.name, data_source), and capture the return value (e.g., errors =
await handle_edit_refresh(...)); if that return contains non-empty errors, log
them as warnings and append them to the MCP response warnings payload (or to a
warnings list on the current save result/context) so the API response surfaces
these refresh warnings instead of dropping them.
- Around line 1258-1279: The code collects measure profiles via
_collect_measure_profile(model=model, engine=engine) into measure_profile but
never persists them; update the logic after measure_profile is assigned to
iterate its entries and call storage.update_column_sampled(...) for each
measured column (similar to the loop that persists dim profile values),
formatting values with the same _format_dim_profile_value or a suitable
formatter and catching exceptions to log warnings; reference
_collect_measure_profile, measure_profile, storage.update_column_sampled,
_format_dim_profile_value and Column.sampled so the persisted sampled values are
saved for numeric/date/time columns.

In `@slayer/search/__init__.py`:
- Around line 4-7: Update the package docstring in slayer/search/__init__.py so
it accurately describes SearchService as a three-channel orchestrator
(entity-overlap BM25 over memories, tantivy full-text over the unioned corpus,
plus an optional embedding-based retrieval channel) and mention that results are
fused (e.g., via RRF) and that the embedding channel is conditional; update any
brief usage notes to reflect the optional embeddings flag and fusion behavior.
Use the symbol SearchService to locate the orchestrator and edit the top-level
module docstring accordingly.

In `@tests/integration/test_integration.py`:
- Around line 3135-3143: The assertion currently allows unrelated hits like
"test_sqlite.customers.region"; tighten it so the test requires at least one
actual amount column. In the block that computes column_hits and expected,
remove "test_sqlite.customers.region" from expected and assert that there exists
a hit with id equal to "test_sqlite.orders.amount" or that endswith(".amount")
(i.e., replace the current combined any(...) or any(...endswith(".amount"))
check with a single assertion requiring any(h.id.endswith(".amount") or h.id ==
"test_sqlite.orders.amount") against column_hits) so only actual amount columns
satisfy the test.

---

Nitpick comments:
In `@slayer/embeddings/client.py`:
- Line 111: Call embed_batch using keyword arguments to follow the repo
convention: change the call that currently uses positional args to pass texts
explicitly (e.g., use texts=[text], model=resolved_model). Update the invocation
of embed_batch in slayer/embeddings/client.py (the line with result = await
embed_batch([text], model=resolved_model)) so it uses the texts= keyword while
keeping model=resolved_model and the same variables (text, resolved_model).

In `@slayer/engine/profiling.py`:
- Around line 87-96: The use of "model.data_source or None" when calling
engine.execute is redundant because model.data_source is already Optional[str];
update the call in the try block (the SlayerQuery.model_validate /
engine.execute usage) to pass model.data_source directly (i.e., remove the "or
None" coalesce) unless you explicitly want empty-string values converted to
None—in that case add a short comment clarifying that behavior.

In `@slayer/search/index.py`:
- Around line 121-169: The function _collect_render_pairs exceeds SonarCloud's
cognitive complexity threshold due to the inlined loops for model children;
extract the column/measure/aggregation handling into a helper (e.g.,
_collect_model_children_pairs) that accepts a SlayerModel and its canonical id
and returns List[Tuple[str,str,str]>; replace the three inner loops inside
_collect_render_pairs with a single call to that helper and ensure the helper
applies the same filters (skip hidden columns and measures with name None) and
uses render_column_text, render_measure_text, and render_aggregation_text so
behavior remains identical.

In `@slayer/search/service.py`:
- Around line 597-608: _collect_index_corpus is calling the private storage
method _list_all_model_identities which couples SearchService to storage
internals; replace that direct private call with a public storage API (e.g., add
and use list_model_identities or list_all_models on the storage ABC) and update
implementations accordingly so _collect_index_corpus calls
storage.list_model_identities() (keep using storage.list_datasources() and
storage.get_model(name, data_source=ds) unchanged); alternatively, if this
coupling is intentional, add a clear comment/docstring on
SearchService._collect_index_corpus noting the deliberate use of the private
API.

In `@slayer/storage/v6_migration.py`:
- Line 15: The decorator usage `@register_migration`("SlayerModel", 5) should use
keyword arguments per style rules; update the decorator call to use named
parameters (e.g., register_migration(name="SlayerModel", version=5)) so the
migration registration is explicit and consistent with functions that take
multiple parameters.

In `@tests/test_embeddings_storage.py`:
- Around line 32-38: The parametrized storage fixture currently yields only
YAMLStorage and SQLiteStorage; add cases that yield JoinSyncStorage wrappers to
exercise delegation. Update the fixture (storage) to include wrapped variants
(e.g., for YAML and SQLite) or extend params to include
"join_yaml"/"join_sqlite", and when those values are seen, construct
JoinSyncStorage(inner=YAMLStorage(...)) and
JoinSyncStorage(inner=SQLiteStorage(...)) so tests exercise JoinSyncStorage's
passthrough behavior alongside YAMLStorage and SQLiteStorage.

In `@tests/test_search_index.py`:
- Around line 151-163: The test test_kind_field_split_for_memory_vs_entity
documents that both memory_hits and entity_hits should be non-empty but only
asserts their types; update the assertions in this test to assert
len(memory_hits) > 0 and len(entity_hits) > 0 (instead of isinstance checks) so
the behavior of search_index / build_in_memory_index is actually verified for
the "customer" query.

In `@tests/test_search_writeback.py`:
- Around line 170-198: The test function
test_inspect_model_writes_back_on_sampled_miss declares an unused monkeypatch
fixture parameter; remove monkeypatch from the function signature (or use it if
there was an intended patch) so the test signature becomes async def
test_inspect_model_writes_back_on_sampled_miss(sqlite_table_setup) -> None: and
update any call sites if needed; ensure no other references to monkeypatch exist
in this function.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 877745c5-8486-4efa-a343-3b3508cfd27e

📥 Commits

Reviewing files that changed from the base of the PR and between 013fca7 and 2dfadb7.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (61)
  • .claude/skills/slayer-overview.md
  • CLAUDE.md
  • docs/concepts/ingestion.md
  • docs/concepts/memories.md
  • docs/concepts/models.md
  • docs/concepts/queries.md
  • docs/concepts/references.md
  • docs/concepts/schema-drift.md
  • docs/concepts/search.md
  • docs/configuration/storage.md
  • pyproject.toml
  • slayer/api/server.py
  • slayer/cli.py
  • slayer/client/slayer_client.py
  • slayer/core/models.py
  • slayer/embeddings/__init__.py
  • slayer/embeddings/client.py
  • slayer/embeddings/models.py
  • slayer/embeddings/ranker.py
  • slayer/embeddings/service.py
  • slayer/engine/ingestion.py
  • slayer/engine/profiling.py
  • slayer/mcp/server.py
  • slayer/memories/__init__.py
  • slayer/memories/models.py
  • slayer/memories/ranker.py
  • slayer/memories/service.py
  • slayer/search/__init__.py
  • slayer/search/index.py
  • slayer/search/render.py
  • slayer/search/rrf.py
  • slayer/search/service.py
  • slayer/storage/base.py
  • slayer/storage/join_sync.py
  • slayer/storage/migrations.py
  • slayer/storage/sqlite_storage.py
  • slayer/storage/v6_migration.py
  • slayer/storage/yaml_storage.py
  • tests/integration/test_integration.py
  • tests/test_embeddings_client.py
  • tests/test_embeddings_service.py
  • tests/test_embeddings_storage.py
  • tests/test_engine_profiling.py
  • tests/test_memories_cli.py
  • tests/test_memories_client.py
  • tests/test_memories_inspect.py
  • tests/test_memories_mcp.py
  • tests/test_memories_rest.py
  • tests/test_memories_storage.py
  • tests/test_migrations.py
  • tests/test_search_index.py
  • tests/test_search_render.py
  • tests/test_search_rrf.py
  • tests/test_search_service.py
  • tests/test_search_surfaces.py
  • tests/test_search_three_channel.py
  • tests/test_search_writeback.py
  • tests/test_storage_sampled.py
  • tests/test_storage_type_refinement.py
  • tests/test_v4_migration.py
  • tests/test_v6_migration.py

Comment thread .claude/skills/slayer-overview.md Outdated
Comment thread CLAUDE.md Outdated
Comment thread docs/concepts/ingestion.md Outdated
Comment thread slayer/cli.py Outdated
Comment thread slayer/cli.py
Comment thread slayer/mcp/server.py
Comment thread slayer/mcp/server.py Outdated
Comment thread slayer/mcp/server.py Outdated
Comment thread slayer/search/__init__.py Outdated
Comment thread tests/integration/test_integration.py
…c-search-using-embeddings

# Conflicts:
#	tests/integration/test_integration.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
CLAUDE.md (1)

115-116: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix CLI synopsis to reflect the third channel and complete flag set.

This line still says "two-channel" and omits --max-example-queries from the usage string, creating inconsistency with the semantic-search section above (line 94) which describes three channels and with line 96 which lists the flag.

📝 Proposed documentation fix
-- `slayer search [--entity ENT ...] [--query JSON_OR_@FILE] [--question TEXT] [--max-memories N] [--max-entities N] [--format json|text]` (DEV-1375) — two-channel semantic search over memories + canonical entities. See [docs/concepts/search.md](docs/concepts/search.md).
+- `slayer search [--entity ENT ...] [--query JSON_OR_@FILE] [--question TEXT] [--max-memories N] [--max-example-queries N] [--max-entities N] [--format json|text]` (DEV-1375/DEV-1386) — up to three-channel semantic search over memories + canonical entities (BM25 + tantivy + optional embeddings). See [docs/concepts/search.md](docs/concepts/search.md).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CLAUDE.md` around lines 115 - 116, Update the CLI synopsis string in the
`slayer search` usage line to reflect the third channel and include the missing
flag: change "two-channel semantic search over memories + canonical entities" to
"three-channel semantic search over memories + canonical entities + examples"
and add the `--max-example-queries N` (or `--max-example-queries` form used
elsewhere) into the flags list for the `slayer search [--entity ...] [--query
...] [--question ...] [--max-memories N] [--max-entities N] [--format
json|text]` synopsis so it matches the semantic-search section and the flags
listed elsewhere.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@CLAUDE.md`:
- Around line 115-116: Update the CLI synopsis string in the `slayer search`
usage line to reflect the third channel and include the missing flag: change
"two-channel semantic search over memories + canonical entities" to
"three-channel semantic search over memories + canonical entities + examples"
and add the `--max-example-queries N` (or `--max-example-queries` form used
elsewhere) into the flags list for the `slayer search [--entity ...] [--query
...] [--question ...] [--max-memories N] [--max-entities N] [--format
json|text]` synopsis so it matches the semantic-search section and the flags
listed elsewhere.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d608308b-cf0a-4405-b0d3-7cd5ea8f5967

📥 Commits

Reviewing files that changed from the base of the PR and between 2dfadb7 and eccfa88.

📒 Files selected for processing (5)
  • CLAUDE.md
  • docs/concepts/queries.md
  • docs/concepts/references.md
  • pyproject.toml
  • tests/integration/test_integration.py
✅ Files skipped from review due to trivial changes (2)
  • docs/concepts/queries.md
  • docs/concepts/references.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • pyproject.toml
  • tests/integration/test_integration.py

Group A — valid fixes:
- slayer/search/index.py: extract _render_model_subtree_pairs to drop
  _collect_render_pairs cognitive complexity from 16 to ~9 (S3776).
- tests/test_embeddings_client.py: remove the dead `fail` stub +
  trivially-true `called == []` assertion in the empty-input test
  (S5603 / S7503).
- slayer/cli.py: collapse chained startswith into a tuple form (S8513,
  pre-existing).

Group B — NOSONAR suppressions for false-positive S7503 on async stubs
that deliberately match production async signatures (litellm.aembedding,
our embed_batch, our embed_query). Each suppression includes the rule
key + reason per the project convention.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@slayer/cli.py`:
- Around line 523-526: The refresh-samples command is ignoring the shared
--format flag; update the refresh-samples handler to check args.format and emit
JSON when args.format == "json" (otherwise keep text), i.e. where the command
currently prints text for results replace the unconditional text output with a
conditional that serializes the result to JSON when args.format == "json" (use
json.dumps) and otherwise prints the existing text format; also apply the same
change to the analogous block referenced at the other location (around lines
640-649) so both refresh paths honor args.format.
- Line 4: Remove the top-level "import asyncio" and instead import run_sync from
slayer.async_utils, then replace the two calls to asyncio.run(...) in
slayer/cli.py with run_sync(...) (use run_sync from slayer.async_utils), passing
the coroutine via the named parameter (e.g., coro=YourAsyncEntry(...)) and
convert any positional arguments you were passing into keyword arguments on that
coroutine call; update both occurrences (the ones previously calling asyncio.run
at the two spots mentioned) to use run_sync and remove the now-unused asyncio
import.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3dfe7cb2-60e0-4d19-a25e-431b59b443e5

📥 Commits

Reviewing files that changed from the base of the PR and between eccfa88 and a1faf7d.

📒 Files selected for processing (5)
  • slayer/cli.py
  • slayer/search/index.py
  • tests/test_embeddings_client.py
  • tests/test_embeddings_service.py
  • tests/test_search_three_channel.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • tests/test_embeddings_client.py
  • slayer/search/index.py
  • tests/test_search_three_channel.py
  • tests/test_embeddings_service.py

Comment thread slayer/cli.py Outdated
Comment thread slayer/cli.py
ZmeiGorynych and others added 2 commits May 11, 2026 14:01
…l failing"

CI without OPENAI_API_KEY was producing per-entity refresh warnings that
bubbled up to save_memory / ingest / edit_model responses, breaking
three tests that assert empty warnings on the happy path.

The root cause: ``is_available()`` only checked that ``litellm`` could
be imported, not that the configured embedding model had a usable key.
With the extra installed but no key configured, ``embed_batch`` failed
per-entity at the litellm boundary, and the write-side surfaces those
as warnings — which is correct for runtime failures but wrong for a
"feature not configured" case.

Changes:

- ``is_available()`` now also calls ``litellm.validate_environment`` to
  confirm the active model has a key in the environment. Both
  "extra not installed" and "extra installed but no key" yield False.
  Unknown models trust the user and let the actual call surface any
  error.
- ``EmbeddingService.refresh_*`` returns an empty warning list when
  ``is_available()`` is False (was emitting a "extra not installed"
  warning that bubbled up). Genuine per-entity runtime failures
  still bubble up to the response per the spec.
- The search-side ``_run_channel_3`` warning now covers both cases in
  one message: "extra not installed or no API key configured".
- ``tests/conftest.py`` adds a session-wide autouse fixture that stubs
  ``is_available`` to ``False``. Tests that exercise the embedding
  path override it locally via their own fixtures. This stops unit
  tests from hitting the real OpenAI API on a developer machine that
  has ``OPENAI_API_KEY`` set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Group A — "two-channel" docstring rot:
- .claude/skills/slayer-overview.md, CLAUDE.md, search/__init__.py,
  slayer_client.py, mcp/server.py: every doc/docstring that still says
  "two-channel" now says "three-channel" with the optional embedding
  caveat. CLAUDE.md's CLI synopsis also gains the missing
  --max-example-queries flag.
- docs/concepts/ingestion.md: clarify that SLAYER_EMBEDDING_MODEL is
  an optional override of the default, not a hard precondition; align
  the failure-mode language with the "channel disabled vs failing"
  split from round 2.

Group B — CLI hardening:
- slayer/cli.py: switch _run_search_refresh_samples / _run_search_query
  / _refresh_samples_async to keyword arguments per the project
  convention. New RefreshSamplesResult Pydantic envelope.
- slayer/cli.py: `slayer search refresh-samples --model foo` now
  tracks unresolved user-specified model names and exits non-zero
  with a clear error instead of silently reporting success.

Group C — edit_model refresh path correctness:
- slayer/mcp/server.py: new model_doc_changed flag fires the refresh
  hook for measure/aggregation/join edits too. Without this, those
  edits left the embedding-channel state stale even though the stored
  model changed. Sample-value refresh still gated on filter/sql
  changes; the new flag triggers only the embedding subtree refresh.
- slayer/mcp/server.py: wrap handle_edit_refresh in try/except so a
  flaky embedding API can't fail the tool after a successful save.
  Captured errors land in response["warnings"] instead of being
  dropped silently.
- slayer/mcp/server.py inspect_model: persist measure-side (numeric /
  temporal) profile values to Column.sampled too, mirroring the
  dim-profile persist loop. Future inspect_model / search calls now
  hit the cache instead of re-running the live profile query.

Group D — code correctness:
- slayer/embeddings/client.py: the query-embedding cache now refreshes
  recency on a hit (pop + reinsert), so eviction is true LRU instead
  of FIFO. Frequently-used keys no longer age out.
- tests/integration/test_integration.py: tighten
  test_search_question_finds_column to require an `.amount` column
  hit. Previously a `customers.region` hit also satisfied the
  assertion, masking relevance regressions.
- slayer/storage/v6_migration.py: @register_migration now uses kwargs
  to match the project convention (>1 param → keyword).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@slayer/mcp/server.py`:
- Line 2628: Update the docstring summary line that currently reads "Two-channel
semantic search over memories + canonical entities." to accurately reflect the
implementation described in the body (three channels); specifically change that
first-line summary to mention "Three-channel semantic search" (or a short phrase
like "Three-channel semantic search over memories, canonical entities, and the
third channel described below") so the one-line summary matches the docstring
body for the function whose docstring begins with "Two-channel semantic search
over memories + canonical entities."
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a04e748b-defc-4366-94f8-e1e9a4d43553

📥 Commits

Reviewing files that changed from the base of the PR and between ca8d67f and dfa47f0.

📒 Files selected for processing (10)
  • .claude/skills/slayer-overview.md
  • CLAUDE.md
  • docs/concepts/ingestion.md
  • slayer/cli.py
  • slayer/client/slayer_client.py
  • slayer/embeddings/client.py
  • slayer/mcp/server.py
  • slayer/search/__init__.py
  • slayer/storage/v6_migration.py
  • tests/integration/test_integration.py
✅ Files skipped from review due to trivial changes (2)
  • .claude/skills/slayer-overview.md
  • docs/concepts/ingestion.md
🚧 Files skipped from review as they are similar to previous changes (4)
  • slayer/storage/v6_migration.py
  • slayer/embeddings/client.py
  • tests/integration/test_integration.py
  • slayer/cli.py

Comment thread slayer/mcp/server.py Outdated
- slayer/cli.py: replace the two asyncio.run() call sites I added in
  rounds 1 and 3 with run_sync() from slayer.async_utils, matching
  every other async dispatch in this file. Required for CLI use from
  inside an already-running event loop (Jupyter, async test runners).
  The top-level `import asyncio` is now unused; dropped.
- slayer/cli.py: `slayer search refresh-samples --format json` now
  emits the RefreshSamplesResult envelope as JSON instead of silently
  falling back to the text path. text remains the default.
- slayer/mcp/server.py: fix the stale "Two-channel" first line of the
  search() tool's docstring — the body already describes all three
  channels.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

@ZmeiGorynych ZmeiGorynych merged commit 5b0c2ac into main May 12, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant