Egor/dev 1265 improve model introspection and help tool in mcp server#36
Conversation
Introduce a conceptual-help surface that complements the schema-focused tool docstrings: the ten things an agent needs to know that aren't in a parameter description — colon syntax, the self-join vs window-function transform trade-off, cross-model measure sub-queries, the three meanings of "last", auto-routing of filters to WHERE/HAVING/post-filter, etc. Topics: queries, formulas, aggregations, transforms, time, filters, joins, models, extending, workflow. Content lives as one .md per topic in slayer/help/topics/ and is discovered dynamically at module load — adding a new topic is just dropping a new NN_name.md file in, no Python changes. The NN_ prefix controls teaching order. MCP: new help(topic=None) tool. The topic list is embedded in the tool description so agents see it in the tool manifest without calling. The FastMCP instructions string now points at help() upfront. CLI: new `slayer help [TOPIC]` subcommand mirroring the MCP behaviour; argparse's built-in -h/--help is unchanged. Tests statically validate every ```json snippet in every help body — SlayerQuery.model_validate for structure, parse_formula for field formulas, and the engine's extract_filter_transforms + parse_filter pipeline for filter strings. This caught two real doc-drift issues (colon syntax in order.column, missing transform names in filter pipeline) that are fixed in the help content here. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a file-backed, topic-driven conceptual help system surfaced via CLI ( Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant MCP_Server
participant Storage
participant Engine
participant Database
Client->>MCP_Server: call inspect_model(model_name, num_rows?, format?)
MCP_Server->>Storage: load model metadata (model_name)
MCP_Server->>Engine: build sample query args (measures, agg choices)
Engine->>Database: execute sample & profile queries
Database-->>Engine: rows / errors
Engine-->>MCP_Server: sample rows, dim profiles, row_count
MCP_Server->>MCP_Server: render markdown or JSON (tables, pruned columns)
MCP_Server-->>Client: return rendered document or JSON
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (2)
slayer/cli.py (1)
249-249: Move newslayer.helpimports to module top-level.The new local imports work, but they violate the repo Python import placement rule.
Suggested refactor
import argparse import os import sys from slayer.async_utils import run_sync +from slayer.help import TOPIC_SUMMARY_LINE, render_help @@ - from slayer.help import TOPIC_SUMMARY_LINE - help_parser = subparsers.add_parser( @@ def _run_help(args): - from slayer.help import render_help - print(render_help(topic=args.topic))As per coding guidelines "
**/*.py: Place imports at the top of files".Also applies to: 292-292
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/cli.py` at line 249, The new local imports from slayer.help (e.g. TOPIC_SUMMARY_LINE) should be moved to the module top-level alongside the other imports instead of being imported inside functions; locate the places where TOPIC_SUMMARY_LINE (and any other slayer.help names added) are imported locally and remove those in-function imports, add a single top-level "from slayer.help import TOPIC_SUMMARY_LINE" with the other imports, and update any references to use that top-level symbol.slayer/help/__init__.py (1)
66-73: Guard against silent topic-key collisions during discovery.If two files normalize to the same key, later entries overwrite earlier ones silently. Please fail fast on duplicate derived keys (and duplicate intro) to avoid hidden docs regressions.
Suggested hardening
key = _strip_numeric_prefix(stem) body = entry.read_text(encoding="utf-8").rstrip() + "\n" if key == INTRO_KEY: + if intro_body: + raise ValueError("Duplicate intro topic detected in slayer/help/topics") intro_body = body else: + if key in bodies: + raise ValueError(f"Duplicate help topic key detected: '{key}'") bodies[key] = body order.append(key)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/help/__init__.py` around lines 66 - 73, The discovery loop currently lets later files overwrite earlier ones when _strip_numeric_prefix(stem) yields duplicate keys (and also allows duplicate INTRO_KEY to be overwritten); modify the logic in the block that computes key and body (the code using _strip_numeric_prefix, INTRO_KEY, intro_body, bodies, order) to detect duplicates and fail fast: if key == INTRO_KEY and intro_body is already set, raise an error (e.g., ValueError) naming the duplicate entry; if key != INTRO_KEY and key already exists in bodies, raise an error indicating the duplicate derived key and the conflicting filename(s) instead of overwriting; ensure the raised error includes the stem or entry path so callers can locate the offending files.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/api_gaps.md`:
- Around line 17-25: The feature matrix in docs/api_gaps.md incorrectly flags
CLI support for several actions; update the table rows for "Create datasource",
"Delete datasource", "Test datasource connection", "Ingest: include specific
tables", and "Ingest: exclude specific tables" to mark CLI as supported (change
the CLI column from — to Y) and adjust any notes (e.g., keep "Test datasource
connection" note that it's included in `describe_datasource`, and preserve the
REST-only note for `exclude_tables` if still applicable) so the table accurately
reflects existing CLI commands/flags.
In `@slayer/help/topics/03_aggregations.md`:
- Around line 64-74: The documentation is inconsistent about the
custom-aggregation placeholder: the prose says `{expr}` but the example YAML
uses `{value}`; pick one and make both match. Update the narrative sentence that
currently mentions `{expr}` and/or the `formula:` line in the `aggregations`
snippet for `trimmed_mean` so they both use the same placeholder (e.g., change
`{value}` to `{expr}` throughout or vice versa) and ensure the `params:` section
still references the correct parameter names (`lo`, `hi`) accordingly.
In `@slayer/help/topics/05_time.md`:
- Around line 59-62: The documentation incorrectly states that
main_time_dimension is always highest priority; update the bullet list to
reflect actual behavior: when a query has a single entry in time_dimensions that
dimension is used automatically for time-based functions; when the query has 2+
time_dimensions, main_time_dimension is used to disambiguate (or the model's
default_time_dimension if that default is present among the query's
time_dimensions); if the query has no time_dimensions at all, fall back to the
model's default_time_dimension. Reference the symbols main_time_dimension,
time_dimensions, and default_time_dimension in the revised bullets.
In `@slayer/help/topics/10_workflow.md`:
- Around line 8-11: Add language identifiers (e.g., "text") to the fenced code
blocks that currently start with numbered lists such as the block containing "1.
datasource_summary() ... 2. inspect_model(...)" and the other blocks that start
with "1. create_datasource(...)" and "1. create_datasource(...,
auto_ingest=false) ..." (also update the similar blocks at the other occurrences
noted). Replace each opening triple backtick (```) with a language-tagged
backtick fence (e.g., ```text) so all fenced code blocks have a language
identifier and markdownlint MD040 is satisfied.
- Around line 30-35: The example for create_datasource uses the wrong parameter
name; change the call to pass driver="postgres" instead of type="postgres" so it
matches the MCP signature (update the create_datasource(...) example in this
file), and scan for other occurrences of type="..." in this document to replace
with driver="..." where the MCP create_datasource() or similar functions are
shown (ensure the parameter name matches the function signature).
In `@slayer/mcp/server.py`:
- Around line 574-589: The public function inspect_model currently returns
markdown and dropped the show_sql parameter, breaking existing JSON consumers;
restore backward compatibility by adding a format parameter (e.g., format: str =
"markdown" with allowed "json"|"markdown") and accept the deprecated show_sql
argument (e.g., show_sql: Optional[bool] = None) in inspect_model, implementing
logic that: if show_sql is provided, map it to format="json" for compatibility
(while logging/deprecating), otherwise honor format; keep the existing markdown
generation path and add a JSON serialization branch that produces the prior
machine-readable payload shape (reuse the same internal helpers that build model
metadata, dimensions, measures, joins, profiles, and sample rows) so callers
requesting "json" or passing show_sql continue to work.
- Around line 109-142: _build_sample_query_args currently picks any allowed
aggregation or falls back to "avg", which can pick aggregations that require
extra args or context; change it to only select zero-argument, context-free
aggs. Define a safe set of aggs (e.g.
{"avg","sum","min","max","count","count_distinct"}), and when processing each
measure use these rules: if m.allowed_aggregations is not None, pick the first
entry from m.allowed_aggregations that is in the safe set (skip the measure if
none match); if m.allowed_aggregations is None, consult dim_types: if
dim_types.get(m.name) is one of ("string","boolean","date","time") choose
"count_distinct", otherwise pick the first available numeric-safe agg from
["avg","sum","min","max"] (skip if none available); append only when an agg was
chosen to fields. This targets the logic inside _build_sample_query_args,
referencing model.measures, m.allowed_aggregations, dim_types and fields.
---
Nitpick comments:
In `@slayer/cli.py`:
- Line 249: The new local imports from slayer.help (e.g. TOPIC_SUMMARY_LINE)
should be moved to the module top-level alongside the other imports instead of
being imported inside functions; locate the places where TOPIC_SUMMARY_LINE (and
any other slayer.help names added) are imported locally and remove those
in-function imports, add a single top-level "from slayer.help import
TOPIC_SUMMARY_LINE" with the other imports, and update any references to use
that top-level symbol.
In `@slayer/help/__init__.py`:
- Around line 66-73: The discovery loop currently lets later files overwrite
earlier ones when _strip_numeric_prefix(stem) yields duplicate keys (and also
allows duplicate INTRO_KEY to be overwritten); modify the logic in the block
that computes key and body (the code using _strip_numeric_prefix, INTRO_KEY,
intro_body, bodies, order) to detect duplicates and fail fast: if key ==
INTRO_KEY and intro_body is already set, raise an error (e.g., ValueError)
naming the duplicate entry; if key != INTRO_KEY and key already exists in
bodies, raise an error indicating the duplicate derived key and the conflicting
filename(s) instead of overwriting; ensure the raised error includes the stem or
entry path so callers can locate the offending files.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7a81759e-2561-4e55-8c8b-d239dbcd1027
📒 Files selected for processing (23)
docs/api_gaps.mddocs/interfaces/cli.mddocs/interfaces/mcp.mddocs/reference/mcp.mdslayer/cli.pyslayer/core/enums.pyslayer/engine/enrichment.pyslayer/help/__init__.pyslayer/help/topics/00_intro.mdslayer/help/topics/01_queries.mdslayer/help/topics/02_formulas.mdslayer/help/topics/03_aggregations.mdslayer/help/topics/04_transforms.mdslayer/help/topics/05_time.mdslayer/help/topics/06_filters.mdslayer/help/topics/07_joins.mdslayer/help/topics/08_models.mdslayer/help/topics/09_extending.mdslayer/help/topics/10_workflow.mdslayer/mcp/server.pytests/integration/test_mcp_inspect.pytests/test_help.pytests/test_mcp_server.py
describe_datasource absorbs list_tables — new `list_tables: bool = True` and `schema_name: str = ""` args append the table listing to the existing connection probe, so one call shows connection health + schemas + tables. The standalone `list_tables` MCP tool is removed. datasource_summary() is replaced by models_summary(datasource_name). The old catalogue-everything JSON is gone; the new tool returns a brief markdown overview of just the requested datasource's non-hidden models (name, description, dim/measure name+description tables, list of joined model names) — much lighter than inspect_model and a better picking tool. inspect_model folds the per-dimension data profile into the Dimensions table as a new `sampled` column (distinct values for string/boolean, `min .. max` for number/date/time, `> 20 distinct` for overflow). The standalone "Dimension profile (sampled)" section is removed. _markdown_table auto-prunes columns whose cells are all empty/None and collapses to a comma-separated backticked list when only one column survives. On an auto-ingested order_items this shrinks the Measures table from 6 columns to 3 and turns the models_summary dim/measure sections into dense name lists rather than tables of em-dashes. Docs + help topics + notebook intro/summary updated everywhere the old tool names or profile section appeared; 08_mcp_introspect notebook newly tracked with its current executed output reflecting all four changes. 677 tests pass; ruff clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
slayer/mcp/server.py (1)
163-196:⚠️ Potential issue | 🟡 MinorSample aggregation selection improved but still has edge cases.
The implementation now correctly handles string/boolean/date/time types by falling back to
count_distinct. However, whenallowed_aggregationsis restricted andavgis not in the list, the code picksallowed[0]which could be a custom aggregation requiring parameters (likeweighted_avg), causing the sample query to fail.Consider filtering to only safe zero-arg aggregations:
Suggested improvement
+SAFE_SAMPLE_AGGS = {"avg", "sum", "count", "count_distinct", "min", "max"} + def _build_sample_query_args(model: SlayerModel, num_rows: int) -> Dict[str, Any]: ... for m in model.measures: if m.hidden: continue allowed = m.allowed_aggregations if allowed is not None and "avg" not in allowed: if not allowed: continue - agg = allowed[0] + safe = [a for a in allowed if a in SAFE_SAMPLE_AGGS] + if not safe: + continue + agg = safe[0] else: ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/mcp/server.py` around lines 163 - 196, The sample-aggregation logic in _build_sample_query_args can pick a parameterized custom aggregation from m.allowed_aggregations (e.g., "weighted_avg") which will break the generated query; update the branch that handles a restricted allowed list to first filter m.allowed_aggregations to a whitelist of safe zero-argument aggs (e.g. {"count","count_distinct","sum","min","max","avg"}) and then pick the first entry from that filtered list, and if none remain skip the measure; keep the existing avg-vs-count_distinct type guard for the avg case in the other branch.
🧹 Nitpick comments (3)
tests/test_mcp_server.py (1)
315-318: Verify expected behavior for CRLF escaping.The test expects
\r\nto become" "(double space) because\rand\nare each replaced with a space independently. This appears intentional based on the_escape_md_cellimplementation (line 109 in server.py:.replace("\r", " ").replace("\n", " ")), but the comment could clarify this produces double-space for CRLF sequences.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_mcp_server.py` around lines 315 - 318, The test's expectation that "\r\n" becomes two spaces should be clarified: update the _escape_md_cell implementation's comment (near the .replace("\r", " ").replace("\n", " ") calls inside function _escape_md_cell) or add an inline test comment explaining that CRLF sequences produce a double space because CR and LF are replaced independently; ensure the test case comment reflects this intended behavior so reviewers understand the double-space is deliberate.docs/examples/08_mcp_introspect/mcp_introspect_nb.ipynb (1)
51-63: Consider using a relative import pattern or package installation.The
sys.path.insertmanipulation for importingsetup_jaffleworks but is fragile. If the notebook is moved or the directory structure changes, it will break silently.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/examples/08_mcp_introspect/mcp_introspect_nb.ipynb` around lines 51 - 63, Replace the fragile sys.path.insert hack by importing setup_jaffle as a proper package or using a relative-package import: ensure the module providing ensure_jaffle_shop is installed or the notebook is run with the project package on PYTHONPATH, then replace the sys.path manipulation and plain import of setup_jaffle / ensure_jaffle_shop with a standard import (e.g., from your_project.setup_jaffle import ensure_jaffle_shop) or install the package into the environment so the call to ensure_jaffle_shop() and references to engine, storage, models work reliably without modifying sys.path at runtime.slayer/mcp/server.py (1)
300-387: Consider adding timeout or query limit for dimension profiling.The
_collect_dim_profilefunction runs one query per categorical dimension plus one batched query for numeric/temporal dimensions. On models with many dimensions or large tables, this could be slow.The
max_dims=10cap helps, but consider adding a per-query timeout or row limit to prevent long-running profile queries from blockinginspect_model.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/mcp/server.py` around lines 300 - 387, The profiling can hang on slow queries; update _collect_dim_profile to enforce per-query timeouts and stricter limits: wrap calls to engine.execute in a bounded timeout (e.g. via asyncio.wait_for) and make the numeric/temporal batched SlayerQuery include an explicit LIMIT (or otherwise constrain the query) so it only returns the single summary row; do this for both the categorical loop (where you already set "limit": max_values+1) and the numeric/temporal batched query, and surface the timeout as a configurable parameter (e.g. profile_timeout_secs) so callers of _collect_dim_profile can control it while preserving the existing exception-swallowing behavior around engine.execute.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@slayer/mcp/server.py`:
- Around line 163-196: The sample-aggregation logic in _build_sample_query_args
can pick a parameterized custom aggregation from m.allowed_aggregations (e.g.,
"weighted_avg") which will break the generated query; update the branch that
handles a restricted allowed list to first filter m.allowed_aggregations to a
whitelist of safe zero-argument aggs (e.g.
{"count","count_distinct","sum","min","max","avg"}) and then pick the first
entry from that filtered list, and if none remain skip the measure; keep the
existing avg-vs-count_distinct type guard for the avg case in the other branch.
---
Nitpick comments:
In `@docs/examples/08_mcp_introspect/mcp_introspect_nb.ipynb`:
- Around line 51-63: Replace the fragile sys.path.insert hack by importing
setup_jaffle as a proper package or using a relative-package import: ensure the
module providing ensure_jaffle_shop is installed or the notebook is run with the
project package on PYTHONPATH, then replace the sys.path manipulation and plain
import of setup_jaffle / ensure_jaffle_shop with a standard import (e.g., from
your_project.setup_jaffle import ensure_jaffle_shop) or install the package into
the environment so the call to ensure_jaffle_shop() and references to engine,
storage, models work reliably without modifying sys.path at runtime.
In `@slayer/mcp/server.py`:
- Around line 300-387: The profiling can hang on slow queries; update
_collect_dim_profile to enforce per-query timeouts and stricter limits: wrap
calls to engine.execute in a bounded timeout (e.g. via asyncio.wait_for) and
make the numeric/temporal batched SlayerQuery include an explicit LIMIT (or
otherwise constrain the query) so it only returns the single summary row; do
this for both the categorical loop (where you already set "limit": max_values+1)
and the numeric/temporal batched query, and surface the timeout as a
configurable parameter (e.g. profile_timeout_secs) so callers of
_collect_dim_profile can control it while preserving the existing
exception-swallowing behavior around engine.execute.
In `@tests/test_mcp_server.py`:
- Around line 315-318: The test's expectation that "\r\n" becomes two spaces
should be clarified: update the _escape_md_cell implementation's comment (near
the .replace("\r", " ").replace("\n", " ") calls inside function
_escape_md_cell) or add an inline test comment explaining that CRLF sequences
produce a double space because CR and LF are replaced independently; ensure the
test case comment reflects this intended behavior so reviewers understand the
double-space is deliberate.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: dc065658-a1d9-4f16-8940-9cc58daed376
📒 Files selected for processing (13)
.claude/skills/slayer-overview.mdREADME.mddocs/api_gaps.mddocs/concepts/models.mddocs/examples/08_mcp_introspect/mcp_introspect_nb.ipynbdocs/getting-started/mcp.mddocs/interfaces/mcp.mddocs/reference/mcp.mdslayer/help/topics/08_models.mdslayer/help/topics/10_workflow.mdslayer/mcp/server.pytests/integration/test_mcp_inspect.pytests/test_mcp_server.py
✅ Files skipped from review due to trivial changes (5)
- docs/concepts/models.md
- README.md
- docs/api_gaps.md
- slayer/help/topics/10_workflow.md
- slayer/help/topics/08_models.md
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/integration/test_mcp_inspect.py
📝 CodeRabbit Chat: Implement requested code changes
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
docs/interfaces/mcp.md (1)
133-156: Add language specifiers to fenced code blocks.Same as in
docs/reference/mcp.md, these workflow example blocks should have language specifiers (e.g.,textorplaintext) for linting compliance.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/interfaces/mcp.md` around lines 133 - 156, The fenced example blocks in docs/interfaces/mcp.md lack language specifiers; update each triple-backtick fence that wraps the workflows and query examples to include a plain-text specifier (e.g., ```text or ```plaintext) so linting passes — specifically locate the blocks showing create_datasource, models_summary, inspect_model, describe_datasource, ingest_datasource_models, list_datasources and query and add the language tag to the opening ``` for each block.slayer/mcp/server.py (1)
424-425: Consider usingcollections.dequefor BFS queue.
queue.pop(0)is O(n) for a list. While the max_depth=5 cap keeps n small, usingdeque.popleft()is the idiomatic O(1) choice for BFS.🔧 Suggested fix
+from collections import deque + async def _collect_reachable_fields( model: SlayerModel, storage: StorageBackend, *, max_depth: int = 5, ) -> Tuple[List[str], List[str]]: ... reachable_dims: set[str] = set() reachable_measures: set[str] = set() visited: set[str] = set() - queue: List[Tuple[str, str]] = [] # (full_path, target_model_name) + queue: deque[Tuple[str, str]] = deque() # (full_path, target_model_name) ... for j in model.joins: path = _derive_path("", j) if path not in visited: - queue.append((path, j.target_model)) + queue.append((path, j.target_model)) # append is same for deque while queue: - path, target_name = queue.pop(0) + path, target_name = queue.popleft()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/mcp/server.py` around lines 424 - 425, The BFS loop currently uses a list and calls queue.pop(0) (in the while queue: / path, target_name = queue.pop(0) block), which is O(n); switch to collections.deque to get O(1) popleft: add "from collections import deque" at the top, initialize queue = deque(...) wherever the list is created, and replace queue.pop(0) with queue.popleft(). Ensure any code that appends uses queue.append(...) (or queue.appendleft(...) if intended) and keep the rest of the BFS logic unchanged.docs/reference/mcp.md (1)
115-138: Add language specifiers to fenced code blocks.The code blocks in the workflow examples are missing language specifiers. While these are pseudo-code/prose examples, adding a specifier (e.g.,
textorplaintext) satisfies linting and improves rendering consistency.🔧 Suggested fix
-``` +```text 1. create_datasource(name="mydb", type="postgres", host="localhost", database="app", username="user", password="pass") # auto_ingest=true by default — models are generated automatically 2. models_summary(datasource_name="mydb") # see what was generated 3. inspect_model(model_name="orders") # see schema + sample dataApply the same to the other unlabeled code blocks (lines 125, 133). </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@docs/reference/mcp.mdaround lines 115 - 138, Add language specifiers (e.g.,
text or plaintext) to each fenced code block that contains the workflow examples
so linting renders them consistently; update the three unlabeled blocks showing
the sequences with functions like create_datasource, models_summary,
inspect_model, describe_datasource, ingest_datasource_models, list_datasources
and query by changing the opening triple-backtick lines to include a language
tag (for example ```text).</details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In@slayer/mcp/server.py:
- Around line 143-152: The single-column rendering block (when len(kept) == 1)
can produce invalid Markdown if a cell contains backticks; in the loop that
builds text for each row (variables col, rows and helper _cell_is_present),
escape backticks before wrapping the value in inline code by adding a
replacement like text = text.replace("", "\\") (i.e. include backtick escaping
in the existing replace chain) so values withcharacters are safe when returned as f"{text}`". Ensure this change is applied where rendered is built
and return remains ", ".join(rendered).
Nitpick comments:
In@docs/interfaces/mcp.md:
- Around line 133-156: The fenced example blocks in docs/interfaces/mcp.md lack
language specifiers; update each triple-backtick fence that wraps the workflows
and query examples to include a plain-text specifier (e.g., ```text orcreate_datasource, models_summary, inspect_model, describe_datasource, ingest_datasource_models, list_datasources and query and add the language tag to the opening ``` for each block. In `@docs/reference/mcp.md`: - Around line 115-138: Add language specifiers (e.g., text or plaintext) to each fenced code block that contains the workflow examples so linting renders them consistently; update the three unlabeled blocks showing the sequences with functions like create_datasource, models_summary, inspect_model, describe_datasource, ingest_datasource_models, list_datasources and query by changing the opening triple-backtick lines to include a language tag (for example ```text). In `@slayer/mcp/server.py`: - Around line 424-425: The BFS loop currently uses a list and calls queue.pop(0) (in the while queue: / path, target_name = queue.pop(0) block), which is O(n); switch to collections.deque to get O(1) popleft: add "from collections import deque" at the top, initialize queue = deque(...) wherever the list is created, and replace queue.pop(0) with queue.popleft(). Ensure any code that appends uses queue.append(...) (or queue.appendleft(...) if intended) and keep the rest of the BFS logic unchanged.🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID:
15d4da1e-2ce9-4c5b-a9e7-4bd491880a18📒 Files selected for processing (4)
docs/api_gaps.mddocs/interfaces/mcp.mddocs/reference/mcp.mdslayer/mcp/server.py✅ Files skipped from review due to trivial changes (1)
- docs/api_gaps.md
- docs/api_gaps.md: correct 4 CLI capability flags (datasource
create/delete/test and ingest --exclude all exist in the CLI)
- help/topics/03_aggregations.md: fix placeholder name {expr} → {value}
to match BUILTIN_AGGREGATION_FORMULAS
- help/topics/05_time.md: rewrite time-dimension priority bullets to
match actual behavior (single dim used directly; main_time_dimension
only disambiguates when 2+ are present; 0 dims falls back to model
default)
- help/topics/10_workflow.md: add ```text language tags to bare fenced
blocks (markdownlint MD040)
- mcp/server.py: prefer safe zero-arg aggregations (avg/sum/min/max/
count/count_distinct/median) in sample-data extraction before falling
back to parametrized ones; escape backticks in single-column markdown
table rendering to prevent broken formatting
- tests: 3 new cases covering the safe-agg preference, parametrized
fallback, and backtick escaping
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-in-mcp-server' of https://github.com/MotleyAI/slayer into egor/dev-1265-improve-model-introspection-and-help-tool-in-mcp-server
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_mcp_server.py`:
- Around line 336-339: The test asserts a CRLF ("line1\r\nline2") collapses to
two spaces which locks in an implementation quirk; update the test for
test_escape_pipes_and_newlines to expect a single space for CRLF (i.e. the same
result as "\n") so it matches the documented behavior of _escape_md_cell and
won't mark a future fix as a regression; locate the assertions in
test_escape_pipes_and_newlines and change the expected value for
"line1\r\nline2" to "line1 line2".
- Around line 240-242: The tests assume each element of args["dimensions"] is a
dict but _build_sample_query_args returns a flat list[str]; update the
assertions in tests/test_mcp_server.py to compare the dimensions list directly
(e.g., assert args["dimensions"] == ["status", "region"]) and similarly change
the other assertion pair at the later lines (around 294-295) to match the actual
shape returned by _build_sample_query_args so the tests no longer index strings
as dicts.
- Around line 439-460: The test test_auto_ingested_multi_hop_path expects a
BFS-derived multi-hop path "orders.customers.name" but the current helper in
slayer/mcp/server.py queues root joins as their direct target and only runs
_derive_path when traversing a target model's own joins, so update the test to
match current behavior: call _collect_reachable_fields with the root as before
and assert for "orders.customer_id" and "customers.name" (not
"orders.customers.name"), or alternatively modify the helper to invoke
_derive_path for root joins that reference dotted join_pairs (ModelJoin with
"orders.customer_id") so multi-hop paths are produced; pick the former (test
change) by replacing the failing assertion with assert "customers.name" in dims
and keeping assert "orders.customer_id" in dims.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 62bfaeca-54b9-44d8-beba-fbb6072ddce8
📒 Files selected for processing (6)
docs/api_gaps.mdslayer/help/topics/03_aggregations.mdslayer/help/topics/05_time.mdslayer/help/topics/10_workflow.mdslayer/mcp/server.pytests/test_mcp_server.py
✅ Files skipped from review due to trivial changes (4)
- docs/api_gaps.md
- slayer/help/topics/03_aggregations.md
- slayer/help/topics/05_time.md
- slayer/help/topics/10_workflow.md
🚧 Files skipped from review as they are similar to previous changes (1)
- slayer/mcp/server.py
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tests/test_mcp_server.py (1)
49-127: Add direct coverage for the new JSON branches.These tests exercise the markdown contract well, but neither
models_summary(..., format="json")norinspect_model(..., format="json")is asserted here. A small happy-path JSON test for each would lock down the new public surface and would have caught the currentinspect_modelsample-data omission immediately.Also applies to: 134-205
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_mcp_server.py` around lines 49 - 127, Add explicit JSON-format tests for the new branches by adding two small happy-path tests (e.g., test_models_summary_json and test_inspect_model_json) that call models_summary(...) and inspect_model(...) via _call with format="json" and assert the returned payload is a parsed JSON object with the expected top-level keys (e.g., datasource name, models list, model details), counts and presence/absence of hidden models; ensure each test sets up the same fixtures used elsewhere (use YAMLStorage.save_datasource and save_model/SlayerModel with required fields) and for inspect_model include the minimal sample data fields the implementation expects so the JSON response contains the expected fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@slayer/mcp/server.py`:
- Around line 911-1005: inspect_model(format="json") currently returns only
metadata and drops the sample data assembled earlier (sample_result, sample_sql,
sample_section). Modify the JSON return payload (the dict inside the json.dumps)
to include the sample information: add fields like "sample": {"columns":
sample_result.columns, "rows": sample_result.data} and, when show_sql is true
and sample_sql is set, include "sample_sql": sample_sql (or include it under
"sample.sql"). Use the already-populated variables sample_result and sample_sql
(and show_sql) so the JSON output mirrors the markdown path instead of
discarding the sample_section.
- Around line 147-156: The current inline-code rendering in the single-column
branch builds code spans unsafely (see the len(kept)==1 block and the other
usage around line 294); create a shared helper (e.g.,
render_markdown_code_span(value: str)) and use it from both places: the helper
should keep the existing sanitization (replace("|", "\\|"), normalize CR/LF,
strip) then compute the longest contiguous backtick run in the sanitized value,
build a fence of backticks one longer than that run (fence = "`" * (max_run +
1)), and return the value wrapped in that fence (fence + value + fence); replace
the current backtick-replacement and direct-wrapping code with calls to this
helper in both locations (the len(kept)==1 block and the other renderer at/near
line 294).
---
Nitpick comments:
In `@tests/test_mcp_server.py`:
- Around line 49-127: Add explicit JSON-format tests for the new branches by
adding two small happy-path tests (e.g., test_models_summary_json and
test_inspect_model_json) that call models_summary(...) and inspect_model(...)
via _call with format="json" and assert the returned payload is a parsed JSON
object with the expected top-level keys (e.g., datasource name, models list,
model details), counts and presence/absence of hidden models; ensure each test
sets up the same fixtures used elsewhere (use YAMLStorage.save_datasource and
save_model/SlayerModel with required fields) and for inspect_model include the
minimal sample data fields the implementation expects so the JSON response
contains the expected fields.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 423b610a-0893-4408-9d09-223cf5c2f852
📒 Files selected for processing (2)
slayer/mcp/server.pytests/test_mcp_server.py
…ev-1265-improve-model-introspection-and-help-tool-in-mcp-server # Conflicts: # README.md # slayer/mcp/server.py
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests