S3: merge dimensions+measures into unified columns (#52)#61
Conversation
…rmulas (#52) Collapse the dim/measure split into a single SlayerModel.columns list of Column objects. Repurpose SlayerModel.measures to hold ModelMeasure named formulas (the shape v1 had on Query.fields). Rename SlayerQuery.fields → measures for consistency. A column carries the union of v1 Dimension+Measure metadata (type, primary_key, allowed_aggregations, filter, format, label, description, hidden, meta). Aggregation eligibility now flows from data type with type→aggregations defaults in slayer/core/enums.py (DEFAULT_AGGREGATIONS_BY_TYPE), an explicit allowed_aggregations whitelist overriding it, and a built-in PRIMARY_KEY_AGGREGATIONS rule restricting PKs to count/count_distinct. The SlayerModel disjoint-name validator forbids overlap between column names and measure-formula names so query name resolution is unambiguous. A v1→v2 read-time migration (slayer/storage/v2_migration.py) handles all existing storage: merges old dimensions+measures into columns, defaults old Measure types to NUMBER, drops the legacy 'type: sum' alias (carrying intent into allowed_aggregations), renames Query.fields→measures, and recurses into inline ModelExtension blocks. The detection heuristic skips fresh v2 dicts so it never clobbers caller payloads. Engine, MCP/API/CLI surfaces, dbt converter, and auto-ingestion updated. edit_model/create_model now take 'columns' and 'measures' (formulas) parameters; inspect_model emits a unified Columns table plus a Measures (formulas) table. The cross-entity {model_name, query_name} uniqueness validator from the issue is deferred to S2 (NamedQuery storage isn't merged on this branch). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughMerge model Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Server
participant Storage
participant Migrator
Client->>Server: POST /query or create model (payload uses measures/columns)
Server->>Storage: Read/persist model or query
Storage->>Migrator: Detect version < 2 (has dimensions/fields/measures v1)
Migrator->>Storage: Transform v1 {dimensions,measures,fields} -> v2 {columns,measures}
Storage-->>Server: Return v2-validated model/query
Server->>Server: Enrich/validate using columns + ModelMeasure
Server-->>Client: Respond with enriched/validated v2 model or query result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 17
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (9)
docs/index.md (1)
68-72:⚠️ Potential issue | 🟡 MinorUpdate the architecture diagram label from
fieldstomeasuresfor v2 consistency.The top example uses
measures, but this diagram still referencesfields.Suggested patch
- SlayerQuery (model, fields, dimensions, filters) + SlayerQuery (model, measures, dimensions, filters)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/index.md` around lines 68 - 72, Update the architecture diagram label inside the ASCII block that currently reads "SlayerQuery (model, fields, dimensions, filters)" to use "measures" instead of "fields" so it matches v2 terminology; locate the ASCII diagram near the "Agent --> MCP / REST API / Python SDK" line in docs/index.md and change the token "fields" to "measures" (i.e., "SlayerQuery (model, measures, dimensions, filters)").tests/perf/test_bench.py (1)
22-27:⚠️ Potential issue | 🟠 MajorUse
execute_sync()instead of the asyncexecute()method in the benchmark helper.The
_execute()function at line 26 callsengine.execute(query=query), which is an async method. In the synchronous benchmark context (line 167), this returns a coroutine instead of aSlayerResponse, invalidating the timing measurement.Suggested patch
def _execute(env: BenchEnv, **query_kwargs) -> SlayerResponse: """Execute a query and return the response.""" engine, _ = env query = SlayerQuery(source_model="orders", **query_kwargs) - return engine.execute(query=query) + return engine.execute_sync(query=query)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/perf/test_bench.py` around lines 22 - 27, The helper _execute currently calls the async method engine.execute(query=query) and returns a coroutine instead of a SlayerResponse; change it to call the synchronous API engine.execute_sync(query=query) so the function actually returns a SlayerResponse in the synchronous benchmark context, keeping the same signature (def _execute(env: BenchEnv, **query_kwargs) -> SlayerResponse) and still constructing the query with SlayerQuery(source_model="orders", **query_kwargs).slayer/cli.py (1)
51-51:⚠️ Potential issue | 🟡 MinorCLI help examples still use deprecated
fieldsquery key.Please update these sample commands to use
"measures"so users copy/paste valid v2 payloads.Suggested patch
- slayer query '{"source_model": "orders", "fields": [{"formula": "count"}]}' + slayer query '{"source_model": "orders", "measures": [{"formula": "count"}]}' @@ - slayer query '{"source_model": "orders", "fields": [{"formula": "count"}]}' + slayer query '{"source_model": "orders", "measures": [{"formula": "count"}]}' @@ - slayer query '{"source_model": "orders", "fields": [{"formula": "count"}]}' --dry-run + slayer query '{"source_model": "orders", "measures": [{"formula": "count"}]}' --dry-runAlso applies to: 121-121, 127-127
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/cli.py` at line 51, Update the CLI help examples in slayer/cli.py to use the v2 payload key "measures" instead of the deprecated "fields": find the example strings containing '{"source_model": "orders", "fields": ...}' (and the other occurrences around the same examples) and replace "fields" with "measures" so the sample commands (e.g., the slayer query example) produce valid v2 payloads.docs/concepts/formulas.md (1)
19-26:⚠️ Potential issue | 🟡 MinorUpdate remaining
fieldswording tomeasuresin this section.The examples now correctly use
"measures", but the nearby prose still says formulas go infields. Please align the text to avoid schema confusion.Suggested doc patch
-Colon syntax is used everywhere measures appear: in `fields`, in arithmetic expressions, in transform function arguments, and in filters. +Colon syntax is used everywhere measures appear: in `measures`, in arithmetic expressions, in transform function arguments, and in filters. @@ -Field formulas define what data columns a query returns. They go in the `fields` parameter: +Field formulas define what data columns a query returns. They go in the `measures` parameter:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/concepts/formulas.md` around lines 19 - 26, Update the prose that still refers to `fields` to consistently use `measures`: change the "Field Formulas" heading and the sentence "Field formulas define what data columns a query returns. They go in the `fields` parameter:" to use "Measure Formulas" and "Measure formulas define what data columns a query returns. They go in the `measures` parameter:" (and replace any other stray `fields` occurrences in this section), ensuring examples already using "measures" match the surrounding text.docs/examples/07_aggregations/aggregations.md (1)
9-19:⚠️ Potential issue | 🟠 MajorModel schema examples still show v1-style
measuresinstead of v2columns.These YAML blocks define row-level SQL under
measures, which conflicts with the v2 schema where row-level definitions are incolumns. As written, users may copy invalid model config.Suggested doc direction
-measures: - - name: revenue - sql: amount +columns: + - name: revenue + sql: amount + type: numberApply the same change pattern to other model-definition snippets in this file.
Also applies to: 102-112
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/examples/07_aggregations/aggregations.md` around lines 9 - 19, Update the example YAML model snippets that use the v1 "measures" block to the v2 "columns" schema: replace the top-level "measures" list with a "columns" list and for each entry change the shape to a column object containing "name", "type: measure" (or appropriate measure type), and the row-level expression under "sql" (keeping the existing expressions like amount, unit_price, qty). Do the same replacement for the other model-definition snippets in this file that still show "measures" (including the block around the previously noted lines) so examples conform to the v2 model schema.tests/integration/test_integration_postgres.py (1)
499-511:⚠️ Potential issue | 🟠 MajorThis assertion still reflects the old dimension/measure split.
With v2,
orders.columnsis the merged row-level field list, soamountshould remain incolumnseven if ingestion also creates a saved measure for it. Keepingassert "amount" not in dim_nameswill fail against the intended schema shape.Suggested test update
- dim_names = [d.name for d in orders.columns] + column_names = [c.name for c in orders.columns] # Should have own columns only (no flattened joined dims) - assert "id" in dim_names - assert "customer_id" in dim_names - # Float-like columns (DECIMAL) get measures only, no dimension - assert "amount" not in dim_names + assert "id" in column_names + assert "customer_id" in column_names + assert "amount" in column_names # Joined dimensions are resolved via join graph, not pre-flattened - assert not any("." in name for name in dim_names) + assert not any("." in name for name in column_names)As per coding guidelines:
SlayerModel.columns: List[Column]replaces separate dimensions and measures.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_integration_postgres.py` around lines 499 - 511, Update the test_orders_has_own_dimensions_only assertion to reflect that SlayerModel.columns is the merged row-level field list: in the test function test_orders_has_own_dimensions_only replace the check asserting "amount" is absent from dim_names (which currently uses orders.columns and dim_names = [d.name for d in orders.columns]) so that "amount" is allowed/present; keep checks for "id" and "customer_id" and the no-dot check, but remove or invert the line assert "amount" not in dim_names so the test expects "amount" to be in orders.columns (or simply delete that assertion).tests/integration/test_integration.py (1)
136-145:⚠️ Potential issue | 🟠 MajorThese assertions are still pinning the internal
*_countalias.
SlayerResponse.datais the public result shape, so*:countshould be asserted asorders.count, notorders._count. The same stale key shows up in several tests below.🔁 Example update
- assert response.data[0]["orders._count"] == 6 + assert response.data[0]["orders.count"] == 6Based on learnings, when validating
*:countmeasure naming, the user-visible output/result column name usesorders.count, whileorders._countis only the internalEnrichedMeasure.alias.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_integration.py` around lines 136 - 145, The test is asserting the internal alias `orders._count` instead of the public result column for the `*:count` measure; update the assertions in this test (and any similar ones) to check response.data[0]["orders.count"] (the public column name produced by the `*:count` measure) rather than `orders._count`, while keeping the same objects referenced (SlayerQuery, ModelMeasure, SlayerResponse, response). Ensure all other tests that validate `*:count` output are updated similarly.slayer/mcp/server.py (2)
1789-1790:⚠️ Potential issue | 🟡 MinorMinor: Missing space before
if.Proposed fix
- return "\n".join(lines)if lines else "" + return "\n".join(lines) if lines else ""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/mcp/server.py` around lines 1789 - 1790, The return expression has a missing space before the conditional operator: change the line "return \"\\n\".join(lines)if lines else \"\"" to include a space so it reads "return \"\\n\".join(lines) if lines else \"\"" (the variables involved are lines and measure_lines where lines is extended earlier).
1693-1698:⚠️ Potential issue | 🔴 CriticalBug:
m.dimensionsis not a valid attribute in v2 schema.This code references
m.dimensionswhich no longer exists in v2. The model now usesm.columns. This will cause anAttributeErrorat runtime.🐛 Proposed fix
lines = [f"Ingested {len(models)} model(s):"] for m in models: - lines.append(f"- {m.name} ({len(m.dimensions)} dims, {len(m.measures)} measures)") + lines.append(f"- {m.name} ({len(m.columns)} columns, {len(m.measures)} measures)") lines.append("")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/mcp/server.py` around lines 1693 - 1698, The summary builder references the removed attribute m.dimensions causing an AttributeError; update the loop that builds lines (iterating over models and using m.name, m.dimensions, m.measures) to use m.columns instead of m.dimensions (i.e., compute dims as len(m.columns)), leaving m.measures as-is so the output stays accurate for the v2 model schema.
🧹 Nitpick comments (1)
docs/getting-started/python.md (1)
143-143: Please normalize the rest of this page tomeasuresas well.Great fix here, but earlier SDK snippets still use
fields(Line 59,Line 92,Line 104). Consider updating them too so the page doesn’t mix v1/v2 terminology.Suggested patch
- fields=["*:count", "revenue:sum"], + measures=["*:count", "revenue:sum"], ... - fields=["*:count"], + measures=["*:count"], ... - fields=["*:count", "revenue:sum"], + measures=["*:count", "revenue:sum"],Based on learnings: “When renaming a field or changing response structure, grep all docs and skills for old names and update every occurrence.”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/getting-started/python.md` at line 143, The page mixes v1/v2 terminology: update all query examples that use the old "fields" key to use "measures" instead so they match the demonstrated snippet (e.g., the example calling result = engine.execute(query={"source_model": "orders", "measures": ["*:count"]})); search the document for any occurrences of "fields" in SDK query examples and replace them with "measures", ensuring the surrounding structure (query={"source_model": ..., "measures": [...]}) and example outputs remain consistent with the v2 response shape.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/concepts/models.md`:
- Around line 79-90: The documentation incorrectly states that an explicit
allowed_aggregations whitelist overrides both the default-by-type rule and the
primary-key restriction; update the sentence so it clearly says that while
allowed_aggregations overrides the default set derived from
DEFAULT_AGGREGATIONS_BY_TYPE, it does NOT bypass the primary-key restriction —
primary-key columns remain limited to count and count_distinct regardless of any
allowed_aggregations whitelist.
In `@docs/concepts/queries.md`:
- Around line 10-12: The docs mix the old term "dimensions" with the new
"columns" for ModelExtension; update all occurrences of "dimensions" in this
file's ModelExtension examples/field lists to use "columns" to match the
parameter table (e.g., change any ModelExtension field named dimensions to
columns and update related examples and descriptions), and verify references to
symbols like source_model, measures, ModelExtension, ColumnRef are consistent
across the page so the v2 docs no longer contradict each other.
In `@docs/getting-started/rest-api.md`:
- Around line 61-65: The REST docs show a "measures" payload but the FastAPI
request model QueryRequest in slayer/api/server.py only defines fields, so
FastAPI will drop measures; update QueryRequest to include an optional measures:
List[str] (and any related types) so FastAPI preserves measures for
SlayerQuery's migration logic, and ensure SlayerQuery consumption still works
(or add a migration step there) before returning responses; alternatively, if
you don't want to change the API surface now, revert the docs to use only fields
examples across the referenced sections (61-65, 87-102, 144-145) so they match
the current QueryRequest behavior.
In `@docs/interfaces/cli.md`:
- Line 72: Update the dry-run example command string used in the docs example
for the CLI (the `slayer query '{"source_model": "orders", "measures":
["count"]}' --dry-run` snippet) to use colon aggregation syntax so the measure
is unambiguous—replace `"count"` with `"*:count"` so the example becomes `slayer
query '{"source_model": "orders", "measures": ["*:count"]}' --dry-run`, matching
v2 query syntax and the project's aggregation guidelines.
In `@docs/interfaces/rest-api.md`:
- Line 36: The docs example uses the key "measures" which doesn't match the
server request model; update the REST example to use the "fields" key instead
(for example replace "measures": ["*:count"] with "fields": ["*:count"]) so it
aligns with the QueryRequest type in QueryRequest (slayer/api/server.py);
alternatively, if you prefer to support the "measures" name, add mapping logic
in QueryRequest parsing to accept "measures" as an alias for "fields" and
normalize it to the existing field handling.
In `@docs/reference/cli.md`:
- Line 66: Update the CLI example queries that use a bare "count" measure to
explicitly specify the aggregation using the colon syntax (e.g., change
"measures": ["count"] to "measures": ["*:count"]) so measures vs aggregations
are unambiguous; locate the example query strings shown (the slayer query JSON
examples containing "measures": ["count"]) and replace each occurrence with the
explicit aggregation form "*:count".
In `@docs/reference/rest-api.md`:
- Line 36: Update the POST /query example payload in docs/reference/rest-api.md
to use the current request model's "fields" key and express aggregations using
colon syntax in formulas (e.g., replace the "measures": ["count"] entry with an
appropriate "fields" entry using "*:count" or "revenue:sum" style). Locate the
example payload where "measures" is used and change it to the "fields" array
with aggregation-suffixed field names so the example matches the server contract
and other count examples.
In `@slayer/core/enums.py`:
- Around line 152-179: DEFAULT_AGGREGATIONS_BY_TYPE currently exposes extra
aggregations beyond the v2 contract; update the mapping so numeric types
(DataType.NUMBER) only include "sum","avg","min","max","count","count_distinct";
string/date/timestamp types (DataType.STRING, DataType.DATE, DataType.TIMESTAMP)
only include "count","count_distinct","first","last" (remove "min"/"max" for
non-numeric types); boolean (DataType.BOOLEAN) should only include
"count","count_distinct" (remove "sum"); keep PRIMARY_KEY_AGGREGATIONS as-is.
Modify the DEFAULT_AGGREGATIONS_BY_TYPE dict to match these exact sets
referenced by the DataType enum names.
In `@slayer/core/query.py`:
- Around line 270-274: Update the docstring example in slayer/core/query.py that
shows measures and filters to use colon aggregation syntax required by v2
formulas: replace bare occurrences like "count" and "revenue / count" with
explicit aggregations such as "*:count" and "revenue:sum / *:count" in the
measures example (the docstring block showing measures=[{"formula": ...}, ...]);
keep the filters example unchanged and ensure the formulas follow the
"metric:aggregate" pattern used by the query formula contract.
In `@slayer/dbt/converter.py`:
- Around line 356-387: entity_col_names is computed once from cols and not
updated when you append new primary key Columns, causing duplicates (e.g., when
sm.primary_entity matches a column just added in the entity loop). Fix by
updating entity_col_names after you append a Column (or recomputing/using
current cols to check membership) so the shorthand primary_entity check uses the
up-to-date set; update the set inside the entity loop whenever you call
cols.append(...) (references: cols, entity_col_names, sm.entities,
sm.primary_entity, Column, DataType.NUMBER).
In `@slayer/engine/enrichment.py`:
- Around line 253-271: The current validation only blocks numeric-only
aggregations on strings but doesn't enforce the effective allowed set; update
the logic in the block around measure_def.allowed_aggregations to compute an
effective_allowed set: if measure_def.allowed_aggregations is not None use it,
else if measure_def.is_primary_key use PRIMARY_KEY_AGGREGATIONS, otherwise use
DEFAULT_AGGREGATIONS_BY_TYPE.get(str(measure_def.type), DEFAULT_FALLBACK) and
then validate aggregation_name against that set and raise the ValueError
referencing aggregation_name, measure_name, and model.name; keep or remove the
separate NUMERIC_ONLY_AGGREGATIONS check as redundant once the effective_allowed
set is enforced.
In `@slayer/engine/ingestion.py`:
- Around line 508-545: _columns_to_model currently builds only
SlayerModel.columns and returns SlayerModel without populating
SlayerModel.measures, which drops the auto-generated measure library; update
_columns_to_model to also build a measures list by iterating the same input
columns (skip joined columns and primary-key columns), creating ModelMeasure
instances for each non-ID column (use the same column alias logic for "_count"
where applicable) and attach that list to the returned SlayerModel.measures;
keep the measure naming consistent with the new auto-ingest contract (one
measure per non-ID column, do not generate old-style _sum/_avg suffixed
measures) and reference SlayerModel, ModelMeasure, and _columns_to_model when
locating where to add this logic.
In `@slayer/storage/v2_migration.py`:
- Around line 105-112: The detector currently assumes raw_measures from
data.get("measures") is subscriptable and indexable, which raises
KeyError/TypeError for malformed inputs like {"measures": {"name":"revenue"}};
update the guard to verify raw_measures is a sequence (e.g., list/tuple or
collections.abc.Sequence) and non-empty before doing first = raw_measures[0], so
that the function (in slayer/storage/v2_migration.py around the raw_measures
check and the subsequent "first" handling) returns False for bad shapes instead
of raising.
In `@tests/integration/test_mcp_inspect.py`:
- Around line 429-430: Remove the extra conflicting status column append and use
the existing status column on the model: delete the call that appends
Column(name="status", sql="status", type=DataType.NUMBER) to
env["model"].columns and do not create a second column with a different
DataType; simply persist env["model"] (env["storage"].save_model(env["model"]))
as-is and rely on the existing status column (and get_column("status")); apply
the same removal for the other duplicated append sites that add a numeric status
column.
In `@tests/test_api_server.py`:
- Around line 102-121: Update the test_hidden_columns_excluded_from_get test so
it asserts hidden columns are absent rather than only checking visible ones:
after calling client.get("/models/orders") and building col_names from
data["columns"], add negative assertions that "internal_flag" and "secret_sum"
are NOT in col_names (e.g., assert "internal_flag" not in col_names and assert
"secret_sum" not in col_names) to ensure GET /models/{name} filters v2 columns
labeled hidden.
In `@tests/test_dbt_converter.py`:
- Around line 297-301: The test is grabbing the first column
(result.models[0].columns[0]) which is a PK/entity column due to
_convert_semantic_model prepending entity columns, so the assertion doesn't
actually verify the converted measure; update the test to locate the measure
column by name (e.g., find the column whose name equals "revenue" in
result.models[0].columns) and then assert that that column's
allowed_aggregations is None; keep using DbtToSlayerConverter and result.models
to find the model, but select the measure by name instead of index.
In `@tests/test_mcp_server.py`:
- Around line 406-408: The test currently asserts every column in
parsed["columns"] contains a "filter" key, but _model_to_summary() only emits
"filter" for columns that define one (e.g., "val" but not "id"); update the test
loop to either conditionally check for "filter" (if "filter" in c: assert ...),
or restrict the assertion to the specific column that should have a filter (find
the column with name "val" in parsed["columns"] and assert "filter" in that
dict), leaving the "id" column unasserted for "filter".
---
Outside diff comments:
In `@docs/concepts/formulas.md`:
- Around line 19-26: Update the prose that still refers to `fields` to
consistently use `measures`: change the "Field Formulas" heading and the
sentence "Field formulas define what data columns a query returns. They go in
the `fields` parameter:" to use "Measure Formulas" and "Measure formulas define
what data columns a query returns. They go in the `measures` parameter:" (and
replace any other stray `fields` occurrences in this section), ensuring examples
already using "measures" match the surrounding text.
In `@docs/examples/07_aggregations/aggregations.md`:
- Around line 9-19: Update the example YAML model snippets that use the v1
"measures" block to the v2 "columns" schema: replace the top-level "measures"
list with a "columns" list and for each entry change the shape to a column
object containing "name", "type: measure" (or appropriate measure type), and the
row-level expression under "sql" (keeping the existing expressions like amount,
unit_price, qty). Do the same replacement for the other model-definition
snippets in this file that still show "measures" (including the block around the
previously noted lines) so examples conform to the v2 model schema.
In `@docs/index.md`:
- Around line 68-72: Update the architecture diagram label inside the ASCII
block that currently reads "SlayerQuery (model, fields, dimensions, filters)" to
use "measures" instead of "fields" so it matches v2 terminology; locate the
ASCII diagram near the "Agent --> MCP / REST API / Python SDK" line in
docs/index.md and change the token "fields" to "measures" (i.e., "SlayerQuery
(model, measures, dimensions, filters)").
In `@slayer/cli.py`:
- Line 51: Update the CLI help examples in slayer/cli.py to use the v2 payload
key "measures" instead of the deprecated "fields": find the example strings
containing '{"source_model": "orders", "fields": ...}' (and the other
occurrences around the same examples) and replace "fields" with "measures" so
the sample commands (e.g., the slayer query example) produce valid v2 payloads.
In `@slayer/mcp/server.py`:
- Around line 1789-1790: The return expression has a missing space before the
conditional operator: change the line "return \"\\n\".join(lines)if lines else
\"\"" to include a space so it reads "return \"\\n\".join(lines) if lines else
\"\"" (the variables involved are lines and measure_lines where lines is
extended earlier).
- Around line 1693-1698: The summary builder references the removed attribute
m.dimensions causing an AttributeError; update the loop that builds lines
(iterating over models and using m.name, m.dimensions, m.measures) to use
m.columns instead of m.dimensions (i.e., compute dims as len(m.columns)),
leaving m.measures as-is so the output stays accurate for the v2 model schema.
In `@tests/integration/test_integration_postgres.py`:
- Around line 499-511: Update the test_orders_has_own_dimensions_only assertion
to reflect that SlayerModel.columns is the merged row-level field list: in the
test function test_orders_has_own_dimensions_only replace the check asserting
"amount" is absent from dim_names (which currently uses orders.columns and
dim_names = [d.name for d in orders.columns]) so that "amount" is
allowed/present; keep checks for "id" and "customer_id" and the no-dot check,
but remove or invert the line assert "amount" not in dim_names so the test
expects "amount" to be in orders.columns (or simply delete that assertion).
In `@tests/integration/test_integration.py`:
- Around line 136-145: The test is asserting the internal alias `orders._count`
instead of the public result column for the `*:count` measure; update the
assertions in this test (and any similar ones) to check
response.data[0]["orders.count"] (the public column name produced by the
`*:count` measure) rather than `orders._count`, while keeping the same objects
referenced (SlayerQuery, ModelMeasure, SlayerResponse, response). Ensure all
other tests that validate `*:count` output are updated similarly.
In `@tests/perf/test_bench.py`:
- Around line 22-27: The helper _execute currently calls the async method
engine.execute(query=query) and returns a coroutine instead of a SlayerResponse;
change it to call the synchronous API engine.execute_sync(query=query) so the
function actually returns a SlayerResponse in the synchronous benchmark context,
keeping the same signature (def _execute(env: BenchEnv, **query_kwargs) ->
SlayerResponse) and still constructing the query with
SlayerQuery(source_model="orders", **query_kwargs).
---
Nitpick comments:
In `@docs/getting-started/python.md`:
- Line 143: The page mixes v1/v2 terminology: update all query examples that use
the old "fields" key to use "measures" instead so they match the demonstrated
snippet (e.g., the example calling result =
engine.execute(query={"source_model": "orders", "measures": ["*:count"]}));
search the document for any occurrences of "fields" in SDK query examples and
replace them with "measures", ensuring the surrounding structure
(query={"source_model": ..., "measures": [...]}) and example outputs remain
consistent with the v2 response shape.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: acae7bf5-20f8-4209-b93f-246bcfb2cd7c
📒 Files selected for processing (52)
CLAUDE.mddocs/concepts/formulas.mddocs/concepts/ingestion.mddocs/concepts/models.mddocs/concepts/queries.mddocs/dbt/slayer_vs_dbt.mddocs/examples/01_dynamic/dynamic.mddocs/examples/03_auto_ingest/auto_ingest.mddocs/examples/04_time/time.mddocs/examples/05_joined_measures/joined_measures.mddocs/examples/06_multistage_queries/multistage_queries.mddocs/examples/07_aggregations/aggregations.mddocs/getting-started/cli.mddocs/getting-started/python.mddocs/getting-started/rest-api.mddocs/index.mddocs/interfaces/cli.mddocs/interfaces/python-client.mddocs/interfaces/rest-api.mddocs/reference/cli.mddocs/reference/python-client.mddocs/reference/rest-api.mdslayer/cli.pyslayer/core/enums.pyslayer/core/models.pyslayer/core/query.pyslayer/dbt/converter.pyslayer/engine/enrichment.pyslayer/engine/ingestion.pyslayer/engine/query_engine.pyslayer/mcp/server.pyslayer/storage/migrations.pyslayer/storage/v2_migration.pytests/conftest.pytests/integration/test_integration.pytests/integration/test_integration_duckdb.pytests/integration/test_integration_postgres.pytests/integration/test_mcp_inspect.pytests/perf/conftest.pytests/perf/test_bench.pytests/test_api_server.pytests/test_client.pytests/test_dbt_converter.pytests/test_format_propagation.pytests/test_formula.pytests/test_join_sync.pytests/test_mcp_server.pytests/test_migrations.pytests/test_models.pytests/test_sql_generator.pytests/test_sqlite_storage.pytests/test_storage.py
| | `source_model` | string, SlayerModel, or ModelExtension | Yes | Source model name, inline model, or model extension (adds columns/measures/joins) | | ||
| | `measures` | list[ModelMeasure] | No | Computed/aggregated values — formulas, arithmetic, transforms. See [Formulas](formulas.md). | | ||
| | `dimensions` | list[ColumnRef] | No | Columns to group by. Supports dotted names for joined models (`customers.name`, `customers.regions.name`). | |
There was a problem hiding this comment.
This page still mixes columns and dimensions for ModelExtension.
The parameter table now says inline extensions add columns, but the later ModelExtension example/field list on this page still uses dimensions. Please update the remaining examples together so the v2 docs don't contradict themselves.
Based on learnings: "When renaming a field or changing response structure, grep all docs and skills for old names and update every occurrence".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/concepts/queries.md` around lines 10 - 12, The docs mix the old term
"dimensions" with the new "columns" for ModelExtension; update all occurrences
of "dimensions" in this file's ModelExtension examples/field lists to use
"columns" to match the parameter table (e.g., change any ModelExtension field
named dimensions to columns and update related examples and descriptions), and
verify references to symbols like source_model, measures, ModelExtension,
ColumnRef are consistent across the page so the v2 docs no longer contradict
each other.
There was a problem hiding this comment.
Not actionable. The ModelExtension example here uses columns, not dimensions (see line 312: "columns": [{"name": "tier", ...}]), and the field-list at line 320 explicitly says "source_name, columns, measures, joins". The "dimensions": ["tier"] at line 315 is the surrounding SlayerQuery parameter (per the table at line 12, dimensions is a top-level query field) — not a ModelExtension field. Those are two distinct schemas. No change needed.
There was a problem hiding this comment.
@ZmeiGorynych, you're right — I misread the example. The "dimensions": ["tier"] is a top-level SlayerQuery field (correctly documented in the parameter table), while the ModelExtension object itself already uses "columns" and the field list at line 320 is accurate. No contradiction exists. I apologize for the noise.
🧠 Learnings used
Learnt from: CR
Repo: MotleyAI/slayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-29T09:40:09.576Z
Learning: Applies to **/models.py : SlayerQuery.source_model accepts a model name, inline `SlayerModel`, or `ModelExtension` for extending models
Learnt from: CR
Repo: MotleyAI/slayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-29T09:40:09.576Z
Learning: Applies to **/models.py : Provide `SlayerResponse.attributes` as `ResponseAttributes` with `.dimensions` and `.measures` dicts, each mapping column alias to `FieldMetadata(label, format)`
Learnt from: CR
Repo: MotleyAI/slayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-29T09:40:09.576Z
Learning: Applies to **/models.py : SlayerModel, SlayerQuery, and DatasourceConfig must carry a `version: int` field (currently `1`) with schema versioning via converter chain in `slayer/storage/migrations.py`
Learnt from: AivanF
Repo: MotleyAI/slayer PR: 25
File: docs/getting-started/python.md:57-61
Timestamp: 2026-04-10T12:12:08.571Z
Learning: In the MotleyAI/slayer repo docs, use language-agnostic “JSON/dict syntax” when showing query objects on non-Python pages (e.g., REST API, CLI, MCP). When the page is specifically demonstrating the Python SDK directly—such as `docs/getting-started/python.md`—it is acceptable (and preferred) to show Python constructor syntax like `SlayerQuery(...)` instead of JSON/dict syntax. Avoid mixing constructors into language-agnostic examples.
Learnt from: ZmeiGorynych
Repo: MotleyAI/slayer PR: 59
File: slayer/mcp/server.py:1725-1740
Timestamp: 2026-04-29T13:25:08.874Z
Learning: In MotleyAI/slayer, `ResponseAttributes` (in `slayer/engine/query_engine.py`) is a Pydantic BaseModel with `.dimensions: Dict[str, FieldMetadata]` and `.measures: Dict[str, FieldMetadata]` fields, and a custom `.get(column: str) -> Optional[FieldMetadata]` method (lines 76-78) that looks up a column key across both dicts. `SlayerResponse.attributes` is always a `ResponseAttributes` instance (never None) because it uses `default_factory=ResponseAttributes`. Do not flag `attrs.get(col_name)` as a crash — it is a valid, intentional API call.
Learnt from: ZmeiGorynych
Repo: MotleyAI/slayer PR: 46
File: tests/test_sql_generator.py:3087-3103
Timestamp: 2026-04-28T11:23:09.032Z
Learning: In `tests/test_sql_generator.py` and the slayer codebase (Python), there are two distinct naming layers for `*:count`:
1. **User-facing result column name** (what CLAUDE.md documents): `*:count → orders.count` (star-colon prefix stripped for display).
2. **Internal `EnrichedMeasure.alias`** (used inside the enrichment pipeline and SQL generator): `orders._count` (leading underscore retained). Documented at `slayer/sql/generator.py` lines ~1003-1011 and `enrichment.py` line ~339.
Do NOT flag an assertion of `orders._count` as wrong when checking `EnrichedMeasure.alias` — it is the correct internal alias. Only `orders.count` applies to the user-visible output column name.
Learnt from: ZmeiGorynych
Repo: MotleyAI/slayer PR: 61
File: slayer/engine/ingestion.py:508-545
Timestamp: 2026-04-29T13:04:25.377Z
Learning: Applies to **/ingestion.py (MotleyAI/slayer): In v2, auto-ingestion leaves `SlayerModel.measures` (named formula library) empty. All introspected columns are emitted as `Column` entries on `SlayerModel.columns`. Named formula measures are an opt-in library that users populate themselves — not auto-generated. The v1 behavior of generating one `Measure` per non-ID column no longer exists. There is an explicit test `test_orders_has_no_named_measures_after_ingest` in `tests/integration/test_integration_duckdb.py` asserting `orders.measures == []` after ingestion.
Learnt from: CR
Repo: MotleyAI/slayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-29T09:40:09.576Z
Learning: Applies to **/*.py : Use bare column names for Dimension/measure SQL (e.g., `"amount"`); use `model_name.column_name` for complex expressions (e.g., `"orders.amount * orders.quantity"`)
Learnt from: CR
Repo: MotleyAI/slayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-29T09:40:09.576Z
Learning: Applies to **/*.py : Treat measures and aggregations as separate concepts: measures are named row-level SQL expressions; aggregations are specified at query time using colon syntax (e.g., `"revenue:sum"`, `"*:count"`)
Learnt from: ZmeiGorynych
Repo: MotleyAI/slayer PR: 22
File: slayer/core/models.py:138-143
Timestamp: 2026-04-10T10:45:29.437Z
Learning: In MotleyAI/slayer, the `Measure.type` (deprecated DataType) field has been intentionally and permanently removed from the `Measure` model in `slayer/core/models.py`. The project is early-stage and the maintainer (ZmeiGorynych) opted for a clean break — no backward-compatibility shim or DeprecationWarning path is desired. Do not flag the absence of `Measure.type` as a backward-compatibility issue.
Learnt from: ZmeiGorynych
Repo: MotleyAI/slayer PR: 26
File: slayer/engine/ingestion.py:493-500
Timestamp: 2026-04-13T08:02:30.828Z
Learning: Applies to **/slayer/**/*ingest*.py : The column renaming rule mapping a physical column named "count" to measure name "count_value" is deprecated and no longer applied. Auto-ingestion emits measures using the raw column name as-is (except `_count` which is still renamed to `count_col`). Do not flag the absence of "count" → "count_value" renaming as a bug.
Learnt from: AivanF
Repo: MotleyAI/slayer PR: 25
File: docs/getting-started/mcp.md:54-54
Timestamp: 2026-04-10T12:15:07.864Z
Learning: When reviewing MotleyAI/slayer documentation (e.g., docs/getting-started/mcp.md), do not apply the JSON/dict syntax rule to illustrative/prose descriptions of an agent’s actions. Specifically, if the text uses conversational or narrative formatting (such as blockquoted lines like “**Agent:** *calls `query(...)`*”), treat it as descriptive prose rather than a real code example/snippet. Only enforce the JSON/dict syntax rule within actual code blocks or real code snippets; keyword-argument-looking syntax in prose should not be flagged as a violation.
- Notebooks: replace v1 ``Dimension``/``Measure`` constructors and
``model.dimensions`` / ``model.measures`` (row-level) accesses with the
v2 ``Column`` and ``model.columns`` shape across:
- docs/examples/02_sql_vs_dsl/sql_vs_dsl_nb.ipynb
- docs/examples/03_auto_ingest/auto_ingest_nb.ipynb
- docs/examples/05_joins/joins_nb.ipynb
- docs/examples/07_aggregations/aggregations_nb.ipynb
Notebook outputs are cleared so the executor regenerates them on next
run, and explanatory comments describe the new schema (column-level
``allowed_aggregations``, type-default eligibility, ``measures`` as
named-formula library, primary-key restriction to count/count_distinct).
- Postgres rollup-ingestion tests: replace v1-shape assertions
("amount not in dim_names", "amount in measure_names") with v2-shape
assertions (every non-joined column lives on ``columns``;
``model.measures`` is empty after auto-ingest because it's now the
named-formula library, not the row-level list). Same fix already
applied to the duckdb suite.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
docs/examples/07_aggregations/aggregations_nb.ipynb (1)
412-447: Consider exercising the custom aggregation here.This registers
weighted_avg_price, but the notebook never actually runs it. A tiny query/output would make the example self-validating.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/examples/07_aggregations/aggregations_nb.ipynb` around lines 412 - 447, Add a short example that actually exercises the registered Aggregation named "weighted_avg_price": after the items_model definition, build and execute a small query that selects the weighted_avg_price aggregation (e.g., reference items_model and its aggregation by name or items_model.aggregations[0]) against the jaffle_shop/order_items table and print the result so the notebook validates the custom aggregation (use the project’s usual model query/execute API such as items_model.select / items_model.query / execute_query — whichever method your codebase uses to run model queries).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@docs/examples/07_aggregations/aggregations_nb.ipynb`:
- Around line 412-447: Add a short example that actually exercises the
registered Aggregation named "weighted_avg_price": after the items_model
definition, build and execute a small query that selects the weighted_avg_price
aggregation (e.g., reference items_model and its aggregation by name or
items_model.aggregations[0]) against the jaffle_shop/order_items table and print
the result so the notebook validates the custom aggregation (use the project’s
usual model query/execute API such as items_model.select / items_model.query /
execute_query — whichever method your codebase uses to run model queries).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3cccf4f7-0803-41f4-916b-2b8973176d1c
📒 Files selected for processing (5)
docs/examples/02_sql_vs_dsl/sql_vs_dsl_nb.ipynbdocs/examples/03_auto_ingest/auto_ingest_nb.ipynbdocs/examples/05_joins/joins_nb.ipynbdocs/examples/07_aggregations/aggregations_nb.ipynbtests/integration/test_integration_postgres.py
Fixes 7 CodeRabbit comments on #61 plus a parallel doc sweep. The REST QueryRequest still declared the v1 `fields` key, so any client following the new docs would have measures silently stripped by FastAPI before SlayerQuery's v1→v2 migration ever ran. Rename to `measures` and set extra="allow" on the request model so legacy `fields` payloads still flow through to the existing migration. Sweep `*:count` and `measures` usage across all docs and the SlayerQuery docstring; fix ModelExtension docs that still showed `dimensions` for what is now `columns`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixes 3 of 4 CodeRabbit test-fix comments on #61. The fourth (test_mcp_server.py:408) was a false positive — the inspect_model JSON path unconditionally emits `filter` when show_sql=True, so the loop passes as written. Also fixes a real API bug surfaced by the test_api_server review: GET /models/{name} was filtering legacy `dimensions`/`measures` keys that no longer exist on v2 SlayerModel and ignoring the new `columns` list, so hidden columns were leaking through. Switch the filter to `columns`. The test that previously masked this with a "visible ones present" check now strictly asserts hidden columns are absent. Other test fixes: drop spurious `status` Column appends in test_mcp_inspect.py — the env fixture already declares `status` as a DataType.STRING column, so adding a second NUMBER copy made the model internally contradictory. Pick `revenue` by name in test_dbt_converter instead of by index — `_convert_semantic_model` prepends entity/PK columns, so columns[0] was the PK, not the converted measure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/getting-started/python.md`:
- Around line 57-61: The example uses the async SlayerQueryEngine.execute()
without awaiting, causing a coroutine to be shown in synchronous docs; update
the snippet(s) to call the synchronous API instead
(SlayerQueryEngine.execute_sync() or engine.execute_sync(...)) wherever
SlayerQueryEngine.execute() is used in synchronous examples (e.g., the shown
query and the other occurrence at line ~143) so the code runs correctly in
CLI/notebook/script contexts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 36beb557-8723-46a0-8bfd-8e41c8062f0f
📒 Files selected for processing (12)
docs/concepts/models.mddocs/concepts/queries.mddocs/getting-started/python.mddocs/getting-started/rest-api.mddocs/interfaces/cli.mddocs/interfaces/mcp.mddocs/reference/cli.mddocs/reference/mcp.mddocs/reference/rest-api.mdslayer/api/server.pyslayer/core/query.pytests/test_api_server.py
✅ Files skipped from review due to trivial changes (6)
- docs/interfaces/mcp.md
- docs/reference/mcp.md
- docs/interfaces/cli.md
- docs/reference/cli.md
- docs/reference/rest-api.md
- docs/getting-started/rest-api.md
🚧 Files skipped from review as they are similar to previous changes (2)
- docs/concepts/models.md
- docs/concepts/queries.md
Two of three CodeRabbit code-bug comments on #61 were valid; the third (engine/ingestion.py:545 "drops the auto-generated measure library") was applying v1 expectations — v2 deliberately leaves measures empty after ingestion, as test_orders_has_no_named_measures_after_ingest documents. dbt converter: when a semantic model declares both a primary entity in `entities` and the same column via the `primary_entity` shorthand, the shorthand block was checking against a stale `entity_col_names` set captured before the entity loop ran, so the PK column was appended a second time. Update the set after each append. v2 migration detector: `_looks_like_v1_model` subscripted `raw_measures[0]` after only checking truthiness, so a malformed payload like `{"measures": {"name": "revenue"}}` crashed with `KeyError: 0` inside the detector instead of falling through to Pydantic validation. Guard with isinstance before subscripting. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tests/integration/test_mcp_inspect.py (1)
423-466:⚠️ Potential issue | 🟠 MajorUpdate query payloads to use
measuresinstead of legacyfieldskey.These tests currently validate the v1-to-v2 migration path for
SlayerQuery, which converts"fields"→"measures"automatically. Tests should exercise the v2 contract directly rather than rely on backward-compatibility migration, making the test intent clearer and avoiding brittleness if migrations change.🔧 Proposed fix
def _run(self, env, formula: str) -> None: from slayer.core.query import SlayerQuery q = SlayerQuery.model_validate({ "source_model": "orders", - "fields": [{"formula": formula}], + "measures": [{"formula": formula}], }) await env["engine"].execute(query=q) @@ q = SlayerQuery.model_validate({ "source_model": "orders", - "fields": [{"formula": "status:min"}, {"formula": "status:max"}], + "measures": [{"formula": "status:min"}, {"formula": "status:max"}], }) @@ q = SlayerQuery.model_validate({ "source_model": "orders", - "fields": [{"formula": "status:count"}, {"formula": "status:count_distinct"}], + "measures": [{"formula": "status:count"}, {"formula": "status:count_distinct"}], }) @@ q = SlayerQuery.model_validate({ "source_model": "orders", - "fields": [{"formula": "amount:sum"}, {"formula": "amount:avg"}], + "measures": [{"formula": "amount:sum"}, {"formula": "amount:avg"}], })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_mcp_inspect.py` around lines 423 - 466, The test payloads are using the legacy "fields" key instead of the v2 "measures" key; update all usages in this file (notably inside the helper _run() and in the SlayerQuery.model_validate calls in tests like test_numeric_only_aggregations_rejected_on_string, test_min_max_allowed_on_string, test_count_and_count_distinct_allowed_on_string, and test_numeric_aggregations_allowed_on_numeric_measure) to replace "fields": [...] with "measures": [...], keeping the same inner formula objects so the tests exercise the v2 contract directly.tests/test_api_server.py (1)
168-176:⚠️ Potential issue | 🟠 MajorMake the test async and await the model save.
storage.save_model()is async, but the test calls it without awaiting. This creates a coroutine that is immediately discarded—the model is never persisted. Per coding guidelines, convert the test toasync defand await the save, or seed the fixture through the API instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_api_server.py` around lines 168 - 176, The test test_query_missing_datasource currently calls the async method storage.save_model without awaiting it, so convert the test function to async def test_query_missing_datasource(...) and await storage.save_model(...) before calling client.post (or alternatively seed the model via the API); ensure you still use the TestClient to POST to "/query" and assert resp.status_code == 400 after the awaited save completes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@slayer/dbt/converter.py`:
- Around line 496-506: When appending the new filtered Column (filtered_column)
to slayer_model.columns ensure you first detect name collisions with existing
columns: iterate slayer_model.columns and check for any column.name ==
metric.name (or same case-normalized key) and handle the collision by either
skipping append, replacing the existing column, or raising/logging an error
depending on the intended behavior; update the logic around Column creation in
converter.py (the filtered_column, metric.name, and slayer_model.columns sites)
so duplicate names are not blindly appended and ambiguous formula resolution is
prevented.
---
Outside diff comments:
In `@tests/integration/test_mcp_inspect.py`:
- Around line 423-466: The test payloads are using the legacy "fields" key
instead of the v2 "measures" key; update all usages in this file (notably inside
the helper _run() and in the SlayerQuery.model_validate calls in tests like
test_numeric_only_aggregations_rejected_on_string,
test_min_max_allowed_on_string, test_count_and_count_distinct_allowed_on_string,
and test_numeric_aggregations_allowed_on_numeric_measure) to replace "fields":
[...] with "measures": [...], keeping the same inner formula objects so the
tests exercise the v2 contract directly.
In `@tests/test_api_server.py`:
- Around line 168-176: The test test_query_missing_datasource currently calls
the async method storage.save_model without awaiting it, so convert the test
function to async def test_query_missing_datasource(...) and await
storage.save_model(...) before calling client.post (or alternatively seed the
model via the API); ensure you still use the TestClient to POST to "/query" and
assert resp.status_code == 400 after the awaited save completes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2e2bc7f5-3913-45c1-b695-a7a9085a35ff
📒 Files selected for processing (7)
slayer/api/server.pyslayer/dbt/converter.pyslayer/storage/v2_migration.pytests/integration/test_mcp_inspect.pytests/test_api_server.pytests/test_dbt_converter.pytests/test_migrations.py
🚧 Files skipped from review as they are similar to previous changes (2)
- slayer/storage/v2_migration.py
- tests/test_migrations.py
Two of three CodeRabbit aggregation-eligibility comments were valid; the third (enums.py:179 "narrow defaults") was applying invented constraints not in the PR description, and would have removed real useful aggregations. The validator at engine/enrichment.py was only blocking numeric-only aggregations on string columns, leaving the PK rule and the type-default rule unenforced. Per CLAUDE.md "Primary-key columns are always restricted to count/count_distinct aggregations regardless of type" — but the code didn't actually do that. Replace the partial check with the documented three-tier logic: PK rule wins (always), then explicit allowed_aggregations whitelist, then type defaults. Custom model-level aggregations bypass the type-default check (they're intentional extensions). The type-probe query in query_engine and the min/max profiler in mcp/server were both using `:max` blanket-style across all columns; under the new strict validator that broke on BOOLEAN columns since max/min weren't in BOOLEAN's defaults. Two fixes: - _build_type_probe_query now picks an aggregation per column from its effective allowed set (max preferred, fallback to first allowed). - BOOLEAN defaults extended to include min/max/first/last (False<True ordering makes these meaningful and the probe path needs them). Doc: models.md:90 now correctly states allowed_aggregations overrides the type-default map but NOT the primary-key restriction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
docs/concepts/models.md (1)
250-258:⚠️ Potential issue | 🟡 MinorUse JSON/dict syntax for the query object in this docs example.
This snippet still uses
SlayerQuery(...)constructor syntax; docs examples should keep query objects in portable JSON/dict form.📝 Suggested doc update
engine.create_model_from_query( - query=SlayerQuery( - source_model="orders", - time_dimensions=[...], - measures=["*:count", "amount:sum"], - ), + query={ + "source_model": "orders", + "time_dimensions": [...], + "measures": ["*:count", "amount:sum"], + }, name="monthly_summary", )As per coding guidelines, use JSON/dict syntax for all query objects in docs and examples — not Python class constructors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/concepts/models.md` around lines 250 - 258, The docs example uses the Python constructor SlayerQuery in the call to engine.create_model_from_query; replace the SlayerQuery(...) constructor form with a JSON/dict literal for the query argument (e.g., a dict with keys source_model, time_dimensions, measures) so the example is language-agnostic and portable; update the call site engine.create_model_from_query(query=..., name="monthly_summary") to pass that dict instead of SlayerQuery and ensure field names match the SlayerQuery schema (source_model, time_dimensions, measures).slayer/engine/enrichment.py (1)
571-630:⚠️ Potential issue | 🟠 MajorNamed saved measures are still unqueryable here.
SlayerQueryEngine._auto_move_fields_to_dimensions()now deliberately keeps bare formulas whenmodel.get_measure(name)exists, but this loop never expands that saved measure before callingparse_formula(). A query likemeasures=[{"formula": "gross_margin"}]will still fall through to_ensure_measure_from_spec()and fail with “Bare measure name ... is not valid”, so the newSlayerModel.measureslibrary cannot actually be used.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/engine/enrichment.py` around lines 571 - 630, The loop over query.measures parses qfield.formula directly and never expands bare saved-measure names; fix by checking before parse_formula whether qfield.formula is a bare name that corresponds to a saved measure (call model.get_measure(qfield.formula)), and if so replace/expand it into the saved measure's formula/spec (or parse the saved measure's formula) so the subsequent logic (including _ensure_aggregated_measure, _flatten_spec, and label handling) operates on the resolved spec; ensure you preserve qfield.label when substituting and keep the existing calls to parse_formula, _ensure_aggregated_measure, and _flatten_spec unchanged for non-saved formulas.slayer/engine/query_engine.py (1)
389-395:⚠️ Potential issue | 🟠 MajorUse
SlayerModel.model_validate()instead ofmodel_copy()to ensure validators run.In Pydantic v2,
model_copy()bypasses all field and model validators. The collision check inSlayerModel._validate_column_measure_disjoint()will not run, allowing aModelExtensionto inject a column that collides with an existing measure name (or vice versa). This invalid state only fails later during enrichment rather than at model construction.Proposed fix
- return base.model_copy( - update={ - "columns": list(base.columns) + extra_cols, - "measures": list(base.measures) + extra_measures, - "joins": list(base.joins) + extra_joins, - } - ) + return SlayerModel.model_validate({ + **base.model_dump(), + "columns": [c.model_dump() for c in [*base.columns, *extra_cols]], + "measures": [m.model_dump() for m in [*base.measures, *extra_measures]], + "joins": [j.model_dump() for j in [*base.joins, *extra_joins]], + })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/engine/query_engine.py` around lines 389 - 395, Replace the call to base.model_copy(...) with base.model_validate(...) so Pydantic v2 validators run and the collision check in SlayerModel._validate_column_measure_disjoint() executes at model construction; call model_validate with an update dict containing the same keys ("columns", "measures", "joins") built from list(base.columns) + extra_cols, list(base.measures) + extra_measures, and list(base.joins) + extra_joins respectively so ModelExtension cannot inject colliding names undetected.
♻️ Duplicate comments (1)
slayer/engine/query_engine.py (1)
805-826:⚠️ Potential issue | 🟠 MajorCross-model
column:aggstill bypasses the new eligibility gates.Local refs now enforce primary-key restriction, explicit
allowed_aggregations, andDEFAULT_AGGREGATIONS_BY_TYPEinslayer/engine/enrichment.py, but this branch still only rejects numeric-only aggregations on strings.customers.id:sum,customers.created_at:avg, or a dotted column with an explicit whitelist can still slip through here.🔧 Suggested fix
- if ( - aggregation_name - and aggregation_name in NUMERIC_ONLY_AGGREGATIONS - and str(col_def.type) == "string" - ): + if col_def.primary_key: + allowed = PRIMARY_KEY_AGGREGATIONS + elif col_def.allowed_aggregations is not None: + allowed = frozenset(col_def.allowed_aggregations) + else: + is_custom_agg = target_model.get_aggregation(aggregation_name) is not None + allowed = ( + frozenset({aggregation_name}) + if is_custom_agg + else DEFAULT_AGGREGATIONS_BY_TYPE.get(col_def.type, frozenset()) + ) + if aggregation_name not in allowed: raise ValueError( - f"Aggregation '{aggregation_name}' is not applicable to " - f"string column '{measure_name}' in model '{target_model_name}'." + f"Aggregation '{aggregation_name}' not allowed for column " + f"'{measure_name}' in model '{target_model_name}'. Allowed: {sorted(allowed)}" )Also add
PRIMARY_KEY_AGGREGATIONSto the enum import at the top of the file.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/engine/query_engine.py` around lines 805 - 826, The cross-model column lookup branch currently only blocks numeric-only aggregations on string columns; update it to enforce the same eligibility gates used for local refs by checking PRIMARY_KEY_AGGREGATIONS, explicit allowed_aggregations on the column, and DEFAULT_AGGREGATIONS_BY_TYPE based on column type before accepting a dotted measure; specifically, when resolving measure_name via target_model.get_column(...) in query_engine.py also import and consult PRIMARY_KEY_AGGREGATIONS and DEFAULT_AGGREGATIONS_BY_TYPE (and any column.allowed_aggregations) the same way enrichment.py does, reject aggregations not permitted by those rules (e.g., primary-key-only aggregations on non-PK columns, aggregations not in allowed_aggregations, and type-disallowed aggregations) and raise the same ValueError messages.
🧹 Nitpick comments (1)
tests/integration/test_mcp_inspect.py (1)
479-493: Usemeasureshere so this new test covers the v2 query API.These assertions currently go through the v1
fieldsmigration alias, so they won’t catch a regression in directSlayerQuery.measuresparsing. Switching the payloads tomeasuresmakes the new primary-key contract test hit the actual public surface.🧪 Suggested change
- "fields": [{"formula": "id:sum"}], + "measures": [{"formula": "id:sum"}], ... - "fields": [{"formula": "id:count"}], + "measures": [{"formula": "id:count"}],🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_mcp_inspect.py` around lines 479 - 493, The test uses the v1 alias "fields" instead of the v2 public API key "measures", so update both query payloads in test_count_on_pk_allowed (and the previous test that asserts ValueError) to use "measures" instead of "fields" so the assertions exercise SlayerQuery.measures parsing; locate the SlayerQuery.model_validate calls in test_count_on_pk_allowed and the preceding test and replace the "fields": [...] entries with "measures": [...] preserving the same formula objects (e.g., {"formula": "id:count"} and {"formula": "id:sum"}) and leave the await env["engine"].execute(query=q) call unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/concepts/models.md`:
- Around line 3-6: The page mixes v1 and v2 model schemas; update the docs to be
consistent with v2 by removing or replacing any v1-era references (e.g.,
"version: 1", "dimensions", and measure examples using raw `sql`) with v2
concepts: use `columns: List[Column]` and formula-based `measures:
List[ModelMeasure]` as used by SlayerModel, update examples to show `columns`
entries (with types, allowed_aggregations, filters/PK flags) and `measures` as
named formulas (not SQL snippets), and ensure any mention of `version` reflects
v2 (or is removed if not applicable).
In `@slayer/engine/query_engine.py`:
- Around line 288-292: The code currently uses a truthiness check on
c.allowed_aggregations which causes an explicit empty list to fall back to
DEFAULT_AGGREGATIONS_BY_TYPE; change the conditional to check for None
explicitly (e.g., if c.allowed_aggregations is not None) so that an explicit
empty whitelist remains empty, keeping allowed = list(c.allowed_aggregations)
when provided and only using DEFAULT_AGGREGATIONS_BY_TYPE when
allowed_aggregations is actually unset; update the block around
c.allowed_aggregations, allowed, and DEFAULT_AGGREGATIONS_BY_TYPE in
query_engine.py to implement this check so downstream functions like
get_column_types() behave correctly.
---
Outside diff comments:
In `@docs/concepts/models.md`:
- Around line 250-258: The docs example uses the Python constructor SlayerQuery
in the call to engine.create_model_from_query; replace the SlayerQuery(...)
constructor form with a JSON/dict literal for the query argument (e.g., a dict
with keys source_model, time_dimensions, measures) so the example is
language-agnostic and portable; update the call site
engine.create_model_from_query(query=..., name="monthly_summary") to pass that
dict instead of SlayerQuery and ensure field names match the SlayerQuery schema
(source_model, time_dimensions, measures).
In `@slayer/engine/enrichment.py`:
- Around line 571-630: The loop over query.measures parses qfield.formula
directly and never expands bare saved-measure names; fix by checking before
parse_formula whether qfield.formula is a bare name that corresponds to a saved
measure (call model.get_measure(qfield.formula)), and if so replace/expand it
into the saved measure's formula/spec (or parse the saved measure's formula) so
the subsequent logic (including _ensure_aggregated_measure, _flatten_spec, and
label handling) operates on the resolved spec; ensure you preserve qfield.label
when substituting and keep the existing calls to parse_formula,
_ensure_aggregated_measure, and _flatten_spec unchanged for non-saved formulas.
In `@slayer/engine/query_engine.py`:
- Around line 389-395: Replace the call to base.model_copy(...) with
base.model_validate(...) so Pydantic v2 validators run and the collision check
in SlayerModel._validate_column_measure_disjoint() executes at model
construction; call model_validate with an update dict containing the same keys
("columns", "measures", "joins") built from list(base.columns) + extra_cols,
list(base.measures) + extra_measures, and list(base.joins) + extra_joins
respectively so ModelExtension cannot inject colliding names undetected.
---
Duplicate comments:
In `@slayer/engine/query_engine.py`:
- Around line 805-826: The cross-model column lookup branch currently only
blocks numeric-only aggregations on string columns; update it to enforce the
same eligibility gates used for local refs by checking PRIMARY_KEY_AGGREGATIONS,
explicit allowed_aggregations on the column, and DEFAULT_AGGREGATIONS_BY_TYPE
based on column type before accepting a dotted measure; specifically, when
resolving measure_name via target_model.get_column(...) in query_engine.py also
import and consult PRIMARY_KEY_AGGREGATIONS and DEFAULT_AGGREGATIONS_BY_TYPE
(and any column.allowed_aggregations) the same way enrichment.py does, reject
aggregations not permitted by those rules (e.g., primary-key-only aggregations
on non-PK columns, aggregations not in allowed_aggregations, and type-disallowed
aggregations) and raise the same ValueError messages.
---
Nitpick comments:
In `@tests/integration/test_mcp_inspect.py`:
- Around line 479-493: The test uses the v1 alias "fields" instead of the v2
public API key "measures", so update both query payloads in
test_count_on_pk_allowed (and the previous test that asserts ValueError) to use
"measures" instead of "fields" so the assertions exercise SlayerQuery.measures
parsing; locate the SlayerQuery.model_validate calls in test_count_on_pk_allowed
and the preceding test and replace the "fields": [...] entries with "measures":
[...] preserving the same formula objects (e.g., {"formula": "id:count"} and
{"formula": "id:sum"}) and leave the await env["engine"].execute(query=q) call
unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c588135c-f03b-41ba-83e7-cdb216178a25
📒 Files selected for processing (5)
docs/concepts/models.mdslayer/core/enums.pyslayer/engine/enrichment.pyslayer/engine/query_engine.pytests/integration/test_mcp_inspect.py
✅ Files skipped from review due to trivial changes (1)
- slayer/core/enums.py
- query_engine: distinguish empty allowed_aggregations from None when building the type probe, so an explicit `[]` whitelist no longer falls back to type defaults - dbt converter: detect column-name collisions before appending a filtered metric, emit a ConversionWarning and skip instead of silently producing duplicate columns - docs/getting-started/python.md: use execute_sync() in synchronous examples and replace v1 model.dimensions with model.columns - docs/concepts/models.md: bump version 1→2 throughout, drop the v1 dimensions row from the model-fields reference, replace it with columns, and rewrite the measures row as the v2 named-formula library Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The S7 PR (#60) merged before S3 (#61) but added integration tests for median/percentile that used v1 syntax — Field(formula=...) and fields= on SlayerQuery. The S3 merge into main brought v2 alongside but didn't update these S7 tests, leaving the merged main with code that fails ruff check (F821: undefined name 'Field' — never imported). Until now egor/S3a was based on origin-S3 (pre-merge), so locally the v1 lines didn't exist. Merged origin/main in (commit 72b0ff4) so the file now reflects the post-merge tree CI sees, then: - tests/integration/test_integration.py: replace 6 Field(formula=...) uses with ModelMeasure(formula=...). Switch fields=[...] to measures=[...] on the same SlayerQuery calls. - tests/integration/test_integration_duckdb.py and tests/integration/test_integration_postgres.py: switch fields=[...] to measures=[...] in the median/percentile tests added by S7. Their dict-form measure literals already worked via the v1→v2 migration but the kwarg name itself was still v1. Verified locally: ruff clean, 5/5 SQLite median/percentile tests pass. This is also a partial down-payment on Major 7 (eradicate `fields`) in the bigger codex follow-up list. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes #52.
Summary
dimensionsandmeasureslists onSlayerModelinto a singlecolumns: List[Column]. Each column carries the union of fields and is usable either as a group-by dim or as an aggregation source, gated byallowed_aggregations, type-default eligibility, and a built-in PK rule (count/count_distinct only).SlayerModel.measuresto holdModelMeasureformulas (the v1Query.fieldsshape) — a library of named saved metrics that queries can reference by bare name.SlayerQuery.fields→measuresfor consistency.SlayerModel/SlayerQueryschema to v2 and add a v1→v2 read-time migration (slayer/storage/v2_migration.py).What the migration does
dimensions+ v1measures→ v2columns(preservingprimary_key,allowed_aggregations,filter,label,description,format,hidden,meta).sql/filter/allowed_aggregationscarry over;typedefaults tonumber(since auto-ingest only emitted Measures for numeric-ish columns); the deprecatedtype: sum/avg/...alias becomesallowed_aggregations: [...]and the field is dropped.SlayerQuery.fields→measuresis a pure key rename; nested inlineModelExtensionis recursively migrated.Out of scope
{model_name, query_name}cross-entity uniqueness validator from the issue is deferred to S2 (NamedQuery storage isn't on this branch yet).Columnobjects.Test plan
poetry run pytest— 959 unit tests pass (incl. 33 migration tests covering the dim+measure→columns merge, fields→measures rename, legacy-type aliasing, name-collision rejection, YAML/SQLite round-trips, and forward-version pass-through).poetry run pytest tests/integration/test_integration.py -m integration— 54/54 SQLite integration tests pass.poetry run pytest tests/integration/test_integration_duckdb.py -m integration— 22/22 DuckDB integration tests pass.pg_ctlfailed to spawn the temp Postgres in this run; rerun on CI.docs/examples/still use v1 patterns (e.g.fields=[...]); update separately.poetry run ruff check slayer/ tests/— clean.🤖 Generated with Claude Code
Summary by CodeRabbit
Breaking Changes
New Features
Documentation