Skip to content

S3: merge dimensions+measures into unified columns (#52)#61

Merged
ZmeiGorynych merged 7 commits into
mainfrom
S3
Apr 29, 2026
Merged

S3: merge dimensions+measures into unified columns (#52)#61
ZmeiGorynych merged 7 commits into
mainfrom
S3

Conversation

@ZmeiGorynych
Copy link
Copy Markdown
Member

@ZmeiGorynych ZmeiGorynych commented Apr 29, 2026

Closes #52.

Summary

  • Collapse v1's separate dimensions and measures lists on SlayerModel into a single columns: List[Column]. Each column carries the union of fields and is usable either as a group-by dim or as an aggregation source, gated by allowed_aggregations, type-default eligibility, and a built-in PK rule (count/count_distinct only).
  • Repurpose SlayerModel.measures to hold ModelMeasure formulas (the v1 Query.fields shape) — a library of named saved metrics that queries can reference by bare name.
  • Rename SlayerQuery.fieldsmeasures for consistency.
  • Bump SlayerModel/SlayerQuery schema to v2 and add a v1→v2 read-time migration (slayer/storage/v2_migration.py).

What the migration does

  • v1 dimensions + v1 measures → v2 columns (preserving primary_key, allowed_aggregations, filter, label, description, format, hidden, meta).
  • v1 Measures lose nothing: sql/filter/allowed_aggregations carry over; type defaults to number (since auto-ingest only emitted Measures for numeric-ish columns); the deprecated type: sum/avg/... alias becomes allowed_aggregations: [...] and the field is dropped.
  • Detection skips fresh v2 constructions so no caller payload gets clobbered.
  • SlayerQuery.fieldsmeasures is a pure key rename; nested inline ModelExtension is recursively migrated.
  • Old YAML and SQLite stores upgrade transparently — round-trip tests cover both backends.

Out of scope

  • The {model_name, query_name} cross-entity uniqueness validator from the issue is deferred to S2 (NamedQuery storage isn't on this branch yet).
  • Auto-ingest strategy revisits stay in S4 — this PR keeps the existing heuristic, just emits Column objects.

Test plan

  • poetry run pytest — 959 unit tests pass (incl. 33 migration tests covering the dim+measure→columns merge, fields→measures rename, legacy-type aliasing, name-collision rejection, YAML/SQLite round-trips, and forward-version pass-through).
  • poetry run pytest tests/integration/test_integration.py -m integration — 54/54 SQLite integration tests pass.
  • poetry run pytest tests/integration/test_integration_duckdb.py -m integration — 22/22 DuckDB integration tests pass.
  • Postgres integration tests — local pg_ctl failed to spawn the temp Postgres in this run; rerun on CI.
  • Notebook tests — 5 notebooks under docs/examples/ still use v1 patterns (e.g. fields=[...]); update separately.
  • poetry run ruff check slayer/ tests/ — clean.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Breaking Changes

    • "fields" → "measures" in queries; models use a unified "columns" schema; storage schema bumped to v2 with automatic v1→v2 migration; CLI/API/tooling now report columns instead of legacy dims/measures.
  • New Features

    • Per-column aggregation whitelists and column-level filters.
    • Named formula measures (referencable).
    • Added percentile aggregation and collision-avoidance for count names.
  • Documentation

    • All docs, examples, CLI/REST/Python snippets updated to the new schema.

…rmulas (#52)

Collapse the dim/measure split into a single SlayerModel.columns list of
Column objects. Repurpose SlayerModel.measures to hold ModelMeasure named
formulas (the shape v1 had on Query.fields). Rename SlayerQuery.fields →
measures for consistency.

A column carries the union of v1 Dimension+Measure metadata (type,
primary_key, allowed_aggregations, filter, format, label, description,
hidden, meta). Aggregation eligibility now flows from data type with
type→aggregations defaults in slayer/core/enums.py
(DEFAULT_AGGREGATIONS_BY_TYPE), an explicit allowed_aggregations whitelist
overriding it, and a built-in PRIMARY_KEY_AGGREGATIONS rule restricting PKs
to count/count_distinct.

The SlayerModel disjoint-name validator forbids overlap between column
names and measure-formula names so query name resolution is unambiguous.

A v1→v2 read-time migration (slayer/storage/v2_migration.py) handles all
existing storage: merges old dimensions+measures into columns, defaults old
Measure types to NUMBER, drops the legacy 'type: sum' alias (carrying
intent into allowed_aggregations), renames Query.fields→measures, and
recurses into inline ModelExtension blocks. The detection heuristic skips
fresh v2 dicts so it never clobbers caller payloads.

Engine, MCP/API/CLI surfaces, dbt converter, and auto-ingestion updated.
edit_model/create_model now take 'columns' and 'measures' (formulas)
parameters; inspect_model emits a unified Columns table plus a Measures
(formulas) table.

The cross-entity {model_name, query_name} uniqueness validator from the
issue is deferred to S2 (NamedQuery storage isn't merged on this branch).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 29, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Merge model dimensions + measures into a unified columns schema, repurpose model-level measures as named/formula objects, rename query fieldsmeasures, and apply schema, engine, ingestion, MCP/API, storage migrations, docs, CLI, and tests updates to v2.

Changes

Cohort / File(s) Summary
Core schema & query types
slayer/core/models.py, slayer/core/enums.py, slayer/core/query.py
Introduce Column and ModelMeasure, bump SlayerModel/SlayerQuery to version 2, replace dimensions with columns, change query fieldsmeasures, and add aggregation-eligibility/PK restriction constants.
Engine, enrichment & ingestion
slayer/engine/query_engine.py, slayer/engine/enrichment.py, slayer/engine/ingestion.py
Rework probing/enrichment/ingestion to operate on columns + ModelMeasure, enforce allowed_aggregations and PK aggregation rules, adjust format inference and auto-ingest naming.
DBT conversion
slayer/dbt/converter.py
Convert dbt outputs into unified columns, consolidate measures into columns, and resolve measure formulas by scanning columns.
Storage & migrations
slayer/storage/migrations.py, slayer/storage/v2_migration.py
Bump persisted schema versions to v2 and add v1→v2 migrations that merge dimensions/measurescolumns, convert legacy measure aliases, rename query fieldsmeasures, and detect/raise name collisions.
MCP / API / CLI surface
slayer/mcp/server.py, slayer/api/server.py, slayer/cli.py
MCP sampling/profiling/inspect and API endpoints now expose/use columns/measures; QueryRequest accepts measures; CLI status prints column counts.
Docs & examples
docs/**, docs/examples/**, docs/getting-started/**, docs/interfaces/**, docs/reference/**
Mass-update examples and narrative from fieldsmeasures, document unified columns model and measure-as-formula semantics, add percentile aggregation, and update auto-ingest/collision wording.
Tests & notebooks
tests/**, docs/examples/*_nb.ipynb
Comprehensive test updates to use SlayerModel(columns=[...]), ModelMeasure for formulas and SlayerQuery.measures, added migration tests for v1→v2, and notebook cells updated to inspect columns.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Server
    participant Storage
    participant Migrator
    Client->>Server: POST /query or create model (payload uses measures/columns)
    Server->>Storage: Read/persist model or query
    Storage->>Migrator: Detect version < 2 (has dimensions/fields/measures v1)
    Migrator->>Storage: Transform v1 {dimensions,measures,fields} -> v2 {columns,measures}
    Storage-->>Server: Return v2-validated model/query
    Server->>Server: Enrich/validate using columns + ModelMeasure
    Server-->>Client: Respond with enriched/validated v2 model or query result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related issues

  • #48 — Auto-ingest behavior: changes to slayer/engine/ingestion.py (stop auto-emitting row-level measures and rename count column) align with the auto-ingestion behavior described in that issue.

Possibly related PRs

  • PR #17 — Overlaps on query enrichment/refactor and touches the same enrichment surfaces.
  • PR #26 — Overlaps on numeric formatting propagation through models/ingestion/query responses.
  • PR #15 — Related changes to query coercion/BeforeValidator behavior and fields→measures renaming.

Suggested reviewers

  • AivanF

Poem

🐰 I stitched two lists into one tidy row,
Columns now hop where dims and measures grow.
Named formulas nestle, ready for a chase,
Queries ask for measures, tidy in their place.
Carrots, v2, and a joyful bounding pace! 🥕✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 64.03% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'S3: merge dimensions+measures into unified columns (#52)' accurately summarizes the main change: consolidating the v1 dimension/measure split into a unified columns schema.
Linked Issues check ✅ Passed All major coding objectives from issue #52 are met: Column model created, SlayerModel.columns replaces dimensions/measures, SlayerQuery.fields renamed to measures, type-based aggregation eligibility rules implemented, v1→v2 migration added, and all affected codepaths updated.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the schema v1→v2 migration: model definitions, query structures, storage migration, documentation updates, and comprehensive test coverage. No unrelated refactoring or unplanned feature additions detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch S3

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (9)
docs/index.md (1)

68-72: ⚠️ Potential issue | 🟡 Minor

Update the architecture diagram label from fields to measures for v2 consistency.

The top example uses measures, but this diagram still references fields.

Suggested patch
-         SlayerQuery (model, fields, dimensions, filters)
+         SlayerQuery (model, measures, dimensions, filters)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/index.md` around lines 68 - 72, Update the architecture diagram label
inside the ASCII block that currently reads "SlayerQuery (model, fields,
dimensions, filters)" to use "measures" instead of "fields" so it matches v2
terminology; locate the ASCII diagram near the "Agent --> MCP / REST API /
Python SDK" line in docs/index.md and change the token "fields" to "measures"
(i.e., "SlayerQuery (model, measures, dimensions, filters)").
tests/perf/test_bench.py (1)

22-27: ⚠️ Potential issue | 🟠 Major

Use execute_sync() instead of the async execute() method in the benchmark helper.

The _execute() function at line 26 calls engine.execute(query=query), which is an async method. In the synchronous benchmark context (line 167), this returns a coroutine instead of a SlayerResponse, invalidating the timing measurement.

Suggested patch
 def _execute(env: BenchEnv, **query_kwargs) -> SlayerResponse:
     """Execute a query and return the response."""
     engine, _ = env
     query = SlayerQuery(source_model="orders", **query_kwargs)
-    return engine.execute(query=query)
+    return engine.execute_sync(query=query)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/perf/test_bench.py` around lines 22 - 27, The helper _execute currently
calls the async method engine.execute(query=query) and returns a coroutine
instead of a SlayerResponse; change it to call the synchronous API
engine.execute_sync(query=query) so the function actually returns a
SlayerResponse in the synchronous benchmark context, keeping the same signature
(def _execute(env: BenchEnv, **query_kwargs) -> SlayerResponse) and still
constructing the query with SlayerQuery(source_model="orders", **query_kwargs).
slayer/cli.py (1)

51-51: ⚠️ Potential issue | 🟡 Minor

CLI help examples still use deprecated fields query key.

Please update these sample commands to use "measures" so users copy/paste valid v2 payloads.

Suggested patch
-  slayer query '{"source_model": "orders", "fields": [{"formula": "count"}]}'
+  slayer query '{"source_model": "orders", "measures": [{"formula": "count"}]}'
@@
-  slayer query '{"source_model": "orders", "fields": [{"formula": "count"}]}'
+  slayer query '{"source_model": "orders", "measures": [{"formula": "count"}]}'
@@
-  slayer query '{"source_model": "orders", "fields": [{"formula": "count"}]}' --dry-run
+  slayer query '{"source_model": "orders", "measures": [{"formula": "count"}]}' --dry-run

Also applies to: 121-121, 127-127

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@slayer/cli.py` at line 51, Update the CLI help examples in slayer/cli.py to
use the v2 payload key "measures" instead of the deprecated "fields": find the
example strings containing '{"source_model": "orders", "fields": ...}' (and the
other occurrences around the same examples) and replace "fields" with "measures"
so the sample commands (e.g., the slayer query example) produce valid v2
payloads.
docs/concepts/formulas.md (1)

19-26: ⚠️ Potential issue | 🟡 Minor

Update remaining fields wording to measures in this section.

The examples now correctly use "measures", but the nearby prose still says formulas go in fields. Please align the text to avoid schema confusion.

Suggested doc patch
-Colon syntax is used everywhere measures appear: in `fields`, in arithmetic expressions, in transform function arguments, and in filters.
+Colon syntax is used everywhere measures appear: in `measures`, in arithmetic expressions, in transform function arguments, and in filters.
@@
-Field formulas define what data columns a query returns. They go in the `fields` parameter:
+Field formulas define what data columns a query returns. They go in the `measures` parameter:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/concepts/formulas.md` around lines 19 - 26, Update the prose that still
refers to `fields` to consistently use `measures`: change the "Field Formulas"
heading and the sentence "Field formulas define what data columns a query
returns. They go in the `fields` parameter:" to use "Measure Formulas" and
"Measure formulas define what data columns a query returns. They go in the
`measures` parameter:" (and replace any other stray `fields` occurrences in this
section), ensuring examples already using "measures" match the surrounding text.
docs/examples/07_aggregations/aggregations.md (1)

9-19: ⚠️ Potential issue | 🟠 Major

Model schema examples still show v1-style measures instead of v2 columns.

These YAML blocks define row-level SQL under measures, which conflicts with the v2 schema where row-level definitions are in columns. As written, users may copy invalid model config.

Suggested doc direction
-measures:
-  - name: revenue
-    sql: amount
+columns:
+  - name: revenue
+    sql: amount
+    type: number

Apply the same change pattern to other model-definition snippets in this file.

Also applies to: 102-112

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/examples/07_aggregations/aggregations.md` around lines 9 - 19, Update
the example YAML model snippets that use the v1 "measures" block to the v2
"columns" schema: replace the top-level "measures" list with a "columns" list
and for each entry change the shape to a column object containing "name", "type:
measure" (or appropriate measure type), and the row-level expression under "sql"
(keeping the existing expressions like amount, unit_price, qty). Do the same
replacement for the other model-definition snippets in this file that still show
"measures" (including the block around the previously noted lines) so examples
conform to the v2 model schema.
tests/integration/test_integration_postgres.py (1)

499-511: ⚠️ Potential issue | 🟠 Major

This assertion still reflects the old dimension/measure split.

With v2, orders.columns is the merged row-level field list, so amount should remain in columns even if ingestion also creates a saved measure for it. Keeping assert "amount" not in dim_names will fail against the intended schema shape.

Suggested test update
-        dim_names = [d.name for d in orders.columns]
+        column_names = [c.name for c in orders.columns]
         # Should have own columns only (no flattened joined dims)
-        assert "id" in dim_names
-        assert "customer_id" in dim_names
-        # Float-like columns (DECIMAL) get measures only, no dimension
-        assert "amount" not in dim_names
+        assert "id" in column_names
+        assert "customer_id" in column_names
+        assert "amount" in column_names
         # Joined dimensions are resolved via join graph, not pre-flattened
-        assert not any("." in name for name in dim_names)
+        assert not any("." in name for name in column_names)

As per coding guidelines: SlayerModel.columns: List[Column] replaces separate dimensions and measures.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_integration_postgres.py` around lines 499 - 511,
Update the test_orders_has_own_dimensions_only assertion to reflect that
SlayerModel.columns is the merged row-level field list: in the test function
test_orders_has_own_dimensions_only replace the check asserting "amount" is
absent from dim_names (which currently uses orders.columns and dim_names =
[d.name for d in orders.columns]) so that "amount" is allowed/present; keep
checks for "id" and "customer_id" and the no-dot check, but remove or invert the
line assert "amount" not in dim_names so the test expects "amount" to be in
orders.columns (or simply delete that assertion).
tests/integration/test_integration.py (1)

136-145: ⚠️ Potential issue | 🟠 Major

These assertions are still pinning the internal *_count alias.

SlayerResponse.data is the public result shape, so *:count should be asserted as orders.count, not orders._count. The same stale key shows up in several tests below.

🔁 Example update
-    assert response.data[0]["orders._count"] == 6
+    assert response.data[0]["orders.count"] == 6

Based on learnings, when validating *:count measure naming, the user-visible output/result column name uses orders.count, while orders._count is only the internal EnrichedMeasure.alias.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_integration.py` around lines 136 - 145, The test is
asserting the internal alias `orders._count` instead of the public result column
for the `*:count` measure; update the assertions in this test (and any similar
ones) to check response.data[0]["orders.count"] (the public column name produced
by the `*:count` measure) rather than `orders._count`, while keeping the same
objects referenced (SlayerQuery, ModelMeasure, SlayerResponse, response). Ensure
all other tests that validate `*:count` output are updated similarly.
slayer/mcp/server.py (2)

1789-1790: ⚠️ Potential issue | 🟡 Minor

Minor: Missing space before if.

Proposed fix
-    return "\n".join(lines)if lines else ""
+    return "\n".join(lines) if lines else ""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@slayer/mcp/server.py` around lines 1789 - 1790, The return expression has a
missing space before the conditional operator: change the line "return
\"\\n\".join(lines)if lines else \"\"" to include a space so it reads "return
\"\\n\".join(lines) if lines else \"\"" (the variables involved are lines and
measure_lines where lines is extended earlier).

1693-1698: ⚠️ Potential issue | 🔴 Critical

Bug: m.dimensions is not a valid attribute in v2 schema.

This code references m.dimensions which no longer exists in v2. The model now uses m.columns. This will cause an AttributeError at runtime.

🐛 Proposed fix
         lines = [f"Ingested {len(models)} model(s):"]
         for m in models:
-            lines.append(f"- {m.name} ({len(m.dimensions)} dims, {len(m.measures)} measures)")
+            lines.append(f"- {m.name} ({len(m.columns)} columns, {len(m.measures)} measures)")
         lines.append("")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@slayer/mcp/server.py` around lines 1693 - 1698, The summary builder
references the removed attribute m.dimensions causing an AttributeError; update
the loop that builds lines (iterating over models and using m.name,
m.dimensions, m.measures) to use m.columns instead of m.dimensions (i.e.,
compute dims as len(m.columns)), leaving m.measures as-is so the output stays
accurate for the v2 model schema.
🧹 Nitpick comments (1)
docs/getting-started/python.md (1)

143-143: Please normalize the rest of this page to measures as well.

Great fix here, but earlier SDK snippets still use fields (Line 59, Line 92, Line 104). Consider updating them too so the page doesn’t mix v1/v2 terminology.

Suggested patch
-    fields=["*:count", "revenue:sum"],
+    measures=["*:count", "revenue:sum"],
...
-    fields=["*:count"],
+    measures=["*:count"],
...
-    fields=["*:count", "revenue:sum"],
+    measures=["*:count", "revenue:sum"],

Based on learnings: “When renaming a field or changing response structure, grep all docs and skills for old names and update every occurrence.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/getting-started/python.md` at line 143, The page mixes v1/v2
terminology: update all query examples that use the old "fields" key to use
"measures" instead so they match the demonstrated snippet (e.g., the example
calling result = engine.execute(query={"source_model": "orders", "measures":
["*:count"]})); search the document for any occurrences of "fields" in SDK query
examples and replace them with "measures", ensuring the surrounding structure
(query={"source_model": ..., "measures": [...]}) and example outputs remain
consistent with the v2 response shape.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/concepts/models.md`:
- Around line 79-90: The documentation incorrectly states that an explicit
allowed_aggregations whitelist overrides both the default-by-type rule and the
primary-key restriction; update the sentence so it clearly says that while
allowed_aggregations overrides the default set derived from
DEFAULT_AGGREGATIONS_BY_TYPE, it does NOT bypass the primary-key restriction —
primary-key columns remain limited to count and count_distinct regardless of any
allowed_aggregations whitelist.

In `@docs/concepts/queries.md`:
- Around line 10-12: The docs mix the old term "dimensions" with the new
"columns" for ModelExtension; update all occurrences of "dimensions" in this
file's ModelExtension examples/field lists to use "columns" to match the
parameter table (e.g., change any ModelExtension field named dimensions to
columns and update related examples and descriptions), and verify references to
symbols like source_model, measures, ModelExtension, ColumnRef are consistent
across the page so the v2 docs no longer contradict each other.

In `@docs/getting-started/rest-api.md`:
- Around line 61-65: The REST docs show a "measures" payload but the FastAPI
request model QueryRequest in slayer/api/server.py only defines fields, so
FastAPI will drop measures; update QueryRequest to include an optional measures:
List[str] (and any related types) so FastAPI preserves measures for
SlayerQuery's migration logic, and ensure SlayerQuery consumption still works
(or add a migration step there) before returning responses; alternatively, if
you don't want to change the API surface now, revert the docs to use only fields
examples across the referenced sections (61-65, 87-102, 144-145) so they match
the current QueryRequest behavior.

In `@docs/interfaces/cli.md`:
- Line 72: Update the dry-run example command string used in the docs example
for the CLI (the `slayer query '{"source_model": "orders", "measures":
["count"]}' --dry-run` snippet) to use colon aggregation syntax so the measure
is unambiguous—replace `"count"` with `"*:count"` so the example becomes `slayer
query '{"source_model": "orders", "measures": ["*:count"]}' --dry-run`, matching
v2 query syntax and the project's aggregation guidelines.

In `@docs/interfaces/rest-api.md`:
- Line 36: The docs example uses the key "measures" which doesn't match the
server request model; update the REST example to use the "fields" key instead
(for example replace "measures": ["*:count"] with "fields": ["*:count"]) so it
aligns with the QueryRequest type in QueryRequest (slayer/api/server.py);
alternatively, if you prefer to support the "measures" name, add mapping logic
in QueryRequest parsing to accept "measures" as an alias for "fields" and
normalize it to the existing field handling.

In `@docs/reference/cli.md`:
- Line 66: Update the CLI example queries that use a bare "count" measure to
explicitly specify the aggregation using the colon syntax (e.g., change
"measures": ["count"] to "measures": ["*:count"]) so measures vs aggregations
are unambiguous; locate the example query strings shown (the slayer query JSON
examples containing "measures": ["count"]) and replace each occurrence with the
explicit aggregation form "*:count".

In `@docs/reference/rest-api.md`:
- Line 36: Update the POST /query example payload in docs/reference/rest-api.md
to use the current request model's "fields" key and express aggregations using
colon syntax in formulas (e.g., replace the "measures": ["count"] entry with an
appropriate "fields" entry using "*:count" or "revenue:sum" style). Locate the
example payload where "measures" is used and change it to the "fields" array
with aggregation-suffixed field names so the example matches the server contract
and other count examples.

In `@slayer/core/enums.py`:
- Around line 152-179: DEFAULT_AGGREGATIONS_BY_TYPE currently exposes extra
aggregations beyond the v2 contract; update the mapping so numeric types
(DataType.NUMBER) only include "sum","avg","min","max","count","count_distinct";
string/date/timestamp types (DataType.STRING, DataType.DATE, DataType.TIMESTAMP)
only include "count","count_distinct","first","last" (remove "min"/"max" for
non-numeric types); boolean (DataType.BOOLEAN) should only include
"count","count_distinct" (remove "sum"); keep PRIMARY_KEY_AGGREGATIONS as-is.
Modify the DEFAULT_AGGREGATIONS_BY_TYPE dict to match these exact sets
referenced by the DataType enum names.

In `@slayer/core/query.py`:
- Around line 270-274: Update the docstring example in slayer/core/query.py that
shows measures and filters to use colon aggregation syntax required by v2
formulas: replace bare occurrences like "count" and "revenue / count" with
explicit aggregations such as "*:count" and "revenue:sum / *:count" in the
measures example (the docstring block showing measures=[{"formula": ...}, ...]);
keep the filters example unchanged and ensure the formulas follow the
"metric:aggregate" pattern used by the query formula contract.

In `@slayer/dbt/converter.py`:
- Around line 356-387: entity_col_names is computed once from cols and not
updated when you append new primary key Columns, causing duplicates (e.g., when
sm.primary_entity matches a column just added in the entity loop). Fix by
updating entity_col_names after you append a Column (or recomputing/using
current cols to check membership) so the shorthand primary_entity check uses the
up-to-date set; update the set inside the entity loop whenever you call
cols.append(...) (references: cols, entity_col_names, sm.entities,
sm.primary_entity, Column, DataType.NUMBER).

In `@slayer/engine/enrichment.py`:
- Around line 253-271: The current validation only blocks numeric-only
aggregations on strings but doesn't enforce the effective allowed set; update
the logic in the block around measure_def.allowed_aggregations to compute an
effective_allowed set: if measure_def.allowed_aggregations is not None use it,
else if measure_def.is_primary_key use PRIMARY_KEY_AGGREGATIONS, otherwise use
DEFAULT_AGGREGATIONS_BY_TYPE.get(str(measure_def.type), DEFAULT_FALLBACK) and
then validate aggregation_name against that set and raise the ValueError
referencing aggregation_name, measure_name, and model.name; keep or remove the
separate NUMERIC_ONLY_AGGREGATIONS check as redundant once the effective_allowed
set is enforced.

In `@slayer/engine/ingestion.py`:
- Around line 508-545: _columns_to_model currently builds only
SlayerModel.columns and returns SlayerModel without populating
SlayerModel.measures, which drops the auto-generated measure library; update
_columns_to_model to also build a measures list by iterating the same input
columns (skip joined columns and primary-key columns), creating ModelMeasure
instances for each non-ID column (use the same column alias logic for "_count"
where applicable) and attach that list to the returned SlayerModel.measures;
keep the measure naming consistent with the new auto-ingest contract (one
measure per non-ID column, do not generate old-style _sum/_avg suffixed
measures) and reference SlayerModel, ModelMeasure, and _columns_to_model when
locating where to add this logic.

In `@slayer/storage/v2_migration.py`:
- Around line 105-112: The detector currently assumes raw_measures from
data.get("measures") is subscriptable and indexable, which raises
KeyError/TypeError for malformed inputs like {"measures": {"name":"revenue"}};
update the guard to verify raw_measures is a sequence (e.g., list/tuple or
collections.abc.Sequence) and non-empty before doing first = raw_measures[0], so
that the function (in slayer/storage/v2_migration.py around the raw_measures
check and the subsequent "first" handling) returns False for bad shapes instead
of raising.

In `@tests/integration/test_mcp_inspect.py`:
- Around line 429-430: Remove the extra conflicting status column append and use
the existing status column on the model: delete the call that appends
Column(name="status", sql="status", type=DataType.NUMBER) to
env["model"].columns and do not create a second column with a different
DataType; simply persist env["model"] (env["storage"].save_model(env["model"]))
as-is and rely on the existing status column (and get_column("status")); apply
the same removal for the other duplicated append sites that add a numeric status
column.

In `@tests/test_api_server.py`:
- Around line 102-121: Update the test_hidden_columns_excluded_from_get test so
it asserts hidden columns are absent rather than only checking visible ones:
after calling client.get("/models/orders") and building col_names from
data["columns"], add negative assertions that "internal_flag" and "secret_sum"
are NOT in col_names (e.g., assert "internal_flag" not in col_names and assert
"secret_sum" not in col_names) to ensure GET /models/{name} filters v2 columns
labeled hidden.

In `@tests/test_dbt_converter.py`:
- Around line 297-301: The test is grabbing the first column
(result.models[0].columns[0]) which is a PK/entity column due to
_convert_semantic_model prepending entity columns, so the assertion doesn't
actually verify the converted measure; update the test to locate the measure
column by name (e.g., find the column whose name equals "revenue" in
result.models[0].columns) and then assert that that column's
allowed_aggregations is None; keep using DbtToSlayerConverter and result.models
to find the model, but select the measure by name instead of index.

In `@tests/test_mcp_server.py`:
- Around line 406-408: The test currently asserts every column in
parsed["columns"] contains a "filter" key, but _model_to_summary() only emits
"filter" for columns that define one (e.g., "val" but not "id"); update the test
loop to either conditionally check for "filter" (if "filter" in c: assert ...),
or restrict the assertion to the specific column that should have a filter (find
the column with name "val" in parsed["columns"] and assert "filter" in that
dict), leaving the "id" column unasserted for "filter".

---

Outside diff comments:
In `@docs/concepts/formulas.md`:
- Around line 19-26: Update the prose that still refers to `fields` to
consistently use `measures`: change the "Field Formulas" heading and the
sentence "Field formulas define what data columns a query returns. They go in
the `fields` parameter:" to use "Measure Formulas" and "Measure formulas define
what data columns a query returns. They go in the `measures` parameter:" (and
replace any other stray `fields` occurrences in this section), ensuring examples
already using "measures" match the surrounding text.

In `@docs/examples/07_aggregations/aggregations.md`:
- Around line 9-19: Update the example YAML model snippets that use the v1
"measures" block to the v2 "columns" schema: replace the top-level "measures"
list with a "columns" list and for each entry change the shape to a column
object containing "name", "type: measure" (or appropriate measure type), and the
row-level expression under "sql" (keeping the existing expressions like amount,
unit_price, qty). Do the same replacement for the other model-definition
snippets in this file that still show "measures" (including the block around the
previously noted lines) so examples conform to the v2 model schema.

In `@docs/index.md`:
- Around line 68-72: Update the architecture diagram label inside the ASCII
block that currently reads "SlayerQuery (model, fields, dimensions, filters)" to
use "measures" instead of "fields" so it matches v2 terminology; locate the
ASCII diagram near the "Agent --> MCP / REST API / Python SDK" line in
docs/index.md and change the token "fields" to "measures" (i.e., "SlayerQuery
(model, measures, dimensions, filters)").

In `@slayer/cli.py`:
- Line 51: Update the CLI help examples in slayer/cli.py to use the v2 payload
key "measures" instead of the deprecated "fields": find the example strings
containing '{"source_model": "orders", "fields": ...}' (and the other
occurrences around the same examples) and replace "fields" with "measures" so
the sample commands (e.g., the slayer query example) produce valid v2 payloads.

In `@slayer/mcp/server.py`:
- Around line 1789-1790: The return expression has a missing space before the
conditional operator: change the line "return \"\\n\".join(lines)if lines else
\"\"" to include a space so it reads "return \"\\n\".join(lines) if lines else
\"\"" (the variables involved are lines and measure_lines where lines is
extended earlier).
- Around line 1693-1698: The summary builder references the removed attribute
m.dimensions causing an AttributeError; update the loop that builds lines
(iterating over models and using m.name, m.dimensions, m.measures) to use
m.columns instead of m.dimensions (i.e., compute dims as len(m.columns)),
leaving m.measures as-is so the output stays accurate for the v2 model schema.

In `@tests/integration/test_integration_postgres.py`:
- Around line 499-511: Update the test_orders_has_own_dimensions_only assertion
to reflect that SlayerModel.columns is the merged row-level field list: in the
test function test_orders_has_own_dimensions_only replace the check asserting
"amount" is absent from dim_names (which currently uses orders.columns and
dim_names = [d.name for d in orders.columns]) so that "amount" is
allowed/present; keep checks for "id" and "customer_id" and the no-dot check,
but remove or invert the line assert "amount" not in dim_names so the test
expects "amount" to be in orders.columns (or simply delete that assertion).

In `@tests/integration/test_integration.py`:
- Around line 136-145: The test is asserting the internal alias `orders._count`
instead of the public result column for the `*:count` measure; update the
assertions in this test (and any similar ones) to check
response.data[0]["orders.count"] (the public column name produced by the
`*:count` measure) rather than `orders._count`, while keeping the same objects
referenced (SlayerQuery, ModelMeasure, SlayerResponse, response). Ensure all
other tests that validate `*:count` output are updated similarly.

In `@tests/perf/test_bench.py`:
- Around line 22-27: The helper _execute currently calls the async method
engine.execute(query=query) and returns a coroutine instead of a SlayerResponse;
change it to call the synchronous API engine.execute_sync(query=query) so the
function actually returns a SlayerResponse in the synchronous benchmark context,
keeping the same signature (def _execute(env: BenchEnv, **query_kwargs) ->
SlayerResponse) and still constructing the query with
SlayerQuery(source_model="orders", **query_kwargs).

---

Nitpick comments:
In `@docs/getting-started/python.md`:
- Line 143: The page mixes v1/v2 terminology: update all query examples that use
the old "fields" key to use "measures" instead so they match the demonstrated
snippet (e.g., the example calling result =
engine.execute(query={"source_model": "orders", "measures": ["*:count"]}));
search the document for any occurrences of "fields" in SDK query examples and
replace them with "measures", ensuring the surrounding structure
(query={"source_model": ..., "measures": [...]}) and example outputs remain
consistent with the v2 response shape.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: acae7bf5-20f8-4209-b93f-246bcfb2cd7c

📥 Commits

Reviewing files that changed from the base of the PR and between 128b3d1 and eb9c499.

📒 Files selected for processing (52)
  • CLAUDE.md
  • docs/concepts/formulas.md
  • docs/concepts/ingestion.md
  • docs/concepts/models.md
  • docs/concepts/queries.md
  • docs/dbt/slayer_vs_dbt.md
  • docs/examples/01_dynamic/dynamic.md
  • docs/examples/03_auto_ingest/auto_ingest.md
  • docs/examples/04_time/time.md
  • docs/examples/05_joined_measures/joined_measures.md
  • docs/examples/06_multistage_queries/multistage_queries.md
  • docs/examples/07_aggregations/aggregations.md
  • docs/getting-started/cli.md
  • docs/getting-started/python.md
  • docs/getting-started/rest-api.md
  • docs/index.md
  • docs/interfaces/cli.md
  • docs/interfaces/python-client.md
  • docs/interfaces/rest-api.md
  • docs/reference/cli.md
  • docs/reference/python-client.md
  • docs/reference/rest-api.md
  • slayer/cli.py
  • slayer/core/enums.py
  • slayer/core/models.py
  • slayer/core/query.py
  • slayer/dbt/converter.py
  • slayer/engine/enrichment.py
  • slayer/engine/ingestion.py
  • slayer/engine/query_engine.py
  • slayer/mcp/server.py
  • slayer/storage/migrations.py
  • slayer/storage/v2_migration.py
  • tests/conftest.py
  • tests/integration/test_integration.py
  • tests/integration/test_integration_duckdb.py
  • tests/integration/test_integration_postgres.py
  • tests/integration/test_mcp_inspect.py
  • tests/perf/conftest.py
  • tests/perf/test_bench.py
  • tests/test_api_server.py
  • tests/test_client.py
  • tests/test_dbt_converter.py
  • tests/test_format_propagation.py
  • tests/test_formula.py
  • tests/test_join_sync.py
  • tests/test_mcp_server.py
  • tests/test_migrations.py
  • tests/test_models.py
  • tests/test_sql_generator.py
  • tests/test_sqlite_storage.py
  • tests/test_storage.py

Comment thread docs/concepts/models.md Outdated
Comment thread docs/concepts/queries.md
Comment on lines +10 to +12
| `source_model` | string, SlayerModel, or ModelExtension | Yes | Source model name, inline model, or model extension (adds columns/measures/joins) |
| `measures` | list[ModelMeasure] | No | Computed/aggregated valuesformulas, arithmetic, transforms. See [Formulas](formulas.md). |
| `dimensions` | list[ColumnRef] | No | Columns to group by. Supports dotted names for joined models (`customers.name`, `customers.regions.name`). |
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

This page still mixes columns and dimensions for ModelExtension.

The parameter table now says inline extensions add columns, but the later ModelExtension example/field list on this page still uses dimensions. Please update the remaining examples together so the v2 docs don't contradict themselves.

Based on learnings: "When renaming a field or changing response structure, grep all docs and skills for old names and update every occurrence".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/concepts/queries.md` around lines 10 - 12, The docs mix the old term
"dimensions" with the new "columns" for ModelExtension; update all occurrences
of "dimensions" in this file's ModelExtension examples/field lists to use
"columns" to match the parameter table (e.g., change any ModelExtension field
named dimensions to columns and update related examples and descriptions), and
verify references to symbols like source_model, measures, ModelExtension,
ColumnRef are consistent across the page so the v2 docs no longer contradict
each other.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not actionable. The ModelExtension example here uses columns, not dimensions (see line 312: "columns": [{"name": "tier", ...}]), and the field-list at line 320 explicitly says "source_name, columns, measures, joins". The "dimensions": ["tier"] at line 315 is the surrounding SlayerQuery parameter (per the table at line 12, dimensions is a top-level query field) — not a ModelExtension field. Those are two distinct schemas. No change needed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZmeiGorynych, you're right — I misread the example. The "dimensions": ["tier"] is a top-level SlayerQuery field (correctly documented in the parameter table), while the ModelExtension object itself already uses "columns" and the field list at line 320 is accurate. No contradiction exists. I apologize for the noise.


🧠 Learnings used
Learnt from: CR
Repo: MotleyAI/slayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-29T09:40:09.576Z
Learning: Applies to **/models.py : SlayerQuery.source_model accepts a model name, inline `SlayerModel`, or `ModelExtension` for extending models

Learnt from: CR
Repo: MotleyAI/slayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-29T09:40:09.576Z
Learning: Applies to **/models.py : Provide `SlayerResponse.attributes` as `ResponseAttributes` with `.dimensions` and `.measures` dicts, each mapping column alias to `FieldMetadata(label, format)`

Learnt from: CR
Repo: MotleyAI/slayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-29T09:40:09.576Z
Learning: Applies to **/models.py : SlayerModel, SlayerQuery, and DatasourceConfig must carry a `version: int` field (currently `1`) with schema versioning via converter chain in `slayer/storage/migrations.py`

Learnt from: AivanF
Repo: MotleyAI/slayer PR: 25
File: docs/getting-started/python.md:57-61
Timestamp: 2026-04-10T12:12:08.571Z
Learning: In the MotleyAI/slayer repo docs, use language-agnostic “JSON/dict syntax” when showing query objects on non-Python pages (e.g., REST API, CLI, MCP). When the page is specifically demonstrating the Python SDK directly—such as `docs/getting-started/python.md`—it is acceptable (and preferred) to show Python constructor syntax like `SlayerQuery(...)` instead of JSON/dict syntax. Avoid mixing constructors into language-agnostic examples.

Learnt from: ZmeiGorynych
Repo: MotleyAI/slayer PR: 59
File: slayer/mcp/server.py:1725-1740
Timestamp: 2026-04-29T13:25:08.874Z
Learning: In MotleyAI/slayer, `ResponseAttributes` (in `slayer/engine/query_engine.py`) is a Pydantic BaseModel with `.dimensions: Dict[str, FieldMetadata]` and `.measures: Dict[str, FieldMetadata]` fields, and a custom `.get(column: str) -> Optional[FieldMetadata]` method (lines 76-78) that looks up a column key across both dicts. `SlayerResponse.attributes` is always a `ResponseAttributes` instance (never None) because it uses `default_factory=ResponseAttributes`. Do not flag `attrs.get(col_name)` as a crash — it is a valid, intentional API call.

Learnt from: ZmeiGorynych
Repo: MotleyAI/slayer PR: 46
File: tests/test_sql_generator.py:3087-3103
Timestamp: 2026-04-28T11:23:09.032Z
Learning: In `tests/test_sql_generator.py` and the slayer codebase (Python), there are two distinct naming layers for `*:count`:
1. **User-facing result column name** (what CLAUDE.md documents): `*:count → orders.count` (star-colon prefix stripped for display).
2. **Internal `EnrichedMeasure.alias`** (used inside the enrichment pipeline and SQL generator): `orders._count` (leading underscore retained). Documented at `slayer/sql/generator.py` lines ~1003-1011 and `enrichment.py` line ~339.
Do NOT flag an assertion of `orders._count` as wrong when checking `EnrichedMeasure.alias` — it is the correct internal alias. Only `orders.count` applies to the user-visible output column name.

Learnt from: ZmeiGorynych
Repo: MotleyAI/slayer PR: 61
File: slayer/engine/ingestion.py:508-545
Timestamp: 2026-04-29T13:04:25.377Z
Learning: Applies to **/ingestion.py (MotleyAI/slayer): In v2, auto-ingestion leaves `SlayerModel.measures` (named formula library) empty. All introspected columns are emitted as `Column` entries on `SlayerModel.columns`. Named formula measures are an opt-in library that users populate themselves — not auto-generated. The v1 behavior of generating one `Measure` per non-ID column no longer exists. There is an explicit test `test_orders_has_no_named_measures_after_ingest` in `tests/integration/test_integration_duckdb.py` asserting `orders.measures == []` after ingestion.

Learnt from: CR
Repo: MotleyAI/slayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-29T09:40:09.576Z
Learning: Applies to **/*.py : Use bare column names for Dimension/measure SQL (e.g., `"amount"`); use `model_name.column_name` for complex expressions (e.g., `"orders.amount * orders.quantity"`)

Learnt from: CR
Repo: MotleyAI/slayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-29T09:40:09.576Z
Learning: Applies to **/*.py : Treat measures and aggregations as separate concepts: measures are named row-level SQL expressions; aggregations are specified at query time using colon syntax (e.g., `"revenue:sum"`, `"*:count"`)

Learnt from: ZmeiGorynych
Repo: MotleyAI/slayer PR: 22
File: slayer/core/models.py:138-143
Timestamp: 2026-04-10T10:45:29.437Z
Learning: In MotleyAI/slayer, the `Measure.type` (deprecated DataType) field has been intentionally and permanently removed from the `Measure` model in `slayer/core/models.py`. The project is early-stage and the maintainer (ZmeiGorynych) opted for a clean break — no backward-compatibility shim or DeprecationWarning path is desired. Do not flag the absence of `Measure.type` as a backward-compatibility issue.

Learnt from: ZmeiGorynych
Repo: MotleyAI/slayer PR: 26
File: slayer/engine/ingestion.py:493-500
Timestamp: 2026-04-13T08:02:30.828Z
Learning: Applies to **/slayer/**/*ingest*.py : The column renaming rule mapping a physical column named "count" to measure name "count_value" is deprecated and no longer applied. Auto-ingestion emits measures using the raw column name as-is (except `_count` which is still renamed to `count_col`). Do not flag the absence of "count" → "count_value" renaming as a bug.

Learnt from: AivanF
Repo: MotleyAI/slayer PR: 25
File: docs/getting-started/mcp.md:54-54
Timestamp: 2026-04-10T12:15:07.864Z
Learning: When reviewing MotleyAI/slayer documentation (e.g., docs/getting-started/mcp.md), do not apply the JSON/dict syntax rule to illustrative/prose descriptions of an agent’s actions. Specifically, if the text uses conversational or narrative formatting (such as blockquoted lines like “**Agent:** *calls `query(...)`*”), treat it as descriptive prose rather than a real code example/snippet. Only enforce the JSON/dict syntax rule within actual code blocks or real code snippets; keyword-argument-looking syntax in prose should not be flagged as a violation.

Comment thread docs/getting-started/rest-api.md
Comment thread docs/interfaces/cli.md Outdated
Comment thread docs/interfaces/rest-api.md
Comment thread slayer/storage/v2_migration.py
Comment thread tests/integration/test_mcp_inspect.py Outdated
Comment thread tests/test_api_server.py Outdated
Comment thread tests/test_dbt_converter.py
Comment thread tests/test_mcp_server.py
- Notebooks: replace v1 ``Dimension``/``Measure`` constructors and
  ``model.dimensions`` / ``model.measures`` (row-level) accesses with the
  v2 ``Column`` and ``model.columns`` shape across:
  - docs/examples/02_sql_vs_dsl/sql_vs_dsl_nb.ipynb
  - docs/examples/03_auto_ingest/auto_ingest_nb.ipynb
  - docs/examples/05_joins/joins_nb.ipynb
  - docs/examples/07_aggregations/aggregations_nb.ipynb

  Notebook outputs are cleared so the executor regenerates them on next
  run, and explanatory comments describe the new schema (column-level
  ``allowed_aggregations``, type-default eligibility, ``measures`` as
  named-formula library, primary-key restriction to count/count_distinct).

- Postgres rollup-ingestion tests: replace v1-shape assertions
  ("amount not in dim_names", "amount in measure_names") with v2-shape
  assertions (every non-joined column lives on ``columns``;
  ``model.measures`` is empty after auto-ingest because it's now the
  named-formula library, not the row-level list). Same fix already
  applied to the duckdb suite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
docs/examples/07_aggregations/aggregations_nb.ipynb (1)

412-447: Consider exercising the custom aggregation here.

This registers weighted_avg_price, but the notebook never actually runs it. A tiny query/output would make the example self-validating.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/examples/07_aggregations/aggregations_nb.ipynb` around lines 412 - 447,
Add a short example that actually exercises the registered Aggregation named
"weighted_avg_price": after the items_model definition, build and execute a
small query that selects the weighted_avg_price aggregation (e.g., reference
items_model and its aggregation by name or items_model.aggregations[0]) against
the jaffle_shop/order_items table and print the result so the notebook validates
the custom aggregation (use the project’s usual model query/execute API such as
items_model.select / items_model.query / execute_query — whichever method your
codebase uses to run model queries).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@docs/examples/07_aggregations/aggregations_nb.ipynb`:
- Around line 412-447: Add a short example that actually exercises the
registered Aggregation named "weighted_avg_price": after the items_model
definition, build and execute a small query that selects the weighted_avg_price
aggregation (e.g., reference items_model and its aggregation by name or
items_model.aggregations[0]) against the jaffle_shop/order_items table and print
the result so the notebook validates the custom aggregation (use the project’s
usual model query/execute API such as items_model.select / items_model.query /
execute_query — whichever method your codebase uses to run model queries).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3cccf4f7-0803-41f4-916b-2b8973176d1c

📥 Commits

Reviewing files that changed from the base of the PR and between eb9c499 and a270d41.

📒 Files selected for processing (5)
  • docs/examples/02_sql_vs_dsl/sql_vs_dsl_nb.ipynb
  • docs/examples/03_auto_ingest/auto_ingest_nb.ipynb
  • docs/examples/05_joins/joins_nb.ipynb
  • docs/examples/07_aggregations/aggregations_nb.ipynb
  • tests/integration/test_integration_postgres.py

ZmeiGorynych and others added 2 commits April 29, 2026 14:50
Fixes 7 CodeRabbit comments on #61 plus a parallel doc sweep. The REST
QueryRequest still declared the v1 `fields` key, so any client following
the new docs would have measures silently stripped by FastAPI before
SlayerQuery's v1→v2 migration ever ran. Rename to `measures` and set
extra="allow" on the request model so legacy `fields` payloads still flow
through to the existing migration. Sweep `*:count` and `measures` usage
across all docs and the SlayerQuery docstring; fix ModelExtension docs
that still showed `dimensions` for what is now `columns`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixes 3 of 4 CodeRabbit test-fix comments on #61. The fourth
(test_mcp_server.py:408) was a false positive — the inspect_model JSON
path unconditionally emits `filter` when show_sql=True, so the loop
passes as written.

Also fixes a real API bug surfaced by the test_api_server review:
GET /models/{name} was filtering legacy `dimensions`/`measures` keys
that no longer exist on v2 SlayerModel and ignoring the new `columns`
list, so hidden columns were leaking through. Switch the filter to
`columns`. The test that previously masked this with a "visible ones
present" check now strictly asserts hidden columns are absent.

Other test fixes: drop spurious `status` Column appends in
test_mcp_inspect.py — the env fixture already declares `status` as a
DataType.STRING column, so adding a second NUMBER copy made the model
internally contradictory. Pick `revenue` by name in test_dbt_converter
instead of by index — `_convert_semantic_model` prepends entity/PK
columns, so columns[0] was the PK, not the converted measure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/getting-started/python.md`:
- Around line 57-61: The example uses the async SlayerQueryEngine.execute()
without awaiting, causing a coroutine to be shown in synchronous docs; update
the snippet(s) to call the synchronous API instead
(SlayerQueryEngine.execute_sync() or engine.execute_sync(...)) wherever
SlayerQueryEngine.execute() is used in synchronous examples (e.g., the shown
query and the other occurrence at line ~143) so the code runs correctly in
CLI/notebook/script contexts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 36beb557-8723-46a0-8bfd-8e41c8062f0f

📥 Commits

Reviewing files that changed from the base of the PR and between a270d41 and 0582c23.

📒 Files selected for processing (12)
  • docs/concepts/models.md
  • docs/concepts/queries.md
  • docs/getting-started/python.md
  • docs/getting-started/rest-api.md
  • docs/interfaces/cli.md
  • docs/interfaces/mcp.md
  • docs/reference/cli.md
  • docs/reference/mcp.md
  • docs/reference/rest-api.md
  • slayer/api/server.py
  • slayer/core/query.py
  • tests/test_api_server.py
✅ Files skipped from review due to trivial changes (6)
  • docs/interfaces/mcp.md
  • docs/reference/mcp.md
  • docs/interfaces/cli.md
  • docs/reference/cli.md
  • docs/reference/rest-api.md
  • docs/getting-started/rest-api.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • docs/concepts/models.md
  • docs/concepts/queries.md

Comment thread docs/getting-started/python.md Outdated
Two of three CodeRabbit code-bug comments on #61 were valid; the third
(engine/ingestion.py:545 "drops the auto-generated measure library")
was applying v1 expectations — v2 deliberately leaves measures empty
after ingestion, as test_orders_has_no_named_measures_after_ingest
documents.

dbt converter: when a semantic model declares both a primary entity in
`entities` and the same column via the `primary_entity` shorthand, the
shorthand block was checking against a stale `entity_col_names` set
captured before the entity loop ran, so the PK column was appended a
second time. Update the set after each append.

v2 migration detector: `_looks_like_v1_model` subscripted
`raw_measures[0]` after only checking truthiness, so a malformed payload
like `{"measures": {"name": "revenue"}}` crashed with `KeyError: 0`
inside the detector instead of falling through to Pydantic validation.
Guard with isinstance before subscripting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/integration/test_mcp_inspect.py (1)

423-466: ⚠️ Potential issue | 🟠 Major

Update query payloads to use measures instead of legacy fields key.

These tests currently validate the v1-to-v2 migration path for SlayerQuery, which converts "fields""measures" automatically. Tests should exercise the v2 contract directly rather than rely on backward-compatibility migration, making the test intent clearer and avoiding brittleness if migrations change.

🔧 Proposed fix
 def _run(self, env, formula: str) -> None:
     from slayer.core.query import SlayerQuery
 
     q = SlayerQuery.model_validate({
         "source_model": "orders",
-        "fields": [{"formula": formula}],
+        "measures": [{"formula": formula}],
     })
     await env["engine"].execute(query=q)
@@
         q = SlayerQuery.model_validate({
             "source_model": "orders",
-            "fields": [{"formula": "status:min"}, {"formula": "status:max"}],
+            "measures": [{"formula": "status:min"}, {"formula": "status:max"}],
         })
@@
         q = SlayerQuery.model_validate({
             "source_model": "orders",
-            "fields": [{"formula": "status:count"}, {"formula": "status:count_distinct"}],
+            "measures": [{"formula": "status:count"}, {"formula": "status:count_distinct"}],
         })
@@
         q = SlayerQuery.model_validate({
             "source_model": "orders",
-            "fields": [{"formula": "amount:sum"}, {"formula": "amount:avg"}],
+            "measures": [{"formula": "amount:sum"}, {"formula": "amount:avg"}],
         })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_mcp_inspect.py` around lines 423 - 466, The test
payloads are using the legacy "fields" key instead of the v2 "measures" key;
update all usages in this file (notably inside the helper _run() and in the
SlayerQuery.model_validate calls in tests like
test_numeric_only_aggregations_rejected_on_string,
test_min_max_allowed_on_string, test_count_and_count_distinct_allowed_on_string,
and test_numeric_aggregations_allowed_on_numeric_measure) to replace "fields":
[...] with "measures": [...], keeping the same inner formula objects so the
tests exercise the v2 contract directly.
tests/test_api_server.py (1)

168-176: ⚠️ Potential issue | 🟠 Major

Make the test async and await the model save.

storage.save_model() is async, but the test calls it without awaiting. This creates a coroutine that is immediately discarded—the model is never persisted. Per coding guidelines, convert the test to async def and await the save, or seed the fixture through the API instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_api_server.py` around lines 168 - 176, The test
test_query_missing_datasource currently calls the async method
storage.save_model without awaiting it, so convert the test function to async
def test_query_missing_datasource(...) and await storage.save_model(...) before
calling client.post (or alternatively seed the model via the API); ensure you
still use the TestClient to POST to "/query" and assert resp.status_code == 400
after the awaited save completes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@slayer/dbt/converter.py`:
- Around line 496-506: When appending the new filtered Column (filtered_column)
to slayer_model.columns ensure you first detect name collisions with existing
columns: iterate slayer_model.columns and check for any column.name ==
metric.name (or same case-normalized key) and handle the collision by either
skipping append, replacing the existing column, or raising/logging an error
depending on the intended behavior; update the logic around Column creation in
converter.py (the filtered_column, metric.name, and slayer_model.columns sites)
so duplicate names are not blindly appended and ambiguous formula resolution is
prevented.

---

Outside diff comments:
In `@tests/integration/test_mcp_inspect.py`:
- Around line 423-466: The test payloads are using the legacy "fields" key
instead of the v2 "measures" key; update all usages in this file (notably inside
the helper _run() and in the SlayerQuery.model_validate calls in tests like
test_numeric_only_aggregations_rejected_on_string,
test_min_max_allowed_on_string, test_count_and_count_distinct_allowed_on_string,
and test_numeric_aggregations_allowed_on_numeric_measure) to replace "fields":
[...] with "measures": [...], keeping the same inner formula objects so the
tests exercise the v2 contract directly.

In `@tests/test_api_server.py`:
- Around line 168-176: The test test_query_missing_datasource currently calls
the async method storage.save_model without awaiting it, so convert the test
function to async def test_query_missing_datasource(...) and await
storage.save_model(...) before calling client.post (or alternatively seed the
model via the API); ensure you still use the TestClient to POST to "/query" and
assert resp.status_code == 400 after the awaited save completes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2e2bc7f5-3913-45c1-b695-a7a9085a35ff

📥 Commits

Reviewing files that changed from the base of the PR and between 0582c23 and 76080fc.

📒 Files selected for processing (7)
  • slayer/api/server.py
  • slayer/dbt/converter.py
  • slayer/storage/v2_migration.py
  • tests/integration/test_mcp_inspect.py
  • tests/test_api_server.py
  • tests/test_dbt_converter.py
  • tests/test_migrations.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • slayer/storage/v2_migration.py
  • tests/test_migrations.py

Comment thread slayer/dbt/converter.py
Two of three CodeRabbit aggregation-eligibility comments were valid;
the third (enums.py:179 "narrow defaults") was applying invented
constraints not in the PR description, and would have removed real
useful aggregations.

The validator at engine/enrichment.py was only blocking numeric-only
aggregations on string columns, leaving the PK rule and the
type-default rule unenforced. Per CLAUDE.md "Primary-key columns are
always restricted to count/count_distinct aggregations regardless of
type" — but the code didn't actually do that. Replace the partial
check with the documented three-tier logic: PK rule wins (always), then
explicit allowed_aggregations whitelist, then type defaults. Custom
model-level aggregations bypass the type-default check (they're
intentional extensions).

The type-probe query in query_engine and the min/max profiler in
mcp/server were both using `:max` blanket-style across all columns;
under the new strict validator that broke on BOOLEAN columns since
max/min weren't in BOOLEAN's defaults. Two fixes:
- _build_type_probe_query now picks an aggregation per column from its
  effective allowed set (max preferred, fallback to first allowed).
- BOOLEAN defaults extended to include min/max/first/last (False<True
  ordering makes these meaningful and the probe path needs them).

Doc: models.md:90 now correctly states allowed_aggregations overrides
the type-default map but NOT the primary-key restriction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
docs/concepts/models.md (1)

250-258: ⚠️ Potential issue | 🟡 Minor

Use JSON/dict syntax for the query object in this docs example.

This snippet still uses SlayerQuery(...) constructor syntax; docs examples should keep query objects in portable JSON/dict form.

📝 Suggested doc update
 engine.create_model_from_query(
-    query=SlayerQuery(
-        source_model="orders",
-        time_dimensions=[...],
-        measures=["*:count", "amount:sum"],
-    ),
+    query={
+        "source_model": "orders",
+        "time_dimensions": [...],
+        "measures": ["*:count", "amount:sum"],
+    },
     name="monthly_summary",
 )

As per coding guidelines, use JSON/dict syntax for all query objects in docs and examples — not Python class constructors.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/concepts/models.md` around lines 250 - 258, The docs example uses the
Python constructor SlayerQuery in the call to engine.create_model_from_query;
replace the SlayerQuery(...) constructor form with a JSON/dict literal for the
query argument (e.g., a dict with keys source_model, time_dimensions, measures)
so the example is language-agnostic and portable; update the call site
engine.create_model_from_query(query=..., name="monthly_summary") to pass that
dict instead of SlayerQuery and ensure field names match the SlayerQuery schema
(source_model, time_dimensions, measures).
slayer/engine/enrichment.py (1)

571-630: ⚠️ Potential issue | 🟠 Major

Named saved measures are still unqueryable here.

SlayerQueryEngine._auto_move_fields_to_dimensions() now deliberately keeps bare formulas when model.get_measure(name) exists, but this loop never expands that saved measure before calling parse_formula(). A query like measures=[{"formula": "gross_margin"}] will still fall through to _ensure_measure_from_spec() and fail with “Bare measure name ... is not valid”, so the new SlayerModel.measures library cannot actually be used.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@slayer/engine/enrichment.py` around lines 571 - 630, The loop over
query.measures parses qfield.formula directly and never expands bare
saved-measure names; fix by checking before parse_formula whether qfield.formula
is a bare name that corresponds to a saved measure (call
model.get_measure(qfield.formula)), and if so replace/expand it into the saved
measure's formula/spec (or parse the saved measure's formula) so the subsequent
logic (including _ensure_aggregated_measure, _flatten_spec, and label handling)
operates on the resolved spec; ensure you preserve qfield.label when
substituting and keep the existing calls to parse_formula,
_ensure_aggregated_measure, and _flatten_spec unchanged for non-saved formulas.
slayer/engine/query_engine.py (1)

389-395: ⚠️ Potential issue | 🟠 Major

Use SlayerModel.model_validate() instead of model_copy() to ensure validators run.

In Pydantic v2, model_copy() bypasses all field and model validators. The collision check in SlayerModel._validate_column_measure_disjoint() will not run, allowing a ModelExtension to inject a column that collides with an existing measure name (or vice versa). This invalid state only fails later during enrichment rather than at model construction.

Proposed fix
-            return base.model_copy(
-                update={
-                    "columns": list(base.columns) + extra_cols,
-                    "measures": list(base.measures) + extra_measures,
-                    "joins": list(base.joins) + extra_joins,
-                }
-            )
+            return SlayerModel.model_validate({
+                **base.model_dump(),
+                "columns": [c.model_dump() for c in [*base.columns, *extra_cols]],
+                "measures": [m.model_dump() for m in [*base.measures, *extra_measures]],
+                "joins": [j.model_dump() for j in [*base.joins, *extra_joins]],
+            })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@slayer/engine/query_engine.py` around lines 389 - 395, Replace the call to
base.model_copy(...) with base.model_validate(...) so Pydantic v2 validators run
and the collision check in SlayerModel._validate_column_measure_disjoint()
executes at model construction; call model_validate with an update dict
containing the same keys ("columns", "measures", "joins") built from
list(base.columns) + extra_cols, list(base.measures) + extra_measures, and
list(base.joins) + extra_joins respectively so ModelExtension cannot inject
colliding names undetected.
♻️ Duplicate comments (1)
slayer/engine/query_engine.py (1)

805-826: ⚠️ Potential issue | 🟠 Major

Cross-model column:agg still bypasses the new eligibility gates.

Local refs now enforce primary-key restriction, explicit allowed_aggregations, and DEFAULT_AGGREGATIONS_BY_TYPE in slayer/engine/enrichment.py, but this branch still only rejects numeric-only aggregations on strings. customers.id:sum, customers.created_at:avg, or a dotted column with an explicit whitelist can still slip through here.

🔧 Suggested fix
-            if (
-                aggregation_name
-                and aggregation_name in NUMERIC_ONLY_AGGREGATIONS
-                and str(col_def.type) == "string"
-            ):
+            if col_def.primary_key:
+                allowed = PRIMARY_KEY_AGGREGATIONS
+            elif col_def.allowed_aggregations is not None:
+                allowed = frozenset(col_def.allowed_aggregations)
+            else:
+                is_custom_agg = target_model.get_aggregation(aggregation_name) is not None
+                allowed = (
+                    frozenset({aggregation_name})
+                    if is_custom_agg
+                    else DEFAULT_AGGREGATIONS_BY_TYPE.get(col_def.type, frozenset())
+                )
+            if aggregation_name not in allowed:
                 raise ValueError(
-                    f"Aggregation '{aggregation_name}' is not applicable to "
-                    f"string column '{measure_name}' in model '{target_model_name}'."
+                    f"Aggregation '{aggregation_name}' not allowed for column "
+                    f"'{measure_name}' in model '{target_model_name}'. Allowed: {sorted(allowed)}"
                 )

Also add PRIMARY_KEY_AGGREGATIONS to the enum import at the top of the file.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@slayer/engine/query_engine.py` around lines 805 - 826, The cross-model column
lookup branch currently only blocks numeric-only aggregations on string columns;
update it to enforce the same eligibility gates used for local refs by checking
PRIMARY_KEY_AGGREGATIONS, explicit allowed_aggregations on the column, and
DEFAULT_AGGREGATIONS_BY_TYPE based on column type before accepting a dotted
measure; specifically, when resolving measure_name via
target_model.get_column(...) in query_engine.py also import and consult
PRIMARY_KEY_AGGREGATIONS and DEFAULT_AGGREGATIONS_BY_TYPE (and any
column.allowed_aggregations) the same way enrichment.py does, reject
aggregations not permitted by those rules (e.g., primary-key-only aggregations
on non-PK columns, aggregations not in allowed_aggregations, and type-disallowed
aggregations) and raise the same ValueError messages.
🧹 Nitpick comments (1)
tests/integration/test_mcp_inspect.py (1)

479-493: Use measures here so this new test covers the v2 query API.

These assertions currently go through the v1 fields migration alias, so they won’t catch a regression in direct SlayerQuery.measures parsing. Switching the payloads to measures makes the new primary-key contract test hit the actual public surface.

🧪 Suggested change
-            "fields": [{"formula": "id:sum"}],
+            "measures": [{"formula": "id:sum"}],
...
-            "fields": [{"formula": "id:count"}],
+            "measures": [{"formula": "id:count"}],
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_mcp_inspect.py` around lines 479 - 493, The test uses
the v1 alias "fields" instead of the v2 public API key "measures", so update
both query payloads in test_count_on_pk_allowed (and the previous test that
asserts ValueError) to use "measures" instead of "fields" so the assertions
exercise SlayerQuery.measures parsing; locate the SlayerQuery.model_validate
calls in test_count_on_pk_allowed and the preceding test and replace the
"fields": [...] entries with "measures": [...] preserving the same formula
objects (e.g., {"formula": "id:count"} and {"formula": "id:sum"}) and leave the
await env["engine"].execute(query=q) call unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/concepts/models.md`:
- Around line 3-6: The page mixes v1 and v2 model schemas; update the docs to be
consistent with v2 by removing or replacing any v1-era references (e.g.,
"version: 1", "dimensions", and measure examples using raw `sql`) with v2
concepts: use `columns: List[Column]` and formula-based `measures:
List[ModelMeasure]` as used by SlayerModel, update examples to show `columns`
entries (with types, allowed_aggregations, filters/PK flags) and `measures` as
named formulas (not SQL snippets), and ensure any mention of `version` reflects
v2 (or is removed if not applicable).

In `@slayer/engine/query_engine.py`:
- Around line 288-292: The code currently uses a truthiness check on
c.allowed_aggregations which causes an explicit empty list to fall back to
DEFAULT_AGGREGATIONS_BY_TYPE; change the conditional to check for None
explicitly (e.g., if c.allowed_aggregations is not None) so that an explicit
empty whitelist remains empty, keeping allowed = list(c.allowed_aggregations)
when provided and only using DEFAULT_AGGREGATIONS_BY_TYPE when
allowed_aggregations is actually unset; update the block around
c.allowed_aggregations, allowed, and DEFAULT_AGGREGATIONS_BY_TYPE in
query_engine.py to implement this check so downstream functions like
get_column_types() behave correctly.

---

Outside diff comments:
In `@docs/concepts/models.md`:
- Around line 250-258: The docs example uses the Python constructor SlayerQuery
in the call to engine.create_model_from_query; replace the SlayerQuery(...)
constructor form with a JSON/dict literal for the query argument (e.g., a dict
with keys source_model, time_dimensions, measures) so the example is
language-agnostic and portable; update the call site
engine.create_model_from_query(query=..., name="monthly_summary") to pass that
dict instead of SlayerQuery and ensure field names match the SlayerQuery schema
(source_model, time_dimensions, measures).

In `@slayer/engine/enrichment.py`:
- Around line 571-630: The loop over query.measures parses qfield.formula
directly and never expands bare saved-measure names; fix by checking before
parse_formula whether qfield.formula is a bare name that corresponds to a saved
measure (call model.get_measure(qfield.formula)), and if so replace/expand it
into the saved measure's formula/spec (or parse the saved measure's formula) so
the subsequent logic (including _ensure_aggregated_measure, _flatten_spec, and
label handling) operates on the resolved spec; ensure you preserve qfield.label
when substituting and keep the existing calls to parse_formula,
_ensure_aggregated_measure, and _flatten_spec unchanged for non-saved formulas.

In `@slayer/engine/query_engine.py`:
- Around line 389-395: Replace the call to base.model_copy(...) with
base.model_validate(...) so Pydantic v2 validators run and the collision check
in SlayerModel._validate_column_measure_disjoint() executes at model
construction; call model_validate with an update dict containing the same keys
("columns", "measures", "joins") built from list(base.columns) + extra_cols,
list(base.measures) + extra_measures, and list(base.joins) + extra_joins
respectively so ModelExtension cannot inject colliding names undetected.

---

Duplicate comments:
In `@slayer/engine/query_engine.py`:
- Around line 805-826: The cross-model column lookup branch currently only
blocks numeric-only aggregations on string columns; update it to enforce the
same eligibility gates used for local refs by checking PRIMARY_KEY_AGGREGATIONS,
explicit allowed_aggregations on the column, and DEFAULT_AGGREGATIONS_BY_TYPE
based on column type before accepting a dotted measure; specifically, when
resolving measure_name via target_model.get_column(...) in query_engine.py also
import and consult PRIMARY_KEY_AGGREGATIONS and DEFAULT_AGGREGATIONS_BY_TYPE
(and any column.allowed_aggregations) the same way enrichment.py does, reject
aggregations not permitted by those rules (e.g., primary-key-only aggregations
on non-PK columns, aggregations not in allowed_aggregations, and type-disallowed
aggregations) and raise the same ValueError messages.

---

Nitpick comments:
In `@tests/integration/test_mcp_inspect.py`:
- Around line 479-493: The test uses the v1 alias "fields" instead of the v2
public API key "measures", so update both query payloads in
test_count_on_pk_allowed (and the previous test that asserts ValueError) to use
"measures" instead of "fields" so the assertions exercise SlayerQuery.measures
parsing; locate the SlayerQuery.model_validate calls in test_count_on_pk_allowed
and the preceding test and replace the "fields": [...] entries with "measures":
[...] preserving the same formula objects (e.g., {"formula": "id:count"} and
{"formula": "id:sum"}) and leave the await env["engine"].execute(query=q) call
unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c588135c-f03b-41ba-83e7-cdb216178a25

📥 Commits

Reviewing files that changed from the base of the PR and between 76080fc and e296b1b.

📒 Files selected for processing (5)
  • docs/concepts/models.md
  • slayer/core/enums.py
  • slayer/engine/enrichment.py
  • slayer/engine/query_engine.py
  • tests/integration/test_mcp_inspect.py
✅ Files skipped from review due to trivial changes (1)
  • slayer/core/enums.py

Comment thread docs/concepts/models.md
Comment thread slayer/engine/query_engine.py Outdated
- query_engine: distinguish empty allowed_aggregations from None when
  building the type probe, so an explicit `[]` whitelist no longer falls
  back to type defaults
- dbt converter: detect column-name collisions before appending a filtered
  metric, emit a ConversionWarning and skip instead of silently producing
  duplicate columns
- docs/getting-started/python.md: use execute_sync() in synchronous
  examples and replace v1 model.dimensions with model.columns
- docs/concepts/models.md: bump version 1→2 throughout, drop the v1
  dimensions row from the model-fields reference, replace it with columns,
  and rewrite the measures row as the v2 named-formula library

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ZmeiGorynych ZmeiGorynych merged commit 072f08b into main Apr 29, 2026
3 checks passed
ZmeiGorynych added a commit that referenced this pull request Apr 30, 2026
The S7 PR (#60) merged before S3 (#61) but added integration tests for
median/percentile that used v1 syntax — Field(formula=...) and fields=
on SlayerQuery. The S3 merge into main brought v2 alongside but didn't
update these S7 tests, leaving the merged main with code that fails
ruff check (F821: undefined name 'Field' — never imported).

Until now egor/S3a was based on origin-S3 (pre-merge), so locally the
v1 lines didn't exist. Merged origin/main in (commit 72b0ff4) so the
file now reflects the post-merge tree CI sees, then:

- tests/integration/test_integration.py: replace 6 Field(formula=...)
  uses with ModelMeasure(formula=...). Switch fields=[...] to
  measures=[...] on the same SlayerQuery calls.
- tests/integration/test_integration_duckdb.py and
  tests/integration/test_integration_postgres.py: switch fields=[...]
  to measures=[...] in the median/percentile tests added by S7. Their
  dict-form measure literals already worked via the v1→v2 migration
  but the kwarg name itself was still v1.

Verified locally: ruff clean, 5/5 SQLite median/percentile tests pass.
This is also a partial down-payment on Major 7 (eradicate `fields`)
in the bigger codex follow-up list.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

S3: Schema redesign — merge dimensions+measurescolumns; repurpose model-level measures to hold formulas; rename query fieldsmeasures

1 participant