Skip to content

Auto-promote window functions in filters; reject raw OVER in formulas (DEV-1336)#91

Merged
ZmeiGorynych merged 5 commits into
mainfrom
egor/dev-1336-window-functions-in-where-on-single-stage-sqlite-queries
May 5, 2026
Merged

Auto-promote window functions in filters; reject raw OVER in formulas (DEV-1336)#91
ZmeiGorynych merged 5 commits into
mainfrom
egor/dev-1336-window-functions-in-where-on-single-stage-sqlite-queries

Conversation

@ZmeiGorynych
Copy link
Copy Markdown
Member

@ZmeiGorynych ZmeiGorynych commented May 4, 2026

Summary

  • Fixes DEV-1336: window function (OVER (...)) in a SLayer filter no longer crashes with near "OVER": syntax error.
  • A model Column.sql containing a window function is auto-promoted: SLayer materializes the column under its alias in the base CTE and applies the predicate as a post-aggregation outer WHERE (existing _filtered wrap; no multi-stage model needed).
  • Raw OVER (...) SQL inside a ModelMeasure.formula or filter string is rejected at validation/parse time with an actionable error pointing at the rank() / first() / last() / lag() / lead() transforms or a Column.sql-with-window pattern.

Linear: https://linear.app/motley-ai/issue/DEV-1336

Test plan

  • poetry run pytest -m "not integration" — 1527 passed, no regressions
  • poetry run pytest tests/integration/test_integration.py tests/integration/test_integration_duckdb.py -m integration — SQLite + DuckDB end-to-end pass
  • poetry run ruff check slayer/ tests/ — clean
  • New tests added: 5 in TestWindowFunctionInFilter (tests/test_sql_generator.py), 1 ModelMeasure.formula validator (tests/test_models.py), 1 SQLite + 1 DuckDB + 1 Postgres integration test (top-3 by mass)
  • Manual dry-run verification: emitted SQL has window function aliased in inner SELECT and WHERE "<model>.<col>" <= 3 in outer wrap

Docs touched

CLAUDE.md, docs/concepts/queries.md, docs/concepts/formulas.md, docs/concepts/models.md, .claude/skills/slayer-query.md, .claude/skills/slayer-models.md, slayer/help/topics/04_transforms.md.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Filters referencing window-function-backed columns are auto-materialized and applied as outer post-aggregation predicates; pagination now runs after that post-filter wrap.
  • Documentation

    • Clarified window-function guidance, recommended top-N via rank(...) and other transforms, and documented that raw OVER(...) in formulas/filters is rejected with an actionable error and workarounds.
  • Tests

    • Added SQLite/DuckDB/Postgres integration tests and unit tests covering window-filter behavior and error conditions.

… (DEV-1336)

A SLayer query whose filter resolved to SQL containing a window function
crashed every dialect with `near "OVER": syntax error` — window functions
cannot live in WHERE. Three reproduction routes existed:

- A `Column.sql` containing `row_number() over (...)` was inlined verbatim
  into the inner WHERE when filtered on.
- A `ModelMeasure.formula` containing raw `OVER` text broke the Python-AST
  filter parser and surfaced a misleading "Invalid filter syntax" error.
- An inline `OVER` in a query filter string failed the same way.

This commit auto-promotes Route A and rejects Routes B/C with an actionable
error pointing at the `rank()` / `first()` / `last()` / `lag()` / `lead()`
transforms or a `Column.sql`-with-window pattern.

For Route A, enrichment now detects model columns whose `sql` contains
`OVER (...)` (`slayer/sql/window_detect.py`), threads the set through
`resolve_filter_columns` and `_classify_one_filter`, and emits a new
`EnrichedQuery.windowed_filter_columns` list. The base SELECT materializes
each as `<sql> AS "<model>.<name>"` (NOT in GROUP BY); the existing
post-filter wrap in `_generate_with_computed` then references the alias in
an outer `SELECT * FROM (...) AS _filtered WHERE ...`. `generate()` routes
through that wrap whenever any filter is post-classified, even if there
are no expressions or transforms to layer.

For Routes B/C, `parse_filter` and `extract_filter_transforms` reject
raw `OVER` before the AST parser sees it, with a message pointing at the
recommended escape hatches. A new `ModelMeasure.formula` validator catches
the same shape at construction time so the error surfaces at `edit_model`
rather than at first query.

Tested:

- 5 new unit tests in `tests/test_sql_generator.py:TestWindowFunctionInFilter`
  cover auto-promote shape, combined base+window filters, the SELECT-only
  control case, and both error-message paths.
- 1 new `test_models.py` test pins the validator at construction time.
- New SQLite, DuckDB, and Postgres integration tests run the planets/top-3
  end-to-end against each Tier 1 dialect.
- Full unit suite: 1527 passed; ruff clean.

Docs updated in CLAUDE.md, docs/concepts/{queries,formulas,models}.md,
.claude/skills/slayer-{query,models}.md, and slayer/help/topics/04_transforms.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

Warning

Rate limit exceeded

@ZmeiGorynych has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 5 minutes and 37 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5e4977b5-e6c8-434e-a7fb-8d53cef505c3

📥 Commits

Reviewing files that changed from the base of the PR and between b389198 and 2a4b317.

📒 Files selected for processing (3)
  • docs/concepts/formulas.md
  • slayer/sql/generator.py
  • tests/test_sql_generator.py
📝 Walkthrough

Walkthrough

Detects and forbids raw SQL window OVER (...) in filters and measure formulas, materializes model columns whose sql contains window functions into the base CTE when referenced by filters, and applies those predicates as outer post-aggregation WHERE. Documentation and tests updated across dialects.

Changes

Window-function-aware filtering

Layer / File(s) Summary
Detection & Validation
slayer/sql/window_detect.py, slayer/core/formula.py, slayer/core/models.py
Add has_window_function() and WINDOW_IN_FILTER_ERROR. parse_filter() and a ModelMeasure.formula validator pre-check and reject raw OVER (...) with an actionable error.
Enrichment / Engine State
slayer/engine/enriched.py, slayer/engine/enrichment.py
EnrichedQuery gains windowed_filter_columns. enrich_query() computes model columns whose sql contains window functions, passes windowed_column_names into filter resolution, and materializes referenced windowed columns into windowed_filter_columns instead of inlining their SQL into WHERE. resolve_filter_columns() signature extended to accept windowed_column_names.
SQL Generation / Execution Flow
slayer/sql/generator.py
generate() treats post-filters as a reason to use the computed/post-filter path. _generate_base() projects windowed_filter_columns (SELECT-only) into the base CTE and preserves joins referenced by their SQL. _generate_with_computed() exposes these aliases to outer layers and applies ORDER/LIMIT/OFFSET after the outer _filtered post-filter wrapper.
Filter Transform Extraction
slayer/engine/enrichment.py, slayer/core/formula.py
extract_filter_transforms and parse_filter() reject raw OVER (...) early (before AST parsing) to provide clearer errors.
Tests
tests/test_models.py, tests/test_sql_generator.py, tests/integration/*
Unit tests assert rejection of raw OVER (...) in formulas/filters; SQL-generator tests verify materialization and post-filter wrapping, pagination ordering, and join-pruning behavior. Integration tests (SQLite, DuckDB, Postgres) validate top-N filtering via windowed Column.sql.
Documentation
.claude/skills/*.md, CLAUDE.md, docs/concepts/*.md, slayer/help/topics/04_transforms.md
Docs added/expanded to explain disallowed raw OVER (...) usage, the auto-promotion/materialization behavior, recommended rank()/transform alternatives, and the Column.sql window pattern with examples.

Sequence Diagram(s)

sequenceDiagram
    participant Q as Query
    participant F as Filter Parser
    participant E as Enrichment
    participant G as SQLGenerator
    participant DB as Database

    Q->>F: submit filters/formulas
    F->>F: reject raw OVER(...) (if present)
    F->>E: parsed filters
    E->>E: detect windowed model columns, populate windowed_filter_columns
    E->>G: enriched query (with windowed_filter_columns + parsed filters)
    G->>G: materialize windowed columns in base CTE
    G->>G: wrap with outer _filtered WHERE for window predicates
    G->>DB: execute generated SQL
    DB-->>Q: results
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • MotleyAI/slayer#89: Modifies slayer/engine/enrichment.py (resolve_filter_columns/enrich_query signature/behavior) and overlaps enrichment changes.
  • MotleyAI/slayer#78: Touches SQL generation for windowed/CTE-based expressions; related to generator changes.
  • MotleyAI/slayer#64: Related enrichment and SQL generation changes handling windowed expressions and CTEs.

Suggested reviewers

  • AivanF

Poem

🐰 I hopped through SQL with ears a-flutter,

No raw OVER in filters to clutter,
I tunneled windows into the base,
Wrapped post-filter magic—neat little space,
Top-N now dances without a stutter.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the two main changes: auto-promotion of window functions in filters and rejection of raw OVER syntax in formulas, directly matching the core implementation work described in the raw_summary.
Docstring Coverage ✅ Passed Docstring coverage is 89.19% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch egor/dev-1336-window-functions-in-where-on-single-stage-sqlite-queries

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
slayer/engine/enrichment.py (2)

725-776: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Resolve JOINs for materialized window-filter columns before building the base CTE.

_resolve_joins() runs before windowed_filter_columns is derived, so a windowed Column.sql that references join aliases never contributes its paths to _collect_needed_paths(). A filter like ["rn <= 3"] will therefore materialize rn without the JOINs its SQL depends on, and the query breaks as soon as the window expression touches customers__.../other joined aliases. Feed these columns into join planning before calling _resolve_joins().

As per coding guidelines, "Auto-promote filters on Column with window functions: materialize the column in the base CTE, then apply the predicate as a post-aggregation outer WHERE."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@slayer/engine/enrichment.py` around lines 725 - 776, Compute
windowed_column_names and derive windowed_filter_columns (the
referenced_windowed mapping -> list) before calling _resolve_joins, and pass
those columns into join planning so their SQL path dependencies are included in
_collect_needed_paths; specifically move the logic that builds
referenced_windowed (using model.get_column and EnrichedDimension with alias
model_name_str) to precede the call to _resolve_joins, then update the
_resolve_joins invocation to accept the windowed_filter_columns (or otherwise
ensure it consumes windowed_column_names/referenced_windowed) so joins required
by windowed Column.sql are planned before resolving joins.

1176-1267: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Dotted windowed column filters still inline raw OVER (...) SQL.

The new fast-path only recognizes bare source-model names in windowed_column_names. Filters on dotted refs such as customers.rn <= 3 fall into the join-walk branch, which resolves the column back to its raw sql and recreates the invalid window predicate instead of treating it as a post-filter alias. This makes the feature incomplete for joined-model columns that are otherwise filterable.

As per coding guidelines, "Reject raw OVER (...) SQL in filter strings and ModelMeasure.formula with an actionable error message. Auto-promote filters on Column with window functions: materialize the column in the base CTE, then apply the predicate as a post-aggregation outer WHERE."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@slayer/engine/enrichment.py` around lines 1176 - 1267, The bug is that dotted
refs (e.g., "customers.rn") that are windowed still get inlined with their raw
dim.sql; update resolve_filter_columns so that inside the join-walk branch (the
loop that builds path_parts/current_model and the block that finds dim in
current_model) you first check whether the ORIGINAL dotted col_name is in
windowed_column_names and, if so, do NOT use dim.sql but instead substitute a
canonical aliased reference (use the same alias convention used elsewhere —
e.g., table_alias = "__".join(path_parts) and qualified =
f"{table_alias}.{dim_name}") into resolved_sql via regex, append the original
dotted col_name to resolved_columns (to preserve join requirements), and
continue; otherwise keep the existing behavior of using dim.sql when safe. This
ensures dotted windowed columns are promoted to post-filter aliases rather than
inlining OVER(...) SQL.
🧹 Nitpick comments (4)
slayer/core/models.py (1)

187-187: ⚡ Quick win

Keep window_detect imports at file top.

This validator is solid functionally, but the inline import should be hoisted to module scope to follow repo conventions.

♻️ Proposed fix
@@
 from slayer.storage.migrations import migrate as _migrate_schema
+from slayer.sql.window_detect import WINDOW_IN_FILTER_ERROR, has_window_function
@@ class ModelMeasure(BaseModel):
-        from slayer.sql.window_detect import WINDOW_IN_FILTER_ERROR, has_window_function
         if has_window_function(v):
             raise ValueError(f"ModelMeasure formula '{v}' {WINDOW_IN_FILTER_ERROR}")

As per coding guidelines: **/*.py: Place imports at the top of files.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@slayer/core/models.py` at line 187, The inline import of
WINDOW_IN_FILTER_ERROR and has_window_function should be moved from inside the
function to the module's top-level imports to follow repo conventions; locate
the inline statement "from slayer.sql.window_detect import
WINDOW_IN_FILTER_ERROR, has_window_function" (currently around the validator
code) and hoist it into the file header with the other imports, then remove the
local import so the validator uses the top-level names.
docs/concepts/formulas.md (1)

218-218: ⚡ Quick win

Use JSON/dict example syntax here instead of a Python constructor.

On a concepts page, Column(name=..., sql=..., type=...) is SDK-specific; please switch to portable JSON/dict-style example text.

As per coding guidelines: docs/**/*.{md,rst,markdown}: Use JSON/dict syntax for all query objects in docs and examples — not Python class constructors.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/concepts/formulas.md` at line 218, The docs line uses a Python
constructor example Column(name="rn", sql="row_number() over (order by mass
desc)", type=NUMBER); replace that constructor with a language-neutral
JSON/dict-style example (e.g., {"name": "rn", "sql": "row_number() over (order
by mass desc)", "type": "NUMBER"}) so the docs follow the rule to use JSON/dict
syntax for query objects; keep references to ModelMeasure.formula, the rank()
transform, and the row_number() over(...) SQL text but present the column
example as a JSON/dict literal rather than a Python class instantiation.
slayer/core/formula.py (1)

846-846: ⚡ Quick win

Move the window-detect import to module scope.

The runtime import works, but it breaks the repo import-placement rule and adds avoidable per-call overhead.

♻️ Proposed fix
@@
 from slayer.core.enums import BUILTIN_AGGREGATIONS
+from slayer.sql.window_detect import WINDOW_IN_FILTER_ERROR, has_window_function
@@ def parse_filter(
-    from slayer.sql.window_detect import WINDOW_IN_FILTER_ERROR, has_window_function
     if has_window_function(formula):
         raise ValueError(f"Filter '{formula}' {WINDOW_IN_FILTER_ERROR}")

As per coding guidelines: **/*.py: Place imports at the top of files.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@slayer/core/formula.py` at line 846, The import of WINDOW_IN_FILTER_ERROR and
has_window_function is currently done at runtime inside a function, causing
unnecessary per-call overhead and violating the import-at-top rule; move the
statement "from slayer.sql.window_detect import WINDOW_IN_FILTER_ERROR,
has_window_function" to the module scope (top of the file) so the symbols are
imported once when the module loads and existing references to
WINDOW_IN_FILTER_ERROR and has_window_function elsewhere in this module continue
to work without runtime imports.
tests/test_sql_generator.py (1)

5892-5897: ⚡ Quick win

Assert actionable guidance in the model-formula OVER error message.

Right now this test only checks for a generic “window function” mention. Since DEV-1336 requires actionable guidance, also assert one of the recommended paths (e.g. rank(/lag(/Column.sql/multi-stage) to prevent regressions in error quality.

Suggested assertion hardening
         msg = str(excinfo.value)
         assert "window function" in msg.lower(), (
             f"Error message should mention 'window function'.\nmsg: {msg}"
         )
+        msg_l = msg.lower()
+        assert any(
+            k in msg_l
+            for k in ("rank(", "first(", "last(", "lag(", "lead(", "column.sql", "multi-stage")
+        ), f"Error must suggest a supported transform or Column.sql/multi-stage path.\nmsg: {msg}"
         assert "Perhaps you forgot a comma" not in msg

As per coding guidelines: "Reject raw OVER (...) SQL in filter strings and ModelMeasure.formula with an actionable error message. Auto-promote filters on Column with window functions..."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_sql_generator.py` around lines 5892 - 5897, Update the test
assertion to check that the error message not only mentions "window function"
but also includes actionable guidance by asserting that the message contains at
least one recommended remediation phrase such as "rank(", "lag(", "Column.sql",
or "multi-stage"; modify the block using msg = str(excinfo.value) and the
subsequent asserts to require that msg.lower() contains "window function" and
(msg contains "rank(" or "lag(" or "column.sql" or "multi-stage") while still
ensuring "Perhaps you forgot a comma" is not present.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@slayer/sql/generator.py`:
- Around line 1040-1049: The new windowed projection
(enriched.windowed_filter_columns / wcol.sql) is added to select_columns but not
considered by _generate_base()’s alternate scopes (skip_isolated=True pruning
and has_first_or_last ranked-source switch), causing missing joins; fix by
wiring these windowed columns into the same scope decisions: in the path that
builds the base CTE (the _generate_base logic), treat any
windowed_filter_columns with non-null sql as non-isolated by (a) parsing their
SQL (like you do with sqlglot.parse_one for wcol.sql) to collect referenced __
aliases and add those aliases to the set used to decide isolation/used joins,
and (b) ensure the has_first_or_last branch includes these windowed columns when
switching to the ranked subquery (i.e., propagate them into the inputs/selected
columns for the ranked source instead of dropping them); update the logic that
computes isolated_columns/used_aliases and the has_first_or_last source
selection to include enriched.windowed_filter_columns so joins referenced by
window expressions are preserved.
- Around line 288-301: The branch that handles "no measure CTEs, just computed
columns or post-filters" calls _generate_with_computed(enriched=enriched,
base_sql=base_sql) which causes ORDER BY/LIMIT/OFFSET to be applied before the
outer filtered wrapper and double-applies pagination for pure post-filter
queries; fix by ensuring the base_sql passed into _generate_with_computed is the
unpaginated version (i.e., strip/avoid the LIMIT/OFFSET/ORDER BY that
_generate_base/_generate_pagination would add) so pagination is only applied
after post-filtering, for example by passing a base_sql generated without
pagination or adding a flag parameter to _generate_with_computed to disable
inner pagination, and update callers and helper logic in _generate_with_computed
and any pagination helper to respect that flag.

In `@tests/test_sql_generator.py`:
- Around line 5870-5872: The test function
test_window_function_in_named_measure_used_as_filter_raises is declared async
and accepts an unused fixture parameter generator; change its signature from
"async def test_window_function_in_named_measure_used_as_filter_raises(self,
generator: SQLGenerator) -> None:" to a synchronous "def
test_window_function_in_named_measure_used_as_filter_raises(self) -> None:" and
remove any unused "generator" parameter from the body and any related unused
imports so the test runs synchronously without unused fixture injection.

---

Outside diff comments:
In `@slayer/engine/enrichment.py`:
- Around line 725-776: Compute windowed_column_names and derive
windowed_filter_columns (the referenced_windowed mapping -> list) before calling
_resolve_joins, and pass those columns into join planning so their SQL path
dependencies are included in _collect_needed_paths; specifically move the logic
that builds referenced_windowed (using model.get_column and EnrichedDimension
with alias model_name_str) to precede the call to _resolve_joins, then update
the _resolve_joins invocation to accept the windowed_filter_columns (or
otherwise ensure it consumes windowed_column_names/referenced_windowed) so joins
required by windowed Column.sql are planned before resolving joins.
- Around line 1176-1267: The bug is that dotted refs (e.g., "customers.rn") that
are windowed still get inlined with their raw dim.sql; update
resolve_filter_columns so that inside the join-walk branch (the loop that builds
path_parts/current_model and the block that finds dim in current_model) you
first check whether the ORIGINAL dotted col_name is in windowed_column_names
and, if so, do NOT use dim.sql but instead substitute a canonical aliased
reference (use the same alias convention used elsewhere — e.g., table_alias =
"__".join(path_parts) and qualified = f"{table_alias}.{dim_name}") into
resolved_sql via regex, append the original dotted col_name to resolved_columns
(to preserve join requirements), and continue; otherwise keep the existing
behavior of using dim.sql when safe. This ensures dotted windowed columns are
promoted to post-filter aliases rather than inlining OVER(...) SQL.

---

Nitpick comments:
In `@docs/concepts/formulas.md`:
- Line 218: The docs line uses a Python constructor example Column(name="rn",
sql="row_number() over (order by mass desc)", type=NUMBER); replace that
constructor with a language-neutral JSON/dict-style example (e.g., {"name":
"rn", "sql": "row_number() over (order by mass desc)", "type": "NUMBER"}) so the
docs follow the rule to use JSON/dict syntax for query objects; keep references
to ModelMeasure.formula, the rank() transform, and the row_number() over(...)
SQL text but present the column example as a JSON/dict literal rather than a
Python class instantiation.

In `@slayer/core/formula.py`:
- Line 846: The import of WINDOW_IN_FILTER_ERROR and has_window_function is
currently done at runtime inside a function, causing unnecessary per-call
overhead and violating the import-at-top rule; move the statement "from
slayer.sql.window_detect import WINDOW_IN_FILTER_ERROR, has_window_function" to
the module scope (top of the file) so the symbols are imported once when the
module loads and existing references to WINDOW_IN_FILTER_ERROR and
has_window_function elsewhere in this module continue to work without runtime
imports.

In `@slayer/core/models.py`:
- Line 187: The inline import of WINDOW_IN_FILTER_ERROR and has_window_function
should be moved from inside the function to the module's top-level imports to
follow repo conventions; locate the inline statement "from
slayer.sql.window_detect import WINDOW_IN_FILTER_ERROR, has_window_function"
(currently around the validator code) and hoist it into the file header with the
other imports, then remove the local import so the validator uses the top-level
names.

In `@tests/test_sql_generator.py`:
- Around line 5892-5897: Update the test assertion to check that the error
message not only mentions "window function" but also includes actionable
guidance by asserting that the message contains at least one recommended
remediation phrase such as "rank(", "lag(", "Column.sql", or "multi-stage";
modify the block using msg = str(excinfo.value) and the subsequent asserts to
require that msg.lower() contains "window function" and (msg contains "rank(" or
"lag(" or "column.sql" or "multi-stage") while still ensuring "Perhaps you
forgot a comma" is not present.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3577c0e5-7d0a-4f55-9b94-853e16ee938d

📥 Commits

Reviewing files that changed from the base of the PR and between d650f70 and 1277a40.

📒 Files selected for processing (18)
  • .claude/skills/slayer-models.md
  • .claude/skills/slayer-query.md
  • CLAUDE.md
  • docs/concepts/formulas.md
  • docs/concepts/models.md
  • docs/concepts/queries.md
  • slayer/core/formula.py
  • slayer/core/models.py
  • slayer/engine/enriched.py
  • slayer/engine/enrichment.py
  • slayer/help/topics/04_transforms.md
  • slayer/sql/generator.py
  • slayer/sql/window_detect.py
  • tests/integration/test_integration.py
  • tests/integration/test_integration_duckdb.py
  • tests/integration/test_integration_postgres.py
  • tests/test_models.py
  • tests/test_sql_generator.py

Comment thread slayer/sql/generator.py
Comment thread slayer/sql/generator.py
Comment thread tests/test_sql_generator.py Outdated
- generator.py: pure post-filter queries no longer double-apply pagination
  or filter the wrong page — the `_filtered` wrap now sits inside the
  pagination wrapper, and `_generate_base` skips inner ORDER/LIMIT when
  post-filters are present (CR Thread 1).
- generator.py: when `skip_isolated=True`, joins referenced by a windowed
  `Column.sql` (e.g. `partition by systems.galaxy_id`) are now preserved
  in the `_base` CTE instead of being pruned (CR Thread 2).
- test_sql_generator.py: drop unused `async`/`generator` fixture from
  test_window_function_in_named_measure_used_as_filter_raises (CR Thread
  3 / Sonar S7503), and add three regression tests for the fixes above.
- Hoist `slayer.sql.window_detect` imports to module scope in
  core/models.py, core/formula.py, and engine/enrichment.py (CR nitpick).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/test_sql_generator.py (1)

5803-5803: ⚡ Quick win

Move newly added imports to module scope

These new in-function imports should be hoisted to the file header to match repo import rules.

♻️ Suggested change
 import tempfile
 from contextlib import asynccontextmanager
+import re as _re

 import pytest
 import sqlglot

 from slayer.core.enums import DataType, TimeGranularity
 from slayer.core.models import Aggregation, AggregationParam, Column, ModelJoin, ModelMeasure, SlayerModel
 from slayer.core.query import ColumnRef, OrderItem, SlayerQuery, TimeDimension
 from slayer.engine.enriched import EnrichedMeasure, EnrichedQuery
+from slayer.engine.enrichment import enrich_query
 from slayer.engine.query_engine import SlayerQueryEngine
 from slayer.sql.generator import SQLGenerator, _validate_agg_param_value
 from slayer.storage.yaml_storage import YAMLStorage
-        import re as _re
         for match in _re.finditer(r"\bWHERE\b", upper):
-        from slayer.engine.enrichment import enrich_query
-
         planets_with_time = SlayerModel(
-        from slayer.engine.enrichment import enrich_query
-
         systems = SlayerModel(

As per coding guidelines: "Place imports at the top of files".

Also applies to: 5945-5945, 6000-6000

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_sql_generator.py` at line 5803, The test adds in-function imports
like "import re as _re" (and similar imports found near the occurrences
referenced) which violates the repo rule to keep imports at module scope; move
these in-function imports to the top of the file header alongside other imports
so they are module-level imports, remove the redundant local import statements
inside the functions, and update any references to _re (or the other imported
aliases) to use the module-scope names (locate usages by the symbol "_re" and
the other newly added import names such as those around the noted occurrences).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/test_sql_generator.py`:
- Line 5803: The test adds in-function imports like "import re as _re" (and
similar imports found near the occurrences referenced) which violates the repo
rule to keep imports at module scope; move these in-function imports to the top
of the file header alongside other imports so they are module-level imports,
remove the redundant local import statements inside the functions, and update
any references to _re (or the other imported aliases) to use the module-scope
names (locate usages by the symbol "_re" and the other newly added import names
such as those around the noted occurrences).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1cc3846d-4535-4e20-95b0-c2a24d87a48e

📥 Commits

Reviewing files that changed from the base of the PR and between 1277a40 and 99f212f.

📒 Files selected for processing (5)
  • slayer/core/formula.py
  • slayer/core/models.py
  • slayer/engine/enrichment.py
  • slayer/sql/generator.py
  • tests/test_sql_generator.py
✅ Files skipped from review due to trivial changes (1)
  • slayer/engine/enrichment.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • slayer/sql/generator.py

…ions-in-where-on-single-stage-sqlite-queries

# Conflicts:
#	slayer/engine/enrichment.py
#	tests/integration/test_integration.py
#	tests/integration/test_integration_duckdb.py
#	tests/integration/test_integration_postgres.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
slayer/sql/generator.py (1)

1059-1068: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Windowed filter columns can break under the first/last ranked-source path.

At Line 1059-Line 1068, windowed Column.sql is materialized in the outer base select. When has_first_or_last is active, FROM is switched to a ranked subquery (Line 1013+ path) that does not expose joined-table aliases. A windowed column SQL referencing joined aliases can then fail at runtime with missing table/column scope.

As per coding guidelines, derived Column.sql is allowed to reference join-graph columns via single-dot or __ notation, so this scope must be preserved in the ranked-source branch too.

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/concepts/formulas.md`:
- Line 220: Replace the Python-constructor-style example Column(name="rn",
sql="row_number() over (order by mass desc)", type=NUMBER) with a
language-agnostic dict/JSON representation, e.g. {"name": "rn", "sql":
"row_number() over (order by mass desc)", "type": "NUMBER"}, so the docs follow
the guideline; keep references to ModelMeasure.formula, use of rank(), and the
explanatory text unchanged, only swap the Column(...) constructor example to the
JSON/dict form.

In `@slayer/sql/generator.py`:
- Around line 1064-1068: Replace direct calls to sqlglot.parse_one when parsing
windowed filter column SQL with the generator's wrapper so SQLite-specific AST
rewrites run: in the loop over enriched.windowed_filter_columns (where you parse
wcol.sql and do select_columns.append(wcol_expr.as_(wcol.alias))) use
self._parse(...) / SQLGenerator._parse instead of sqlglot.parse_one(...). Do the
same change at the other occurrence around the second block (the code referenced
at lines ~1136-1140) so json_extract rewrites for SQLite are preserved.

In `@tests/test_sql_generator.py`:
- Around line 5943-5946: Split the generated SQL string stored in norm at the
token "AS _filtered" (use a case-sensitive split on that substring) into
inner_sql and outer_sql, then assert that "Pluto" is present in inner_sql and
not present in outer_sql (e.g., assert "Pluto" in inner_sql and "Pluto" not in
outer_sql); keep the existing check that the window predicate ('"planets.rn" <=
3' or '"planets"."rn" <= 3') appears in the outer SQL segment to validate the
window filter is applied on the alias.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 26050aca-187c-454e-8946-592d568bde67

📥 Commits

Reviewing files that changed from the base of the PR and between 99f212f and b389198.

📒 Files selected for processing (12)
  • .claude/skills/slayer-models.md
  • .claude/skills/slayer-query.md
  • CLAUDE.md
  • docs/concepts/formulas.md
  • docs/concepts/models.md
  • docs/concepts/queries.md
  • slayer/engine/enrichment.py
  • slayer/sql/generator.py
  • tests/integration/test_integration.py
  • tests/integration/test_integration_duckdb.py
  • tests/integration/test_integration_postgres.py
  • tests/test_sql_generator.py
✅ Files skipped from review due to trivial changes (4)
  • tests/integration/test_integration_duckdb.py
  • docs/concepts/queries.md
  • .claude/skills/slayer-models.md
  • docs/concepts/models.md
🚧 Files skipped from review as they are similar to previous changes (4)
  • .claude/skills/slayer-query.md
  • tests/integration/test_integration.py
  • CLAUDE.md
  • slayer/engine/enrichment.py

Comment thread docs/concepts/formulas.md Outdated
Comment thread slayer/sql/generator.py
Comment thread tests/test_sql_generator.py Outdated
ZmeiGorynych and others added 2 commits May 5, 2026 11:11
CodeRabbit nitpick: per the global "imports at the top" rule, no
inline imports should live inside function bodies. Hoisted 69 inline
imports (16x enrich_query, 23x tempfile, 23x YAMLStorage, 7x re as _re,
plus single occurrences of ParsedFilter, Aggregation, EnrichedDimension,
CrossModelMeasure, _cte_name_from_alias, AsyncMock/MagicMock/patch).

Also added NOSONAR(S7503) on the `resolve_join_target` callback in
test_windowed_filter_column_referencing_join_with_cross_model_measure
— Sonar flagged async-with-no-await but the async signature is
required by enrich_query's callback contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- docs/concepts/formulas.md: switch the windowed Column.sql example
  from Python-constructor syntax to dict syntax per docs/CLAUDE.md
  (CR Thread 1).
- generator.py: parse `wcol.sql` via `self._parse(...)` instead of
  `sqlglot.parse_one(...)` so SQLite's `json_extract` rewrite still
  applies to windowed columns containing JSON path access (CR Thread 2,
  2 sites).
- test_sql_generator.py: strengthen the Pluto assertion in
  test_filter_on_windowed_column_combined_with_base_filter — split the
  SQL at `AS _filtered` and check that the base filter lives in the
  inner SELECT only, not the outer wrap (CR Thread 3).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 5, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant