DEV-1501: parametric first/last in ORDER BY / HAVING — materialise + outer trim#159
Conversation
…outer trim
Hidden first/last aggregates referenced only by ORDER BY or HAVING used to
render inline against a dangling `_last_rn`, collapsing distinct
explicit-time-column specs into a single broken term. The architecture's
documented design (planning.md:89-93,155-159) already mandates the fix:
materialise hidden order/filter slots in the base CTE, trim them via an
outer SELECT. The transform path already did this; this commit extends the
same pattern to the no-transform path.
Five coordinated changes:
* planning.py `_canonical_name` routes AggregateKey through
`canonical_agg_name` so parametric aggregates (`revenue:last(created_at)`,
`revenue:percentile(p=0.5)`) carry their arg/kwarg signature in the
declared name — distinct hidden slots get distinct base-CTE aliases.
* generator.py `_collect_base_aux_slot_ids` learns `aggregates_only=True`
and walks AGGREGATE-phase filter deps too. The no-transform path uses it
to pull hidden AggregateKey order/filter operands into base_render_order
without dragging ROW leaves into GROUP BY.
* generator.py `_build_outer_trim_wrap_sql` wraps the base SELECT when any
hidden materialised slot is present, projecting exactly the public
projection (mirroring the transform path's outer wrap, cycling
aliases_by_slot_id for C13 duplicates) and moving ORDER BY / LIMIT /
OFFSET to the outer level. `_apply_order_limit_from_planned` resolves
hidden AggregateKey order targets via the materialised aliases.
* generator.py `_build_first_last_base_select` collects full aliases per
sid (multi-alias list) and pass-2 cycles per visit — pre-existing C13
bug exposed by the wrap.
* generator.py `_build_agg` suffix guard no longer requires
`default_time_col`; FirstLastRenderState threads the rn maps from the
base builder into `_build_where_having_from_planned` so HAVING
aggregates reference the same `_first_rn` / `_last_rn{suffix}` columns
the base projects.
Tests: 27 new in test_sql_generator (TestDev1501HiddenFirstLastRender +
TestDev1501BroadTriggerAndGuards + multi-dialect parametric) and
test_projection_planner (TestDev1501CanonicalNameWithAggArgs) plus one
SQLite integration test against real data. test_projection_trim's
`test_declared_projection_order_and_hidden_exclusion` second assertion
relaxed to "no quantity_sum in OUTER projection" to match the materialise
+trim shape per docs.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughImplements DEV-1501: canonicalizes parametric aggregate hidden-slot names, threads FirstLastRenderState through base/ranked plans into HAVING/cross-model rendering, detects hidden-materialized base slots and emits an outer-trim wrapper projecting only public aliases, and adds SQL-introspection helpers plus extensive tests. ChangesDEV-1501: Hidden Parametric Aggregates and Outer-Trim Wrapping
Sequence Diagram(s)sequenceDiagram
participant Planner as ProjectionPlanner.plan
participant Generator as generate_from_planned
participant BaseSelect as _build_base_select_for_planned
participant Wrap as _build_outer_trim_wrap_sql
participant ApplyOrder as _apply_order_limit_from_planned
Planner->>Generator: produce planned query (includes hidden AggregateKeys)
Generator->>BaseSelect: build base SQL (+ first_last_state)
BaseSelect-->>Generator: base_sql, first_last_state, aliases_by_slot_id
Generator->>Generator: inspect base for hidden-materialized slots
alt hidden materialized
Generator->>Wrap: emit outer trim projecting public aliases
Wrap-->>Generator: wrapped_sql
Generator->>ApplyOrder: apply ORDER/LIMIT/OFFSET using aliases_by_slot_id
ApplyOrder-->>Generator: final_sql
else no hidden materialized
Generator->>ApplyOrder: apply ORDER/LIMIT/OFFSET to base
ApplyOrder-->>Generator: final_sql
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
slayer/sql/generator.py (1)
5648-5656:⚠️ Potential issue | 🟠 Major | ⚡ Quick winThread the ranked first/last state into cross-model HAVING rendering too.
_render_cross_model_cte()computesrn_suffix_map/filtered_rn_map/filtered_match_mapfor first/last just above, but routed HAVING filters rebuild the aggregate through_render_filter_value_key_in_target_scope()with a plain_build_agg(synth). For filtered or explicit-time-columnfirst()/last(), that makes the HAVING expression diverge from the CTE's SELECT expression and can bind to the wrong_first_rn/_last_rn{suffix}columns.Also applies to: 5772-5801
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@slayer/sql/generator.py` around lines 5648 - 5656, The HAVING expression reconstruction for routed filters diverges for first()/last() because `_render_filter_value_key_in_target_scope()` calls `_build_agg(synth)` without the ranked-first/last context computed in `_render_cross_model_cte()` (rn_suffix_map / filtered_rn_map / filtered_match_map), causing wrong `_first_rn`/`_last_rn{suffix}` bindings; fix by threading the ranked state into the routed-HAVING path: add parameters to `_render_filter_value_key_in_target_scope()` (and if needed `_build_agg()`) to accept rn_suffix_map/filtered_rn_map/filtered_match_map (or a single ranked_state object) and use those when rendering aggregates for first()/last(), then when calling `_collect_routed_filters()` / `cte_select.having(...)` pass the computed ranked_state from `_render_cross_model_cte()` so the HAVING expression uses the same suffixes and match columns as the CTE SELECT.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@slayer/sql/generator.py`:
- Around line 2862-2879: The no-transform branch (when no
planned_query.transform_layers) currently materializes Phase.POST dependencies
via extra_materialize_ids/_collect_base_aux_slot_ids but never applies POST
filters because _build_where_having_from_planned skips POST; fix by invoking the
post-phase rendering path (call _render_post_phase_filter_conditions or ensure
_build_where_having_from_planned handles Phase.POST) inside the no_transform
branch after extra_materialize_ids are computed so POST filters are applied to
base_render_order; apply the same change to the other occurrence noted (around
the second block at lines 2946-2969) so both no-transform codepaths run the POST
filter rendering.
---
Outside diff comments:
In `@slayer/sql/generator.py`:
- Around line 5648-5656: The HAVING expression reconstruction for routed filters
diverges for first()/last() because `_render_filter_value_key_in_target_scope()`
calls `_build_agg(synth)` without the ranked-first/last context computed in
`_render_cross_model_cte()` (rn_suffix_map / filtered_rn_map /
filtered_match_map), causing wrong `_first_rn`/`_last_rn{suffix}` bindings; fix
by threading the ranked state into the routed-HAVING path: add parameters to
`_render_filter_value_key_in_target_scope()` (and if needed `_build_agg()`) to
accept rn_suffix_map/filtered_rn_map/filtered_match_map (or a single
ranked_state object) and use those when rendering aggregates for first()/last(),
then when calling `_collect_routed_filters()` / `cte_select.having(...)` pass
the computed ranked_state from `_render_cross_model_cte()` so the HAVING
expression uses the same suffixes and match columns as the CTE SELECT.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0041868d-f881-48f2-b25d-32764b508181
📒 Files selected for processing (8)
docs/architecture/planning.mddocs/architecture/sql-generation.mdslayer/engine/planning.pyslayer/sql/generator.pytests/integration/test_integration.pytests/test_projection_planner.pytests/test_projection_trim.pytests/test_sql_generator.py
…hygiene Codex + CodeRabbit + SonarQube review of PR #159 surfaced two genuine HAVING render bugs and a defensive POST-walk gap, all rooted in incomplete plumbing of the ranked-subquery state into the HAVING render paths added by the original Change 5. Group A — HAVING ranked-state threading (2 real bugs): * Host HAVING: ``_render_value_key_for_filter``'s AggregateKey branch built the synth with placeholder ``full_alias="__having_ref__"``, so ``_build_agg``'s ``filtered_rn_map[alias]`` / ``filtered_match_map [alias]`` lookups missed and the term fell back to unfiltered ``_last_rn`` + raw ``filter_sql`` for filtered first/last. Thread ``aliases_by_slot_id`` into ``_build_where_having_from_planned`` → ``_render_value_key_for_filter`` and look up the materialised slot's full alias as the synth alias. * Cross-model HAVING: ``_collect_routed_filters`` → ``_render_filter_value_key_in_target_scope`` rebuilt the aggregate via ``_build_agg(synth)`` with no rn-state context, so a cross-model filtered first/last in HAVING bound to bare ``_last_rn`` instead of the CTE's ``_last_rn_f0``. Extend ``FirstLastRenderState`` with ``agg_synth_alias`` (single-aggregate cross-model CTE case), ``_render_cross_model_cte`` builds the state from its existing rn maps + the projected aggregate's ``full_agg_alias``, threaded through ``_collect_routed_filters`` / ``_render_filter_value_key_in_target_scope`` recursion. Group B — defensive guard: * ``_collect_base_aux_slot_ids`` no longer walks POST-phase filter deps unless ``transform_layers`` exist. POST-phase requires TransformKey operands, so POST in no-transform queries is planner-unreachable; walking the deps without applying the filter would silently materialise helper slots. Group C — Sonar S3776 NOSONAR markers on 5 functions over the complexity threshold (``_canonical_name`` planning.py:631; ``generate_from_planned``, ``_collect_base_aux_slot_ids``, ``_render_value_key_for_filter``, ``_apply_order_limit_from_planned`` generator.py), each with a rationale matching the existing codebase pattern. Pre-existing test helper at tests/test_sql_generator.py:88 left untouched. Group D — two small Sonar test fixes: relax the reluctant-quantifier HAVING regex from ``r"HAVING(.*?)(ORDER BY|$)"`` to ``r"HAVING(.*)$"`` (inner SELECT has no ORDER BY after outer wrap); drop ``async`` from ``test_hidden_composite_order_rejected_at_input_validation`` (does no ``await`` — just an OrderItem input-validation check). Tests: 2 new regression tests for the HAVING bugs (``test_filtered_first_last_in_having_uses_filtered_rn``, ``test_cross_model_filtered_last_in_having``); all 32 DEV-1501 tests pass; full non-integration suite 3931 pass; SQLite + DuckDB integration suites pass; ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
@coderabbitai re your outside-diff comment on
Regression test: |
|
Tip For best results, initiate chat on the files or code changes.
|
… + NOSONAR Codex round-2 review caught a residual gap from the round-1 fix: ``_render_value_key_for_filter``'s recursive calls for ``ScalarCallKey`` / ``BetweenKey`` / ``InKey`` threaded ``first_last_state`` but DROPPED ``aliases_by_slot_id``. So a filtered ``paid_amount:last(created_at)`` nested under ``coalesce(...)``, ``IN (...)``, or ``BETWEEN`` rebuilt the synth with placeholder ``__having_ref__`` → ``filtered_rn_map[alias]`` missed → fell back to unfiltered ``_last_rn`` + raw ``filter_sql``. The original round-1 regression test only exercised the ``>`` (ArithmeticKey) path. Fix: pass ``aliases_by_slot_id`` through the three missing recursion sites in ``_render_value_key_for_filter``. New regression test pins a ``coalesce(paid_amount:last(created_at), 0) > 0`` HAVING shape to prevent re-occurrence. Sonar S3776 NOSONAR marker on ``_render_filter_value_key_in_target_scope`` (the function grew from the round-1 Group A.3 edits; complexity 26). Also: reply on the two stale CodeRabbit threads (POST-drop guard + cross-model HAVING rn-state) that were addressed in commit dc7a775 but not auto-resolved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…n-state
Codex round-3 review caught the last rn-state plumbing gap from the
DEV-1501 series: composite aggregate slots (ArithmeticKey /
ScalarCallKey of aggregates) projected on the first/last-ranked path
render via ``_render_aggregate_composite_expr`` which previously did
NOT receive ``rn_suffix_map`` / ``default_time_col_sql`` /
``filtered_rn_map`` / ``filtered_match_map``. A composite like
``revenue:last(created_at) + revenue:last(updated_at)`` therefore
emitted ``MAX(CASE WHEN _last_rn = 1 …) + MAX(CASE WHEN _last_rn = 1
…)`` — both operands collapsed to bare ``_last_rn`` regardless of
their explicit time arg.
Two coordinated fixes:
* ``_render_aggregate_composite_expr`` now accepts the four rn-state
params (defaults to ``None`` so non-first/last call sites are
unaffected) and passes them to ``_build_agg`` in the AggregateKey
branch + threads through ArithmeticKey / ScalarCallKey recursion.
* ``_build_first_last_base_select`` gathers composite-operand
first/last AggregateKey leaves (via the new module-level
``_iter_first_last_leaves`` walker) and includes their synth specs
in the ranked-subquery synth list — so each operand's distinct time
column contributes its own ``_first_rn`` / ``_last_rn{suffix}``
column. The operands are NOT projected as base SELECT columns
(they're inlined inside the composite render); they only participate
in the rn-column set.
New regression test: ``test_composite_first_last_in_projection_uses_
correct_suffixes`` exercises a query with both a direct projected
``last(created_at)`` AND a composite ``last(created_at) +
last(updated_at)``, asserting the composite renders each operand with
a DIFFERENT ``_last_rn{suffix}``.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
slayer/sql/generator.py (2)
8365-8413:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftFiltered first/last leaves inside HAVING composites still miss their materialized alias.
For composite HAVING expressions,
slot_by_keyis intentionally absent for the inlined first/last leaves, sohaving_full_aliasstays__having_ref__. But_build_first_last_base_select()materializes those filtered leaves under synthetic aliases like..._composite_op_N, andfiltered_rn_map/filtered_match_mapare keyed by that alias. The lookup therefore misses here and falls back to rawfilter_sql, which can reference columns that are out of scope outside the ranked subquery. You'll need to carry the composite-leaf alias mapping intoFirstLastRenderStateand use it here.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@slayer/sql/generator.py` around lines 8365 - 8413, The composite HAVING leaves' synthetic aliases (created in _build_first_last_base_select) need to be threaded into the FirstLastRenderState and consulted when resolving having_full_alias: extend FirstLastRenderState (the class created/returned by _build_first_last_base_select) to include the composite-leaf alias mapping (e.g. composite_alias_map keyed by the inlined leaf identifier or slot/key), populate that map where _build_first_last_base_select creates synthetic aliases, and then modify this block (where having_full_alias is computed using slot_by_key/aliases_by_slot_id) to also consult first_last_state.composite_alias_map when slot_by_key is missing for an inlined composite leaf so the lookup uses the materialised alias before falling back to "__having_ref__" (then pass that alias through to _build_agg as currently done).
5956-6018:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReroot cross-model HAVING aggregates the same way as the CTE projection path.
This branch only strips the
ColumnKeysource path._render_cross_model_cte()reroots the source, explicit time args, and kwargs before synthesizing the projected aggregate, so a HAVING on a joined-targetfirst/lastor parametric aggregate can still resolve against host-path aliases here or hit the derived-timeNotImplementedErroreven though the CTE itself rendered the aggregate successfully. Reuse the same reroot logic before calling_build_agg_render_spec_from_planned.As per coding guidelines,
**/{*join*,*generator*,*cross*model*}.py: "Cross-model measures use dotted syntax with colon aggregation ... and multi-hop ...".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@slayer/sql/generator.py` around lines 5956 - 6018, The HAVING-aggregate path only strips ColumnKey.path but misses the cross-model "reroot" steps used in _render_cross_model_cte, causing parametric/first-last aggregates to resolve incorrectly; before calling _build_agg_render_spec_from_planned, apply the same reroot logic used in _render_cross_model_cte to local_agg (and tmp_slot/key) so the source is rerooted to the host path, explicit time args and kwargs are translated to the host scope, and column_filter_key (and any derived-time handling) is preserved; locate the reroot implementation in _render_cross_model_cte and invoke it (or inline the same transformations) on local_agg/value_key/tmp_slot so synth is built with the rerooted source, args, and kwargs.
🧹 Nitpick comments (1)
tests/test_sql_generator.py (1)
5713-5723: ⚡ Quick winHoist the
pydanticimport to module scope.This new function-local import breaks the file-level import rule and is the only import in the test body.
As per coding guidelines, "Place imports at the top of files".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_sql_generator.py` around lines 5713 - 5723, The test function test_hidden_composite_order_rejected_at_input_validation currently contains a function-local import "from pydantic import ValidationError as PydanticValidationError"; move this import to the module top-level (alongside other imports) and remove the in-function import so the test uses the module-scoped PydanticValidationError symbol instead, preserving the exact alias and ensuring no other behavior changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/test_sql_generator.py`:
- Around line 5276-5282: The fixture defines a boolean Column named "active"
(DataType.BOOLEAN) but the filtered Column "active_score" uses filter="active =
1", which causes Postgres type errors; update the Column(...) for active_score
to use a boolean literal in its filter (e.g., "active = TRUE") so the filter
matches DataType.BOOLEAN and embedded SQL (generated in functions that consume
Column.filter) is valid.
---
Outside diff comments:
In `@slayer/sql/generator.py`:
- Around line 8365-8413: The composite HAVING leaves' synthetic aliases (created
in _build_first_last_base_select) need to be threaded into the
FirstLastRenderState and consulted when resolving having_full_alias: extend
FirstLastRenderState (the class created/returned by
_build_first_last_base_select) to include the composite-leaf alias mapping (e.g.
composite_alias_map keyed by the inlined leaf identifier or slot/key), populate
that map where _build_first_last_base_select creates synthetic aliases, and then
modify this block (where having_full_alias is computed using
slot_by_key/aliases_by_slot_id) to also consult
first_last_state.composite_alias_map when slot_by_key is missing for an inlined
composite leaf so the lookup uses the materialised alias before falling back to
"__having_ref__" (then pass that alias through to _build_agg as currently done).
- Around line 5956-6018: The HAVING-aggregate path only strips ColumnKey.path
but misses the cross-model "reroot" steps used in _render_cross_model_cte,
causing parametric/first-last aggregates to resolve incorrectly; before calling
_build_agg_render_spec_from_planned, apply the same reroot logic used in
_render_cross_model_cte to local_agg (and tmp_slot/key) so the source is
rerooted to the host path, explicit time args and kwargs are translated to the
host scope, and column_filter_key (and any derived-time handling) is preserved;
locate the reroot implementation in _render_cross_model_cte and invoke it (or
inline the same transformations) on local_agg/value_key/tmp_slot so synth is
built with the rerooted source, args, and kwargs.
---
Nitpick comments:
In `@tests/test_sql_generator.py`:
- Around line 5713-5723: The test function
test_hidden_composite_order_rejected_at_input_validation currently contains a
function-local import "from pydantic import ValidationError as
PydanticValidationError"; move this import to the module top-level (alongside
other imports) and remove the in-function import so the test uses the
module-scoped PydanticValidationError symbol instead, preserving the exact alias
and ensuring no other behavior changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2e9e563f-5c80-42d1-97d0-44889b97a639
📒 Files selected for processing (3)
slayer/engine/planning.pyslayer/sql/generator.pytests/test_sql_generator.py
🚧 Files skipped from review as they are similar to previous changes (1)
- slayer/engine/planning.py
…anked subquery
Codex round-4 review caught the entry-gate gap from round 3's composite
fix: ``_has_first_last_aggregate`` only walked top-level ``AggregateKey``
slots. A query whose ONLY first/last reference is inside a composite
slot (no direct first/last sibling) — e.g.
``measures=[ModelMeasure("revenue:last(created_at) + revenue:last(updated_at)")]``
— took the regular base path, the composite render emitted ``MAX(CASE
WHEN _last_rn = 1 …)`` referencing a column the bare-FROM never
projected.
Fix: extend ``_has_first_last_aggregate`` to also walk composite slot
keys via ``_iter_first_last_leaves``. Combined with round 3's
composite-operand synth gathering and rn-state threading, composite-only
queries now correctly build the ranked subquery and render each operand
with its own ``_first_rn`` / ``_last_rn{suffix}``.
New regression test:
``test_composite_only_first_last_triggers_ranked_subquery``.
Also: Sonar S3776 NOSONAR(S3776) on ``_iter_first_last_leaves`` (the
walker is the per-type recursion contract — extracting per-type helpers
would scatter it); Sonar S3626 redundant ``return`` removed at the end
of the InKey branch.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ined-path discovery Codex round-5 review caught two more composite-aware code paths the round 3/4 fixes missed: 1. ``_build_first_last_base_select``'s "no-default ranking time column" validation only walked top-level ``AggregateKey`` slots — a composite like ``revenue:last + 1`` (no explicit time arg, no model default_time_dimension) bypassed the error and then ``_build_unfiltered_rn_columns`` emitted ``ORDER BY None``. Extend the validation to also walk composite slots via ``_iter_first_last_leaves``. 2. ``_collect_joined_paths_for_base`` walked explicit time args only on top-level ``AggregateKey`` slots. A composite operand ``revenue:last(customers.signed_up_at) + 1`` orders the ranked subquery by ``customers.signed_up_at`` but the ``customers`` join was never pulled into the FROM, producing invalid SQL. Extend path discovery to walk composite-slot first/last leaves. Two new regression tests: * ``test_composite_first_last_without_default_time_dim_raises`` * ``test_composite_first_last_with_joined_time_arg_adds_join`` Also: CodeRabbit minor — the round-2 cross-model HAVING regression test used ``filter="active = 1"`` on a BOOLEAN column, which Postgres rejects (no implicit bool↔int cast). Switched to ``= TRUE``. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@slayer/sql/generator.py`:
- Around line 6294-6313: The loop that collects join paths for first/last
composite leaves only handles ColumnKey args, missing ColumnSqlKey which
_resolve_explicit_time_col() can handle; update the inner check in the
Phase.AGGREGATE handling (the fl_keys loop) to treat ColumnSqlKey the same as
ColumnKey (e.g., isinstance(a, (ColumnKey, ColumnSqlKey))) and call _add(a.path)
for both so derived explicit time args like ColumnSqlKey aliases get their joins
added; ensure you reference the AggregateKey/Phase.AGGREGATE block and the
_resolve_explicit_time_col behavior when making this change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: dbd32776-1f17-4e05-9ede-233589008fe2
📒 Files selected for processing (2)
slayer/sql/generator.pytests/test_sql_generator.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/test_sql_generator.py
… filter Codex round-6 review caught two more edge cases: 1. Filtered composite operands rebuilt their synth with the placeholder ``full_alias="__op__"`` in ``_render_aggregate_composite_expr``, but ``_build_first_last_base_select`` keyed ``filtered_rn_map`` by the per-leaf alias it minted at synth time (``_composite_op_N``). The lookup missed → bare ``_last_rn`` + raw filter. Thread a ``composite_alias_by_key`` map so the operand synth's alias matches the rn-map key. 2. Cross-model queries with a LOCAL first/last HAVING filter (``filters=["revenue:last(created_at) > 100"]``) had the ``_render_with_cross_model_plans`` ``base_render_order`` construction skip AGGREGATE-phase filter deps. The host ``_base`` CTE wasn't building the ranked subquery → HAVING emitted dangling ``_last_rn``. Add an unconditional ``_collect_base_aux_slot_ids(aggregates_only= True, include_order=False)`` pass to the cross-model base-order construction. The fix surfaces the existing ``NotImplementedError`` for the combination of local first/last + cross-model aggregate (the in-place guard was already there), which is a strict improvement over silently dangling-``_last_rn`` SQL. NOSONAR(S3776) on ``_collect_joined_paths_for_base`` (round 5 extension bumped its complexity to 30). Two regression tests: * ``test_filtered_composite_first_last_uses_filtered_rn`` * ``test_cross_model_query_with_local_first_last_having_filter`` Also: reply on CodeRabbit's POST-drop re-flag thread referencing the ``has_transforms`` guard in dc7a775. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
slayer/sql/generator.py (2)
4672-4763:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPropagate
composite_alias_by_keythrough theScalarCallKeybranch.The new alias map only survives the
ArithmeticKeyrecursion today. A filteredfirst/lastleaf nested under a scalar call still rebuilds with__op__, so itsfiltered_rn_map/filtered_match_maplookup misses and the SQL falls back to the wrong ranked column path.Suggested fix
if isinstance(key, ScalarCallKey): args = [] any_agg = False for a in key.args: if isinstance(a, (AggregateKey, ArithmeticKey, ScalarCallKey, LiteralKey)): e, ag = self._render_aggregate_composite_expr( key=a, slot=slot, source_model=source_model, source_relation=source_relation, bundle=bundle, rn_suffix_map=rn_suffix_map, default_time_col=default_time_col, filtered_rn_map=filtered_rn_map, filtered_match_map=filtered_match_map, + composite_alias_by_key=composite_alias_by_key, ) args.append(e) any_agg = any_agg or ag🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@slayer/sql/generator.py` around lines 4672 - 4763, The ScalarCallKey branch fails to forward composite_alias_by_key into its recursive calls, causing filtered first/last leaves inside ScalarCallKey to rebuild with the placeholder "__op__" and miss filtered_rn_map/filtered_match_map lookups; update the recursive invocation of _render_aggregate_composite_expr inside the ScalarCallKey loop (the call that binds e, ag) to include composite_alias_by_key=composite_alias_by_key so the per-leaf alias is preserved (same approach used in the ArithmeticKey branch).
4897-4968:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep routed
_cm_*filters out of the host first/last base path.This change now enables
_baseto take the ranked first/last path in cross-model queries, but that helper still rebuilds its innerWHEREwithoutskip_filter_ids, and_render_with_cross_model_plans()dropsbase_where_consumed. Filters routed to a_cm_*CTE can therefore still leak into the host ranked subquery or get re-applied outside it.Suggested direction
- _base_where_consumed, + base_where_consumed, base_first_last_state, ) = self._build_base_select_for_planned( planned_query=planned_query, bundle=bundle, source_model=source_model, @@ - if base_where is not None: + if base_where is not None and not base_where_consumed: base_select = base_select.where(base_where)You’ll also want to thread
skip_filter_ids=routed_idsinto_build_first_last_base_select()and use it in that helper’s_build_where_having_from_planned(...)call so the inner ranked subquery and the outer routing logic stay in sync.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@slayer/sql/generator.py` around lines 4897 - 4968, The host first/last base can still rebuild a WHERE that re-applies routed `_cm_*` filters because `_build_first_last_base_select()` is not receiving the aggregated routed_ids; pass the same routed_ids (computed here) into the call to _build_first_last_base_select(planned_query=..., base_render_order=..., skip_filter_ids=routed_ids, ...) and update _build_first_last_base_select to thread its skip_filter_ids argument into its internal call to _build_where_having_from_planned(...) so the inner ranked subquery and outer routing logic use the same skip list; ensure the symbol names affected are routed_ids, _build_first_last_base_select, and _build_where_having_from_planned.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@slayer/sql/generator.py`:
- Around line 4672-4763: The ScalarCallKey branch fails to forward
composite_alias_by_key into its recursive calls, causing filtered first/last
leaves inside ScalarCallKey to rebuild with the placeholder "__op__" and miss
filtered_rn_map/filtered_match_map lookups; update the recursive invocation of
_render_aggregate_composite_expr inside the ScalarCallKey loop (the call that
binds e, ag) to include composite_alias_by_key=composite_alias_by_key so the
per-leaf alias is preserved (same approach used in the ArithmeticKey branch).
- Around line 4897-4968: The host first/last base can still rebuild a WHERE that
re-applies routed `_cm_*` filters because `_build_first_last_base_select()` is
not receiving the aggregated routed_ids; pass the same routed_ids (computed
here) into the call to _build_first_last_base_select(planned_query=...,
base_render_order=..., skip_filter_ids=routed_ids, ...) and update
_build_first_last_base_select to thread its skip_filter_ids argument into its
internal call to _build_where_having_from_planned(...) so the inner ranked
subquery and outer routing logic use the same skip list; ensure the symbol names
affected are routed_ids, _build_first_last_base_select, and
_build_where_having_from_planned.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8509f6ec-8488-49b0-a2c9-f30b3c6967a2
📒 Files selected for processing (2)
slayer/sql/generator.pytests/test_sql_generator.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/test_sql_generator.py
…+ NOSONARs Three follow-ups from round-7 review: 1. Codex MAJOR: ``_render_aggregate_composite_expr``'s ``ScalarCallKey`` branch dropped ``composite_alias_by_key`` when recursing into operands (the ``ArithmeticKey`` branch forwarded it; only the ``ScalarCallKey`` recursion missed it). A filtered first/last operand inside ``coalesce(...)`` would then fall back to the placeholder ``__op__`` alias and miss the ``filtered_rn_map`` lookup. Forward ``composite_alias_by_key`` consistently in all three recursion branches. 2. Sonar quality-gate ERROR on ``new_duplicated_lines_density`` (4.0% > 3.0%): the round-6 cross-model AGG-phase filter materialization block was near-identical to the round-5 transform-deps block. Factor both into a single ``_add_local_aux_slots(include_order=, aggregates_only=)`` inner helper. Same control flow, one body. 3. Sonar S3776 NOSONAR markers on ``_render_aggregate_composite_expr`` (complexity 37 — sequential ValueKey dispatch with rn-state + composite-alias-by-key threading) and ``_render_with_cross_model_plans`` (complexity 80 — orchestration of host ``_base`` CTE + per-plan ``_cm_*`` CTEs + combined SELECT + transform chain + outer wrap). Also: reply on the CodeRabbit POST-drop re-flag thread (the third time this round) — the ``has_transforms`` guard from dc7a775 remains sufficient; POST in no-transform is planner-unreachable. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three follow-ups from round-8 review: 1. Codex MAJOR (slayer/sql/generator.py:6304+): a DERIVED first/last time arg (``ColumnSqlKey``) skipped join discovery. ``_resolve_explicit_time_col`` expands ``net_signed_at.sql = "customers.signed_up_at"`` so the ranked subquery's ORDER BY emits ``customers.signed_up_at`` — but ``_collect_joined_paths_for_base`` only walked ``ColumnKey`` args, so the ``customers`` join was not added to ``_base``. Extend the helper to accept ``source_model`` / ``source_relation`` / ``bundle`` and walk ``ColumnSqlKey`` args via the existing ``_expand_derived_column_sql`` + ``_joined_paths_in_sql`` pair (same machinery the ROW-derived TimeTruncKey path already uses). + regression test ``TestDev1501HiddenFirstLastRender.test_derived_time_arg_pulls_in_referenced_join``. 2. Sonar S3776 NOSONAR on ``_projection_rn_by_alias`` test helper (complexity 21 > 15) — sequential isinstance dispatch over outermost-SELECT projection layers (Alias / Cast / Max / Case / EQ / Column unwrap). Each layer is a structural-shape predicate whose failure short-circuits to the next projection; extracting per-layer helpers would scatter the predicate. 3. Sonar duplication gate (4.0% > 3.0% threshold, test_sql_generator.py 5.1% per-file). Extract ``_persist_and_engine(*models)`` async context-manager helper plus a ``_orders_with_paid_amount_model`` fixture; apply across the DEV-1501 test classes (TestDev1501HiddenFirstLastRender, TestDev1501BroadTriggerAndGuards, TestParameterizedAggCanonicalDistinct, TestTransformAmbiguousTimeDimension). One-line per call site (vs the prior 5-line ``tempfile + YAMLStorage + save_datasource + save_model + SlayerQueryEngine`` block) — drops ~140 net lines of new-line duplication. 3939 unit tests pass; ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codex MAJOR (slayer/sql/generator.py:6031+): the cross-model HAVING ``AggregateKey`` reconstruction in ``_render_filter_value_key_in_target_scope`` rerooted ``source`` to ``path=()`` but copied ``args`` / ``kwargs`` unchanged. The projection path (lines ~5685-5730) already handles all three symmetrically via ``_reroot_kwarg``. Mirror that logic here so a multi-hop routed filter (``customers.regions.amount:last(customers.regions.opened_at) > 0``) qualifies its time arg under the local CTE relation instead of the host-rooted ``__``-path alias that doesn't exist inside the target CTE. Inert for single-hop cases — both pre-fix and post-fix render ``customers.signed_up_at`` because ``_resolve_explicit_time_col``'s ``"__".join(path)`` happens to collapse to the same string when the target relation has the same single hop. Real fix lands for multi-hop shapes (``customers.regions.opened_at``) that the existing test suite doesn't yet exercise — kept the fix mechanical (same shape as the projection-path reroot) rather than introducing a multi-model test fixture that exercises an unrelated routing concern. 3939 unit tests pass; ruff clean. Sonar gate is fully green (duplication 0.9% < 3% threshold, 0 OPEN issues, 0 hotspots, no CI failures) after round 8's ``_persist_and_engine`` refactor. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
ffd85bc
into
egor/dev-1484-dev-1452-stage-c-migrate-pre-existing-tests-off-legacy



Summary
first()/last()aggregates referenced only by ORDER BY or HAVING used to render inline against a dangling_last_rn, collapsing distinct explicit-time-column specs (revenue:last(created_at)vsrevenue:last(updated_at)) into a single broken term. Companion bug:_build_agg's suffix guard silently collapsed distinct-time-column specs even in the projected case when the model had nodefault_time_dimension.docs/architecture/planning.md:89-93,155-159) already mandates the fix: materialise hidden order/filter slots in the base CTE, trim them via an outer SELECT. The transform path already did this; this PR extends the same pattern to the no-transform path._canonical_name; aggregate-only materialisation of hidden order/filter deps + AGGREGATE-phase filter coverage; conditional outer-trim wrapper with materialised-alias ORDER BY routing; suffix-guard drops thedefault_time_colrequirement;FirstLastRenderStatethreads rn maps into HAVING render. C13 multi-public-alias support in_build_first_last_base_selectfixed as a pre-existing bug surfaced by the outer wrap.Test plan
tests/test_sql_generator.py:4372flipped to expected-pass with strengthened assertions.test_sql_generator.py+ 3367 elsewhere = 3929 unit tests).poetry run ruff check slayer/ tests/clean.test_projection_trim.py::TestProvenance::test_declared_projection_order_and_hidden_exclusion) had its second assertion relaxed to "noquantity_sumin OUTER projection" to match the materialise+trim shape per docs — done with explicit approval.initdbnot on local PATH — CI environment will run it).🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests