DEV-1339: regression tests for multi-hop derived column qualification by ZmeiGorynych · Pull Request #93 · MotleyAI/slayer

ZmeiGorynych · 2026-05-05T08:55:29Z

Summary

DEV-1339 reported that derived columns reached via a multi-hop join path were emitted with unqualified inner refs. Investigation showed the bug is already fixed by PR #89 (DEV-1333) — the issue's exact reproduction (adapted to the real target_model/join_pairs API; the original used hypothetical alias/on fields that don't exist on ModelJoin) now produces correctly qualified SQL across every entry point.

This PR lands six regression tests in tests/test_cross_model_derived_columns.py pinning the fixed behavior so the contract can't silently regress:

multi-hop query dim ref → derived target column
multi-hop cross-model measure aggregation → derived target column
multi-hop filter → derived target column
multi-hop time-dim → derived target column
verbatim DEV-1339 solar_panels repro
diamond-path cross-model measure → derived target (asserts the non-queried diamond arm doesn't leak)

No production code changes — the existing expand_derived_refs machinery in slayer/engine/column_expansion.py already handles every scenario the issue describes.

Test plan

poetry run pytest tests/test_cross_model_derived_columns.py — 24/24 pass
poetry run pytest -m "not integration" — 1591/1591 pass
poetry run ruff check slayer/ tests/ — clean

Closes DEV-1339.

🤖 Generated with Claude Code

Summary by CodeRabbit

Tests
- Added extensive regression tests for multi-hop join queries with derived columns, ensuring correct qualification and aliasing across dimension selection, measure aggregation, filter predicates, and time-dimension enrichment.
- Included repro and diamond-join scenarios to prevent cross-path identifier leakage and validate multi-hop measure SQL generation.

DEV-1339 reported that querying a derived ``Column.sql`` reached via a multi-hop join path leaves the inner bare base-column refs unqualified in emitted SQL. Investigation showed the bug is already fixed by PR #89 (DEV-1333) — running the issue's reproduction now produces correctly qualified SQL across all entry points: dim refs, cross-model measure aggregations, filters, time dimensions, and diamond join paths. Land six regression tests pinning the fixed behavior so the qualification contract can't silently regress: multi-hop dim, multi-hop cross-model measure, multi-hop filter, multi-hop time-dim, the verbatim DEV-1339 solar_panels repro (adapted from the issue's hypothetical alias/on fields to the real target_model/join_pairs API), and a diamond-path cross-model measure that asserts the non-queried arm of the diamond never leaks into the measure CTE. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ev-1339-multi-hop-alias-path-doesnt-qualify-joined-model-derived

coderabbitai · 2026-05-05T08:55:40Z

Warning

Rate limit exceeded

@ZmeiGorynych has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 11 minutes and 32 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fc554bfa-232a-41af-99f1-a63ae7b39757

📥 Commits

Reviewing files that changed from the base of the PR and between 6aa9ddd and d518963.

📒 Files selected for processing (1)

tests/test_cross_model_derived_columns.py

📝 Walkthrough

Walkthrough

Adds regression tests and a test fixture validating recursive inlining and canonical alias qualification of cross-model derived Column.sql across multi-hop joins (A→B→C), including dimension selection, measures, filters, time-dim enrichment, a solar-panels repro, and a diamond-join case.

Changes

Multi-Hop Derived Column Test Suite

Layer / File(s)	Summary
Test Fixture / Data Shape `tests/test_cross_model_derived_columns.py`	Adds `_DEFAULT_C_COLUMNS` and `_save_a_b_c(...)` to persist a three-model chain A→B→C where C defines derived columns (e.g., `x_derived = raw_c * 2`).
Query-side Dimension / Filter `tests/test_cross_model_derived_columns.py`	Adds `test_multihop_query_dim_to_derived_target_column` and `test_multihop_filter_on_derived_target_column` asserting query-time expansion of `B.C.x_derived` is inlined and qualified to the canonical multi-hop alias (e.g., `B__C`) with no bare identifier leakage.
Measure Expansion / Parsing `tests/test_cross_model_derived_columns.py`	Adds `test_multihop_cross_model_measure_over_derived_target` which generates measure SQL (e.g., `B.C.x_derived:sum`), verifies sqlglot parses it, and asserts qualified inlining inside aggregation (`SUM(...)` contains qualified `raw_c * 2`).
Time-dimension Enrichment `tests/test_cross_model_derived_columns.py`	Adds `test_multihop_time_dim_to_derived_target_column` asserting time-dimension SQL that targets `B.C.shifted_ts` is expanded with canonical multi-hop qualifiers.
Regression & Complex Join Topologies `tests/test_cross_model_derived_columns.py`	Adds `test_dev_1339_solar_panels_repro` (solar_panels.panel_efficiency derived from bare inner bases) and `test_diamond_path_cross_model_measure_over_derived_target` (diamond join) to ensure inner base references and diamond-arm aliases are correctly qualified during cross-model derived expansion.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

MotleyAI/slayer#9: Implements multi-hop join aliasing and derived column resolution logic that these tests directly validate.
MotleyAI/slayer#89: Introduces recursive derived reference expansion and enrichment used by the canonical alias qualification behavior under test.

Poem

🐰
Hopping through joins from A to C,
I inline the maths for all to see,
No bare names wander, aliases hold,
B__C keeps the SQL tidy and bold,
Rabbit-approved tests—safe and free.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically identifies the main change: adding regression tests for multi-hop derived column qualification, with the ticket reference DEV-1339 providing context.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch egor/dev-1339-multi-hop-alias-path-doesnt-qualify-joined-model-derived

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (3)

tests/test_cross_model_derived_columns.py (3)
650-905: ⚡ Quick win

Add one parse-level assertion for a new multi-hop scenario.

These regressions mostly validate aliasing with substring checks. A malformed query shape could still satisfy those assertions, so extending the existing sqlglot.parse(...) sanity check to one representative multi-hop measure/time-dimension path would tighten the safety net for DEV-1339.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_cross_model_derived_columns.py` around lines 650 - 905, Add a sql
parsing sanity check to one representative multi-hop test to catch malformed SQL
that might pass substring checks: e.g. in
test_multihop_cross_model_measure_over_derived_target (after generating sql via
_gen_sql and normalizing to norm) call sqlglot.parse(sql) or
sqlglot.parse_one(norm) to assert the SQL is syntactically valid; alternatively
do the same in test_multihop_time_dim_to_derived_target_column after sql/norm is
produced. Ensure you import sqlglot at top of the test file if missing and place
the parse assertion before the other substring/assert checks so parse failures
fail early.
659-659: ⚡ Quick win

Use keyword arguments for the new _gen_sql calls.

These new cases pass engine, query, and model positionally, which breaks the repo’s Python calling convention and makes the setup blocks harder to scan.

As per coding guidelines "Use keyword arguments for functions with more than 1 parameter".

Also applies to: 686-686, 714-714, 774-774, 818-823, 829-834, 891-891
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_cross_model_derived_columns.py` at line 659, Several tests call
_gen_sql with positional args (e.g., _gen_sql(engine, query, model_a)); update
those calls to use keyword arguments to follow the project's convention: call
_gen_sql(engine=engine, query=query, model=model_a) (or model=model_b as
appropriate). Replace every positional invocation of _gen_sql in this file
(including the instances flagged around the noted ranges) so the parameters are
passed as keywords: engine=..., query=..., model=.... This applies to all
occurrences (the call at the shown line and the other listed occurrences).
730-731: ⚡ Quick win

Move these imports to module scope.

TimeGranularity and TimeDimension are static dependencies of this test module, so keeping them inside the test body works against the repo’s import convention.

As per coding guidelines "Place imports at the top of files".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_cross_model_derived_columns.py` around lines 730 - 731, Move the
two local imports out of the test body and place them at module scope at the top
of the test module: import TimeGranularity from slayer.core.enums and
TimeDimension from slayer.core.query so they are module-level imports; update
any existing duplicate imports to avoid redundancy and run tests to ensure no
import-time side effects.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/test_cross_model_derived_columns.py`:
- Around line 650-905: Add a sql parsing sanity check to one representative
multi-hop test to catch malformed SQL that might pass substring checks: e.g. in
test_multihop_cross_model_measure_over_derived_target (after generating sql via
_gen_sql and normalizing to norm) call sqlglot.parse(sql) or
sqlglot.parse_one(norm) to assert the SQL is syntactically valid; alternatively
do the same in test_multihop_time_dim_to_derived_target_column after sql/norm is
produced. Ensure you import sqlglot at top of the test file if missing and place
the parse assertion before the other substring/assert checks so parse failures
fail early.
- Line 659: Several tests call _gen_sql with positional args (e.g.,
_gen_sql(engine, query, model_a)); update those calls to use keyword arguments
to follow the project's convention: call _gen_sql(engine=engine, query=query,
model=model_a) (or model=model_b as appropriate). Replace every positional
invocation of _gen_sql in this file (including the instances flagged around the
noted ranges) so the parameters are passed as keywords: engine=..., query=...,
model=.... This applies to all occurrences (the call at the shown line and the
other listed occurrences).
- Around line 730-731: Move the two local imports out of the test body and place
them at module scope at the top of the test module: import TimeGranularity from
slayer.core.enums and TimeDimension from slayer.core.query so they are
module-level imports; update any existing duplicate imports to avoid redundancy
and run tests to ensure no import-time side effects.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5349daaa-af36-4691-a1bf-eac440c7715b

📥 Commits

Reviewing files that changed from the base of the PR and between cce8b08 and dfc942d.

📒 Files selected for processing (1)

tests/test_cross_model_derived_columns.py

- Parameterise ``_save_a_b_c`` with a ``c_columns`` kwarg so ``test_multihop_time_dim_to_derived_target_column`` can reuse the A/B scaffold instead of redefining all three models inline. Eliminates the 22-line duplicate block flagged by the SonarCloud quality gate (drops new-duplicated-lines density below the 3% threshold). - Add a ``sqlglot.parse(...)`` sanity assertion at the head of ``test_multihop_cross_model_measure_over_derived_target`` so structurally broken SQL short-circuits before the substring matchers — addresses the CodeRabbit review-summary nitpick on the prior commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (1)

tests/test_cross_model_derived_columns.py (1)
740-741: 💤 Low value

Consider moving imports to the top of the file.

These imports are placed inside the test function. Per coding guidelines, imports should be at the top of files. However, this is common in tests for isolation when only one test needs specific imports.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_cross_model_derived_columns.py` around lines 740 - 741, Move the
local imports for TimeGranularity and TimeDimension out of the test body and
place them at the module top so they follow the project's import guidelines;
find usages of TimeGranularity and TimeDimension in
tests/test_cross_model_derived_columns.py and replace the in-test "from
slayer.core.enums import TimeGranularity" and "from slayer.core.query import
TimeDimension" with module-level imports at the top of the file, ensuring no
name conflicts and keeping test isolation intact if any setup previously
depended on the late import.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/test_cross_model_derived_columns.py`:
- Around line 740-741: Move the local imports for TimeGranularity and
TimeDimension out of the test body and place them at the module top so they
follow the project's import guidelines; find usages of TimeGranularity and
TimeDimension in tests/test_cross_model_derived_columns.py and replace the
in-test "from slayer.core.enums import TimeGranularity" and "from
slayer.core.query import TimeDimension" with module-level imports at the top of
the file, ensuring no name conflicts and keeping test isolation intact if any
setup previously depended on the late import.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 831f6488-34f8-4e63-be20-69548655b80e

📥 Commits

Reviewing files that changed from the base of the PR and between dfc942d and 6aa9ddd.

📒 Files selected for processing (1)

tests/test_cross_model_derived_columns.py

Per the project import-style rule ("place all imports at the top of the file") and the CodeRabbit nitpick on PR #93, lift the in-function imports out of ``test_multihop_time_dim_to_derived_target_column``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

sonarqubecloud · 2026-05-05T09:25:01Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

ZmeiGorynych and others added 2 commits May 5, 2026 10:51

Merge branch 'main' of https://github.com/MotleyAI/slayer into egor/d…

dfc942d

…ev-1339-multi-hop-alias-path-doesnt-qualify-joined-model-derived

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

ZmeiGorynych merged commit 006e03e into main May 5, 2026
4 checks passed

This was referenced May 6, 2026

Type-aware CAST emission + sqlglot-aligned DataType + v5 migration (DEV-1361) #103

Merged

fix(enrichment): refuse filters/dims/measures that reference unreachable models (DEV-1367) #106

Closed

coderabbitai Bot mentioned this pull request May 14, 2026

DEV-1410: expand bare same-model derived refs in Column.sql #123

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEV-1339: regression tests for multi-hop derived column qualification#93

DEV-1339: regression tests for multi-hop derived column qualification#93
ZmeiGorynych merged 4 commits into
mainfrom
egor/dev-1339-multi-hop-alias-path-doesnt-qualify-joined-model-derived

ZmeiGorynych commented May 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 5, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

sonarqubecloud Bot commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ZmeiGorynych commented May 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud Bot commented May 5, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ZmeiGorynych commented May 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 5, 2026 •

edited

Loading