Conversation
20 tests validating that measures evaluate against their defining table and are immune to row duplication from joins. Covers all aggregate types (SUM, AVG, COUNT, MIN, MAX, STDDEV, VARIANCE, MEDIAN, STRING_AGG, COUNT DISTINCT, MODE, PRODUCT, BIT_XOR, KURTOSIS, SKEWNESS, ENTROPY, LIST, BOOL_AND, BOOL_OR) and ratio measures across 1:N, M:N, and LEFT join cardinalities with grouped and filtered query shapes.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 825b0b42d1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
test/sql/measures.test
Outdated
| # Fan-out wouldn't change these with this data, but the measure must still | ||
| # evaluate against the defining table, not the joined rows. | ||
|
|
||
| query TT | ||
| SEMANTIC SELECT AGGREGATE(all_premium), AGGREGATE(any_trial) |
There was a problem hiding this comment.
Use fan-out-sensitive data for BOOL_AND/BOOL_OR test
This test cannot detect a regression in fan-out prevention because the chosen boolean values produce the same false/true result even if the measure is incorrectly evaluated on duplicated join rows (or rows filtered by the join). As written, it validates function wiring but not the stated fan-out immunity, so a real duplication bug for boolean measures could slip through while this case still passes.
Useful? React with 👍 / 👎.
… name collision Redesign boolean test data so Dave (no orders) is the only non-premium and only trial user. If fan-out prevention fails, both results flip. Rename avg_age to avg_cust_age in fan-out tests to avoid colliding with the downstream customers_qualified_v measure of the same name.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: db5f31dd09
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
test/sql/measures.test
Outdated
| SEMANTIC SELECT AGGREGATE(youngest), AGGREGATE(oldest) | ||
| FROM fanout_customers_minmax_v c | ||
| JOIN fanout_orders o ON c.cust_id = o.cust_id; | ||
| ---- | ||
| 25 40 |
There was a problem hiding this comment.
Make MIN/MAX fan-out test observable
This MIN/MAX case cannot detect a fan-out regression because MIN(age) and MAX(age) are identical on both the base rows and the duplicated join rows in this dataset. If measure evaluation accidentally shifts to joined rows, this test would still pass, so it does not validate the claimed fan-out protection for semi-additive aggregates. Use values where the excluded or duplicated rows would change the min/max result.
Useful? React with 👍 / 👎.
test/sql/measures.test
Outdated
| SEMANTIC SELECT AGGREGATE(distinct_products) | ||
| FROM fanout_orders_cd_v o | ||
| JOIN fanout_customers c ON o.cust_id = c.cust_id; |
There was a problem hiding this comment.
Use a fan-out join in COUNT DISTINCT coverage
This COUNT(DISTINCT) test joins orders to customers on cust_id, which is many-to-one from fanout_orders and does not duplicate order rows, so the measure input set is unchanged. As a result, the assertion can pass even when fan-out prevention is broken, because no fan-out is actually introduced for the defining table of the measure.
Useful? React with 👍 / 👎.
| query I | ||
| SEMANTIC SELECT AGGREGATE(mode_age) | ||
| FROM fanout_ages_mode_v a | ||
| JOIN fanout_age_orders ao ON a.id = ao.id; | ||
| ---- |
There was a problem hiding this comment.
Make MODE test fail on joined-row evaluation
The MODE scenario is currently fan-out-insensitive: with this data, both the base table and the fanned-out join still produce mode 30, so a bug that evaluates the measure over joined rows would not be caught. This leaves non-decomposable fan-out behavior unverified despite the test passing.
Useful? React with 👍 / 👎.
MIN/MAX: add row with global-min age that has no join match, so MIN changes if measure evaluates over joined rows only (15 vs 25). COUNT DISTINCT: join orders to line_items (1:N from orders side) so order rows are actually duplicated; previous join was M:1 and never fanned out the defining table. MODE: redesign data so base mode=20 but fanned-out mode=10 (id=1 has 3 orders, inflating age=10 to appear 3x vs age=20 appearing 2x).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 20b5fe3a04
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
test/sql/measures.test
Outdated
| SEMANTIC SELECT c.name, c.age, COUNT(*) AS order_rows, AGGREGATE(avg_cust_age) | ||
| FROM fanout_customers_v c | ||
| JOIN fanout_orders o ON c.cust_id = o.cust_id | ||
| GROUP BY c.name, c.age; |
There was a problem hiding this comment.
Make grouped AVG fan-out check observable
This grouped case cannot detect a join fan-out regression because each group is a single customer with one constant age value; duplicating that customer's joined rows leaves AGGREGATE(avg_cust_age) unchanged at 30/25/40 either way. As written, the assertion can still pass when grouped measure evaluation accidentally runs over joined rows, so it does not verify fan-out immunity for grouped query shape.
Useful? React with 👍 / 👎.
test/sql/measures.test
Outdated
| SEMANTIC SELECT p.product, AGGREGATE(profit_margin), COUNT(*) AS region_count | ||
| FROM fanout_products_v p | ||
| JOIN fanout_product_regions pr ON p.product = pr.product | ||
| GROUP BY p.product; |
There was a problem hiding this comment.
Make grouped profit-margin test sensitive to fan-out
This grouped profit-margin query is also fan-out-insensitive with the current fixture: each product has only one base row, so duplicating Widget via the join scales both numerator and denominator equally and keeps the ratio at 0.6. That means the test can pass even if grouped measure computation incorrectly uses duplicated joined rows, leaving a regression path untested.
Useful? React with 👍 / 👎.
Grouped AVG: replace single-customer-per-group test with tier-based grouping where gold tier has Alice(30, 3 orders) + Bob(25, 1 order). Correct AVG=27.5; fan-out would weight Alice 3x giving 28.75. Grouped ratio: remove per-product grouped profit margin test since (SUM(rev)-SUM(cost))/SUM(rev) is scale-invariant under uniform row duplication within a group, making it fundamentally unobservable. The ungrouped ratio test already validates fan-out prevention.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7eec0f1142
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
test/sql/measures.test
Outdated
| SEMANTIC SELECT AGGREGATE(distinct_products) | ||
| FROM fanout_orders_cd_v o | ||
| JOIN fanout_line_items li ON o.order_id = li.order_id; |
There was a problem hiding this comment.
Make COUNT DISTINCT fan-out test fail on joined-row evaluation
This case is still fan-out-insensitive because COUNT(DISTINCT product) is unchanged by duplicating existing order rows; the joined rows contain the same three product values as the base table. If measure evaluation regresses to use joined rows, this query still returns 3, so the test would pass and miss the bug. Use fixture data where the join changes which distinct values are visible (for example, a unique product on a non-matching row) so incorrect joined-row evaluation produces a different result.
Useful? React with 👍 / 👎.
test/sql/measures.test
Outdated
| # Correct: MIN=15 (id=4, no tasks), MAX=40 | ||
| # Joined-only (wrong): MIN=25, MAX=40 |
There was a problem hiding this comment.
Make MAX assertion observable in MIN/MAX fan-out test
The setup only makes MIN sensitive to joined-row evaluation; MAX is 40 in both the correct and incorrect (joined-only) path, so a regression affecting MAX specifically would still pass this test. Because this block is intended to validate both MIN and MAX fan-out immunity, the fixture should also force MAX to differ under joined-row execution (for example by putting the global max on an unmatched row).
Useful? React with 👍 / 👎.
MIN/MAX: put both the global min (age=15) and global max (age=50) on unmatched rows so both values change if measure uses joined rows only (25/35 vs correct 15/50). COUNT DISTINCT: add an order with product 'Thingamajig' from cust_id=99 who has no match in the join target. Correct count is 3 distinct products; joined-only evaluation would give 2.
Summary