fix(tesseract): Prevent raw cube refs leaking into multi-stage pre-agg queries#10948
fix(tesseract): Prevent raw cube refs leaking into multi-stage pre-agg queries#10948MazterQyou wants to merge 1 commit into
Conversation
…g queries Signed-off-by: Alex Qyoun-ae <4062971+MazterQyou@users.noreply.github.com>
|
Claude finished @MazterQyou's task in 2m 33s —— View job Code Review: PR #10948 — Prevent raw cube refs leaking into multi-stage pre-agg queries
SummaryThis PR fixes a correctness bug where multi-stage queries hitting an external pre-aggregation could route raw DWH-table SQL to CubeStore, causing Bug AnalysisThe root cause was a pack/unpack asymmetry in A second contributing factor was that What the fix does1.
The symmetry between 2. The FindingsNo correctness bugs found. The fix is logically sound and well-targeted. Positive notes:
Suggestion (non-blocking):
Overall: Looks good. The fix is correct, well-commented, and narrowly scoped. Merging without a regression test is reasonable given the existing test suite for adjacent scenarios, but adding one would strengthen confidence. |
| let (keys_subquery, source, dimension_subqueries) = if pre_aggregation_override.is_some() { | ||
| ( | ||
| self.keys_subquery.clone(), | ||
| self.source.clone(), | ||
| self.dimension_subqueries.clone(), | ||
| ) | ||
| } else { | ||
| ( | ||
| keys_subquery.unwrap().clone().into_logical_node()?, | ||
| self.source.with_plan_node(source.unwrap().clone())?, | ||
| dimension_subqueries | ||
| .unwrap() | ||
| .iter() | ||
| .map(|itm| itm.clone().into_logical_node()) | ||
| .collect::<Result<Vec<_>, _>>()?, | ||
| ) | ||
| }; |
There was a problem hiding this comment.
Observation (non-blocking): The .unwrap() calls on lines 53–54 are safe here because the else branch guarantees pre_aggregation_override is None, which means the unpacker always sets keys_subquery, source, and dimension_subqueries to Some. The logic is sound, but worth noting for future maintainers — the safety depends on the pack/unpack invariant staying in sync.
| } else { | ||
| query.source().clone() | ||
| self.usages.truncate(saved_usages_len); | ||
| self.usage_counter = saved_counter; | ||
| return Ok(None); | ||
| }; |
There was a problem hiding this comment.
Good fix. The previous code fell through with query.source().clone(), silently allowing a LogicalJoin source to carry raw cube.table identifiers into the rewritten plan. The rollback + early return is the correct conservative choice — better to skip the rewrite than produce invalid SQL for CubeStore.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #10948 +/- ##
==========================================
- Coverage 83.57% 78.98% -4.60%
==========================================
Files 254 470 +216
Lines 76126 93132 +17006
Branches 0 3454 +3454
==========================================
+ Hits 63622 73559 +9937
- Misses 12504 19070 +6566
- Partials 0 503 +503
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Check List
Description of Changes Made
This PR fixes a bug in Tesseract where multi-stage queries hitting an external pre-aggregation could route raw dwh-table SQL to Cube Store, leading to
Internal: Error during planning: Table default.orders was not founderror.