Skip to content

fix(tesseract): pre-aggregation table name corruption when one name is a prefix of another#11000

Merged
waralexrom merged 1 commit into
masterfrom
fix-preagg-table-name-replacement
Jun 3, 2026
Merged

fix(tesseract): pre-aggregation table name corruption when one name is a prefix of another#11000
waralexrom merged 1 commit into
masterfrom
fix-preagg-table-name-replacement

Conversation

@waralexrom
Copy link
Copy Markdown
Member

Summary

QueryCache.replacePreAggregationTableNames replaced pre-aggregation placeholder table names sequentially with a global string replace per entry, which corrupts the SQL when one placeholder name is a prefix of another. With per-usage partition pruning (Tesseract), entries are suffixed name0..name10..., so 11+ usages of the same rollup in one query deterministically produce broken SQL like (SELECT ...)0 AS "alias" and fail on CubeStore with ParserError("Expected: ), found: 0").

Sequential replacement is broken in any order: real target names are built as tableName + '_' + versions (see PreAggregations.targetTableName), so a source name also matches inside already inserted target names of other entries.

Changes

  • Replace all placeholder names in a single pass using one regex alternation (longest names first, since JS alternation matches left-to-right) with a Map-based replacement callback — inserted text is never rescanned
  • Add unit tests: prefix collision with 11+ suffixed entries, source name inside an inserted target name, QueryWithParams passthrough, empty replacements, literal $ in target names, input array immutability

Testing

  • yarn jest test/unit/ReplacePreAggregationTableNames.test.ts — 7 passed; the two collision tests fail against the previous implementation
  • Reproduced live with Tesseract (CUBEJS_TESSERACT_SQL_PLANNER=true, CUBEJS_TESSERACT_PRE_AGGREGATIONS=true) on a query with 10+ multi-stage measures from one view (11 usages of the same rollup); fix verified against that setup

@waralexrom waralexrom requested a review from a team as a code owner June 2, 2026 21:51
@github-actions github-actions Bot added the javascript Pull requests that update Javascript code label Jun 2, 2026
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Jun 2, 2026

Claude finished @waralexrom's task in 59s —— View job


Code Review

Verdict: LGTM — correct fix for a real, deterministic bug. 0 high, 0 medium, 2 low.

Details

Tasks

  • Read PR diff
  • Verify root cause analysis
  • Audit new implementation for edge cases
  • Review tests

Correctness

The diagnosis is right. replacePreAggregationTableNames previously did N sequential String.replace(/name/g, target) calls, so:

  • If name1 is processed before name10, the name1 pass corrupts name10target1 + 0 (the bug the PR description shows).
  • If name10 is processed first, the name1 pass then matches name1 inside the just-inserted name10_<versions> target (since real targets are tableName + '_' + versions, see PreAggregations.targetTableName), corrupting it as well.

Both orderings break, so length-sorting alone wouldn't have been enough — single-pass replacement is the right fix.

Implementation (QueryCache.ts:422-450)

  • Longest-first sort is sufficient: JS regex alternation picks the first matching alternative at each position, so name10|name1 correctly matches name10 greedily. Equal-length names can't be prefixes of each other, so plain length sort is fine.
  • Using the replacer callback form ((match) => replacements.get(match)) avoids the $&/$1 interpretation of the old replaceAll, so the manual $$$$ escape is no longer needed. The "literal $ in target names" test guards this.
  • Inserted text is never rescanned (single pass), so the "source name inside inserted target name" collision is structurally impossible now.
  • The sorted.length guard correctly avoids new RegExp(''), which would match every position.
  • Input array immutability preserved via [...preAggregationsTablesToTempTables].sort(...).

Tests (ReplacePreAggregationTableNames.test.ts)

Good coverage — prefix collision (12 entries forces name1/name10/name11), source-inside-target shape, QueryWithParams passthrough, empty replacements, literal $, and array immutability. The negative assertion expect(result).not.toMatch(/\)\d+ AS/) is a nice tripwire for the original bug shape.

Minor / nits (non-blocking)

  1. QueryCache.ts:445replacements.get(match) as string: the cast is safe because match can only be a key from replacements (the regex is built from those same keys), but a non-null assertion (!) reads more naturally than a string cast since the value is already typed. Cosmetic.
  2. Duplicate tableName entries would silently lose all but one mapping (new Map([...]) keeps last). This matches prior reduce behavior (last write wins) so it's not a regression, but worth a one-liner comment if duplicates are ever realistic in practice. Probably they aren't.

Security / performance

Nothing of concern. Worst case the alternation regex grows with the number of pre-aggregations in a single query (dozens at most in practice), and matching stays linear in the SQL length. Net win over N full-string passes.

· `fix-preagg-table-name-replacement`

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.54%. Comparing base (a2c65aa) to head (67b7e9a).
⚠️ Report is 2 commits behind head on master.

❗ There is a different number of reports uploaded between BASE (a2c65aa) and HEAD (67b7e9a). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (a2c65aa) HEAD (67b7e9a)
cubesql 1 0
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #11000       +/-   ##
===========================================
- Coverage   83.32%   58.54%   -24.78%     
===========================================
  Files         255      216       -39     
  Lines       77177    17157    -60020     
  Branches        0     3495     +3495     
===========================================
- Hits        64304    10044    -54260     
+ Misses      12873     6606     -6267     
- Partials        0      507      +507     
Flag Coverage Δ
cube-backend 58.54% <100.00%> (?)
cubesql ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread packages/cubejs-query-orchestrator/src/orchestrator/QueryCache.ts
@waralexrom waralexrom merged commit f401118 into master Jun 3, 2026
180 of 183 checks passed
@waralexrom waralexrom deleted the fix-preagg-table-name-replacement branch June 3, 2026 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

javascript Pull requests that update Javascript code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants