Skip to content

Include OFFSET in physical optimizer group-trim limit pushdown#18600

Merged
yashmayya merged 1 commit into
apache:masterfrom
yashmayya:fix-group-trim-offset-pushdown
May 28, 2026
Merged

Include OFFSET in physical optimizer group-trim limit pushdown#18600
yashmayya merged 1 commit into
apache:masterfrom
yashmayya:fix-group-trim-offset-pushdown

Conversation

@yashmayya
Copy link
Copy Markdown
Contributor

On the physical optimizer path (usePhysicalOptimizer=true), the group-trim rule pushed limit = fetch into the leaf/final aggregate and ignored the query OFFSET. Paginated DISTINCT / GROUP BY queries that hit group trim then trimmed away rows the outer OFFSET ... FETCH window still needed and came back short — silently wrong results, no error. It now pushes offset + fetch (clamped to avoid int overflow) so the leaf/final aggregate keeps enough groups.

The default/legacy planner isn't affected: its sort-exchange-copy rule already folds the offset into the inner sort's fetch before the aggregate rule runs, so that rule sees offset + fetch there. PINOT_POST_RULES_V2 has no sort-exchange rule, so PinotLogicalAggregateRule matches the raw outer sort and has to add the offset itself.

Tests

  • Plan-level (PhysicalOptimizerPlans.json): no-aggregate DISTINCT ... LIMIT n OFFSET m and hinted aggregate, asserting leaf/final limit=[offset+fetch].
  • Integration (GroupByOptionsTest, usePhysicalOptimizer=true): paginated DISTINCT and hinted-aggregate paths return a full page. Verified it fails (0 rows) without the fix.

Fixes #18592

The group-trim rule pushed limit=fetch and dropped the offset, so paginated
DISTINCT/GROUP BY queries on the physical optimizer (usePhysicalOptimizer=true)
came back short. Push offset+fetch instead so the leaf/final aggregate keeps
enough groups to cover the OFFSET window. The legacy path is unaffected since
its sort-exchange-copy rule already folds the offset before the aggregate rule.

Fixes apache#18592
@yashmayya yashmayya added bug Something is not working as expected multi-stage Related to the multi-stage query engine mse-physical-optimizer Multi-stage engine physical query optimizer labels May 27, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 27, 2026

Codecov Report

❌ Patch coverage is 80.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.26%. Comparing base (26e7d12) to head (04222c6).
⚠️ Report is 19 commits behind head on master.

Files with missing lines Patch % Lines
...t/calcite/rel/rules/PinotLogicalAggregateRule.java 80.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18600      +/-   ##
============================================
- Coverage     64.31%   64.26%   -0.05%     
  Complexity     1137     1137              
============================================
  Files          3335     3335              
  Lines        205710   206046     +336     
  Branches      32085    32142      +57     
============================================
+ Hits         132295   132415     +120     
- Misses        62774    62976     +202     
- Partials      10641    10655      +14     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 64.26% <80.00%> (-0.05%) ⬇️
temurin 64.26% <80.00%> (-0.05%) ⬇️
unittests 64.26% <80.00%> (-0.05%) ⬇️
unittests1 56.78% <80.00%> (+0.05%) ⬆️
unittests2 36.79% <0.00%> (+0.99%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@yashmayya yashmayya merged commit c2561d4 into apache:master May 28, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something is not working as expected mse-physical-optimizer Multi-stage engine physical query optimizer multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multi-stage engine (physical optimizer): group-trim LIMIT pushdown ignores OFFSET, under-counting paginated queries

3 participants