You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#21828 implements OFFSET pushdown for parquet queries without filters. Queries with WHERE clauses still use GlobalLimitExec for offset handling because row counts may be inaccurate after filtering.
Problem
For queries like SELECT * FROM table WHERE date >= '2020-01-01' LIMIT 5 OFFSET 1000000, the offset is handled by GlobalLimitExec even when statistics prove all rows in some RGs satisfy the filter.
Opportunity
prune_by_statistics already marks RGs as is_fully_matched when column statistics prove ALL rows satisfy the predicate (e.g., min(date) >= '2020-01-01'). For these RGs, num_rows is the exact qualifying row count — safe to use for offset calculation.
Stop at the first non-fully-matched RG (qualifying row count unknown)
GlobalLimitExec handles the remaining offset (reduced by skipped rows)
Need mechanism to communicate skipped row count from parquet opener back to GlobalLimitExec (reduce its skip)
Challenge
The key difficulty is coordinating between parquet-level RG skipping and GlobalLimitExec's skip counter. The optimizer sets GlobalLimitExec(skip=N) at plan time, but the actual RG-level skipping happens at runtime. Options:
Shared counter between opener and GlobalLimitExec
Dynamic adjustment of GlobalLimitExec skip based on DataSourceExec's output
Background
#21828 implements OFFSET pushdown for parquet queries without filters. Queries with WHERE clauses still use
GlobalLimitExecfor offset handling because row counts may be inaccurate after filtering.Problem
For queries like
SELECT * FROM table WHERE date >= '2020-01-01' LIMIT 5 OFFSET 1000000, the offset is handled byGlobalLimitExeceven when statistics prove all rows in some RGs satisfy the filter.Opportunity
prune_by_statisticsalready marks RGs asis_fully_matchedwhen column statistics prove ALL rows satisfy the predicate (e.g.,min(date) >= '2020-01-01'). For these RGs,num_rowsis the exact qualifying row count — safe to use for offset calculation.Proposed approach
prune_by_offset, skip leading fully-matched RGs whose cumulative rows fall within offset (already implemented in feat: pushdown OFFSET to parquet for RG-level skipping #21828'sprune_by_offsetwithhas_predicateflag)GlobalLimitExechandles the remaining offset (reduced by skipped rows)GlobalLimitExec(reduce its skip)Challenge
The key difficulty is coordinating between parquet-level RG skipping and
GlobalLimitExec's skip counter. The optimizer setsGlobalLimitExec(skip=N)at plan time, but the actual RG-level skipping happens at runtime. Options:Related