Skip to content

feat(parquet): separate push decoder frontier state from row-group decoding#9804

Open
HippoBaro wants to merge 3 commits intoapache:mainfrom
HippoBaro:frontier_row_group_selection
Open

feat(parquet): separate push decoder frontier state from row-group decoding#9804
HippoBaro wants to merge 3 commits intoapache:mainfrom
HippoBaro:frontier_row_group_selection

Conversation

@HippoBaro
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

#9697 aims to make staged buffer management in the push decoder more explicit. In doing so, it exposes a structural problem: the logic for deciding whether a row group is still live, skipped, or unreachable is spread across several parts of the decoder.

This matters because row-group-level buffer release depends on a single question having a clear answer: can this row group ever need bytes again? That answer depends on the queued row groups, the remaining selection, the running offset/limit budget, and whether predicates require the decoder to stay conservative. Today, that state is split across multiple components, which makes the release policy difficult to centralize cleanly.

What changes are included in this PR?

This PR introduces a clearer ownership boundary in the push decoder:

  • cross-row-group scan state is now handled by a dedicated frontier/look-ahead mechanism
  • the row-group builder is reduced to current-row-group decode work only
  • offset/limit accounting and row-group selection advancement are centralized around that frontier/builder split

This does not implement row-group-level buffer release directly, but it establishes the structure needed for that follow-up work. It should also make future pruning rules easier to add and maintain.

Are these changes tested?

All existing tests pass, and the refactor adds focused coverage for the extracted budget logic and the frontier-driven try_next_reader path.

Are there any user-facing changes?

None.

Extract the push decoder offset/limit accounting into `RowBudget` and
use it when planning row-group reads.

This centralizes the row-count arithmetic needed to apply offset and
limit without changing decoder behavior. It also adds focused tests for
plain limit, offset+limit, and empty-selection cases so later frontier
work can reuse the same accounting safely.

Signed-off-by: Hippolyte Barraud <hippolyte.barraud@datadoghq.com>
@github-actions github-actions Bot added the parquet Changes to the parquet crate label Apr 24, 2026
Move the cross-row-group scan state into a dedicated `RowGroupFrontier`.

The frontier now owns the queued row groups, the tail `RowSelection`,
the running `RowBudget`, and the conservative "has predicates" flag.
Reduce `RowGroupReaderBuilder` to current-row-group work only by
threading a budget snapshot into `next_row_group` and returning a typed
`RowGroupBuildResult`.

This also folds in the selection-frontier cleanup so queued selection
state is consumed in one place instead of through ad hoc split/clone
logic.

Signed-off-by: Hippolyte Barraud <hippolyte.barraud@datadoghq.com>
Teach the row-group frontier to seek ahead over queued row groups that
can be proven unreachable before instantiating the row-group builder.

Skip queued row groups when their selection slice is empty, when
offset/limit leaves no rows to read, or when the remaining limit is
already exhausted. Keep predicate-bearing row groups conservative and
stop at the first row group that may still need data.

Add a push decoder regression covering `try_next_reader` with
offset/limit so the frontier path is exercised directly.

Signed-off-by: Hippolyte Barraud <hippolyte.barraud@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant