feat: OFFSET pushdown for multi-file parquet scans

## Background

#21828 implements OFFSET pushdown for single-file parquet queries. Multi-file queries still use `GlobalLimitExec` for offset handling.

## Problem

For multi-file queries like `SELECT * FROM directory/ LIMIT 5 OFFSET 1000000`, the offset is handled by `GlobalLimitExec` which reads all rows then discards the first 1M. With multiple files, we could skip entire files whose cumulative row count falls within the offset.

## Challenge

File read order is non-deterministic with `target_partitions > 1` and dynamic scheduling (#21351). A shared counter (`Arc<AtomicUsize>`) across file openers could work for single-partition sequential reads, but multi-partition ordering is undefined.

## Proposed approach

1. Single partition (`preserve_order=true`): files read in deterministic order → shared counter tracks consumed offset across files → skip entire files + RGs
2. Multi-partition: keep `GlobalLimitExec` (order undefined without ORDER BY)
3. Use file-level statistics (`PartitionedFile.statistics.num_rows`) to skip entire files before opening

## Related

- #21828 — Single-file OFFSET pushdown (parent PR)
- #19654 — Original issue for OFFSET performance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: OFFSET pushdown for multi-file parquet scans #21915

Background

Problem

Challenge

Proposed approach

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat: OFFSET pushdown for multi-file parquet scans #21915

Description

Background

Problem

Challenge

Proposed approach

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions