Skip to content

Skip granules consumed by OFFSET when reading in order#106578

Open
raimannma wants to merge 1 commit into
ClickHouse:masterfrom
raimannma:skip-granules-on-offset-read-in-order
Open

Skip granules consumed by OFFSET when reading in order#106578
raimannma wants to merge 1 commit into
ClickHouse:masterfrom
raimannma:skip-granules-on-offset-read-in-order

Conversation

@raimannma
Copy link
Copy Markdown
Contributor

Closes: #92671

When a MergeTree table is read in primary key order with an OFFSET, the leading granules entirely consumed by the offset are still read, merged and then dropped by the downstream offset step. This adds a query-plan optimization that drops those leading granules during reading and reduces the downstream offset accordingly. It is gated by the new setting query_plan_optimize_read_in_order_skip_offset (enabled by default) and only applies when it is safe to do so (forward read-in-order, no FINAL/PREWHERE/lightweight deletes/row policies/sampling/parallel replicas, and a safe ascending primary key).

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

When reading a MergeTree table in primary key order with an OFFSET, skip reading the leading granules entirely consumed by the offset.

When a MergeTree table is read in primary key order with an OFFSET, the
leading granules entirely consumed by the offset are read, merged and then
dropped by the downstream offset step.

This adds a query-plan optimization that, for forward read-in-order, drops
the leading granules whose exact total row count does not exceed the offset
(when each is strictly separated in primary key space from the remaining
data) and reduces the downstream offset by the number of rows skipped.

Gated by the new setting query_plan_optimize_read_in_order_skip_offset
(default enabled). Applied only when it is safe: no FINAL, PREWHERE,
lightweight deletes / row policies, sampling or parallel replicas, and a
safe ascending primary key.

Closes: ClickHouse#92671
@raimannma raimannma force-pushed the skip-granules-on-offset-read-in-order branch from 0affc21 to 65ce224 Compare June 5, 2026 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RFC] Skip granules based on the OFFSET for read in order

1 participant