Skip to content

Optimization for deferring row policy and PREWHERE#102884

Merged
yariks5s merged 9 commits into
masterfrom
yarik/fix-deferred-filters-perf-regression
May 3, 2026
Merged

Optimization for deferring row policy and PREWHERE#102884
yariks5s merged 9 commits into
masterfrom
yarik/fix-deferred-filters-perf-regression

Conversation

@yariks5s
Copy link
Copy Markdown
Member

@yariks5s yariks5s commented Apr 16, 2026

With FINAL queries and apply_row_policy_after_final = 1 (default), this PR narrows when the row policy and PREWHERE are deferred to after FINAL

This is the explain without the patch: https://fiddle.clickhouse.com/6251834d-26e7-4a4e-8bc4-ba4723de4420

This is the explain after the patch: https://pastila.clickhouse.com/?00224304/0d1aaa7352c7f511fa80cd31f1ba8eef#snF8VfZFxcq4d+YOsS2qvA==GCM

For a FINAL read with a row policy (and optionally a PREWHERE):

Row policy PREWHERE Before After
SK-only, deterministic none row policy deferred row policy not deferred
SK-only, deterministic any row policy deferred, PREWHERE follows user setting neither deferred
SK-only, non-deterministic any row policy deferred row policy still deferred (unchanged)
non-SK none row policy deferred row policy deferred (unchanged)
non-SK any row policy deferred, PREWHERE force-deferred row policy deferred, PREWHERE force-deferred (unchanged)

PREWHERE is not deferred anymore in this case. This is the case where we apply PREWHERE on sorting key column, when we probably can avoid deferring

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Skip deferring the row policy after FINAL when it depends only on sorting-key columns and is deterministic, and skip the corresponding PREWHERE deferral that the row policy was forcing in that case

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Apr 16, 2026

Workflow [PR], commit [2f6e799]

Summary:

job_name test_name status info comment
Performance Comparison (arm_release, master_head, 5/6) FAIL
distinct_in_order #6::old FAIL query history
distinct_in_order #6::new FAIL query history
logical_functions_medium #1::old FAIL query history
logical_functions_medium #1::new FAIL query history
logical_functions_small #7::old FAIL query history
logical_functions_small #7::new FAIL query history
logical_functions_small #10::old FAIL query history
logical_functions_small #10::new FAIL query history
logical_functions_small #12::old FAIL query history
logical_functions_small #12::new FAIL query history
2 more test cases not shown

AI Review

Summary

This PR narrows when row policy and PREWHERE are deferred for FINAL reads in ReadFromMergeTree, and adds/updates stateless tests for determinism and side-channel safety cases. The current implementation is coherent with the intended behavior change, and I did not find new correctness/safety issues beyond already-reported inline feedback.

Missing context
  • ⚠️ No CI run logs or benchmark results were available in this review context.
ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
No large/binary files
Final Verdict
  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh Bot added the pr-improvement Pull request with some product improvements label Apr 16, 2026
Comment thread src/Processors/QueryPlan/ReadFromMergeTree.cpp
Comment thread tests/queries/0_stateless/03742_apply_row_policy_after_final.sql
Comment thread tests/queries/0_stateless/03742_apply_row_policy_after_final.reference Outdated
Comment thread tests/queries/0_stateless/03742_apply_row_policy_after_final.reference Outdated
@ClickHouse ClickHouse locked and limited conversation to collaborators Apr 16, 2026
@ClickHouse ClickHouse unlocked this conversation Apr 16, 2026
@rienath rienath self-assigned this Apr 28, 2026
Copy link
Copy Markdown
Member

@rienath rienath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expand the PR description? Right now it only mentions PREWHERE, but the PR also stops deferring row policies that are over the sorting key, which is arguably the bigger change.

It would also be great to walk through each case explicitly (row policy over SK with no PREWHERE, row policy over SK with PREWHERE, row policy on non-SK with PREWHERE over SK and the cases that don't change) and describe what the pipeline looked like before vs after for each one. A small table or list would make it much easier to follow than diffing two EXPLAIN outputs.

More importantly, the correctness argument should be written down somewhere (with details about the determinism caveat and guard handling). If something breaks around row policies / FINAL, this is the first place someone will look, and "we probably can avoid deferring" won't tell them whether the optimisation was sound or a guess.

Comment thread tests/queries/0_stateless/03742_apply_row_policy_after_final.sql
Comment thread src/Processors/QueryPlan/ReadFromMergeTree.cpp
Comment thread src/Processors/QueryPlan/ReadFromMergeTree.cpp Outdated
Copy link
Copy Markdown
Member

@rienath rienath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I resolved the points that were taken care of. Ping me when the rest are addressed, I will take another look @yariks5s

@pufit could you please review this PR too? It's a critical section, would be great to have component expert eyes look at it

Comment thread src/Processors/QueryPlan/ReadFromMergeTree.cpp Outdated
@rienath rienath requested a review from pufit April 29, 2026 13:06
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Apr 30, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.10% 84.00% -0.10%
Functions 91.10% 91.10% +0.00%
Branches 76.60% 76.50% -0.10%

Changed lines: 100.00% (27/27) | lost baseline coverage: 1 line(s) · Uncovered code

Full report · Diff report

Copy link
Copy Markdown
Member

@rienath rienath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the fact that this also incidentally solved row policy problem that we had before. If someone set SET apply_row_policy_after_final = 1, apply_prewhere_after_final = 0;, then we had PREWHERE → FINAL → ROW POLICY. PREWHERE was applied before ROW POLICY and we could probe for rows we haven't got access to. You can see example here. But this is solved now!

SELECT *
FROM employees
FINAL
PREWHERE throwIf(salary > 200000, 'leak') = 0
   ┌─dept_id─┬─name──┬─salary─┬─version─┐
1. │       1 │ Alice │  81000 │       2 │
   └─────────┴───────┴────────┴─────────┘

I think we should add this fiddle query as a test too so we don't degrade. In any case, looks good!

@yariks5s
Copy link
Copy Markdown
Member Author

yariks5s commented May 3, 2026

Thanks! I will add this test in the follow-up PR.

@yariks5s yariks5s added this pull request to the merge queue May 3, 2026
Merged via the queue into master with commit 2ebd247 May 3, 2026
163 of 165 checks passed
@yariks5s yariks5s deleted the yarik/fix-deferred-filters-perf-regression branch May 3, 2026 05:35
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-synced-to-cloud The PR is synced to the cloud repo label May 3, 2026
@fm4v fm4v added pr-must-backport Pull request should be backported intentionally. Use this label with great care! v25.10-must-backport v25.12-must-backport v26.2-must-backport labels May 4, 2026
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label May 4, 2026
robot-ch-test-poll added a commit that referenced this pull request May 4, 2026
Cherry pick #102884 to 26.3: Optimization for deferring row policy and PREWHERE
robot-ch-test-poll added a commit that referenced this pull request May 4, 2026
Cherry pick #102884 to 26.4: Optimization for deferring row policy and PREWHERE
yariks5s added a commit that referenced this pull request May 14, 2026
Cherry pick #102884 to 26.2: Optimization for deferring row policy and PREWHERE
@robot-clickhouse robot-clickhouse added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label May 14, 2026
@yariks5s yariks5s removed pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR labels May 18, 2026
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore labels May 18, 2026
yariks5s added a commit that referenced this pull request May 18, 2026
Backport #102884 to 26.2: Optimization for deferring row policy and PREWHERE
yariks5s added a commit that referenced this pull request May 19, 2026
Backport #102884 to 26.3: Optimization for deferring row policy and PREWHERE
yariks5s added a commit that referenced this pull request May 19, 2026
Backport #102884 to 26.4: Optimization for deferring row policy and PREWHERE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-improvement Pull request with some product improvements pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo v25.10-must-backport v25.12-must-backport v26.2-must-backport

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants