Skip to content

compareTrackAt() in partial_merge_join for nullables#100945

Open
4ertus2 wants to merge 7 commits into
ClickHouse:masterfrom
4ertus2:compare_at_track
Open

compareTrackAt() in partial_merge_join for nullables#100945
4ertus2 wants to merge 7 commits into
ClickHouse:masterfrom
4ertus2:compare_at_track

Conversation

@4ertus2
Copy link
Copy Markdown
Contributor

@4ertus2 4ertus2 commented Mar 27, 2026

Changelog category (leave one):

  • performance improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

perf improvement in partial_merge_join

Missing part of #99013

@4ertus2
Copy link
Copy Markdown
Contributor Author

4ertus2 commented Mar 27, 2026

Tested with TPC-H witt all columns changed to Nullables.

./bin/clickhouse-benchmark --cumulative -q "SELECT l_orderkey, sum(l_extendedprice * (1 - l_discount)) AS revenue, o_orderdate, o_shippriority FROM nu.customer, nu.orders, nu.lineitem WHERE (c_mktsegment = 'BUILDING') AND (c_custkey = o_custkey) AND (l_orderkey = o_orderkey) AND (o_orderdate < toDate('1995-03-15')) AND (l_shipdate > toDate('1995-03-15')) GROUP BY l_orderkey, o_orderdate, o_shippriority ORDER BY revenue DESC, o_orderdate ASC LIMIT 10 SETTINGS join_algorithm = 'partial_merge', max_rows_in_join = 100000000000000, max_bytes_in_join = 100000000000000"

before

Queries executed: 11.

localhost:9000, queries: 11, QPS: 0.128, RPS: 11314168.217, MiB/s: 260.941, result RPS: 1.276, result MiB/s: 0.000.

0%              7.421 sec.
10%             7.556 sec.
20%             7.749 sec.
30%             7.802 sec.
40%             7.825 sec.
50%             7.838 sec.
60%             7.841 sec.
70%             7.906 sec.
80%             7.909 sec.
90%             8.107 sec.
95%             8.122 sec.
99%             8.122 sec.
99.9%           8.122 sec.
99.99%          8.122 sec.

after

Queries executed: 11.

localhost:9000, queries: 11, QPS: 0.161, RPS: 14257888.379, MiB/s: 328.832, result RPS: 1.608, result MiB/s: 0.000.

0%              6.035 sec.
10%             6.113 sec.
20%             6.119 sec.
30%             6.127 sec.
40%             6.143 sec.
50%             6.188 sec.
60%             6.217 sec.
70%             6.277 sec.
80%             6.287 sec.
90%             6.308 sec.
95%             6.463 sec.
99%             6.463 sec.
99.9%           6.463 sec.
99.99%          6.463 sec.

I have not test it with reall nullable data yet. Only with Nullable columns with 0 nulls inside.

@alexey-milovidov alexey-milovidov added the can be tested Allows running workflows for external contributors label Mar 27, 2026
@alexey-milovidov
Copy link
Copy Markdown
Member

Please label it as a performance improvement.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Mar 27, 2026

Workflow [PR], commit [f6069ad]

Summary:


AI Review

Summary

This PR improves nullable-key handling in partial_merge_join (nullableCompareAt, nullableCompareTrackAt, and sort nulls_direction) and pre-extracts right-side columns for copy loops. The core code changes look consistent with the intended behavior, but the PR still lacks a focused nullable regression test for these cursor-movement changes.

Findings
  • ⚠️ Majors
    • [src/Interpreters/MergeJoin.cpp:458] The PR still has no dedicated tests/queries/0_stateless regression coverage for nullable-key partial-merge behavior (NULL runs, NULL/non-NULL boundaries, and continuation via left_key_tail / skip).
    • Suggested fix: add one focused stateless test that exercises these three cases with small block settings to force continuation paths.
Tests
  • ⚠️ Add a 0_stateless regression test for nullable partial-merge join cursor advancement across: (1) NULL runs on both sides, (2) NULL/non-NULL transitions, and (3) right-block continuation (left_key_tail / skip).
ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
No large/binary files
Final Verdict
  • Status: ⚠️ Request changes
  • Minimum required actions:
    1. Add the nullable-key partial_merge_join regression test described above in tests/queries/0_stateless.

@clickhouse-gh clickhouse-gh Bot added pr-not-for-changelog This PR should not be mentioned in the changelog pr-performance Pull request with some performance improvements and removed pr-not-for-changelog This PR should not be mentioned in the changelog labels Mar 27, 2026
Comment thread src/Interpreters/MergeJoin.cpp
@4ertus2
Copy link
Copy Markdown
Contributor Author

4ertus2 commented Mar 28, 2026

Add check for null-tracks. AI should be happy :)
It's expected to be better for many-null case and a bit slower for no-nulls case.

Queries executed: 8.

localhost:9000, queries: 8, QPS: 0.155, RPS: 13747537.579, MiB/s: 317.062, result RPS: 1.550, result MiB/s: 0.000.

0%              6.343 sec.
10%             6.347 sec.
20%             6.347 sec.
30%             6.396 sec.
40%             6.407 sec.
50%             6.419 sec.
60%             6.419 sec.
70%             6.424 sec.
80%             6.505 sec.
90%             6.505 sec.
95%             6.653 sec.
99%             6.653 sec.
99.9%           6.653 sec.
99.99%          6.653 sec.

Comment thread src/Interpreters/MergeJoin.cpp Outdated
Comment thread src/Interpreters/MergeJoin.cpp Outdated
@4ertus2 4ertus2 marked this pull request as draft March 28, 2026 09:24
@@ -406,7 +459,7 @@ void makeSortAndMerge(const Names & keys, SortDescription & sort, SortDescriptio
if (!unique_keys.contains(key_name))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change flips partial_merge_join key ordering to nulls_direction = -1 and rewrites nullableCompareAt/nullableCompareTrackAt semantics for nullable keys. Please add a regression test in tests/queries/0_stateless that covers nullable join keys across both streams (including runs of NULL, mixed NULL/non-NULL, and cross-block continuation) to lock in cursor advancement behavior.

Without such coverage, future refactors around compareTrackAt can easily reintroduce subtle mis-ordering/skip bugs.

Copy link
Copy Markdown
Contributor Author

@4ertus2 4ertus2 Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need get and improve these ones #99706

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. This thread is still live: the PR changes nullableCompareAt/nullableCompareTrackAt behavior and nulls_direction in partial_merge_join, but there is still no regression test in this PR that locks cursor advancement with nullable keys.

The pointer to #99706 does not resolve this PR by itself, because nothing in the current diff adds coverage for:

  1. runs of NULL on both sides,
  2. mixed NULL/non-NULL boundaries,
  3. continuation across right-block boundaries (left_key_tail/skip paths).

Please add a focused tests/queries/0_stateless test for these cases in this PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concern is still open in the current HEAD: this PR changes nullable cursor movement in nullableCompareAt, nullableCompareTrackAt, and null ordering in makeSortAndMerge, but still does not add a regression test under tests/queries/0_stateless.

Please add one focused test that validates partial-merge behavior for nullable join keys across:

  1. runs of NULL on both sides,
  2. NULL/non-NULL boundaries,
  3. continuation across right-block boundaries (left_key_tail/skip path with small block limits).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is still present in the current HEAD: this PR changes nullable-key cursor logic (nullableCompareAt, nullableCompareTrackAt, and makeSortAndMerge ordering), but it still does not add a dedicated tests/queries/0_stateless regression test for partial-merge joins on nullable keys.

The previous reply points to #99706, but there is still no coverage in this PR for:

  1. runs of NULL keys on both sides,
  2. NULL/non-NULL boundary transitions,
  3. continuation paths across right-block boundaries (left_key_tail / skip behavior).

Please add one focused stateless test in this PR to lock this behavior.

@4ertus2 4ertus2 force-pushed the compare_at_track branch from 97ad3fd to 8964e5b Compare May 11, 2026 20:41
@4ertus2 4ertus2 marked this pull request as ready for review May 11, 2026 21:47
@4ertus2
Copy link
Copy Markdown
Contributor Author

4ertus2 commented May 11, 2026

Ryzen 7840HS

Before (26.4.2.10)

Queries executed: 64.

localhost:9000, queries: 64, QPS: 1.786, RPS: 158353146.253, MiB/s: 3652.125, result RPS: 17.855, result MiB/s: 0.001.

0%              0.527 sec.
10%             0.533 sec.
20%             0.536 sec.
30%             0.538 sec.
40%             0.541 sec.
50%             0.545 sec.
60%             0.546 sec.
70%             0.550 sec.
80%             0.553 sec.
90%             0.567 sec.
95%             0.608 sec.
99%             0.792 sec.
99.9%           0.969 sec.
99.99%          0.969 sec.

After

Queries executed: 63.

localhost:9000, queries: 63, QPS: 2.220, RPS: 196917822.598, MiB/s: 4541.548, result RPS: 22.203, result MiB/s: 0.001.

0%              0.422 sec.
10%             0.425 sec.
20%             0.426 sec.
30%             0.428 sec.
40%             0.428 sec.
50%             0.430 sec.
60%             0.431 sec.
70%             0.433 sec.
80%             0.435 sec.
90%             0.438 sec.
95%             0.560 sec.
99%             0.693 sec.
99.9%           0.904 sec.
99.99%          0.904 sec.

Default HashJoin (slower!)

Queries executed: 79.

localhost:9000, queries: 79, QPS: 1.947, RPS: 172693076.147, MiB/s: 3982.849, result RPS: 19.472, result MiB/s: 0.001.

0%              0.480 sec.
10%             0.484 sec.
20%             0.485 sec.
30%             0.486 sec.
40%             0.488 sec.
50%             0.488 sec.
60%             0.489 sec.
70%             0.493 sec.
80%             0.501 sec.
90%             0.519 sec.
95%             0.530 sec.
99%             0.909 sec.
99.9%           0.980 sec.
99.99%          0.980 sec.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented May 12, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.00% 84.10% +0.10%
Functions 91.10% 91.10% +0.00%
Branches 76.50% 76.50% +0.00%

Changed lines: 100.00% (101/101) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-performance Pull request with some performance improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants