Skip to content

Transitive join predicates#98479

Merged
davenger merged 27 commits intomasterfrom
transitive_join_predicates
Apr 2, 2026
Merged

Transitive join predicates#98479
davenger merged 27 commits intomasterfrom
transitive_join_predicates

Conversation

@davenger
Copy link
Copy Markdown
Member

@davenger davenger commented Mar 2, 2026

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

The join order optimizer now infers transitive equi-join predicates from existing join conditions. For example, given A.x = B.x AND B.x = C.x, the equivalence A.x = C.x is recognized, allowing the optimizer to consider direct joins between transitively-connected tables. This can improve plan quality for star and snowflake schemas where dimension tables connect through a shared fact table. The feature is controlled by the new enable_join_transitive_predicates setting (off by default)

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@davenger davenger marked this pull request as draft March 2, 2026 09:08
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Mar 2, 2026

Workflow [PR], commit [294b356]

Summary:

job_name test_name status info comment
Integration tests (amd_asan_ubsan, targeted) failure
test_totp_auth/test_totp.py::test_interactive_totp_authentication FAIL cidb, issue ISSUE CREATED
Stress test (arm_msan) failure
Server died FAIL cidb IGNORED
MemorySanitizer: use-of-uninitialized-value (STID: 1003-358c) FAIL cidb, issue ISSUE EXISTS

AI Review

Summary

This PR adds transitive predicate inference for join-order optimization behind enable_join_transitive_predicates (default off), introduces a reusable EquivalenceClasses helper, and adds extensive stateless tests (including outer-join and edge-case coverage). I found only minor issues: two typos in comments and missing user documentation update for the new setting. No additional correctness, safety, or performance blockers were identified beyond these.

Findings
  • 💡 Nits
    • [src/Processors/QueryPlan/Optimizations/joinOrder.cpp:315] Typo in comment: Peridically should be Periodically.
      • Suggested fix: correct the comment spelling.
    • [src/Processors/QueryPlan/Optimizations/joinOrder.h:77] Typo in comment: JoinActionRef-s should be JoinActionRefs.
      • Suggested fix: adjust pluralization for readability.
    • [src/Processors/QueryPlan/Optimizations/QueryPlanOptimizationSettings.h:75] New user-facing setting enable_join_transitive_predicates was added, but PR template still leaves documentation unchecked.
      • Suggested fix: add docs describing behavior, limitations/safety notes (especially outer-join semantics), and rollout guidance/example.
ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality ⚠️ Documentation checkbox remains unchecked for new setting enable_join_transitive_predicates.
Safe rollout ⚠️ User docs are needed to safely roll out the new setting.
Compilation time
Final Verdict
  • Status: ⚠️ Request changes
  • Minimum required actions:
    • Fix the two comment typos.
    • Add/update user documentation for enable_join_transitive_predicates and mark documentation accordingly in PR metadata.

@clickhouse-gh clickhouse-gh Bot added the pr-improvement Pull request with some product improvements label Mar 2, 2026
/// Column equivalence classes derived from equi-join edges (e.g., A.x = B.x AND B.x = C.x
/// implies A.x, B.x, C.x are all equivalent). Used by the join order optimizer to detect
/// transitive connectivity between relations without synthesizing extra edges.
EquivalenceClasses<RelColumn, RelColumnHash> column_equivalences;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can’t we use JoinActionRef instead of RelColumn? Is it because JoinActionRef would be different due to aliases in the DAG?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to use JoinActionRef-s

///
/// T must be equality-comparable; Hash must be a hash function for T.
template <typename T, typename Hash = std::hash<T>>
struct EquivalenceClasses
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s also another implementation of disjoint sets here, but I don’t think it’s worth unifying them or choosing one over the other. It probably wouldn’t make the code any cleaner, and performance isn’t critical in this cases.

/// Disjoint Set Union (Union-Find) data structure for JoinActionRef.
/// Used to find equivalent expressions in JOIN conditions.
/// For example, if (a = b) and (b = c), then a, b, c belong to the same equivalence class.
class EquivalentJoinKeySet
{
public:
JoinActionRef findOrAdd(JoinActionRef ref)
{
auto it = parent.find(ref);
if (it == parent.end())
{
parent.emplace(ref, ref);
return ref;
}
if (it->second == ref)
return ref;
JoinActionRef root = findOrAdd(it->second);
parent.insert_or_assign(ref, root);
return root;
}
JoinActionRef unite(JoinActionRef a, JoinActionRef b)
{
JoinActionRef root_a = findOrAdd(a);
JoinActionRef root_b = findOrAdd(b);
if (root_a == root_b)
return root_a;
size_t rank_a = rank[root_a];
size_t rank_b = rank[root_b];
if (rank_a < rank_b)
std::swap(root_a, root_b);
parent.insert_or_assign(root_b, root_a);
if (rank_a == rank_b)
++rank[root_a];
return root_a;
}
bool connected(JoinActionRef a, JoinActionRef b)
{
return findOrAdd(a) == findOrAdd(b);
}
std::unordered_map<JoinActionRef, std::vector<JoinActionRef>> getClasses()
{
std::unordered_map<JoinActionRef, std::vector<JoinActionRef>> classes;
for (auto & [ref, _] : parent)
classes[findOrAdd(ref)].push_back(ref);
return classes;
}
std::vector<JoinActionRef> getClass(JoinActionRef ref)
{
std::vector<JoinActionRef> res;
JoinActionRef root = findOrAdd(ref);
for (auto & [other_ref, _] : parent)
{
if (findOrAdd(other_ref) == root)
res.push_back(other_ref);
}
return res;
}
private:
std::unordered_map<JoinActionRef, JoinActionRef> parent;
std::unordered_map<JoinActionRef, size_t> rank;
};

davenger and others added 6 commits March 6, 2026 00:26
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SELECT * FROM t1 ALL INNER JOIN tj ON t1.key1 = tj.key1 AND t1.key2 = tj.key2 AND t1.key3 = tj.key3 AND t1.key1 = tj.key1; -- { serverError INCOMPATIBLE_TYPE_OF_JOIN }
SELECT '--- duplicate key dedup ---';
SELECT * FROM t1 ALL INNER JOIN tj ON t1.key1 = tj.key1 AND t1.key2 = tj.key2 AND t1.key3 = tj.key3 AND t1.key1 = tj.key1 SETTINGS enable_analyzer = 0; -- { serverError INCOMPATIBLE_TYPE_OF_JOIN }
SELECT * FROM t1 ALL INNER JOIN tj ON t1.key1 = tj.key1 AND t1.key2 = tj.key2 AND t1.key3 = tj.key3 AND t1.key1 = tj.key1 ORDER BY t1.key1 SETTINGS enable_analyzer = 1; -- Whith analyzer duplicate predicates are removed
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: Whith -> With in this test comment.

davenger and others added 4 commits March 13, 2026 12:09
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@@ -194,7 +128,7 @@ struct DecorrelationContext
CorrelatedPlanStepMap correlated_plan_steps;
/// Equivalence classes stack for subqeiries. Equivalence classes should not be propagated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: subqeiries -> subqueries.

davenger and others added 5 commits March 13, 2026 13:25
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@davenger davenger force-pushed the transitive_join_predicates branch from 464a172 to 357db9b Compare March 13, 2026 16:49
UInt64 max_ndv = std::max(lhs_ndv, rhs_ndv);
if (max_ndv > 0)
selectivity = std::min(selectivity, 1.0 / static_cast<double>(max_ndv));
break;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

computeSelectivity currently uses the first matching pair from an equivalence class and then breaks. Because class traversal comes from unordered_map/unordered_set, that pair selection can vary and make selectivity estimates (and therefore join-order choice) unstable. Please evaluate all candidates across the class (or pick a deterministic key and strongest selectivity) before finalizing selectivity.

@@ -72,6 +72,9 @@ struct QueryPlanOptimizationSettings
/// Maximum number of tables in query graph to reorder
UInt64 query_plan_optimize_join_order_limit;

/// Infer transitive equi-join predicates (e.g., A.x=B.x AND B.x=C.x implies A.x=C.x)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a new user-visible optimization knob, but the PR template still leaves documentation unchecked. Please add/update docs for enable_join_transitive_predicates (what it changes, safety constraints, and an example) so users can discover and roll it out safely.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@davenger davenger force-pushed the transitive_join_predicates branch from b9b508d to d8f8bce Compare March 14, 2026 07:45
@davenger davenger marked this pull request as ready for review March 14, 2026 12:28
@vdimir vdimir self-assigned this Mar 27, 2026
auto edge = getApplicableExpressions(left->relations, right->relations);
if (edge.empty() && best_plan)
bool connected = !edge.empty()
|| query_graph.areTransitivelyConnected(left->relations, right->relations);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, below we can add a join with an empty applied_edge, assuming the predicate will be added later via cleanupJoinPredicates. I was wondering what happens to the existing ones, and why the checks for non_applied_edges are still valid. It seems we will apply them anyway later, as soon as the required relations are present, and then remove them in cleanupJoinPredicates as well. It’s a bit tricky, but it should be okay.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is done the same way as for DPsize, but there we might have many entries in DP table that will not be included in the final plan, and creating real JoinPredicates for them upfront might be wasteful

@davenger davenger added this pull request to the merge queue Apr 1, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 1, 2026
Comment thread src/Processors/QueryPlan/Optimizations/joinOrder.cpp Outdated
@davenger davenger force-pushed the transitive_join_predicates branch from 79d57bc to af1d306 Compare April 1, 2026 19:28
/// Column equivalence classes derived from equi-join edges (e.g., A.x = B.x AND B.x = C.x
/// implies A.x, B.x, C.x are all equivalent). Used by the join order optimizer to detect
/// transitive connectivity between relations without synthesizing extra edges.
/// Stored as alias-resolved JoinActionRef-s pointing to INPUT nodes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: JoinActionRef-s should be JoinActionRefs (or JoinActionRef in plural context) for readability.

double computeSelectivity(const std::vector<JoinActionRef *> & edges, const BitSet & left, const BitSet & right);
size_t getColumnStats(BitSet rels, const String & column_name);

/// Peridically called from potentially long running optimization to check time limits and send progress
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: Peridically should be Periodically.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Apr 2, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.00% 84.00% +0.00%
Functions 90.90% 90.90% +0.00%
Branches 76.50% 76.50% +0.00%

Changed lines: 93.35% (295/316) | lost baseline coverage: 48 line(s) · Uncovered code

Full report · Diff report

@davenger davenger added this pull request to the merge queue Apr 2, 2026
Merged via the queue into master with commit 2814147 Apr 2, 2026
160 of 163 checks passed
@davenger davenger deleted the transitive_join_predicates branch April 2, 2026 06:12
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants