Skip to content

Fix ARRAY_JOIN row count mismatch in partial evaluation during outer-to-inner join optimization#98464

Merged
alexey-milovidov merged 2 commits intomasterfrom
fix-array-join-partial-eval-row-count-mismatch
Mar 2, 2026
Merged

Fix ARRAY_JOIN row count mismatch in partial evaluation during outer-to-inner join optimization#98464
alexey-milovidov merged 2 commits intomasterfrom
fix-array-join-partial-eval-row-count-mismatch

Conversation

@alexey-milovidov
Copy link
Member

Summary

  • Fix LOGICAL_ERROR exception when arrayJoin in a WHERE clause causes row count mismatch during partial evaluation in the convertOuterJoinToInnerJoin optimization
  • arrayJoin expands a single-row constant array into multiple rows, breaking the invariant that all columns have input_rows_count rows
  • Skip ARRAY_JOIN expansion during partial evaluation (input_rows_count > 0), propagating "unknown" instead — the header evaluation path (input_rows_count=0) is unaffected
  • This is distinct from the null-column segfault fixed in Fix segfault in outer-to-inner join optimization with arrayJoin in filter #98147 — here the input column IS present but expanding it breaks the row count invariant

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=a1e0dff89e0b4aa8c21455db0bda2ca751f3387e&name_0=MasterCI&name_1=AST%20fuzzer%20%28amd_ubsan%29

Test plan

  • Added stateless test 03836_array_join_in_filter_partial_evaluation reproducing the exception
  • Verified original fuzzer query no longer triggers exception

Changelog category (leave one):

  • Critical Bug Fix (crash, data loss, RBAC)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix LOGICAL_ERROR exception when arrayJoin is used in a filter expression with OUTER JOIN and join_use_nulls enabled.

🤖 Generated with Claude Code

…r-to-inner join optimization

`arrayJoin` in a filter expression changes the number of rows (e.g., a
constant array of 5 elements expands 1 row into 5 rows). When
`filterResultForNotMatchedRows` evaluates the filter with
`input_rows_count=1`, the `ARRAY_JOIN` case in
`executeActionForPartialResult` would produce a column with more rows
than expected. Downstream functions then received arguments with
mismatched sizes (5 vs 1), triggering a LOGICAL_ERROR exception:
"Expected the argument to have 1 rows, but it has 5".

This is distinct from the null-column segfault fixed in #98147 - here
the input column IS present but expanding it breaks the row count
invariant.

Skip `ARRAY_JOIN` expansion when `input_rows_count > 0` (partial
evaluation for optimization). The result propagates as "unknown",
which is safe. The header evaluation path (`input_rows_count=0`) is
unaffected.

https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=a1e0dff89e0b4aa8c21455db0bda2ca751f3387e&name_0=MasterCI&name_1=AST%20fuzzer%20%28amd_ubsan%29

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented Mar 2, 2026

Workflow [PR], commit [0a7dff1]

Summary:

The bug is specifically in tryConvertAnyOuterJoinToInnerJoin which only
runs for ANY strictness joins. A regular LEFT JOIN (ALL strictness)
goes through a different code path that is not affected.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@alexey-milovidov alexey-milovidov self-assigned this Mar 2, 2026
@alexey-milovidov alexey-milovidov merged commit 8c4ed1e into master Mar 2, 2026
147 of 148 checks passed
@alexey-milovidov alexey-milovidov deleted the fix-array-join-partial-eval-row-count-mismatch branch March 2, 2026 15:11
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 2, 2026
@Algunenano Algunenano added pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-critical-bugfix labels Mar 13, 2026
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Mar 13, 2026
robot-clickhouse-ci-1 added a commit that referenced this pull request Mar 13, 2026
Cherry pick #98464 to 25.8: Fix ARRAY_JOIN row count mismatch in partial evaluation during outer-to-inner join optimization
robot-clickhouse added a commit that referenced this pull request Mar 13, 2026
… evaluation during outer-to-inner join optimization
robot-clickhouse-ci-1 added a commit that referenced this pull request Mar 13, 2026
Cherry pick #98464 to 25.12: Fix ARRAY_JOIN row count mismatch in partial evaluation during outer-to-inner join optimization
robot-clickhouse added a commit that referenced this pull request Mar 13, 2026
…l evaluation during outer-to-inner join optimization
robot-clickhouse-ci-1 added a commit that referenced this pull request Mar 13, 2026
Cherry pick #98464 to 26.1: Fix ARRAY_JOIN row count mismatch in partial evaluation during outer-to-inner join optimization
robot-clickhouse added a commit that referenced this pull request Mar 13, 2026
… evaluation during outer-to-inner join optimization
robot-clickhouse-ci-1 added a commit that referenced this pull request Mar 13, 2026
Cherry pick #98464 to 26.2: Fix ARRAY_JOIN row count mismatch in partial evaluation during outer-to-inner join optimization
robot-clickhouse added a commit that referenced this pull request Mar 13, 2026
… evaluation during outer-to-inner join optimization
robot-ch-test-poll added a commit that referenced this pull request Mar 13, 2026
Cherry pick #98464 to 25.3: Fix ARRAY_JOIN row count mismatch in partial evaluation during outer-to-inner join optimization
robot-clickhouse added a commit that referenced this pull request Mar 13, 2026
… evaluation during outer-to-inner join optimization
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants