Skip to content

Fix projection matching regression from removeTrivialWrappers in appendExpression#104317

Open
amosbird wants to merge 1 commit intoClickHouse:masterfrom
amosbird:fix-projection-prewhere
Open

Fix projection matching regression from removeTrivialWrappers in appendExpression#104317
amosbird wants to merge 1 commit intoClickHouse:masterfrom
amosbird:fix-projection-prewhere

Conversation

@amosbird
Copy link
Copy Markdown
Collaborator

@amosbird amosbird commented May 7, 2026

Summary

PR #88798 added removeTrivialWrappers inside QueryDAG::appendExpression to strip materialize/identity wrappers, enabling aggregate projection matching through views. However, calling it during incremental DAG merging (per expression step) breaks the name chain between steps: early stripping changes output names (e.g. materialize(val)val) before the next mergeInplace expects them, causing three different failures depending on query shape:

The fix moves removeTrivialWrappers from appendExpression (called per-step) to the end of build (after the full DAG is assembled and filter nodes are wired into outputs). This preserves the original fix for #32753 (projection matching through views) while avoiding the name resolution breakage during incremental merging.

Note on #104256 empty results: The empty-result behavior when SELECT toString(x) AS x ... WHERE x IN ... is a pre-existing analyzer semantic — alias shadows the source column name in WHERE regardless of projections. This is controlled by prefer_column_name_to_alias (default 0 = prefer alias). The regression from #88798 was only the exception (AMBIGUOUS_COLUMN_NAME), not the empty result.

Closes #104117
Closes #104235
Closes #104256

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix exceptions (NOT_FOUND_COLUMN_IN_BLOCK, LOGICAL_ERROR, AMBIGUOUS_COLUMN_NAME) when using projections with UNION ALL views, window functions, or alias-column name collisions. Regression from #88798.

Documentation entry for user-facing changes

  • Documentation is not required

…ppendExpression`

Move `removeTrivialWrappers` from `QueryDAG::appendExpression` (called
per-step during incremental DAG merging) to the end of `QueryDAG::build`
(after the full DAG is assembled).

When called during `appendExpression`, the early stripping of
`materialize`/`identity` wrappers changes output names before subsequent
`mergeInplace` calls, breaking name resolution across expression steps.
This caused NOT_FOUND_COLUMN_IN_BLOCK with UNION ALL views (ClickHouse#104117),
block structure mismatch with window functions (ClickHouse#104235), and
AMBIGUOUS_COLUMN_NAME with alias collisions (ClickHouse#104256).

Closes ClickHouse#104117
Closes ClickHouse#104235
Closes ClickHouse#104256
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented May 7, 2026

Workflow [PR], commit [9139e41]

Summary:


AI Review

Summary

This PR moves removeTrivialWrappers from per-step QueryDAG::appendExpression to the end of QueryDAG::build, which matches the stated intent to avoid breaking incremental DAG merge/name resolution while keeping wrapper cleanup before projection analysis. The change is small, targeted, and accompanied by a new stateless regression test that covers the three reported exceptions; I did not find correctness, safety, or compatibility issues in the patch itself.

Missing context
  • ⚠️ No CI logs/results were available in this review context, so CI health could not be independently validated here.
ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
No large/binary files
Final Verdict

Status: ✅ Approve

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label May 7, 2026
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented May 7, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.10% 84.10% +0.00%
Functions 91.10% 91.10% +0.00%
Branches 76.60% 76.60% +0.00%

Changed lines: 100.00% (7/7) | lost baseline coverage: 1 line(s) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment