Skip to content

fix: update clickbench expected plan for NDV-aware optimization#21050

Merged
timsaucer merged 1 commit intoapache:mainfrom
asolimando:asolimando/fix-clickbench-ndv-plan
Mar 19, 2026
Merged

fix: update clickbench expected plan for NDV-aware optimization#21050
timsaucer merged 1 commit intoapache:mainfrom
asolimando:asolimando/fix-clickbench-ndv-plan

Conversation

@asolimando
Copy link
Member

Which issue does this PR close?

Fixes CI breakage on main introduced by #19957.

Rationale for this change

#19957 introduced NDV extraction from Parquet metadata. The optimizer now sees NDV=1 for HitColor, BrowserCountry, BrowserLanguage in the clickbench test file and short-circuits COUNT(DISTINCT) to a constant projection, skipping the full table scan.

What changes are included in this PR?

Updates the expected EXPLAIN plan in clickbench.slt to match the new (better) physical plan:

-   01)AggregateExec: mode=Single, gby=[], aggr=[count(DISTINCT hits.HitColor), ...]
-   02)--DataSourceExec: file_groups={1 group: [...]}, projection=[HitColor, BrowserLanguage, BrowserCountry], file_type=parquet
+   01)ProjectionExec: expr=[1 as count(DISTINCT hits.HitColor), 1 as count(DISTINCT hits.BrowserCountry), 1 as count(DISTINCT hits.BrowserLanguage)]
+   02)--PlaceholderRowExec

Are these changes tested?

This PR is the test fix. Verified locally with cargo test --profile ci -p datafusion-sqllogictest --test sqllogictests.

Are there any user-facing changes?

No.

apache#19957 introduced NDV extraction from Parquet metadata. The optimizer
now sees NDV=1 for HitColor, BrowserCountry, BrowserLanguage in the
clickbench test file and short-circuits COUNT(DISTINCT) to a constant
projection, skipping the full table scan.
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Mar 19, 2026
@asolimando
Copy link
Member Author

asolimando commented Mar 19, 2026

(cc: @timsaucer as you reported the problem in CI and you have rights to merge the PR if it looks OK to you, tested locally but I might have missed something, waiting for CI)

Copy link
Member

@timsaucer timsaucer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@timsaucer timsaucer added this pull request to the merge queue Mar 19, 2026
Merged via the queue into apache:main with commit 4ae19eb Mar 19, 2026
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants