Skip to content

[SPARK-50061][SQL] Enable analyze table for collated columns#48586

Closed
stevomitric wants to merge 3 commits intoapache:masterfrom
stevomitric:stevomitric/analyze-fix
Closed

[SPARK-50061][SQL] Enable analyze table for collated columns#48586
stevomitric wants to merge 3 commits intoapache:masterfrom
stevomitric:stevomitric/analyze-fix

Conversation

@stevomitric
Copy link
Contributor

What changes were proposed in this pull request?

In this PR the analyze table command is enabled for collated strings. Current implementation collects stats based on the collation-aware Aggregate expression, so this PR only enables the aggregation.

Why are the changes needed?

To enable analyze table command for collated strings.

Does this PR introduce any user-facing change?

Yes, currently doing:

ANALYZE TABLE test_table COMPUTE STATISTICS FOR COLUMNS c

where c is collated string, fails because of unsupported datatype. This PR addresses this issue and enables the command.

How was this patch tested?

New test in this PR.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Oct 21, 2024
Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for CI.

@HyukjinKwon
Copy link
Member

I think you should sync your branch to the lastest master

@MaxGekk
Copy link
Member

MaxGekk commented Oct 25, 2024

+1, LGTM. Merging to master.
Thank you, @stevomitric.

@MaxGekk MaxGekk closed this in 413a65b Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants