Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-51525][SQL] Collation field for Desc As JSON StringType #50290

Closed
wants to merge 3 commits into from

Conversation

asl3
Copy link
Contributor

@asl3 asl3 commented Mar 17, 2025

What changes were proposed in this pull request?

Add a collation field for Desc As JSON StringType.

For example:

"columns":[{"name":"c1","type":{"name":"string", "collation":"UNICODE_CI"}

or the default collation value:

"columns":[{"name":"c1","type":{"name":"string", "collation":"UTF8_BINARY"}

Why are the changes needed?

Add support for collation data type in Desc As JSON

Does this PR introduce any user-facing change?

Yes, it affects the output of Desc As JSON for collation data type.

How was this patch tested?

Added test in DescribeTableSuite

Was this patch authored or co-authored using generative AI tooling?

No

@asl3 asl3 changed the title Collation field for Desc As JSON StringType [SPARK-51525][SQL] Collation field for Desc As JSON StringType Mar 17, 2025
@github-actions github-actions bot added the SQL label Mar 17, 2025
columns = Some(List(
TableColumn("c1", Type("string", collation = Some("UNICODE_CI"))),
TableColumn("c2", Type("string", collation = Some("UNICODE_RTRIM"))),
TableColumn("c3", Type("string", collation = Some("fr"))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, why is the case not normalized? For builtin collations they should be, no?

@cloud-fan
Copy link
Contributor

There is still a test failure

case stringType: StringType =>
JObject(
"name" -> JString("string"),
"collation" -> JString(stringType.collationName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the plan to add a table collation as a top-level field to JSON in a separate PR?

Copy link
Contributor Author

@asl3 asl3 Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I'll make a separate PR to add the top-level table collation field

@cloud-fan
Copy link
Contributor

thanks, merging to master/4.0!

@cloud-fan cloud-fan closed this in 513a080 Mar 18, 2025
cloud-fan pushed a commit that referenced this pull request Mar 18, 2025
### What changes were proposed in this pull request?

Add a collation field for Desc As JSON StringType.

For example:

```
"columns":[{"name":"c1","type":{"name":"string", "collation":"UNICODE_CI"}
```

or the default collation value:

```
"columns":[{"name":"c1","type":{"name":"string", "collation":"UTF8_BINARY"}
```

### Why are the changes needed?

Add support for collation data type in Desc As JSON

### Does this PR introduce _any_ user-facing change?

Yes, it affects the output of Desc As JSON for collation data type.

### How was this patch tested?

Added test in DescribeTableSuite

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #50290 from asl3/asl3/collation-descasjson.

Authored-by: Amanda Liu <amanda.liu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 513a080)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants