Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-45827] Disallow partitioning on Variant column #44742

Closed
wants to merge 2 commits into from

Conversation

cashmand
Copy link
Contributor

@cashmand cashmand commented Jan 15, 2024

What changes were proposed in this pull request?

Follow-up to #43984: we should not allow partitioning on VariantType. Even though it is is an atomic type, it represents a nested semi-structured value, so not partitioning is consistent with our decision to not allow partitioning on nested types. Also, for now at least, it is not even comparable, so attempting to partition fails with a confusing codegen error about a method named compare not being declared.

Why are the changes needed?

Improves error message when attempting to partition on Variant, and explicitly forbids a case that we do not intend to support.

Does this PR introduce any user-facing change?

Improved error message if a user tries to partition on Variant.

How was this patch tested?

Added unit test, which fails without the code change.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Jan 15, 2024
@cashmand
Copy link
Contributor Author

@cloud-fan, could you please review, or suggest a reviewer? I'm not sure what the On pull request update check failure means, or if there's anything I should do about it.

withTempDir { dir =>
val tempDir = new File(dir, "files").getCanonicalPath
intercept[AnalysisException] {
query.write.partitionBy("v").parquet(tempDir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we test more cases? e.g. .saveAsTable, and the SQL CTAS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, I added the extra cases to the test.

@cloud-fan
Copy link
Contributor

the failure is unrelated, merging to master!

@cloud-fan cloud-fan closed this in 689ab0e Jan 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants