[SPARK-48649][SQL] Add "ignoreInvalidPartitionPaths" and "spark.sql.files.ignoreInvalidPartitionPaths" configs to allow ignoring invalid partition paths#47006
Closed
sadikovi wants to merge 1 commit intoapache:masterfrom
Conversation
Contributor
Author
|
cc @cloud-fan @dongjoon-hyun @gengliangwang for review. Thank you. |
cloud-fan
approved these changes
Jun 19, 2024
Contributor
|
thanks, merging to master! |
Contributor
Author
|
Thanks @cloud-fan. I forgot to ask, do we need any documentation updates or migration/release note for this? |
Contributor
|
This is not a breaking change (it's a new feature), so migration guide is not needed. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR adds a new data source config
ignoreInvalidPartitionPathsand SQL session configuration flagspark.sql.files.ignoreInvalidPartitionPathsto control the behaviour of skipping invalid partition paths (base paths).When the config is enabled, it allows skipping invalid paths such as:
In this case,
table/invalidpath will be ignored.Data source option takes precedence over the SQL config so with the code:
the query would ignore invalid partitions, i.e. the flag will be enabled.
The config is disabled by default.
Why are the changes needed?
Allows ignoring invalid partition paths that cannot be parsed.
Does this PR introduce any user-facing change?
No. The added configs are disabled by default to have the exact same behaviour as before.
How was this patch tested?
I added a unit test for this.
Was this patch authored or co-authored using generative AI tooling?
No.