Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-39417][SQL] Handle Null partition values in PartitioningUtils #36810

Closed
wants to merge 3 commits into from

Conversation

singhpk234
Copy link

What changes were proposed in this pull request?

We should not try casting everything returned by removeLeadingZerosFromNumberTypePartition to string, as it returns null value for the cases when partition has null value and is already replaced by DEFAULT_PARTITION_NAME

Why are the changes needed?

for null partitions where removeLeadingZerosFromNumberTypePartition is called it would throw a NPE and hence the query would fail.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added a UT, which would fail with an NPE otherwise.

@singhpk234
Copy link
Author

cc @srowen @HyukjinKwon

@singhpk234 singhpk234 changed the title SPARK-39417: Handle Null partition values in PartitioningUtils [SPARK-39417][SQL]: Handle Null partition values in PartitioningUtils Jun 8, 2022
@singhpk234 singhpk234 changed the title [SPARK-39417][SQL]: Handle Null partition values in PartitioningUtils [SPARK-39417][SQL] Handle Null partition values in PartitioningUtils Jun 8, 2022
@huaxingao
Copy link
Contributor

LGTM. Pending test results

@huaxingao
Copy link
Contributor

The Python doc generation failure seems to be irrelevant. All the other tests passed.

@cloud-fan
Copy link
Contributor

do we know which commit caused this issue? is it a 3.3 only bug?

@singhpk234
Copy link
Author

singhpk234 commented Jun 9, 2022

do we know which commit caused this issue? is it a 3.3 only bug?

@cloud-fan, This seems to be introduced via commit, SPARK-35561 and seems to only affect 3.3.0

@huaxingao
Copy link
Contributor

The python documentation failure is not related to this PR. All the other tests run successfully. I will merge this PR.

huaxingao pushed a commit that referenced this pull request Jun 9, 2022
### What changes were proposed in this pull request?

We should not try casting everything returned by `removeLeadingZerosFromNumberTypePartition` to string, as it returns null value for the cases when partition has null value and is already replaced by `DEFAULT_PARTITION_NAME`

### Why are the changes needed?

for null partitions where `removeLeadingZerosFromNumberTypePartition` is called it would throw a NPE and hence the query would fail.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added a UT, which would fail with an NPE otherwise.

Closes #36810 from singhpk234/psinghvk/fix-npe.

Authored-by: Prashant Singh <psinghvk@amazon.com>
Signed-off-by: huaxingao <huaxin_gao@apple.com>
(cherry picked from commit dcfd9f0)
Signed-off-by: huaxingao <huaxin_gao@apple.com>
@huaxingao huaxingao closed this in dcfd9f0 Jun 9, 2022
@huaxingao
Copy link
Contributor

Merged to master/3.3. Thanks a lot @singhpk234 and et al.

@huaxingao
Copy link
Contributor

@singhpk234 is this your first Spark contribution? I tried to assign the jira to you but somehow can't find you. Then I tried to add you to the Spark contributors list but found multiple users with the same name. Can you have your email?

@singhpk234
Copy link
Author

singhpk234 commented Jun 9, 2022

@singhpk234 is this your first Spark contribution?

yup

Can you have your email?

sure, my email is prashant010696@gmail.com

Many thanks @huaxingao for merging this :) !!!

Thank you all for your awesome reviews.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
7 participants