Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-40470][SQL] Handle GetArrayStructFields and GetMapValue in "arrays_zip" function #37911

Closed
wants to merge 1 commit into from

Conversation

sadikovi
Copy link
Contributor

What changes were proposed in this pull request?

This is a follow-up for #37833.

The PR fixes column names in arrays_zip function for the cases when GetArrayStructFields and GetMapValue expressions are used (see unit tests for more details).

Before the patch, the column names would be indexes or an AnalysisException would be thrown in the case of GetArrayStructFields example.

Why are the changes needed?

Fixes an inconsistency issue in Spark 3.2 and onwards where the fields would be labeled as indexes instead of column names.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

I added unit tests that reproduce the issue and confirmed that the patch fixes them.

@github-actions github-actions bot added the SQL label Sep 16, 2022
@HyukjinKwon HyukjinKwon changed the title [SPARK-40470] Handle GetArrayStructFields and GetMapValue in "arrays_zip" function [SPARK-40470][SQL] Handle GetArrayStructFields and GetMapValue in "arrays_zip" function Sep 16, 2022
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Sep 16, 2022

Merged to master, branch-3.3 and branch-3.2.

HyukjinKwon pushed a commit that referenced this pull request Sep 16, 2022
…rays_zip" function

### What changes were proposed in this pull request?

This is a follow-up for #37833.

The PR fixes column names in `arrays_zip` function for the cases when `GetArrayStructFields` and `GetMapValue` expressions are used (see unit tests for more details).

Before the patch, the column names would be indexes or an AnalysisException would be thrown in the case of `GetArrayStructFields` example.

### Why are the changes needed?

Fixes an inconsistency issue in Spark 3.2 and onwards where the fields would be labeled as indexes instead of column names.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

I added unit tests that reproduce the issue and confirmed that the patch fixes them.

Closes #37911 from sadikovi/SPARK-40470.

Authored-by: Ivan Sadikov <ivan.sadikov@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 9b0f979)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
HyukjinKwon pushed a commit that referenced this pull request Sep 16, 2022
…rays_zip" function

### What changes were proposed in this pull request?

This is a follow-up for #37833.

The PR fixes column names in `arrays_zip` function for the cases when `GetArrayStructFields` and `GetMapValue` expressions are used (see unit tests for more details).

Before the patch, the column names would be indexes or an AnalysisException would be thrown in the case of `GetArrayStructFields` example.

### Why are the changes needed?

Fixes an inconsistency issue in Spark 3.2 and onwards where the fields would be labeled as indexes instead of column names.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

I added unit tests that reproduce the issue and confirmed that the patch fixes them.

Closes #37911 from sadikovi/SPARK-40470.

Authored-by: Ivan Sadikov <ivan.sadikov@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 9b0f979)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@sadikovi
Copy link
Contributor Author

Thank you, @HyukjinKwon!

LuciferYang pushed a commit to LuciferYang/spark that referenced this pull request Sep 20, 2022
…rays_zip" function

### What changes were proposed in this pull request?

This is a follow-up for apache#37833.

The PR fixes column names in `arrays_zip` function for the cases when `GetArrayStructFields` and `GetMapValue` expressions are used (see unit tests for more details).

Before the patch, the column names would be indexes or an AnalysisException would be thrown in the case of `GetArrayStructFields` example.

### Why are the changes needed?

Fixes an inconsistency issue in Spark 3.2 and onwards where the fields would be labeled as indexes instead of column names.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

I added unit tests that reproduce the issue and confirmed that the patch fixes them.

Closes apache#37911 from sadikovi/SPARK-40470.

Authored-by: Ivan Sadikov <ivan.sadikov@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
sunchao pushed a commit to sunchao/spark that referenced this pull request Jun 2, 2023
…rays_zip" function

### What changes were proposed in this pull request?

This is a follow-up for apache#37833.

The PR fixes column names in `arrays_zip` function for the cases when `GetArrayStructFields` and `GetMapValue` expressions are used (see unit tests for more details).

Before the patch, the column names would be indexes or an AnalysisException would be thrown in the case of `GetArrayStructFields` example.

### Why are the changes needed?

Fixes an inconsistency issue in Spark 3.2 and onwards where the fields would be labeled as indexes instead of column names.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

I added unit tests that reproduce the issue and confirmed that the patch fixes them.

Closes apache#37911 from sadikovi/SPARK-40470.

Authored-by: Ivan Sadikov <ivan.sadikov@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 9b0f979)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants