Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always resolve current field name in SparkSQL when creating Row objects inside of arrays #2158

Merged

Conversation

jbaiera
Copy link
Member

@jbaiera jbaiera commented Nov 9, 2023

This PR is meant to fix #2157

The solution employed is to always resolve the current field name in the value reader when creating Row objects inside of arrays.

Tests were expanded to ensure that no other code was relying on this erroneous behavior. All instances of this happening in the test environment have been cleaned up, further validation put in place, and have had the incorrect behavior confirmed.

Copy link
Member

@masseyke masseyke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had one question, but it looks good to me!

@jbaiera jbaiera added v8.11.2 and removed v8.11.1 labels Nov 16, 2023
@jbaiera jbaiera merged commit 668fcc1 into elastic:main Nov 16, 2023
3 checks passed
@jbaiera jbaiera deleted the fix-spark-nested-object-empty-object-bug branch November 16, 2023 23:32
@jbaiera
Copy link
Member Author

jbaiera commented Nov 16, 2023

💚 All backports created successfully

Status Branch Result
8.11

Questions ?

Please refer to the Backport tool documentation

jbaiera added a commit to jbaiera/elasticsearch-hadoop that referenced this pull request Nov 16, 2023
…ts inside of arrays (elastic#2158)

* Resolve current field every time when creating Row objects in arrays.

* Expand tests to ensure no breakages

* Mirror changes to the sql-20 source root

* Throw an exception for the case where there is no current field when creating a map for an array.

(cherry picked from commit 668fcc1)
jbaiera added a commit to jbaiera/elasticsearch-hadoop that referenced this pull request Nov 16, 2023
…ts inside of arrays (elastic#2158)

* Resolve current field every time when creating Row objects in arrays.

* Expand tests to ensure no breakages

* Mirror changes to the sql-20 source root

* Throw an exception for the case where there is no current field when creating a map for an array.

(cherry picked from commit 668fcc1)
@jbaiera jbaiera removed the v7.17.15 label Nov 27, 2023
breskeby pushed a commit to breskeby/elasticsearch-hadoop that referenced this pull request Dec 1, 2023
…ts inside of arrays (elastic#2158) (elastic#2166)

* Resolve current field every time when creating Row objects in arrays.

* Expand tests to ensure no breakages

* Mirror changes to the sql-20 source root

* Throw an exception for the case where there is no current field when creating a map for an array.

(cherry picked from commit 668fcc1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nested objects fail parsing in Spark SQL when empty objects present
2 participants