-
Notifications
You must be signed in to change notification settings - Fork 1.8k
feat(spark): implement Spark map function map_from_entries
#17779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(spark): implement Spark map function map_from_entries
#17779
Conversation
Jefffrey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, some suggestions for more tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add tests for SELECT map_from_entries(NULL) and also when you have a null key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there is a test from ([], null) we can also have otherwise (null, [])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for examples!
Added some tests for input with nulls:
- nulltype instead of array (failed as expected)
- array with nulltype instead of struct (failed as expected)
- array with null key (failed as expected)
- array with null entries - this was failing instead of returning correct result
so added and rewritten some code to fix it
| ---- | ||
| {outer_key1: {inner_a: 1, inner_b: 2}, outer_key2: {inner_x: 10, inner_y: 20, inner_z: 30}} | ||
|
|
||
| # Test with duplicate keys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, last win strategy in action
comphead
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @SparkApplicationMaster I think it is LGTM
…tial offsets, add tests
Which issue does this PR close?
part of #15914
Rationale for this change
Migrate spark functions from https://github.com/lakehq/sail/ to datafusion engine to unify codebase
What changes are included in this PR?
implement spark udf
map_from_entrieshttps://spark.apache.org/docs/latest/api/sql/index.html#map_from_entries
Are these changes tested?
sqllogictests added
Are there any user-facing changes?
map_from_entries(key_value_struct_list)now can be called in queries