Skip to content

Conversation

@SparkApplicationMaster
Copy link
Contributor

Which issue does this PR close?

part of #15914

Rationale for this change

Migrate spark functions from https://github.com/lakehq/sail/ to datafusion engine to unify codebase

What changes are included in this PR?

implement spark udf map_from_entries
https://spark.apache.org/docs/latest/api/sql/index.html#map_from_entries

Are these changes tested?

sqllogictests added

Are there any user-facing changes?

map_from_entries(key_value_struct_list) now can be called in queries

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) spark labels Sep 25, 2025
Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, some suggestions for more tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add tests for SELECT map_from_entries(NULL) and also when you have a null key?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there is a test from ([], null) we can also have otherwise (null, [])

Copy link
Contributor Author

@SparkApplicationMaster SparkApplicationMaster Sep 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for examples!
Added some tests for input with nulls:

  1. nulltype instead of array (failed as expected)
  2. array with nulltype instead of struct (failed as expected)
  3. array with null key (failed as expected)
  4. array with null entries - this was failing instead of returning correct result
    so added and rewritten some code to fix it

----
{outer_key1: {inner_a: 1, inner_b: 2}, outer_key2: {inner_x: 10, inner_y: 20, inner_z: 30}}

# Test with duplicate keys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, last win strategy in action

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @SparkApplicationMaster I think it is LGTM

@Jefffrey Jefffrey added this pull request to the merge queue Sep 29, 2025
Merged via the queue into apache:main with commit 2d947b3 Sep 29, 2025
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

spark sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants