Skip to content

[AURON #2153] Implement native function of map_from_entries#2169

Merged
slfan1989 merged 1 commit intoapache:masterfrom
weimingdiit:feat/map_from_entries-native-function
Apr 9, 2026
Merged

[AURON #2153] Implement native function of map_from_entries#2169
slfan1989 merged 1 commit intoapache:masterfrom
weimingdiit:feat/map_from_entries-native-function

Conversation

@weimingdiit
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #2153

Rationale for this change

map_from_entries(...) was not supported in Auron’s native execution path.
This PR extends native coverage for Spark map functions using the existing extension-function pattern already used for Spark-specific functions such as map_concat(...). The goal is to support this function natively while preserving Spark-compatible behavior.

What changes are included in this PR?

This PR:

  • adds MapFromEntries conversion in NativeConverters
  • passes Spark’s spark.sql.mapKeyDedupPolicy to the native implementation
  • registers Spark_MapFromEntries in datafusion-ext-functions
  • implements map_from_entries(...) in spark_map.rs
  • handles Spark-compatible semantics for:
    • null input array -> null result
    • null entry inside the input array -> null result
    • null key -> error
    • duplicate keys -> error by default
    • duplicate keys with LAST_WIN -> last value wins
    • null values -> allowed
  • adds Scala regression tests in AuronFunctionSuite
  • adds Rust unit tests in spark_map.rs

Are there any user-facing changes?

Queries using map_from_entries(arrayOfEntries) can now run through Auron’s native extension-function path instead of falling back or remaining unsupported.

How was this patch tested?

CI.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds native execution-path support for Spark map_from_entries(...) via Auron’s extension-function mechanism, including Spark-compatible null/duplicate-key semantics and regression/unit test coverage.

Changes:

  • Adds Spark-side conversion for MapFromEntries, passing spark.sql.mapKeyDedupPolicy into the native call.
  • Registers and implements Spark_MapFromEntries in datafusion-ext-functions with Spark-compatible semantics.
  • Adds Scala regression tests and Rust unit tests covering null handling and dedup policies.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
spark-extension/src/main/scala/org/apache/spark/sql/auron/NativeConverters.scala Converts Spark MapFromEntries into Spark_MapFromEntries, passing dedup policy.
spark-extension-shims-spark/src/test/scala/org/apache/auron/AuronFunctionSuite.scala Adds end-to-end Spark-vs-native regression tests for map_from_entries.
native-engine/datafusion-ext-functions/src/spark_map.rs Implements native map_from_entries and adds Rust unit tests.
native-engine/datafusion-ext-functions/src/lib.rs Registers Spark_MapFromEntries extension function.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: weimingdiit <weimingdiit@gmail.com>
@weimingdiit weimingdiit force-pushed the feat/map_from_entries-native-function branch from 26c23de to 8caebc4 Compare April 8, 2026 07:47
@weimingdiit weimingdiit marked this pull request as ready for review April 8, 2026 07:51
@slfan1989 slfan1989 merged commit 5814d40 into apache:master Apr 9, 2026
123 checks passed
@slfan1989
Copy link
Copy Markdown
Contributor

@weimingdiit Thanks for the contribution! Merged into the master.

@weimingdiit weimingdiit deleted the feat/map_from_entries-native-function branch April 10, 2026 01:36
@weimingdiit
Copy link
Copy Markdown
Contributor Author

@slfan1989 Thanks for your review and merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement native function of map_from_entries

3 participants