Skip to content

[FLINK-31691][table] Add built-in MAP_FROM_ENTRIES function. #26777

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

liuyongvs
Copy link
Contributor

@liuyongvs liuyongvs commented Jul 10, 2025

  • What is the purpose of the change
    This is an implementation of MAP_FROM_ENTRIES

  • Brief change log
    MAP_FROM_ENTRIES for Table API and SQL

map_from_entries(map) - Returns a map created from an arrays of row with two fields. Note that the number of fields in a row array should be 2 and the key of a row array should not be null.

Syntax:
  map_from_entries(array_of_rows)

Arguments:
  array_of_rows: an arrays of row with two fields.

Returns:
  Returns a map created from an arrays of row with two fields. Note that the number of fields in a row array should be 2 and the key of a row array should not be null. Returns null if the argument is null

> SELECT map_from_entries(map[1, 'a', 2, 'b']);
 [(1,"a"),(2,"b")]

See also
presto https://prestodb.io/docs/current/functions/map.html

spark https://spark.apache.org/docs/latest/api/sql/index.html#map_from_entries

  • Verifying this change
    This change added tests in MapFunctionITCase.

  • Does this pull request potentially affect one of the following parts:
    Dependencies (does it add or upgrade a dependency): ( no)
    The public API, i.e., is any changed class annotated with @public(Evolving): (yes )
    The serializers: (no)
    The runtime per-record code paths (performance sensitive): ( no)
    Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: ( no)
    The S3 file system connector: ( no)

  • Documentation
    Does this pull request introduce a new feature? (yes)
    If yes, how is the feature documented? (docs)

@flinkbot
Copy link
Collaborator

flinkbot commented Jul 10, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

* <pre>{@code
* table.select(
* mapFromEntries(
* array(row(key1, 1), row(key2, 2), row(key3, 3))
Copy link
Contributor

@davidradl davidradl Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering what the constrains are for this

  • what is the behaviour for duplicate keys - this should be documented, I assume we take the first key value or error. We should add tests for this.
  • are there any constraints on the key type - this should be documented and tests added. How would it fail is the key type was not a valid one - for example I would think a nested row is not appropriate for a key, but primitives excluding boolean would seems reasonable.
    • do we check that the rows are of the same shape - we should test how/if this fails and document ?

@github-actions github-actions bot added community-reviewed PR has been reviewed by the community. and removed community-reviewed PR has been reviewed by the community. labels Jul 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-reviewed PR has been reviewed by the community.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants