Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-31118][table] Add ARRAY_UNION function. #22483

Merged
merged 1 commit into from May 16, 2023

Conversation

liuyongvs
Copy link
Contributor

@liuyongvs liuyongvs commented Apr 25, 2023

  • What is the purpose of the change
    This is an implementation of ARRAY_UNION

  • Brief change log
    ARRAY_UNION for Table API and SQL

Returns an array of the elements in the union of array1 and array2, without duplicates. 

Syntax:
array_union(array1, array2)

Arguments:
array: An ARRAY to be handled.

Returns:
An ARRAY. If any of the array is null, the function will return null.
Examples:

> SELECT array_union(array(1, 2, 3), array(1, 3, 5));
 [1,2,3,5] 

See also
spark https://spark.apache.org/docs/latest/api/sql/index.html#array_union
presto https://prestodb.io/docs/current/functions/array.html

  • Verifying this change
    This change added tests in CollectionFunctionsITCase.

  • Does this pull request potentially affect one of the following parts:
    Dependencies (does it add or upgrade a dependency): ( no)
    The public API, i.e., is any changed class annotated with @public(Evolving): (yes )
    The serializers: (no)
    The runtime per-record code paths (performance sensitive): ( no)
    Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: ( no)
    The S3 file system connector: ( no)

  • Documentation
    Does this pull request introduce a new feature? (yes)
    If yes, how is the feature documented? (docs)

@flinkbot
Copy link
Collaborator

flinkbot commented Apr 25, 2023

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@liuyongvs
Copy link
Contributor Author

liuyongvs commented Apr 25, 2023

hi @snuyanzin i submit a new pr and close this #21958.
The new implementation considers the issue of type conversion and refers to the spark implementation https://github.com/apache/spark/blob/50b652e241f7e31b99303359ec53e26a8989a4f0/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala#L4030

@@ -643,6 +643,9 @@ collection:
- sql: ARRAY_REVERSE(haystack)
table: haystack.arrayReverse()
description: Returns an array in reverse order. If the array itself is null, the function will return null.
- sql: ARRAY_UNION(array1, array2)
table: haystack.arrayUnion(array)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: haystack does not make sense in this context imo. It's a nice joke but imo works only for ARRAY_CONTAINS

Copy link
Contributor

@dawidwys dawidwys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants