Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-41233][SQL][PYTHON] Add array_prepend function #38947

Closed
wants to merge 72 commits into from

Conversation

navinvishy
Copy link
Contributor

What changes were proposed in this pull request?

Adds a new array function array_prepend to catalyst.

Why are the changes needed?

This adds a function that exists in many SQL implementations, specifically Snowflake: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/api/snowflake.snowpark.functions.array_prepend.html

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

Added unit tests.

@navinvishy
Copy link
Contributor Author

Running ./dev/scalafmt produced a lot of formatting changes on these files. I've added comments to the portions that are relevant to the task so they are easy to find for reviewing.

@HyukjinKwon HyukjinKwon changed the title SPARK-41231: Adds an array_prepend function to catalyst [SPARK-41231][SQL] Adds an array_prepend function to catalyst Dec 7, 2022
@LuciferYang
Copy link
Contributor

LuciferYang commented Dec 7, 2022

The jira number should be SPARK-41233

@LuciferYang
Copy link
Contributor

@HyukjinKwon ArrayAppend and ArrayPrepend are special cases of ArrayInsert. Is it better to implement each function independently or consider them as a whole?

newRight: Expression): ArrayPrepend =
copy(left = newLeft, right = newRight)
override def dataType: DataType = left.dataType
override def checkInputDataTypes(): TypeCheckResult = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can refer to ArrayRemove#checkInputDataTypes here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did what ArrayContains does. Maybe we should consolidate this, since it makes sense for many of these to do the same thing? eg. ArrayContains, ArrayRemove, ArrayPrepend, ArrayAppend etc.

@navinvishy navinvishy changed the title [SPARK-41231][SQL] Adds an array_prepend function to catalyst [SPARK-41233][SQL] Adds an array_prepend function to catalyst Dec 7, 2022
@navinvishy
Copy link
Contributor Author

The jira number should be SPARK-41233

Changed. Thanks!

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-41233][SQL] Adds an array_prepend function to catalyst [SPARK-41233][SQL] Add array_prepend function Dec 11, 2022
@zhengruifeng
Copy link
Contributor

also cc @beliefer

@LuciferYang
Copy link
Contributor

@navinvishy cloud you resolve the conflict?

@beliefer
Copy link
Contributor

beliefer commented Dec 27, 2022

Could we wait array_append merged then this PR could follow some convention and make a better abstract ?
@zhengruifeng @LuciferYang @HyukjinKwon

@@ -4043,6 +4043,16 @@ object functions {
def array_compact(column: Column): Column = withExpr {
ArrayCompact(column.expr)
}
/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The indentation of comments is different from others

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LuciferYang thanks. I've made the updates. There were some failures that seem to be unrelated to this change. I reran the workflow and seems to have succeeded this time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All Passed now :)

Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

@zhengruifeng
Copy link
Contributor

merged to master again 😄

MaxGekk pushed a commit that referenced this pull request Mar 20, 2023
…end`

### What changes were proposed in this pull request?
This pr re-generates golden files for `array_prepend` functions. It seems that the newly added case in #38947 is missing from the golden files due to lack of rebase when merging #40449.

### Why are the changes needed?
 Re-generates golden files for `array_prepend` functions to Pass GitHub Actions

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

- Pass GitHub Actions
- Manually checked with Scala 2.13

Closes #40492 from LuciferYang/SPARK-42791-FOLLOWUP.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
HyukjinKwon pushed a commit that referenced this pull request Mar 22, 2023
…hon client

### What changes were proposed in this pull request?

This is a follow-up of #38947.

Add `array_prepend` function to Spark Connect Python client.

### Why are the changes needed?

`array_prepend` was added at #38947 without Spark Connect Python client.

### Does this PR introduce _any_ user-facing change?

`array_prepend`  will be available in Spark Connect Python client.

### How was this patch tested?

Enabled the related test.

Closes #40514 from ueshin/issues/SPARK-41233/array_prepend.

Authored-by: Takuya UESHIN <ueshin@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants