[SPARK-43937][CONNECT][PYTHON] Add ifnull,isnotnull,equal_null,nullif,nvl,nvl2 to Scala and Python#41534
[SPARK-43937][CONNECT][PYTHON] Add ifnull,isnotnull,equal_null,nullif,nvl,nvl2 to Scala and Python#41534panbingkun wants to merge 11 commits intoapache:masterfrom
Conversation
…,nvl,nvl2 to Scala and Python
python/pyspark/sql/functions.py
Outdated
There was a problem hiding this comment.
if: new add on the scala side, however it conflicts with Python keywords if and cannot be added on the Python side?
There was a problem hiding this comment.
I think we should apply a different name in this case, like map in Scala <-> create_map in Python.
cc @HyukjinKwon @beliefer do you have a good name for it?
There was a problem hiding this comment.
I encountered the same like problem. I want add the any API to Pyspark, but it is a Python keywords too.
Personally, I use the name py_any instead. How about py_if or py_replaceable_if ? cc @HyukjinKwon @zhengruifeng
There was a problem hiding this comment.
friendly ping @HyukjinKwon for the function names for any and if ?
There was a problem hiding this comment.
@beliefer @panbingkun sorry, but I'd suggest we exclude any if for now: we have some/bool_or to replace any, and when/otherwise to replace if.
let's add them in separate PRs later if we really need them
There was a problem hiding this comment.
ok, let me remove it.
…,nvl,nvl2 to Scala and Python
…,nvl,nvl2 to Scala and Python
There was a problem hiding this comment.
maybe name it after misc non-aggregate functions
python/pyspark/sql/functions.py
Outdated
There was a problem hiding this comment.
I think we should apply a different name in this case, like map in Scala <-> create_map in Python.
cc @HyukjinKwon @beliefer do you have a good name for it?
python/pyspark/sql/functions.py
Outdated
There was a problem hiding this comment.
current ifnull use the same expression as nvl, let's discuss in #41516 (comment) first
There was a problem hiding this comment.
let's keep current implementation which directly invoke alias ifnull
python/pyspark/sql/functions.py
Outdated
There was a problem hiding this comment.
I encountered the same like problem. I want add the any API to Pyspark, but it is a Python keywords too.
Personally, I use the name py_any instead. How about py_if or py_replaceable_if ? cc @HyukjinKwon @zhengruifeng
There was a problem hiding this comment.
I don't know the reason put the EqualNull into the group misc_funcs. It seems EqualNull should putted into group predicate_funcs. cc @cloud-fan @zhengruifeng
There was a problem hiding this comment.
I see a few functions in FunctionRegistry (and other places) were placed in wrong group. e.g. Abs should be in math instead of misc non-aggregate functions
There was a problem hiding this comment.
I think we should fix it together.
There was a problem hiding this comment.
So, let's put these functions in 'predicate_funcs' first? and later I will propose a new PR to fix grouping and other trivial issues. @zhengruifeng @beliefer
There was a problem hiding this comment.
I think it is fine, let's fix the grouping in separate PR
Co-authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
…,nvl,nvl2 to Scala and Python
…,nvl,nvl2 to Scala and Python
There was a problem hiding this comment.
let's also exclude if on the scala side.
…,nvl,nvl2 to Scala and Python
There was a problem hiding this comment.
let's don't touch unrelated files
There was a problem hiding this comment.
Ok, Let me revert it.
…,nvl,nvl2 to Scala and Python
| * @group predicates_funcs | ||
| * @since 3.5.0 | ||
| */ | ||
| def ifnull(col1: Column, col2: Column): Column = withExpr { |
There was a problem hiding this comment.
the replacement in ifnull/nvl/nvl2/nullif is missing?
There was a problem hiding this comment.
According to my understanding, 'replacement: Expression' is only an internal Spark mechanism that exists to cooperate with 'RuntimeReplaceable' and is designed to reuse expressions. I guess it is actually not needed in our function.
There was a problem hiding this comment.
@zhengruifeng the default constructor with replacement: Expression will be replaced by rules. We only need consider the other constructor without replacement.
There was a problem hiding this comment.
oh, thanks for the explanation! @panbingkun @beliefer
…,nvl,nvl2 to Scala and Python
…,nvl,nvl2 to Scala and Pytho
|
merged to mater, thank you @panbingkun @beliefer |
…,nvl,nvl2 to Scala and Python ### What changes were proposed in this pull request? Add following functions: - ~~not: already exists on the Scala side, but is it still not on the Python side due to keyword conflicts?~~ - ~~if: new add on the scala side, however it conflicts with Python keywords `if` and cannot be added on the Python side?~~ - ifnull - isnotnull - equal_null - nullif - nvl - nvl2 to: - Scala API - Python API - Spark Connect Scala Client - Spark Connect Python Client ### Why are the changes needed? for parity ### Does this PR introduce _any_ user-facing change? Yes, new functions. ### How was this patch tested? - Add New UT. Closes apache#41534 from panbingkun/SPARK-43937. Lead-authored-by: panbingkun <pbk1982@gmail.com> Co-authored-by: panbingkun <84731559@qq.com> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
What changes were proposed in this pull request?
Add following functions:
not: already exists on the Scala side, but is it still not on the Python side due to keyword conflicts?if: new add on the scala side, however it conflicts with Python keywordsifand cannot be added on the Python side?to:
Why are the changes needed?
for parity
Does this PR introduce any user-facing change?
Yes, new functions.
How was this patch tested?