-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-26979][PYTHON][FOLLOW-UP] Make binary math/string functions take string as columns as well #24121
Conversation
Test build #103599 has finished for PR 24121 at commit
|
Test build #103604 has finished for PR 24121 at commit
|
I'm definitely in favour of the change in how math functions work. Somehow I missed that cast to float, so I thought they just passed the argument directly when given a string (which would've worked, since the Scala API has string overloads for all math functions). So this is something I should've included in my PR. I'll add some inline comments for stuff I disagree with. |
Test build #103612 has finished for PR 24121 at commit
|
I'll merge this one assuming we want this way if there are no more comments this week. adding @holdenk, @felixcheung, @BryanCutler, @ueshin, @viirya for visibility. |
Test build #103670 has finished for PR 24121 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just add a quick release-notes tag and Docs text to the JIRA for the one potential behavior change. This looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Merged to master. |
What changes were proposed in this pull request?
This is a followup of #23882 to handle binary math/string functions. For instance, see the cases below:
Before:
After:
Note that,
This PR causes a slight behaviour changes for math functions. For instance, numbers as strings (e.g.,
"1"
) were supported as arguments of binary math functions before. After this PR, it recognises it as column names.I also intentionally didn't document this behaviour changes since we're going ahead for Spark 3.0 and I don't think numbers as strings make much sense in math functions.
There is another exception
when
, which takes string as literal values as below. This PR doeesn't fix this ambiguity.This PR also fixes as below:
#23882 fixed it to:
_create_function
to_create_name_function
_create_function
to take strings as column names.This PR, I proposes to:
_create_name_function
name to_create_function
._create_function_over_column
to take strings as column names.How was this patch tested?
Some unit tests were added for binary math / string functions.