-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-43943][SQL][PYTHON][CONNECT] Add SQL math functions to Scala and Python #41435
Conversation
44441ac
to
9572ce2
Compare
815c93f
to
b882811
Compare
c83cd0b
to
8ddad2a
Compare
oh also might need to put them in Python reference doc |
log | ||
log10 | ||
log1p | ||
log2 | ||
negative |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HyukjinKwon they were added here. except the negate
, I will add it.
BTW, in the last commit I add negate
(which only existed in scala) to the python side for the parity between py and scala
8ddad2a
to
b286c84
Compare
LGTM |
f5028b9
to
200fe37
Compare
200fe37
to
a3c0908
Compare
@HyukjinKwon it seems that |
merged to master |
I will investigate tomorrow, a little late today There is one case that may have been failed after this merged , and I try to fix it in #41519 |
…ction parity` in `DataFrameFunctionsSuite` ### What changes were proposed in this pull request? This pr remove `ceiling`, `negative`, `std`, `sign` from `excludedSqlFunctions` to make `DataFrame function and SQL functon parity` in `DataFrameFunctionsSuite ` test pass. These four functions were introduced into `sql.functions` in #41435 and need to maintain the content of `excludedSqlFunctions` simultaneously. ### Why are the changes needed? Fix `DataFrame function and SQL function parity` in `DataFrameFunctionsSuite` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Acitons - Manual check, run ``` build/sbt clean "sql/testOnly org.apache.spark.sql.DataFrameFunctionsSuite" ``` **Before** ``` [info] DataFrameFunctionsSuite: 23:20:51.858 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable [info] - DataFrame function and SQL functon parity *** FAILED *** (340 milliseconds) [info] Set("ceiling", "negative", "std", "sign") was not empty (DataFrameFunctionsSuite.scala:115) [info] org.scalatest.exceptions.TestFailedException: [info] at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472) [info] at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471) [info] at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231) [info] at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295) [info] at org.apache.spark.sql.DataFrameFunctionsSuite.$anonfun$new$1(DataFrameFunctionsSuite.scala:115) [info] at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) [info] at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) [info] at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) [info] at org.scalatest.Transformer.apply(Transformer.scala:22) [info] at org.scalatest.Transformer.apply(Transformer.scala:20) [info] at org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:226) [info] at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:221) [info] at org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:224) [info] at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:236) [info] at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:236) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:218) [info] at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:67) [info] at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234) [info] at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227) [info] at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:67) [info] at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:269) [info] at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413) [info] at scala.collection.immutable.List.foreach(List.scala:431) [info] at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) [info] at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396) [info] at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTests(AnyFunSuiteLike.scala:269) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTests$(AnyFunSuiteLike.scala:268) [info] at org.scalatest.funsuite.AnyFunSuite.runTests(AnyFunSuite.scala:1564) [info] at org.scalatest.Suite.run(Suite.scala:1114) [info] at org.scalatest.Suite.run$(Suite.scala:1096) [info] at org.scalatest.funsuite.AnyFunSuite.org$scalatest$funsuite$AnyFunSuiteLike$$super$run(AnyFunSuite.scala:1564) [info] at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$run$1(AnyFunSuiteLike.scala:273) [info] at org.scalatest.SuperEngine.runImpl(Engine.scala:535) [info] at org.scalatest.funsuite.AnyFunSuiteLike.run(AnyFunSuiteLike.scala:273) [info] at org.scalatest.funsuite.AnyFunSuiteLike.run$(AnyFunSuiteLike.scala:272) [info] at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:67) [info] at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213) [info] at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210) [info] at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208) [info] at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:67) [info] at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:321) [info] at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:517) [info] at sbt.ForkMain$Run.lambda$runTest$1(ForkMain.java:413) [info] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [info] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [info] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [info] at java.lang.Thread.run(Thread.java:750) ... [info] Run completed in 27 seconds, 818 milliseconds. [info] Total number of tests run: 123 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 122, failed 1, canceled 0, ignored 0, pending 0 ``` **After** ``` [info] Run completed in 27 seconds, 338 milliseconds. [info] Total number of tests run: 123 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 123, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. ``` Closes #41519 from LuciferYang/fix-df-functions-suite. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
…nd Python ### What changes were proposed in this pull request? Add following functions: * ceiling * e * pi * ln * negative * positive * power * sign * std * width_bucket to: * Scala API * Python API * Spark Connect Scala Client * Spark Connect Python Client This PR also adds `negate` (which already exists in Scala API and SCSC) to Python API and SCPC. ### Why are the changes needed? for parity ### Does this PR introduce _any_ user-facing change? yes, new functions ### How was this patch tested? added ut / doctest Closes apache#41435 from zhengruifeng/sql_func_math. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
…ction parity` in `DataFrameFunctionsSuite` ### What changes were proposed in this pull request? This pr remove `ceiling`, `negative`, `std`, `sign` from `excludedSqlFunctions` to make `DataFrame function and SQL functon parity` in `DataFrameFunctionsSuite ` test pass. These four functions were introduced into `sql.functions` in apache#41435 and need to maintain the content of `excludedSqlFunctions` simultaneously. ### Why are the changes needed? Fix `DataFrame function and SQL function parity` in `DataFrameFunctionsSuite` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Acitons - Manual check, run ``` build/sbt clean "sql/testOnly org.apache.spark.sql.DataFrameFunctionsSuite" ``` **Before** ``` [info] DataFrameFunctionsSuite: 23:20:51.858 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable [info] - DataFrame function and SQL functon parity *** FAILED *** (340 milliseconds) [info] Set("ceiling", "negative", "std", "sign") was not empty (DataFrameFunctionsSuite.scala:115) [info] org.scalatest.exceptions.TestFailedException: [info] at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472) [info] at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471) [info] at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231) [info] at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295) [info] at org.apache.spark.sql.DataFrameFunctionsSuite.$anonfun$new$1(DataFrameFunctionsSuite.scala:115) [info] at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) [info] at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) [info] at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) [info] at org.scalatest.Transformer.apply(Transformer.scala:22) [info] at org.scalatest.Transformer.apply(Transformer.scala:20) [info] at org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:226) [info] at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:221) [info] at org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:224) [info] at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:236) [info] at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:236) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:218) [info] at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:67) [info] at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234) [info] at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227) [info] at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:67) [info] at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:269) [info] at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413) [info] at scala.collection.immutable.List.foreach(List.scala:431) [info] at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) [info] at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396) [info] at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTests(AnyFunSuiteLike.scala:269) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTests$(AnyFunSuiteLike.scala:268) [info] at org.scalatest.funsuite.AnyFunSuite.runTests(AnyFunSuite.scala:1564) [info] at org.scalatest.Suite.run(Suite.scala:1114) [info] at org.scalatest.Suite.run$(Suite.scala:1096) [info] at org.scalatest.funsuite.AnyFunSuite.org$scalatest$funsuite$AnyFunSuiteLike$$super$run(AnyFunSuite.scala:1564) [info] at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$run$1(AnyFunSuiteLike.scala:273) [info] at org.scalatest.SuperEngine.runImpl(Engine.scala:535) [info] at org.scalatest.funsuite.AnyFunSuiteLike.run(AnyFunSuiteLike.scala:273) [info] at org.scalatest.funsuite.AnyFunSuiteLike.run$(AnyFunSuiteLike.scala:272) [info] at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:67) [info] at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213) [info] at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210) [info] at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208) [info] at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:67) [info] at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:321) [info] at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:517) [info] at sbt.ForkMain$Run.lambda$runTest$1(ForkMain.java:413) [info] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [info] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [info] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [info] at java.lang.Thread.run(Thread.java:750) ... [info] Run completed in 27 seconds, 818 milliseconds. [info] Total number of tests run: 123 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 122, failed 1, canceled 0, ignored 0, pending 0 ``` **After** ``` [info] Run completed in 27 seconds, 338 milliseconds. [info] Total number of tests run: 123 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 123, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. ``` Closes apache#41519 from LuciferYang/fix-df-functions-suite. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
What changes were proposed in this pull request?
Add following functions:
to:
This PR also adds
negate
(which already exists in Scala API and SCSC) to Python API and SCPC.Why are the changes needed?
for parity
Does this PR introduce any user-facing change?
yes, new functions
How was this patch tested?
added ut / doctest