Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-43928][SQL][PYTHON][CONNECT] Add bit operations to Scala, Python and Connect API #41608

Closed
wants to merge 5 commits into from

Conversation

beliefer
Copy link
Contributor

What changes were proposed in this pull request?

This PR want add bit operations to Scala, Python and Connect API.
These API show below.

  • bit_and
  • bit_count
  • bit_get
  • bit_or
  • bit_xor
  • getbit

Why are the changes needed?

Add bit operations to Scala, Python and Connect API

Does this PR introduce any user-facing change?

'No'.
New feature.

How was this patch tested?

New test cases.

| 1|
| 1|
+------------+
<BLANKLINE>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's don't add <BLANKLINE> see #41610

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@@ -1829,6 +1829,112 @@ def bitwise_not(col: "ColumnOrName") -> Column:
return _invoke_function_over_columns("bitwise_not", col)


@try_remote_functions
def bit_count(col: "ColumnOrName") -> Column:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also add them to *.rst files for Python API reference documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I forgot it.

@zhengruifeng
Copy link
Contributor

@beliefer oh, please rebase to resolve the conflicts.

@zhengruifeng
Copy link
Contributor

@beliefer would you mind re-triggering the failed test?

@zhengruifeng
Copy link
Contributor

the failed tests should be unrelated

@zhengruifeng
Copy link
Contributor

merged to master, thank you @beliefer so much!

@beliefer
Copy link
Contributor Author

@zhengruifeng @HyukjinKwon Thank you!

LuciferYang pushed a commit to LuciferYang/spark that referenced this pull request Jun 17, 2023
…on and Connect API

### What changes were proposed in this pull request?
This PR want add bit operations to Scala, Python and Connect API.
These API show below.

- bit_and
- bit_count
- bit_get
- bit_or
- bit_xor
- getbit

### Why are the changes needed?
Add bit operations to Scala, Python and Connect API

### Does this PR introduce _any_ user-facing change?
'No'.
New feature.

### How was this patch tested?
New test cases.

Closes apache#41608 from beliefer/SPARK-43928.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
@LuciferYang
Copy link
Contributor

@beliefer

https://github.com/apache/spark/actions/runs/5296358122/jobs/9587419094

DataFrame function and SQL functon parity in DataFrameFunctionsSuite failed after this one

2023-06-17T07:30:23.4365440Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m- DataFrame function and SQL functon parity *** FAILED *** (22 milliseconds)�[0m�[0m
2023-06-17T07:30:23.4366767Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  Set("getbit") was not empty (DataFrameFunctionsSuite.scala:115)�[0m�[0m
2023-06-17T07:30:23.4372881Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  org.scalatest.exceptions.TestFailedException:�[0m�[0m
2023-06-17T07:30:23.4374160Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)�[0m�[0m
2023-06-17T07:30:23.4378197Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)�[0m�[0m
2023-06-17T07:30:23.4381810Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231)�[0m�[0m
2023-06-17T07:30:23.4382914Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295)�[0m�[0m
2023-06-17T07:30:23.4384235Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.apache.spark.sql.DataFrameFunctionsSuite.$anonfun$new$1(DataFrameFunctionsSuite.scala:115)�[0m�[0m
2023-06-17T07:30:23.4392901Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)�[0m�[0m
2023-06-17T07:30:23.4394052Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)�[0m�[0m
2023-06-17T07:30:23.4395224Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)�[0m�[0m
2023-06-17T07:30:23.4396114Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.Transformer.apply(Transformer.scala:22)�[0m�[0m
2023-06-17T07:30:23.4396922Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.Transformer.apply(Transformer.scala:20)�[0m�[0m
2023-06-17T07:30:23.4397607Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:226)�[0m�[0m
2023-06-17T07:30:23.4401438Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:221)�[0m�[0m
2023-06-17T07:30:23.4405555Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:224)�[0m�[0m
2023-06-17T07:30:23.4409387Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:236)�[0m�[0m
2023-06-17T07:30:23.4416389Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)�[0m�[0m
2023-06-17T07:30:23.4420746Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:236)�[0m�[0m
2023-06-17T07:30:23.4422018Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:218)�[0m�[0m
2023-06-17T07:30:23.4424339Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:67)�[0m�[0m
2023-06-17T07:30:23.4429271Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234)�[0m�[0m
2023-06-17T07:30:23.4431653Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227)�[0m�[0m
2023-06-17T07:30:23.4432812Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:67)�[0m�[0m
2023-06-17T07:30:23.4433639Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:269)�[0m�[0m
2023-06-17T07:30:23.4438569Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413)�[0m�[0m
2023-06-17T07:30:23.4441385Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at scala.collection.immutable.List.foreach(List.scala:431)�[0m�[0m
2023-06-17T07:30:23.4443550Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)�[0m�[0m
2023-06-17T07:30:23.4448505Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396)�[0m�[0m
2023-06-17T07:30:23.4449817Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475)�[0m�[0m
2023-06-17T07:30:23.4450582Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.funsuite.AnyFunSuiteLike.runTests(AnyFunSuiteLike.scala:269)�[0m�[0m
2023-06-17T07:30:23.4461803Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.funsuite.AnyFunSuiteLike.runTests$(AnyFunSuiteLike.scala:268)�[0m�[0m
2023-06-17T07:30:23.4463089Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.funsuite.AnyFunSuite.runTests(AnyFunSuite.scala:1564)�[0m�[0m
2023-06-17T07:30:23.4466240Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.Suite.run(Suite.scala:1114)�[0m�[0m
2023-06-17T07:30:23.4473557Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.Suite.run$(Suite.scala:1096)�[0m�[0m
2023-06-17T07:30:23.4479868Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.funsuite.AnyFunSuite.org$scalatest$funsuite$AnyFunSuiteLike$$super$run(AnyFunSuite.scala:1564)�[0m�[0m
2023-06-17T07:30:23.4481336Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$run$1(AnyFunSuiteLike.scala:273)�[0m�[0m
2023-06-17T07:30:23.4482271Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.SuperEngine.runImpl(Engine.scala:535)�[0m�[0m
2023-06-17T07:30:23.4483160Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.funsuite.AnyFunSuiteLike.run(AnyFunSuiteLike.scala:273)�[0m�[0m
2023-06-17T07:30:23.4488064Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.funsuite.AnyFunSuiteLike.run$(AnyFunSuiteLike.scala:272)�[0m�[0m
2023-06-17T07:30:23.4489064Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:67)�[0m�[0m
2023-06-17T07:30:23.4493955Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)�[0m�[0m
2023-06-17T07:30:23.4503997Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)�[0m�[0m
2023-06-17T07:30:23.4504906Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)�[0m�[0m
2023-06-17T07:30:23.4505589Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:67)�[0m�[0m
2023-06-17T07:30:23.4510958Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:321)�[0m�[0m
2023-06-17T07:30:23.4523118Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:517)�[0m�[0m
2023-06-17T07:30:23.4523778Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at sbt.ForkMain$Run.lambda$runTest$1(ForkMain.java:413)�[0m�[0m
2023-06-17T07:30:23.4524394Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at java.util.concurrent.FutureTask.run(FutureTask.java:266)�[0m�[0m
2023-06-17T07:30:23.4530327Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)�[0m�[0m
2023-06-17T07:30:23.4536282Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)�[0m�[0m
2023-06-17T07:30:23.4541460Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[31m  at java.lang.Thread.run(Thread.java:750)�[0m�[0m

@beliefer
Copy link
Contributor Author

@LuciferYang Let me fix it.

@LuciferYang
Copy link
Contributor

@beliefer I give a quick fix #41639

zhengruifeng pushed a commit that referenced this pull request Jun 17, 2023
…tions` of `DataFrameFunctionsSuite`

### What changes were proposed in this pull request?
#41608 add bit operations to Scala, Python and Connect API, but missing the test for `DataFrameFunctionsSuite`.

### Why are the changes needed?
Fix missing tests

### Does this PR introduce _any_ user-facing change?
'No'.
New feature.

### How was this patch tested?
N/A

Closes #41640 from beliefer/SPARK-43928_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
czxm pushed a commit to czxm/spark that referenced this pull request Jun 19, 2023
…on and Connect API

### What changes were proposed in this pull request?
This PR want add bit operations to Scala, Python and Connect API.
These API show below.

- bit_and
- bit_count
- bit_get
- bit_or
- bit_xor
- getbit

### Why are the changes needed?
Add bit operations to Scala, Python and Connect API

### Does this PR introduce _any_ user-facing change?
'No'.
New feature.

### How was this patch tested?
New test cases.

Closes apache#41608 from beliefer/SPARK-43928.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
czxm pushed a commit to czxm/spark that referenced this pull request Jun 19, 2023
…tions` of `DataFrameFunctionsSuite`

### What changes were proposed in this pull request?
apache#41608 add bit operations to Scala, Python and Connect API, but missing the test for `DataFrameFunctionsSuite`.

### Why are the changes needed?
Fix missing tests

### Does this PR introduce _any_ user-facing change?
'No'.
New feature.

### How was this patch tested?
N/A

Closes apache#41640 from beliefer/SPARK-43928_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants