Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-42599][CONNECT][INFRA] Introduce dev/connect-jvm-client-mima-check instead of CompatibilitySuite #40191

Closed

Conversation

LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Feb 27, 2023

What changes were proposed in this pull request?

The main changes of this pr as follows:

  • Refactor CompatibilitySuite as a new tool CheckConnectJvmClientCompatibility and move it into tools module
  • Add a new shell script dev/connect-jvm-client-mima-check, it will use CheckConnectJvmClientCompatibility to check the mima compatibility of connect-jvm-client module.
  • Add dev/connect-jvm-client-mima-check to github task

Why are the changes needed?

For fix test error report in [VOTE] Release Apache Spark 3.4.0 (RC1) mail list.

Testing CompatibilitySuite with maven requires some pre-work:

build/mvn clean install -DskipTests -pl sql/core -am
build/mvn clean install -DskipTests -pl connector/connect/client/jvm -am 

So if we run build/mvn package test to test whole project as before, CompatibilitySuite will failed as follows:

CompatibilitySuite:
- compatibility MiMa tests *** FAILED ***
  java.lang.AssertionError: assertion failed: Failed to find the jar inside folder: /home/bjorn/spark-3.4.0/connector/connect/client/jvm/target
  at scala.Predef$.assert(Predef.scala:223)
  at org.apache.spark.sql.connect.client.util.IntegrationTestUtils$.findJar(IntegrationTestUtils.scala:67)
  at org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar$lzycompute(CompatibilitySuite.scala:57)
  at org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar(CompatibilitySuite.scala:53)
  at org.apache.spark.sql.connect.client.CompatibilitySuite.$anonfun$new$1(CompatibilitySuite.scala:69)
  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
  at org.scalatest.Transformer.apply(Transformer.scala:22)
  at org.scalatest.Transformer.apply(Transformer.scala:20)
  ...
- compatibility API tests: Dataset *** FAILED ***
  java.lang.AssertionError: assertion failed: Failed to find the jar inside folder: /home/bjorn/spark-3.4.0/connector/connect/client/jvm/target
  at scala.Predef$.assert(Predef.scala:223)
  at org.apache.spark.sql.connect.client.util.IntegrationTestUtils$.findJar(IntegrationTestUtils.scala:67)
  at org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar$lzycompute(CompatibilitySuite.scala:57)
  at org.apache.spark.sql.connect.client.CompatibilitySuite.clientJar(CompatibilitySuite.scala:53)
  at org.apache.spark.sql.connect.client.CompatibilitySuite.$anonfun$new$7(CompatibilitySuite.scala:110)
  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
  at org.scalatest.Transformer.apply(Transformer.scala:22) 

So we need to fix this problem for developers.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Pass GitHub Actions

@LuciferYang LuciferYang marked this pull request as draft February 27, 2023 09:39
@LuciferYang LuciferYang changed the title [SPARK-42554][CONNECT] Add dev/connect-jvm-client-mima-check tool to instead of CompatibilitySuite [SPARK-42554][CONNECT][INFRA] Add dev/connect-jvm-client-mima-check tool to instead of CompatibilitySuite Feb 27, 2023
@LuciferYang LuciferYang changed the title [SPARK-42554][CONNECT][INFRA] Add dev/connect-jvm-client-mima-check tool to instead of CompatibilitySuite [SPARK-42554][CONNECT][INFRA] Add dev/connect-jvm-client-mima-check instead of CompatibilitySuite Feb 27, 2023
@LuciferYang LuciferYang changed the title [SPARK-42554][CONNECT][INFRA] Add dev/connect-jvm-client-mima-check instead of CompatibilitySuite [SPARK-42554][CONNECT][INFRA] Introduce dev/connect-jvm-client-mima-check instead of CompatibilitySuite Feb 27, 2023
@LuciferYang LuciferYang changed the title [SPARK-42554][CONNECT][INFRA] Introduce dev/connect-jvm-client-mima-check instead of CompatibilitySuite [SPARK-42599][CONNECT][INFRA] Introduce dev/connect-jvm-client-mima-check instead of CompatibilitySuite Feb 27, 2023
@HyukjinKwon
Copy link
Member

cc @Yikun FYI

fail(
s"\nComparing client jar: $clientJar\nand sql jar: $sqlJar\n" +
problems.map(p => p.description("client")).mkString("\n"))
private def checkDatasetApiCompatibility(clientJar: File, sqlJar: File): Array[String] = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mima check failed now due to there are 4 incompatible Apis between Dataset in sql module and Dataset in connect-client-jvm module with current comparison way, this is the right result.

image

@github-actions github-actions bot added the SQL label Feb 27, 2023
@github-actions github-actions bot removed the SQL label Feb 27, 2023
@github-actions github-actions bot added the SQL label Feb 27, 2023
@LuciferYang LuciferYang marked this pull request as ready for review February 27, 2023 16:48
if test ! -z "$ERRORS"; then
cat .connect-mima-check-result
echo -e "connect-client-jvm module mima check failed."
echo -e "Exceptions to binary compatibility can be added in tools/CheckConnectJvmClientCompatibility.scala"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep the file CheckConnectJvmClientCompatibility.scala inside client/jvm and make the tools depends on the client test-jar? Then we do not need to copy past the findJar and recursiveListFiles methods. And also we can keep the code closer to the client?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the previous habit, the tools module is independent, and it does not rely on any other Spark module. I do not want to break this.

In addition, the compatibility API tests: Dataset related issues I mentioned in another pr, put it inside the client module may cause unexpected class loading.

If we are sure we don't need to check compatibility API tests: Dataset or there are other ways to solve it, I can try to put CheckConnectJvmClientCompatibility back into the client module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another pr #40213 keep CheckConnectJvmClientCompatibility in client/jvm, if we all recommend keeping CheckConnectJvmClientCompatibility in the client/jvm module, we can use that one

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I vote for change #40213 Thanks!

@LuciferYang
Copy link
Contributor Author

This pr often conflicts....

@HyukjinKwon
Copy link
Member

cc @hvanhovell too FYI

@@ -2739,7 +2739,7 @@ class Dataset[T] private[sql] (
sparkSession.analyze(plan, proto.AnalyzePlanRequest.AnalyzeCase.SCHEMA)
}

def collectResult(): SparkResult[T] = sparkSession.execute(plan, encoder)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why make this private[sql]? For advanced use cases this is a better way of interacting with results.

@hvanhovell
Copy link
Contributor

@LuciferYang can we close this one in favor of #40213?

@LuciferYang
Copy link
Contributor Author

@LuciferYang can we close this one in favor of #40213?

OK, let me close this one and focus on #40213

@LuciferYang LuciferYang closed this Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants