Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48088][PYTHON][CONNECT][TESTS][3.5] Skip tests that fail in 3.5 client <> 4.0 server #46334

Closed
wants to merge 1 commit into from

Conversation

HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

This PR proposes to skip the tests that fail with 3.5 client and 4.0 server in Spark Connect (by adding SPARK_SKIP_CONNECT_COMPAT_TESTS). This is a base work for #46298. This partially backports #45870

This PR also adds SPARK_CONNECT_TESTING_REMOTE environment variable so developers can run PySpark unittests against a Spark Connect server.

Why are the changes needed?

In order to set up the CI that tests 3.5 client and 4.0 server in Spark Connect.

Does this PR introduce any user-facing change?

No, test-only.

How was this patch tested?

Tested it in my fork, see #46298

Was this patch authored or co-authored using generative AI tooling?

No.

import unittest
from pyspark.sql import SparkSession
from pyspark.ml.tests.connect.test_legacy_mode_tuning import CrossValidatorTestsMixin


@unittest.skipIf("SPARK_SKIP_CONNECT_COMPAT_TESTS" in os.environ, "Requires JVM access")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to forward port a bit of changes into master to reduce the diff later.

@HyukjinKwon HyukjinKwon force-pushed the SPARK-48088 branch 5 times, most recently from c695fd4 to c8d13e8 Compare May 2, 2024 12:45
@HyukjinKwon HyukjinKwon marked this pull request as ready for review May 2, 2024 13:57
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought Spark Connect is supposed to handle this, @HyukjinKwon .

Do you know what breaks the compatibility?

@dongjoon-hyun
Copy link
Member

cc @grundprinzip , too

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented May 3, 2024

Spark Connect is actually for testing pure python library. Spark 3.5 doesn't have it ... from 4.0 <> 4.1, we could leverage pure python library build to test them. I will try to reuse that build instead of duplicating some like this

I am triaging the breaking changes in the umbrella JIRA at https://issues.apache.org/jira/browse/SPARK-48082. They are mostly because of new features so it seems not so bad.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @HyukjinKwon .

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, could you re-trigger the failed pyspark tests to make it sure? It looks irrelevant though.

@HyukjinKwon
Copy link
Member Author

Merged to branch-3.5.

I will perpare separate forward port PR

HyukjinKwon added a commit that referenced this pull request May 3, 2024
…5 client <> 4.0 server

### What changes were proposed in this pull request?

This PR proposes to skip the tests that fail with 3.5 client and 4.0 server in Spark Connect (by adding `SPARK_SKIP_CONNECT_COMPAT_TESTS`). This is a base work for #46298. This partially backports #45870

This PR also adds `SPARK_CONNECT_TESTING_REMOTE` environment variable so developers can run PySpark unittests against a Spark Connect server.

### Why are the changes needed?

In order to set up the CI that tests 3.5 client and 4.0 server in Spark Connect.

### Does this PR introduce _any_ user-facing change?

No, test-only.

### How was this patch tested?

Tested it in my fork, see #46298

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #46334 from HyukjinKwon/SPARK-48088.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@HyukjinKwon HyukjinKwon closed this May 3, 2024
dongjoon-hyun pushed a commit that referenced this pull request May 3, 2024
…test 4.0 <> above

### What changes were proposed in this pull request?

This PR forward ports #46334 to reduce conflicts.

### Why are the changes needed?

To reduce the conflict against branch-3.5, and prepare 4.0 <> above test.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

CI in this PR should verify them.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #46358 from HyukjinKwon/SPARK-48088-40.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
HyukjinKwon added a commit that referenced this pull request May 6, 2024
…to SPARK_SKIP_CONNECT_COMPAT_TESTS

### What changes were proposed in this pull request?

This PR is a followup of #46298 that properly uses the environment variable SPARK_SKIP_CONNECT_COMPAT_TESTS added at #46334

### Why are the changes needed?

To recover the compatibility test (https://github.com/apache/spark/actions/runs/8961109352/job/24608366499).

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Manually

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #46394 from HyukjinKwon/SPARK-48054-followup.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
dongjoon-hyun pushed a commit that referenced this pull request May 7, 2024
…hat that requires JVM access

### What changes were proposed in this pull request?

This PR is a followup of #46334 that missed one more test case.

### Why are the changes needed?

See #46334

### Does this PR introduce _any_ user-facing change?

See #46334

### How was this patch tested?

Manually

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #46411 from HyukjinKwon/SPARK-48088-followup.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
JacobZheng0927 pushed a commit to JacobZheng0927/spark that referenced this pull request May 11, 2024
…test 4.0 <> above

### What changes were proposed in this pull request?

This PR forward ports apache#46334 to reduce conflicts.

### Why are the changes needed?

To reduce the conflict against branch-3.5, and prepare 4.0 <> above test.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

CI in this PR should verify them.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#46358 from HyukjinKwon/SPARK-48088-40.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
JacobZheng0927 pushed a commit to JacobZheng0927/spark that referenced this pull request May 11, 2024
…to SPARK_SKIP_CONNECT_COMPAT_TESTS

### What changes were proposed in this pull request?

This PR is a followup of apache#46298 that properly uses the environment variable SPARK_SKIP_CONNECT_COMPAT_TESTS added at apache#46334

### Why are the changes needed?

To recover the compatibility test (https://github.com/apache/spark/actions/runs/8961109352/job/24608366499).

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Manually

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#46394 from HyukjinKwon/SPARK-48054-followup.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants