Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-38302][K8S][TESTS] Use Java 17 in K8S IT in case of spark-tgz option #35627

Closed
wants to merge 2 commits into from

Conversation

dcoliversun
Copy link
Contributor

@dcoliversun dcoliversun commented Feb 23, 2022

What changes were proposed in this pull request?

This PR aims to use Java 17 in K8s integration tests by default when setting spark-tgz.

Why are the changes needed?

When setting parameters spark-tgz during integration tests, the error that resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17 cannot be found occurs.

This is due to the default value of spark.kubernetes.test.dockerFile is a relative path.

When using the tgz, the working directory is $UNPACKED_SPARK_TGZ, and the relative path is invalid.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Runing k8s integration test manaully:

sbt

$ build/sbt -Pkubernetes -Pkubernetes-integration-tests -Dtest.exclude.tags=minikube,r "kubernetes-integration-tests/test"

KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with no resources & statefulset allocation
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j2.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local hostpath storage on statefulsets
- PVs with local hostpath and storageClass on statefulsets
- PVs with local storage
- Launcher client dependencies
- SPARK-33615: Launcher client archives
- SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
- SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
- Launcher python client dependencies using a zip file
- Test basic decommissioning
- Test basic decommissioning with shuffle cleanup
- Test decommissioning with dynamic allocation & shuffle cleanups
- Test decommissioning timeouts
- SPARK-37576: Rolling decommissioning
Run completed in 27 minutes, 8 seconds.
Total number of tests run: 30
Suites: completed 2, aborted 0
Tests: succeeded 30, failed 0, canceled 0, ignored 0, pending 0
All tests passed.

maven with spark-tgz

$ bash resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --spark-tgz $TARBALL_TO_TEST --exclude-tags r

KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with no resources & statefulset allocation
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j2.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local hostpath storage on statefulsets
- PVs with local hostpath and storageClass on statefulsets
- PVs with local storage
- Launcher client dependencies
- SPARK-33615: Launcher client archives
- SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
- SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
- Launcher python client dependencies using a zip file
- Test basic decommissioning
- Test basic decommissioning with shuffle cleanup
- Test decommissioning with dynamic allocation & shuffle cleanups
- Test decommissioning timeouts
- SPARK-37576: Rolling decommissioning
Run completed in 30 minutes, 6 seconds.
Total number of tests run: 30
Suites: completed 2, aborted 0
Tests: succeeded 30, failed 0, canceled 0, ignored 0, pending 0
All tests passed.

maven without spark-tgz

$ bash resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --exclude-tags r

KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with no resources & statefulset allocation
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j2.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local hostpath storage on statefulsets
- PVs with local hostpath and storageClass on statefulsets
- PVs with local storage
- Launcher client dependencies
- SPARK-33615: Launcher client archives
- SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
- SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
- Launcher python client dependencies using a zip file
- Test basic decommissioning
- Test basic decommissioning with shuffle cleanup
- Test decommissioning with dynamic allocation & shuffle cleanups
- Test decommissioning timeouts
- SPARK-37576: Rolling decommissioning
Run completed in 35 minutes, 0 seconds.
Total number of tests run: 30
Suites: completed 2, aborted 0
Tests: succeeded 30, failed 0, canceled 0, ignored 0, pending 0
All tests passed.

@dcoliversun
Copy link
Contributor Author

cc @dongjoon-hyun @sarutak
It would be good if you could take a look when you have time, thanks!

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pinging me, @dcoliversun .

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-38302][K8S][TESTS] Use Java 17 in K8S integration tests when setting spark-tgz [SPARK-38302][K8S][TESTS] Use Java 17 in K8S IT in case of spark-tgz option Feb 24, 2022
@@ -106,7 +106,7 @@ then
# OpenJDK base-image tag (e.g. 8-jre-slim, 11-jre-slim)
JAVA_IMAGE_TAG_BUILD_ARG="-b java_image_tag=$JAVA_IMAGE_TAG"
else
JAVA_IMAGE_TAG_BUILD_ARG="-f $DOCKER_FILE"
JAVA_IMAGE_TAG_BUILD_ARG="-f $DOCKER_FILE_BASE_PATH$DOCKER_FILE"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to break the build when we use a docker file as an argument. For example, what happens when we give an absolute path? Please try this patch with --docker-file option from your environment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I handle this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. These code can check that $DOCKER_FILE is absolute path or not.

  • If $DOCKER_FILE is absolute path, nothing changes.
  • If relative path, $DOCKER_FILE_BASE_PATH and $DOCKER_FILE are joined together, for using Java 17 in K8S IT by default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that new commit do that, @dcoliversun ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. First commit doesn't supoort --docker-file is absolute path.

@dongjoon-hyun
Copy link
Member

Also, cc @LuciferYang since this is Java 17.

@LuciferYang
Copy link
Contributor

Is the behavior of Java 17 and Java 8 inconsistent ?

@dcoliversun
Copy link
Contributor Author

Is the behavior of Java 17 and Java 8 inconsistent ?

Hi @LuciferYang . Not compatibility issue, it's about a relative path about Dockerfile.java17 is not available in spark-tgz option when integration test.

@LuciferYang
Copy link
Contributor

LuciferYang commented Feb 26, 2022

@dongjoon-hyun @dcoliversun Sorry for the late reply. This is the first time I run the test locally of this module, so I spent some time setup the local test environment.

I use following command to build a tarball use Zulu17.32+13-CA

dev/make-distribution.sh --tgz -Phadoop-3 -Phadoop-cloud -Pmesos -Pyarn -Pkinesis-asl -Phive-thriftserver -Pspark-ganglia-lgpl -Pkubernetes -Phive

then checkout this pr use command:

gh pr checkout 35627

run test use command as follows:

resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --spark-tgz $TARBALL_TO_TEST --exclude-tags r

when $TRBALL_TO_TEST use absolute path, this pr can fix the described issue

@dcoliversun
Copy link
Contributor Author

@dongjoon-hyun @dcoliversun Sorry for the late reply. This is the first time I run the test locally of this module, so I spent some time setup the local test environment.

I use following command to build a tarball use Zulu17.32+13-CA

dev/make-distribution.sh --tgz -Phadoop-3 -Phadoop-cloud -Pmesos -Pyarn -Pkinesis-asl -Phive-thriftserver -Pspark-ganglia-lgpl -Pkubernetes -Phive

then checkout this pr use command:

gh pr checkout 35627

run test use command as follows:

resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --spark-tgz $TARBALL_TO_TEST --exclude-tags r

when $TRBALL_TO_TEST use absolute path, this pr can fix the described issue

Thanks for your verification. I'm glad this pr can fix the issue😀

@dongjoon-hyun
Copy link
Member

Thank you so much again, @LuciferYang !

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. I also verified this PR manually. Thank you, @dcoliversun , @LuciferYang , and @martin-g . Merged to master for Apache Spark 3.3.0.

@dcoliversun dcoliversun deleted the SPARK-38302 branch February 27, 2022 00:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants