New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-22962][K8S] Fail fast if submission client local files are used #20320
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM only some nits.
@@ -117,6 +117,12 @@ private[spark] class DriverConfigOrchestrator( | |||
.map(_.split(",")) | |||
.getOrElse(Array.empty[String]) | |||
|
|||
if (existSubmissionLocalFiles(sparkJars) || existSubmissionLocalFiles(sparkFiles)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add a TODO
here if this is planned to be supported in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, and also created https://issues.apache.org/jira/browse/SPARK-23153.
@@ -117,6 +117,12 @@ private[spark] class DriverConfigOrchestrator( | |||
.map(_.split(",")) | |||
.getOrElse(Array.empty[String]) | |||
|
|||
if (existSubmissionLocalFiles(sparkJars) || existSubmissionLocalFiles(sparkFiles)) { | |||
throw new SparkException("The Kubernetes mode does not yet support application " + | |||
"dependencies local to the submission client. It currently only allows application" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra space in the end of line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Kubernetes integration test starting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on docs change for 2.3 modulo minor comments.
Code changes ok if we also re-validate with manual tests against http (we have a caveat with our integration tests not testing this correctly yet), gcs and HDFS. Also ok with dropping the code change for 2.3, since it's a usability and not a functionality improvement.
docs/running-on-kubernetes.md
Outdated
`SPARK_EXTRA_CLASSPATH` environment variable in your Dockerfiles. | ||
`SPARK_EXTRA_CLASSPATH` environment variable in your Dockerfiles. The `local://` scheme is also required when referring to | ||
dependencies in custom-built Docker images in `spark-submit`. Note that using application dependencies local to the submission | ||
client is currently not yet supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"application dependencies from the local file system"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -117,7 +117,10 @@ This URI is the location of the example jar that is already in the Docker image. | |||
If your application's dependencies are all hosted in remote locations like HDFS or HTTP servers, they may be referred to | |||
by their appropriate remote URIs. Also, application dependencies can be pre-mounted into custom-built Docker images. | |||
Those dependencies can be added to the classpath by referencing them with `local://` URIs and/or setting the | |||
`SPARK_EXTRA_CLASSPATH` environment variable in your Dockerfiles. | |||
`SPARK_EXTRA_CLASSPATH` environment variable in your Dockerfiles. The `local://` scheme is also required when referring to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point above already covers that local://
is needed with custom-built images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that's about adding to the classpath. I wanted to make it more specific and clearer.
Kubernetes integration test status success |
Test build #86354 has finished for PR 20320 at commit
|
Regarding manual tests, our integration tests cover |
Kubernetes integration test starting |
Test build #86357 has finished for PR 20320 at commit
|
Kubernetes integration test status success |
@foxish Manual tests to verify that using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question - is referencing a local file as a main app resource (/home/../spark-examples.jar
for example) need to be dealt with separately?
@@ -117,6 +117,13 @@ private[spark] class DriverConfigOrchestrator( | |||
.map(_.split(",")) | |||
.getOrElse(Array.empty[String]) | |||
|
|||
// TODO(SPARK-23153): remote once submission client local dependencies are supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/remote/remove/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -117,6 +117,13 @@ private[spark] class DriverConfigOrchestrator( | |||
.map(_.split(",")) | |||
.getOrElse(Array.empty[String]) | |||
|
|||
// TODO(SPARK-23153): remote once submission client local dependencies are supported. | |||
if (existSubmissionLocalFiles(sparkJars) || existSubmissionLocalFiles(sparkFiles)) { | |||
throw new SparkException("The Kubernetes mode does not yet support application " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd shorten this to just "Kubernetes mode does not support referencing application dependencies in the local file system".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Actually main application jar gets added to |
Suspected that it would use the same code path - might be worth a manual test of that as well. |
The manual tests I did actually use a main app jar located on gcs and http. To be specific and for record, I did the following tests:
|
Kubernetes integration test starting |
Test build #86359 has finished for PR 20320 at commit
|
LGTM |
Kubernetes integration test status success |
Merging to master / 2.3. |
## What changes were proposed in this pull request? In the Kubernetes mode, fails fast in the submission process if any submission client local dependencies are used as the use case is not supported yet. ## How was this patch tested? Unit tests, integration tests, and manual tests. vanzin foxish Author: Yinan Li <liyinan926@gmail.com> Closes #20320 from liyinan926/master. (cherry picked from commit 5d7c4ba) Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
What changes were proposed in this pull request?
In the Kubernetes mode, fails fast in the submission process if any submission client local dependencies are used as the use case is not supported yet.
How was this patch tested?
Unit tests, integration tests, and manual tests.
@vanzin @foxish