[SPARK-22839][K8S] Remove the use of init-container for downloading remote dependencies #20669

ifilonenko · 2018-02-25T00:12:22Z

What changes were proposed in this pull request?

Removal of the init-container for downloading remote dependencies. Built off of the work done by @vanzin in an attempt to refactor driver/executor configuration elaborated in this ticket.

How was this patch tested?

This patch was tested with unit and integration tests.

SparkQA · 2018-02-25T00:25:51Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1026/

ssuchter · 2018-02-25T00:42:39Z

retest this please

SparkQA · 2018-02-25T00:44:54Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1026/

SparkQA · 2018-02-25T00:55:36Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1027/

ssuchter · 2018-02-25T00:57:38Z

retest this please

SparkQA · 2018-02-25T01:20:18Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1027/

SparkQA · 2018-02-25T01:20:26Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1028/

SparkQA · 2018-02-25T01:45:07Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1028/

vanzin · 2018-02-26T16:53:19Z

I'm not sure there was a consensus at the end of that discussion, and @mccheah has been working on some internal refactoring here also. @foxish

mccheah · 2018-02-26T18:36:28Z

My impression was that we did reach a consensus that this was the right idea from the mailing list discussion. Specifically http://apache-spark-developers-list.1001551.n3.nabble.com/Kubernetes-why-use-init-containers-tp23113p23140.html.

We want to get this in before we start refactoring because it reduces the size of the code that actually needs to be refactored by a sizable amount.

vanzin · 2018-02-26T18:56:54Z

If that's the consensus, then great! I thought some people still wanted to keep the init container around for some reason.

foxish · 2018-02-26T19:02:26Z

+1, the benefits outweigh the desire to be k8s-like in the approach. We should have tests that validate that no behavior changes though. I'm wondering if we should move the integration tests into the repo first, before indulging in any large scale refactors.

SparkQA · 2018-02-26T20:51:12Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1061/

SparkQA · 2018-02-26T21:08:47Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1061/

vanzin · 2018-02-26T23:53:16Z

I think tests are failing because you're missing changes to run the driver through spark-submit in the entry point script:
vanzin@7e1dd8f#diff-c12f609c87a8093c2902284459fbe84b

…d by resolving the location of the downloaded file

SparkQA · 2018-02-27T01:01:29Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1066/

SparkQA · 2018-02-27T01:05:52Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1066/

…sUtils

SparkQA · 2018-02-27T16:21:56Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1105/

SparkQA · 2018-02-27T16:29:07Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1105/

vanzin · 2018-02-27T17:03:28Z

@ifilonenko you can't just remove the init container without replacing its functionality. In my patch that was done by using spark-submit to run the driver. But here you have nothing. That's why the tests are failing.

ifilonenko · 2018-02-27T17:36:03Z

@vanzin you are definitely right and that is what needs to happen. However, one of the functions of the init container was resolving the file location which atm isn’t supported unless the spark-application that is testing this service uses SparkFiles.get(). This is something I am looking into right now either on the integration testing side or maybe there could be some way to bring back this resolution (i.e. the current file is currently being stored in /tmp)

vanzin · 2018-02-27T18:32:26Z

SparkFiles.get() is the official way of retrieving anything that is distributed using --files. So if the application is not using that, it's relying on undocumented behavior.

In the error logs from jenkins, the path of the file ("/var/spark-data/spark-files/pagerank_data.txt") seems to be provided by the test code as part of running spark-submit. That seems wrong, since the test code doesn't really have control of where files will show up. If the file is already in the image somehow, then it's probably a case of the test and the image not agreeing about what the path is. If the file is uploaded using --files, then providing an absolute path to the app is wrong (or, at worst, the app should be using just the file name and using SparkFiles.get() to find its actual location).

vanzin · 2018-02-27T18:43:31Z

BTW I wonder why are we still relying on integration tests that do not live in the Spark repo...

ssuchter · 2018-02-27T18:48:28Z

It's just because we haven't PR'd them yet. There's some cleanups that I wanted to do before doing that.

ssuchter · 2018-02-28T21:16:57Z

@vanzin I just opened this PR for putting the integration tests in the Spark repo: #20697

liyinan926 · 2018-03-14T18:18:27Z

...ce-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala

@@ -89,26 +56,16 @@ private[spark] object KubernetesUtils {
  def getOnlyRemoteFiles(uris: Iterable[String]): Iterable[String] = {
    uris.filter { uri =>
      val scheme = Utils.resolveURI(uri).getScheme
-      scheme != "file" && scheme != "local"
+      scheme != "local"


Actually is this function still needed?

liyinan926 · 2018-03-14T18:20:24Z

...tes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala

@@ -202,6 +221,10 @@ private[spark] class KubernetesClientApplication extends SparkApplication {
    val launchTime = System.currentTimeMillis()
    val waitForAppCompletion = sparkConf.get(WAIT_FOR_APP_COMPLETION)
    val appName = sparkConf.getOption("spark.app.name").getOrElse("spark")
+    val kubernetesResourceNamePrefix = {
+      val uuid = UUID.nameUUIDFromBytes(Longs.toByteArray(launchTime)).toString.replaceAll("-", "")


Using a uuid with - removed is what we already do.

liyinan926 · 2018-03-14T18:26:08Z

...tes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala

-      .addAllToEnv(driverJavaOptsEnvs.asJava)
+      .addNewEnv()
+        .withName(SPARK_CONF_DIR_ENV)
+        .withValue(SPARK_CONF_PATH)


I have one concern on using a different path for SPARK_CONF_DIR than /opt/spark/conf which is the one under the installation path. Because the environment variable SPARK_CONF_DIR is set, people who want to customize the image, e.g., using a custom spark-env.sh has to know to put the file into this custom path instead of /opt/spark/conf. Why don't we use /opt/spark/conf instead?

I am flexible to that, anyone else have thoughts?

I'm unsure of what the behavior is if we try to mount to a directory that already exists on the image itself.

Then I would suggest removing COPY conf /opt/spark/conf from the Dockerfile as it is no effect anyway if SPARK_CONF_DIR points to somewhere else. Then set SPARK_CONF_DIR to point to /opt/spark/conf.

+1 for this.

Do the executors require a SPARK_CONF_DIR directory to be defined as well?

Per the discussion on Slack, please remove COPY conf /opt/spark/conf in the Docker file, put the properties file under /opt/spark/conf, and then unset SPARK_CONF_DIR as it defaults to /opt/spark/conf anyway.

liyinan926 · 2018-03-14T18:28:45Z

...es/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterManager.scala

-    if (masterURL.startsWith("k8s") && sc.deployMode == "client") {
+    if (masterURL.startsWith("k8s") &&
+      sc.deployMode == "client" &&
+      !sc.conf.contains(KUBERNETES_EXECUTOR_POD_NAME_PREFIX)) {


Is this sufficient to prevent use the client mode from end users? What about adding a special key when calling spark-submit in the driver and test that key here instead?

I believe that logic might be beyond the scope of this PR. But I could add that if it seems appropriate.

I think it's safer to have a new internal config key that is only used for this purpose. I'm not sure if checking the presence of KUBERNETES_EXECUTOR_POD_NAME_PREFIX is sufficient or not depending on if the submission client code is called in client mode.

SparkQA · 2018-03-14T19:36:59Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1495/

SparkQA · 2018-03-14T19:44:28Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1495/

mccheah · 2018-03-14T20:54:21Z

examples/src/main/scala/org/apache/spark/examples/SparkRemoteFileTest.scala

+import org.apache.spark.sql.SparkSession
+
+/** Usage: SparkRemoteFileTest [file] */
+object SparkRemoteFileTest {


Hm, what is the purpose of this specific example?

To test the presence of a remote file being mounted on the executors via the spark-submit being run by the driver. Should I add a Javadoc (HDFSTest.scala didn't include one) but I can if necessary

mccheah · 2018-03-14T22:14:24Z

...tes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala

+        .withName(configMapName)
+        .withNamespace(namespace)
+        .endMetadata()
+      .addToData(SPARK_CONF_FILE_NAME, propertiesWriter.toString + driverJavaOps)


You want to write two config map data items:

one data item with the properties file name,

One data item with the JVM options.

In other words I'd expect two calls to addToData here with the appropriate item names corresponding to the file names.

We should probably also capture this with an integration test. Some test that has a call to System.getProperty(property) with a property that isn't a Spark-specific option should suffice.

One data item with the JVM options.

Why? spark-submit handles that already.

This doesn't put spark.driver.extraJavaOptions on the launch of the spark-submit command in client mode, I think.

Why does it have to be on the command line and not in the properties file?

My understanding is for spark-submit to pick it up you want a jvm-opts file in the Spark configuration directory. This is specifically for JVM options that aren't spark.* options, e.g. GC, out of memory settings, yourkit agent loading.

You really should try this out to see that it works the way I've been saying since the beginning. Just let spark-submit do its thing.

Ah I think I was misreading the code here - just putting spark.driver.extraJavaOptions in the properties works. See

spark/launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java

Line 268 in cfcd746

addOptionString(cmd, driverExtraJavaOptions);

Sorry about that - @ifilonenko think just adding spark.driver.extraJavaOptions as-is to the properties file will suffice.

ifilonenko · 2018-03-16T23:19:31Z

Results from integration testing:

Discovery starting.
Discovery completed in 123 milliseconds.
Run starting. Expected test count is: 8
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- Run SparkPi with a test secret mounted into the driver and executor pods
- Test extraJVMProprties being present on Driver
- Run FileCheck using a Remote Data File
Run completed in 3 minutes, 22 seconds.
Total number of tests run: 8
Suites: completed 2, aborted 0
Tests: succeeded 8, failed 0, canceled 0, ignored 0, pending 0
All tests passed.

SparkQA · 2018-03-16T23:32:02Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1558/

SparkQA · 2018-03-16T23:45:14Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1558/

mccheah

An open question for us to consider, otherwise looks fine.

mccheah · 2018-03-16T23:42:52Z

resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala

@@ -79,6 +79,12 @@ private[spark] object Config extends Logging {
      .stringConf
      .createOptional

+  val KUBERNETES_DRIVER_SUBMIT_CHECK =
+    ConfigBuilder("spark.kubernetes.submitInDriver")
+    .internal()


Nit: The indentation looks slightly off here.

mccheah · 2018-03-16T23:47:26Z

resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh

+      "$SPARK_HOME/bin/spark-submit"
+      --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS"
+      --deploy-mode client
+      "$@"


There's an argument that can be made for enforcing what arguments can be passed here. For example, we can instead have the following command:

"$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client --properties-file $SPARK_DRIVER_PROPERTIES_FILE --class $SPARK_DRIVER_MAIN_CLASS spark-internal

And then instead of passing args to the Kubernetes container command, we pass everything as environment variables. I think this is clearer and makes the contract more obvious, in terms of what this driver container is expecting as input. Thoughts @vanzin @ifilonenko?

What's the problem of passing arguments as arguments? I really dislike using environment variables in general, but this would be an especially nasty use for them. And I'm not sure how they make the contract more obvious.

If one is going to override the docker image and provide their own implementation, the customizing developer would like to know all of the parameters that they are being passed up-front. Receiving "$@" makes it less obvious that we should be expecting a properties file and the SparkLauncher.NO_RESOURCE from spark-submit.

We could do argument parsing here to verify that --properties-file etc. are explicitly provided.

+1 on explicit envs.

If one is going to override the docker image and provide their own implementation

There is no documentation about that, so how about defining what that contract is supposed to be first? Otherwise we're all just talking about hypotheticals.

My current understanding is that regardless of what's in the image, entrypoint.sh defines the entry point's contract, so this is an internal decision between spark-submit code and that shell script.

We've seen some users override the entry point to do a bunch of extra bootstrapping before calling into the actual command to run the driver in custom images. I vaguely recall something related to Openshift doing this - @erikerlandson.

The point is, even if entrypoint.sh defines the contract, what arguments would a custom implementation pass it? Something like this would fail because we didn't pass the appropriate arguments to the entrypoint:

#!/bin/bash # my-entrypoint.sh source /opt/my-custom.script.sh /opt/spark/entrypoint.sh # Notice we don't pass arguments

So the contract may be, that whatever arguments the user's entrypoint is passed, they must eventually call entrypoint.sh with all of those arguments forwarded through. That might be an acceptable contract. With environment variables we can say that it doesn't matter what arguments the downstream entrypoint is passed, just that you eventually need to run that code and it knows what to do if the application submission code set up the container's environment properly.

Having defined these ideas more precisely, I think either contract is acceptable here, with a slight preference towards having Spark interpret the environment rather than the arguments.

The point is, even if entrypoint.sh defines the contract, what arguments would a custom implementation pass it?

entrypoint.sh doesn't really define the contract, it implements the contract. Basically the entry point has to respect whatever the Spark submit code tells it to do.

So define the contract first. If you want users to be able to write their own custom entry points, then document that contract in a public doc, so that it cannot change in future versions of Spark. That was the main reason (well, one of them) why I asked this whole thing to be marked "experimental" in 2.3 - nothing is defined, but there seems to be a lot of tribal knowledge about what people want to do.

That doesn't really work in the long run.

Think that makes sense re: defining specifically what a custom implementation has to do. We can follow up on that separately. Also agree that we haven't been too precise about what a custom implementation would look like. There's simple things like adding or modifying the existing content, but in terms of modifying logic we haven't given that enough thought.

For now I think the code we have here will suffice for the immediate need of removing the init-container, and we can follow up later on which path we want to take in this part of the discussion.

as @mccheah mentioned, I included some logic on the current entrypoint.sh to allow Spark to work in cases such as an anonymous uid (another way to look at it is managing Spark's long-standing quirk of failing when it can't find a passwd entry). Putting it in entrypoint.sh was a good way to make sure this happened regardless of how the actual CMD evolved. A kind of defensive future-proofing, which is important for reference Dockerfiles. It also provides execution via tini as "pid 1", which is considered good standard practice.

All of this is done in part with the expectation that most users are liable to just want to customize their images by building the reference dockerfiles and using those as base images for their own, without modifying the CMD or entrypoint.sh

That said, I think that in terms of formally documenting a container API, entrypoint.sh may be a red herring. In theory, a user should be able to build their own custom container from the ground up, up to and including a different entrypoint, or default entrypoint, etc.

Part of the reason we went with an externalized CMD (instead of creating one in the backend code) was to allow maximum flexibility in how images were constructed. The back-end provides certain information to the pod. The "API" is a catalogue of this information, combined with any behaviors that the user's container must implement. API doc shouldn't assume the existence of entrypoint.sh

liyinan926

LGTM.

mccheah · 2018-03-19T17:58:28Z

@vanzin anything other feedback before merging this?

vanzin · 2018-03-19T17:59:07Z

I probably won't have time to review this carefully, so if you're happy with it, don't wait for me.

mccheah · 2018-03-19T18:05:42Z

Think someone with permissions to merge has to do so here.

vanzin · 2018-03-19T18:13:44Z

@mccheah you're a committer...

mccheah · 2018-03-19T18:19:27Z

Merge button doesn't appear for me in the UI =( will need to look into that.

erikerlandson · 2018-03-19T18:20:30Z

@mccheah workflow is to use dev/merge_spark_pr.py

vanzin · 2018-03-19T18:22:15Z

@mccheah you should have gotten an e-mail from Matei explaining the basics of how to merge PRs.

foxish · 2018-03-19T18:22:41Z

There's a section explaining it at the bottom of https://spark.apache.org/committers.html

mccheah · 2018-03-19T18:27:52Z

Thanks - merging shortly.

…emote dependencies Removal of the init-container for downloading remote dependencies. Built off of the work done by vanzin in an attempt to refactor driver/executor configuration elaborated in [this](https://issues.apache.org/jira/browse/SPARK-22839) ticket. This patch was tested with unit and integration tests. Author: Ilan Filonenko <if56@cornell.edu> Closes apache#20669 from ifilonenko/remove-init-container.

…emote dependencies ## What changes were proposed in this pull request? Removal of the init-container for downloading remote dependencies. Built off of the work done by vanzin in an attempt to refactor driver/executor configuration elaborated in [this](https://issues.apache.org/jira/browse/SPARK-22839) ticket. ## How was this patch tested? This patch was tested with unit and integration tests. Author: Ilan Filonenko <if56@cornell.edu> Closes apache#20669 from ifilonenko/remove-init-container.

Removed the use of init-container for downloading remote dependencies

2fefd0e

ifilonenko changed the title ~~[SPARK-22839][K8S] Remove the use of init-container for downloading remote dependencies~~ [SPARK-22839][K8S][WIP] Remove the use of init-container for downloading remote dependencies Feb 26, 2018

fixed entrypoint which was failing docker image build

431a216

removal of all instances of init container, at the moment bottlenecke…

1e63ecb

…d by resolving the location of the downloaded file

failure of file:// led to the need of including them within Kubernete…

71f4158

…sUtils

liyinan926 reviewed Mar 14, 2018

View reviewed changes

included extraDriverOptions

08def2c

mccheah reviewed Mar 14, 2018

View reviewed changes

resolving comments and checking JVM with integration tests

f8f42f0

mccheah reviewed Mar 16, 2018

View reviewed changes

liyinan926 approved these changes Mar 19, 2018

View reviewed changes

mccheah approved these changes Mar 19, 2018

View reviewed changes

asfgit closed this in f15906d Mar 19, 2018

ifilonenko deleted the remove-init-container branch March 19, 2018 21:08

liyinan926 mentioned this pull request Apr 20, 2018

Sync with apache/spark/master kubeflow/spark-operator#129

Closed

8 tasks

liyinan926 mentioned this pull request Jul 16, 2018

Deal with internally created ConfigMap mounted onto SPARK_CONF_DIR kubeflow/spark-operator#216

Open

[SPARK-22839][K8S] Remove the use of init-container for downloading remote dependencies #20669

[SPARK-22839][K8S] Remove the use of init-container for downloading remote dependencies #20669

Conversation

ifilonenko commented Feb 25, 2018

What changes were proposed in this pull request?

How was this patch tested?

SparkQA commented Feb 25, 2018

ssuchter commented Feb 25, 2018

SparkQA commented Feb 25, 2018

SparkQA commented Feb 25, 2018

ssuchter commented Feb 25, 2018

SparkQA commented Feb 25, 2018

SparkQA commented Feb 25, 2018

SparkQA commented Feb 25, 2018

vanzin commented Feb 26, 2018

mccheah commented Feb 26, 2018

vanzin commented Feb 26, 2018

foxish commented Feb 26, 2018 • edited Loading

SparkQA commented Feb 26, 2018

SparkQA commented Feb 26, 2018

vanzin commented Feb 26, 2018

SparkQA commented Feb 27, 2018

SparkQA commented Feb 27, 2018

SparkQA commented Feb 27, 2018

SparkQA commented Feb 27, 2018

vanzin commented Feb 27, 2018

ifilonenko commented Feb 27, 2018 • edited Loading

vanzin commented Feb 27, 2018 • edited Loading

vanzin commented Feb 27, 2018

ssuchter commented Feb 27, 2018

ssuchter commented Feb 28, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liyinan926 Mar 15, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liyinan926 Mar 14, 2018 • edited Loading

Choose a reason for hiding this comment

SparkQA commented Mar 14, 2018

SparkQA commented Mar 14, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ifilonenko commented Mar 16, 2018

SparkQA commented Mar 16, 2018

SparkQA commented Mar 16, 2018

mccheah left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mccheah Mar 17, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liyinan926 left a comment

Choose a reason for hiding this comment

mccheah commented Mar 19, 2018

vanzin commented Mar 19, 2018

mccheah commented Mar 19, 2018

vanzin commented Mar 19, 2018

foxish commented Feb 26, 2018 •

edited

Loading

ifilonenko commented Feb 27, 2018 •

edited

Loading

vanzin commented Feb 27, 2018 •

edited

Loading

liyinan926 Mar 15, 2018 •

edited

Loading

liyinan926 Mar 14, 2018 •

edited

Loading

mccheah Mar 17, 2018 •

edited

Loading