Use a pre-installed Minikube instance -- porting over logic from PR 521 #14

ifilonenko · 2017-12-29T21:25:07Z

Porting logic from: apache-spark-on-k8s/spark#521

ssuchter

I think it looks good. I have some comments to take a look at and fix if you think they should be.

ssuchter · 2017-12-30T00:50:38Z

integration-test/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/Utils.scala

+          Utils.tryWithSafeFinally {
+            val buf = new Array[Byte](1024)
+            var len = in.read(buf)
+            while (len != -1) {


This input reading loop is typically done with a break command inside an infinite loop. (So there aren't two reads in the code)

Not that it's too important to fix this, but have you considered using an approach like this so it can use breaks?

https://alvinalexander.com/scala/break-continue-for-while-loops-in-scala-examples-how-to

Ditto to above ^^

ssuchter · 2017-12-30T00:51:23Z

integration-test/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/Utils.scala

+        scala.util.control.Exception.ignoring(classOf[IOException]) {
+          // FIXME: We copy the stream on the level of bytes to avoid encoding problems.
+          Utils.tryWithSafeFinally {
+            val buf = new Array[Byte](1024)


Should this line 99-105 go into its own subroutine?

As an alternative, do we have Guava available? Could we just use ByteStreams.copy() instead of the entire body here?

These Utils are taken from spark core.... should I modify them?

I'd argue that the cleanest modification (using Guava) is something we should do. There's no reason to replicate this code from the Spark core.

I don't see this class being used anywhere anymore actually, so we can remove this.

ssuchter · 2017-12-30T00:56:35Z

.../scala/org/apache/spark/deploy/k8s/integrationtest/docker/KubernetesSuiteDockerManager.scala

+  def buildSparkDockerImages(): Unit = {
+    Eventually.eventually(TIMEOUT, INTERVAL) { dockerClient.ping() }
+    // Building Python distribution environment
+    val pythonExec = sys.env.get("PYSPARK_DRIVER_PYTHON")


Maybe we should print out (log) which python gets selected? I can imagine debugging when it's hard to tell what is actually running because of this automatic switching behavior.

ssuchter · 2017-12-30T00:58:03Z

.../scala/org/apache/spark/deploy/k8s/integrationtest/docker/KubernetesSuiteDockerManager.scala

+      .orElse(sys.env.get("PYSPARK_PYTHON"))
+      .getOrElse("/usr/bin/python")
+    val builder = new ProcessBuilder(
+      Seq(pythonExec, "setup.py", "sdist").asJava)


Should the path to this setup.py be more carefully specified? It seems like the above code determines where the dockerfiles location more precisely, but maybe I'm missing something there. This just seems to be relevant to the current directory.

ifilonenko · 2017-12-30T22:59:46Z

Problem I am having is that:

com.spotify.docker.client.exceptions.DockerException: 
ProgressMessage{id=null, status=null, stream=null, 
error=lstat dockerfiles/spark-base/entrypoint.sh: no such file or directory, 
progress=null, progressDetail=null}

I have no idea what this is stemming from as the same logic is used in the master branch atm.

ifilonenko · 2017-12-30T23:13:00Z

I noticed you guys modified the target docker-file to have this line instead:
COPY kubernetes/dockerfiles/spark-base/entrypoint.sh /opt/
which distribution are you pointing to?

ifilonenko · 2017-12-30T23:49:33Z

Above problems resolved when pulling from spark/master.

kimoonkim

First batch of comments. PTAL.

kimoonkim · 2018-01-03T17:31:05Z

README.md

 Running the integration tests requires a Spark distribution package tarball that
 contains Spark jars, submission clients, etc. You can download a tarball from
 http://spark.apache.org/downloads.html. Or, you can create a distribution from
 source code using `make-distribution.sh`. For example:

 ```
-$ git clone git@github.com:apache/spark.git
+$ https://github.com/apache/spark.git


What does this line mean only with a URL?

kimoonkim · 2018-01-03T17:32:26Z

README.md

-$ mvn clean integration-test  \
-    -Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz  \
-    -DextraScalaTestArgs=-Dspark.docker.test.persistMinikube=true
+    -DextraScalaTestArgs="-Dspark.kubernetes.test.master=k8s://https://<master>


Is the master parameter now required? If so, explain how people can get it from a minikube?

kubectl cluster-info is already discussed above.

Master parameter shouldn't be required.

...But this section is under the "running against an arbitrary cluster" section, so in this context the parameter would be required.

kimoonkim · 2018-01-03T17:35:02Z

...ration-test/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala

@@ -143,6 +152,10 @@ private[spark] class KubernetesSuite extends FunSuite with BeforeAndAfterAll wit
      }
    }
  }
+  private def doMinikubeCheck(): Unit = {
+    assume(testBackend == MinikubeTestBackend)


Hmm. I remember @foxish just deleted this line recently so integration tests can run against GCE. Can you check with @foxish?

(Comment was for the wrong section)

kimoonkim · 2018-01-03T17:39:56Z

...t/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/backend/minikube/Minikube.scala

  private val MINIKUBE_STARTUP_TIMEOUT_SECONDS = 60

-  // NOTE: This and the following methods are synchronized to prevent deleteMinikube from


We are deleting this note. Maybe we don't need "synchronized" any more. Kill "synchronized" below?

kimoonkim · 2018-01-03T17:42:02Z

.../src/test/scala/org/apache/spark/deploy/k8s/integrationtest/backend/GCE/GCETestBackend.scala

@@ -37,5 +37,7 @@ private[spark] class GCETestBackend(val master: String) extends IntegrationTestB
    defaultClient
  }

-  override def name(): String = GCE_TEST_BACKEND
+  override def dockerImageTag(): String = {
+    return System.getProperty(KUBERNETES_TEST_DOCKER_TAG_SYSTEM_PROPERTY, "latest")


Why not generate a random ID like minikube backend code does? i.e. UUID.randomUUID().toString.replaceAll("-", "")

In the Minikube case we're building these images from scratch. In the GCE case, we don't create a Docker manager and hence are not building the images there. But this in itself seems to contradict this section of our readme:

If you're using a non-local cluster, you must provide an image repository which you have write access to, using the -i option, in order to store docker images generated during the test.

which indicates that GCE-backed tests should be building images as well. Is this correct @foxish?

That readme section is meant to highlight that we push the images to an image repository only in the cloud testing case, and don't have to in the minikube case since the images are built in the minikube VM's docker environment. That documentation pertains only to the use of the script, which avoids using maven for building images.

The problem then with using a random ID tag here is that it's impossible for this tag to actually match anything. Using "latest" at least guarantees that we pick up some image in the default case.

We can be more strict here and require the tag be explicitly specified.

Looking a little closer I think the miscommunication is because the docker image manager isn't serving the image tag but is instead being handed the tag by the test backend. The responsibilities thus aren't clear and the coupling of the provision of a custom tag vs. a generated tag, and how that influences whether or not images are built or deleted, is unclear.

I'm moving the generation of the tag vs. using the user-provided one into the docker manager. This should hopefully clarify the connection.

kimoonkim · 2018-01-03T17:45:37Z

.../scala/org/apache/spark/deploy/k8s/integrationtest/docker/KubernetesSuiteDockerManager.scala

+import org.apache.spark.deploy.k8s.integrationtest.Logging
+import org.apache.spark.deploy.k8s.integrationtest.Utils.{RedirectThread, tryWithResource}
+
+private[spark] class KubernetesSuiteDockerManager(


Can you make git associate SparkDockerImageBuilder.scala with this file so we can see the change side by side and also keep the history? Maybe find how-to at https://stackoverflow.com/questions/433111/how-to-make-git-mark-a-deleted-and-a-new-file-as-a-file-move

I tried most of the suggestions in that stack overflow post to no avail, unfortunately, as the diff against master will still treat the move as a remove and add operation. We can make the modification a separate commit from the rename, and then even if github UI doesn't show the diff in the main PR view, one can navigate to the specific commit and view the diff from there. Is that a reasonable compromise?

@kimoonkim I rewrote some of the git history and 7470472 is the specific diff for this file. I suppose git is always treating it as removing the whole thing and adding it as a new file because of the large number of changes that were actually made here.

Ok. Sounds good enough. We can take a look at the commit. Thanks!

kimoonkim · 2018-01-05T03:11:49Z

rerun integration tests please

kimoonkim · 2018-01-05T17:26:19Z

I have two high level design questions. Perhaps @ifilonenko @mccheah, @ssuchter or @foxish know the answers.

Are we targetting to run multiple integration tests simultaneously using a single shared minikube instance? Then what is the mechanism to cap the max number of tests so the minikube doesn't blow up?
When we share a single minikube instance, it's important for each test to clean up when it finishes. Like delete all images, pods, secrets, etc. What is our story against poorly written tests that failed to clean up? How do we ensure the leaked resources over time do not make all other tests fail later?

foxish · 2018-01-05T17:32:52Z

I can try and address #2. I think we can accomplish that through namespace isolation for tests. We do this within k8s tests as well - all resources are created within a namespace during each test and the namespace is torn down at the end of each test. I think we already do this here as well, and it would give us sufficient isolation even with multiple runs occurring in parallel IMO. As far as I know, the docker images should get GC-ed automatically also and not need explicit cleanup. If needed, we can add some docker images cleanup step though to our tests.

…

On Jan 5, 2018 9:26 AM, "Kimoon Kim" ***@***.***> wrote: I have two high level design questions. Perhaps @ifilonenko <https://github.com/ifilonenko> @mccheah <https://github.com/mccheah>, @ssuchter <https://github.com/ssuchter> or @foxish <https://github.com/foxish> know the answers. 1. Are we targetting to run multiple integration tests simultaneously using a single shared minikube instance? Then what is the mechanism to cap the max number of tests so the minikube doesn't blow up? 2. When we share a single minikube instance, it's important for each test to clean up when it finishes. Like delete all images, pods, secrets, etc. What is our story against poorly written tests that failed to clean up? How do we ensure the leaked resources over time do not make all other tests fail later? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#14 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA3U5064pNQUpupL2SjHHCdZl1StM6Zcks5tHls7gaJpZM4RPZWe> .

kimoonkim · 2018-01-05T18:29:55Z

I like the namespace story. I don't think dockerd does image GC (I could be wrong here). We'd need follow-up on that. I wonder if we want to move the namespace/docker clean up part to non-scala code so that there is less chance for buggy scala code in a PR to screw it up, but we can delay this until it becomes a real problem.

mccheah · 2018-01-09T00:06:51Z

@ifilonenko heads up, I force pushed and rewrote some of the git history to address #14 (comment). (In retrospect I should have done git rm and git revert, etc. to only push onto the history stack though)

kimoonkim · 2018-01-09T17:32:55Z

Are we targetting to run multiple integration tests simultaneously using a single shared minikube instance? Then what is the mechanism to cap the max number of tests so the minikube doesn't blow up?

I wonder if we can use namespace resource quota for this. @foxish Does k8s reject a new namespace add request if the cluster would run out of resource by accepting the namespace's max resources? If yes, then the shared minikube can limit the max number of test namespaces, which in turn limit the max number of simultaneous tests?

kimoonkim

LGTM. Thanks for the change!

Let's delay merging until we resolve Jenkins setup issues somehow.

ssuchter · 2018-01-09T18:12:26Z

I don’t think we have to delay. The jenkins builds should work even after this is merged. Sean On January 9, 2018 at 9:42:15 AM, Kimoon Kim (notifications@github.com) wrote: *@kimoonkim* approved this pull request. LGTM. Thanks for the change! Let's delay merging until we resolve Jenkins setup issues somehow. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#14 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB2E2-UgbpXoJ6BLZH2_b8jgx9RPN1Nfks5tI6T2gaJpZM4RPZWe> .

ifilonenko · 2018-01-10T04:39:54Z

This LGTM. Thanks for the the work @mccheah

ssuchter · 2018-01-10T17:33:33Z

Kimoon brought up the question to me of (eventually) when we want to have a persistent minikube, how will be control the number of tests running against it simultaneously. E.g. maybe it can only hold 4 tests (or 8 or whatever). Turns out the Jenkins lockable resource plugin supports this - you can not just have a single resource that gets locked, you can have N resources that you can request M of for one test (e.g. N=4, M=1) and then after that, builds will block. It's like a counting semaphore, more than just a mutex.

kimoonkim · 2018-01-10T17:45:42Z

Right. The question I asked earlier is:

Are we targetting to run multiple integration tests simultaneously using a single shared minikube instance? Then what is the mechanism to cap the max number of tests so the minikube doesn't blow up?

So we have two potential answers:

A. Use Jenkins lockable resource plugin

E.g. maybe it can only hold 4 tests (or 8 or whatever). Turns out the Jenkins lockable resource plugin supports this

B. Use namespace resource quota

I wonder if we can use namespace resource quota for this. @foxish Does k8s reject a new namespace add request if the cluster would run out of resource by accepting the namespace's max resources? If yes, then the shared minikube can limit the max number of test namespaces, which in turn limit the max number of simultaneous tests?

Which one do we like more?

mccheah · 2018-01-10T23:14:37Z

In the SIG meeting on 01/10/2018 we decided that we can use the locking plugin with a certain count of concurrently runnable builds. We should be able to ensure that whatever build runs, that build can get resources to run tests and not worry about timing out. For now we can leave the number of concurrent builds as 1 and experiment with what happens as we scale up moving forward.

Think we agreed that we want this to go in - can we merge this and then install Minikube on Pepperdata Jenkins that runs periodically with this?

ssuchter · 2018-01-11T00:37:11Z

I'm fine with that plan.

ssuchter · 2018-01-11T17:37:21Z

I should mention in response to @kimoonkim's different options, I prefer using the Jenkins plugin (or another job locking mechanism until the Jenkins plugin is available) over the namespace limit solution. I like the simplicity of the integration test code always expecting that it's submissions to Kubernetes will succeed, and if they fail, the entire integration test should fail. I think that if we use the namespace limit solution, we'd have to code detection of that reason that the Kubernetes submissions fail, in order to avoid spurious failures of the integration test system do to an unfortunate co-occurrence of builds.

Doing both limiting methods would avoid spurious failures, but I'm not sure of the value of having the namespace limit if we have locks beforehand.

kimoonkim · 2018-01-11T18:02:02Z

I don't have a strong preference either way. But there is a longer-term downside in using the lockable resource Jenkins plugin. The resource need of integration tests would change over time when we put more helper components like RSS, HDFS, token refresh server, etc. We would have to reconfigure the lock plugin whenever we find it underestimated the resource requirement. A more ominous case is when a new PR or two add those extra helper components. Suddenly not every test run needs the same amount of resources. It's hard to express this using the lock plugin.

But I think we can set aside these concerns for now in pursuit of simplicity. And revisit other options if and when we actually face these issues.

I am personally more worried about the resource clean up issue. I understand we can rely on pods to be removed when the enclosing namespace gets deleted. And this PR cleans up docker images in the scala code. But a new poorly written PR can easily disable the clean-up scala code. So it won't be fool-safe. I wonder if we need more forceful clean-up mechanism, sitting outside the scala code.

When we share a single minikube instance, it's important for each test to clean up when it finishes. Like delete all images, pods, secrets, etc. What is our story against poorly written tests that failed to clean up? How do we ensure the leaked resources over time do not make all other tests fail later?

mccheah · 2018-01-11T18:57:36Z

We should probably have scripts that deal with the Docker cleanup outside of the Scala code. There's still some messiness related to the fact that we're building images in two places - see https://github.com/apache-spark-on-k8s/spark-integration/issues/30. Basically the Scala code should be best-effort but the scripts should make the final guarantees.

mccheah · 2018-01-12T00:22:19Z

Are we good to merge this? I'm doing some refactors that will heavily depend on both this and #29.

ssuchter · 2018-01-12T18:24:38Z

I'm good to go.

foxish · 2018-01-12T22:29:41Z

Saw that the docker image management moves into the scripts in #31. This LGTM. Thanks Matt & Ilan.

ifilonenko added 2 commits December 29, 2017 16:22

initial logic needed for porting PR 521

3d152c9

merge conflicts

6a014fb

ifilonenko requested review from kimoonkim, ssuchter and liyinan926 December 29, 2017 21:25

ifilonenko added 3 commits December 29, 2017 16:27

remove spark.version in pom

22668e3

minor styling

a59339c

handling flags, readmes, and POM changes

21fc0d1

ssuchter requested changes Dec 30, 2017

View reviewed changes

ifilonenko added 2 commits December 30, 2017 17:48

resolve issues with minikube and some comment resolution

c777d2c

updated readme

55e97ea

config values

7369772

kimoonkim requested changes Jan 3, 2018

View reviewed changes

mccheah added 6 commits January 8, 2018 15:08

Merge remote-tracking branch 'origin/master' into 'ifilonenko/master'

3a2b4d8

Modify SparkDockerImageBuilder so it can delete docker images

7470472

Move the docker manager in git

047b2a2

Merge remote-tracking branch 'ifilonenko/master' into HEAD

6c32d8f

Address more comments.

792476f

Address more comments

103d507

kimoonkim changed the title ~~Porting over logic from PR 521~~ Use a pre-installed Minikube instance -- porting over logic from PR 521 Jan 9, 2018

kimoonkim approved these changes Jan 9, 2018

View reviewed changes

ssuchter approved these changes Jan 12, 2018

View reviewed changes

mccheah mentioned this pull request Jan 12, 2018

Move docker image management and test entrypoint to Maven #31

Merged

foxish merged commit 9051b55 into apache-spark-on-k8s:master Jan 12, 2018

		private val MINIKUBE_STARTUP_TIMEOUT_SECONDS = 60

		// NOTE: This and the following methods are synchronized to prevent deleteMinikube from

Use a pre-installed Minikube instance -- porting over logic from PR 521 #14

Use a pre-installed Minikube instance -- porting over logic from PR 521 #14

Conversation

ifilonenko commented Dec 29, 2017

ssuchter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ifilonenko commented Dec 30, 2017 • edited

ifilonenko commented Dec 30, 2017

ifilonenko commented Dec 30, 2017

kimoonkim left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mccheah Jan 8, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kimoonkim commented Jan 5, 2018

kimoonkim commented Jan 5, 2018

foxish commented Jan 5, 2018 via email

kimoonkim commented Jan 5, 2018

mccheah commented Jan 9, 2018

kimoonkim commented Jan 9, 2018

kimoonkim left a comment

Choose a reason for hiding this comment

ssuchter commented Jan 9, 2018 via email

ifilonenko commented Jan 10, 2018 • edited

ssuchter commented Jan 10, 2018

kimoonkim commented Jan 10, 2018

mccheah commented Jan 10, 2018

ssuchter commented Jan 11, 2018

ssuchter commented Jan 11, 2018

kimoonkim commented Jan 11, 2018

mccheah commented Jan 11, 2018

mccheah commented Jan 12, 2018

ssuchter commented Jan 12, 2018

foxish commented Jan 12, 2018 • edited

ifilonenko commented Dec 30, 2017 •

edited

mccheah Jan 8, 2018 •

edited

ifilonenko commented Jan 10, 2018 •

edited

foxish commented Jan 12, 2018 •

edited