Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-25647][k8s] Add spark streaming compatibility suite for kubernetes. #22639

Closed
wants to merge 4 commits into from

Conversation

ScrapCodes
Copy link
Member

What changes were proposed in this pull request?

Adds integration tests for spark streaming compatibility with K8s mode.

How was this patch tested?

By running the test suites.

@@ -120,7 +120,7 @@ private[spark] object SparkAppLauncher extends Logging {
appConf.toStringArray :+ appArguments.mainAppResource

if (appArguments.appArgs.nonEmpty) {
commandLine += appArguments.appArgs.mkString(" ")
commandLine ++= appArguments.appArgs
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space separated single argument or, multiple different argument. If we do .mkString(" ") then, it takes multi arguments as space separated single argument.

@SparkQA
Copy link

SparkQA commented Oct 5, 2018

@SparkQA
Copy link

SparkQA commented Oct 5, 2018

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/3709/

@SparkQA
Copy link

SparkQA commented Oct 5, 2018

Test build #96987 has finished for PR 22639 at commit d2230fc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 5, 2018

@SparkQA
Copy link

SparkQA commented Oct 5, 2018

Test build #96989 has finished for PR 22639 at commit 29a8e2e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 5, 2018

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/3711/

@ScrapCodes
Copy link
Member Author

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Oct 5, 2018

@SparkQA
Copy link

SparkQA commented Oct 5, 2018

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/3713/

@SparkQA
Copy link

SparkQA commented Oct 5, 2018

Test build #96994 has finished for PR 22639 at commit 29a8e2e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member

@liyinan926 @mccheah

Copy link
Contributor

@mccheah mccheah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a lot of this could be cleaner if we deployed in cluster mode. Also deploying a separate pod to run the socket server with the extra connection for the word count to read from would be cleaner in terms of reasoning about resource management and not leaking sockets.


k8sSuite: KubernetesSuite =>

private def startSocketServer(): (String, Int, ServerSocket) = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the submission client starting a server, and then the pod needs to be able to connect to the submission client host and port? An alternative is to deploy a separate pod that does this so that network communications are pod-to-pod instead of pod-to-host.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do it that way it's a lot easier to clear the resources and avoid trouble with sockets hanging open on the Jenkins bare metal host, for example we can just delete the whole server pod.

logInfo(s"Received connection on $socket")
for (i <- 1 to 10 ) {
if (socket.isConnected && !serverSocket.isClosed) {
socket.getOutputStream.write("spark-streaming-kube test.\n".getBytes())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specify encoding as UTF-8.

}
}

private def getRunLog(_driverPodName: String): String = kubernetesTestComponents.kubernetesClient
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the underscore in _driverPodName? If it's a name conflict, I would presume the driver pod name is available elsewhere already and we can just use the local var and not have an argument.

.withImagePullPolicy("IfNotPresent")
.withCommand("/opt/spark/bin/run-example")
.addToArgs("--master", s"k8s://https://kubernetes.default.svc")
.addToArgs("--deploy-mode", mode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're creating the pod manually, the deploy mode should always be client and not cluster, so this should just be hardcoded, and the argument removed from the function signature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also why are we running in client mode if we're going to be deploying a pod anyways? We can deploy this in cluster mode using spark submit and then set up the service in front of the pod anyways.


private def driverServiceSetup(_driverPodName: String): (Int, Int, Service) = {
val labels = Map("spark-app-selector" -> _driverPodName)
val driverPort = 7077
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

driverPort and blockManagerPort can be constants in a companion object.

@ScrapCodes
Copy link
Member Author

ScrapCodes commented Oct 11, 2018

@mccheah Thanks for taking a look. Overall nice suggestion, I am okay with idea of having a pod, but struggling with creating a pod for socket server. Can you please suggest how to go about it?

@SparkQA
Copy link

SparkQA commented Oct 11, 2018

@SparkQA
Copy link

SparkQA commented Oct 11, 2018

Test build #97250 has finished for PR 22639 at commit c63c54e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 11, 2018

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/3879/

import StreamingCompatibilitySuite._

test("Run spark streaming in client mode.", k8sTestTag) {
val (host, port, serverSocket) = startSocketServer()
Copy link
Contributor

@skonto skonto Oct 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use a custom source as an alternative for feeding the stream. Re-using existing code is also nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please correct my understanding, a custom source has to either live in examples, or a separate image has to be published with the class path of the custom source.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am more inclined to add in the examples. Just an alternative option.

@ScrapCodes
Copy link
Member Author

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Oct 15, 2018

Test build #97381 has finished for PR 22639 at commit c63c54e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 15, 2018

@SparkQA
Copy link

SparkQA commented Oct 15, 2018

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/3975/

.withLabels(labels.asJava)
.endMetadata()
.withNewSpec()
.withServiceAccountName(kubernetesTestComponents.serviceAccountName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flakiness of the tests seems to be in relation to this service account. Investigate the necessary rbac.yml that would need to be set to ensure that these failures don't come up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same is used in "run in client mode" test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, what is causing it. Do you have any clue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this was solved in the Client mode tests, despite the addition of the spark-rbac.yml I think this might require investigation outside of the scope of this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have run these tests on my own setup of minikube, and I am unable to reproduce the failure that occurred on jenkins. It is possible that is related to how minikube is setup on jenkins.

@ScrapCodes
Copy link
Member Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Oct 25, 2018

@SparkQA
Copy link

SparkQA commented Oct 25, 2018

Test build #98006 has finished for PR 22639 at commit c63c54e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 25, 2018

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4468/

@ScrapCodes
Copy link
Member Author

@mccheah and @skonto Do you have suggestion how to go forward from here? I wanted to write more tests, like how to recover from checkpoints etc...

@skonto
Copy link
Contributor

skonto commented Jan 15, 2019

@ScrapCodes could you update the PR according to @mccheah comments? I think we will be good to go then.

@vanzin
Copy link
Contributor

vanzin commented Mar 8, 2019

If this isn't being worked on it should probably be closed.

@ScrapCodes
Copy link
Member Author

I intend to finish this soon, if you like I can keep it closed until then?

@ScrapCodes ScrapCodes closed this Mar 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants