Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark Pi sample fails when running on Azure AKS version 1.13.10 #613

Closed
jdelgadillo opened this issue Sep 12, 2019 · 2 comments
Closed

Spark Pi sample fails when running on Azure AKS version 1.13.10 #613

jdelgadillo opened this issue Sep 12, 2019 · 2 comments

Comments

@jdelgadillo
Copy link

Due to recent security, we set out to validate our application on AKS 1.13.10 and 1.14.6.

What we found was that none of the spark applications we deployed would run. Eventually we tried running Spark Pi Sample (https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/examples/spark-pi.yaml) and got the following error:

We installed spark-operator via helm:
helm --kubeconfig=${KC} install --name spark --namespace spark-operator --set enableWebhook=true ../helm-charts/sparkoperator
We copied the helm chart down locally so we can control the version that gets installed. We've tried using both latest and v2.4.0-v1beta1-0.9.0 and see the same results

If we deploy spark-operator to with AKS runing K8s version 1.13.9 or 1.14.5, everything works as expected.

java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden'
    at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216)
    at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183)
    at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141)
    at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
2019-09-12 15:51:24 ERROR SparkContext:91 - Error initializing SparkContext.
io.fabric8.kubernetes.client.KubernetesClientException:
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2.onFailure(WatchConnectionManager.java:188)
    at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:543)
    at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:185)
    at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141)
    at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
2019-09-12 15:51:24 INFO  AbstractConnector:318 - Stopped Spark@2fb5fe30{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2019-09-12 15:51:24 INFO  SparkUI:54 - Stopped Spark web UI at http://spark-pi-1568303422100-driver-svc.default.svc:4040
2019-09-12 15:51:25 INFO  KubernetesClusterSchedulerBackend:54 - Shutting down all executors
2019-09-12 15:51:25 INFO  KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint:54 - Asking each executor to shut down
2019-09-12 15:51:27 INFO  MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2019-09-12 15:51:27 INFO  MemoryStore:54 - MemoryStore cleared
2019-09-12 15:51:27 INFO  BlockManager:54 - BlockManager stopped
2019-09-12 15:51:28 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
2019-09-12 15:51:28 WARN  MetricsSystem:66 - Stopping a MetricsSystem that is not running
2019-09-12 15:51:28 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2019-09-12 15:51:28 INFO  SparkContext:54 - Successfully stopped SparkContext
Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException:
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2.onFailure(WatchConnectionManager.java:188)
    at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:543)
    at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:185)
    at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141)
    at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
@liyinan926
Copy link
Collaborator

Please see #591.

@jdelgadillo
Copy link
Author

@liyinan926 Thanks!!! Updating the jar as suggest in #591 fixed the issue for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants