Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WatchConnectionManager:185 - Exec Failure: HTTP 403, Status: 403 - #591

Closed
jemega opened this issue Aug 29, 2019 · 9 comments
Closed

WatchConnectionManager:185 - Exec Failure: HTTP 403, Status: 403 - #591

jemega opened this issue Aug 29, 2019 · 9 comments

Comments

@jemega
Copy link

jemega commented Aug 29, 2019

I had deployed the spark operator un a platform with Kubernetes v1.14.5 with GoVersion go1.12.5 all was working well, but by a vulnerability on the go version I did an upgrade to Kubernetes v1.14.6 with GoVersion go1.12.9 without another change in the platform, now the driver return the following error:

2019-08-28 21:26:09 INFO  SparkContext:54 - Added JAR file:///opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar at spark://spark-pi-1567027559176-driver-svc.spark-test.svc:7078/jars/spark-examples_2.11-2.4.0.jar with timestamp 1567027569021
2019-08-28 21:26:11 INFO  ExecutorPodsAllocator:54 - Going to request 1 executors from Kubernetes.
2019-08-28 21:26:11 WARN  WatchConnectionManager:185 - Exec Failure: HTTP 403, Status: 403 - 
java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden'
	at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216)
	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183)
	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141)
	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
2019-08-28 21:26:11 ERROR SparkContext:91 - Error initializing SparkContext.
io.fabric8.kubernetes.client.KubernetesClientException: 
	at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2.onFailure(WatchConnectionManager.java:188)
	at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:543)
	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:185)
	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141)
	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
2019-08-28 21:26:11 INFO  AbstractConnector:318 - Stopped Spark@58cc8da2{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2019-08-28 21:26:11 INFO  SparkUI:54 - Stopped Spark web UI at http://spark-pi-1567027559176-driver-svc.spark-test.svc:4040
2019-08-28 21:26:11 INFO  KubernetesClusterSchedulerBackend:54 - Shutting down all executors
2019-08-28 21:26:11 INFO  KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint:54 - Asking each executor to shut down
2019-08-28 21:26:12 INFO  MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2019-08-28 21:26:12 INFO  MemoryStore:54 - MemoryStore cleared
2019-08-28 21:26:12 INFO  BlockManager:54 - BlockManager stopped
2019-08-28 21:26:12 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
2019-08-28 21:26:12 WARN  MetricsSystem:66 - Stopping a MetricsSystem that is not running
2019-08-28 21:26:12 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2019-08-28 21:26:12 INFO  SparkContext:54 - Successfully stopped SparkContext
Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: 
	at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2.onFailure(WatchConnectionManager.java:188)
	at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:543)
	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:185)
	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141)
	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
2019-08-28 21:26:12 INFO  ShutdownHookManager:54 - Shutdown hook called 

Here the event log:

0s          Normal    SparkApplicationDeleted            sparkapplication/spark-pi           SparkApplication spark-pi was deleted
0s          Normal    SparkApplicationAdded              sparkapplication/spark-pi           SparkApplication spark-pi was added, enqueuing it for submission
0s          Normal    Scheduled                          pod/spark-pi-driver                 Successfully assigned spark-test/spark-pi-driver to aks-agentpool-21324540-0
0s          Warning   FailedMount                        pod/spark-pi-driver                 MountVolume.SetUp failed for volume "spark-conf-volume" : configmaps "spark-pi-1567027559176-driver-conf-map" not found
0s          Normal    SparkApplicationSubmitted          sparkapplication/spark-pi           SparkApplication spark-pi was submitted successfully
0s          Normal    Pulling                            pod/spark-pi-driver                 pulling image "gcr.io/spark-operator/spark:v2.4.0"
0s          Normal    Pulled                             pod/spark-pi-driver                 Successfully pulled image "gcr.io/spark-operator/spark:v2.4.0"
0s          Normal    Created                            pod/spark-pi-driver                 Created container
0s          Normal    Started                            pod/spark-pi-driver                 Started container
0s          Normal    SparkDriverRunning                 sparkapplication/spark-pi           Driver spark-pi-driver is running
0s          Normal    SparkExecutorPending               sparkapplication/spark-pi           Executor spark-pi-1567027559176-exec-1 is pending
0s          Normal    Scheduled                          pod/spark-pi-1567027559176-exec-1   Successfully assigned spark-test/spark-pi-1567027559176-exec-1 to aks-agentpool-21324540-1
0s          Warning   SparkDriverFailed                  sparkapplication/spark-pi           Driver spark-pi-driver failed
0s          Warning   SparkApplicationFailed             sparkapplication/spark-pi           SparkApplication spark-pi failed: driver pod failed with ExitCode: 1, Reason: Error
0s          Normal    Pulling                            pod/spark-pi-1567027559176-exec-1   pulling image "gcr.io/spark-operator/spark:v2.4.0"
0s          Warning   Failed                             pod/spark-pi-1567027559176-exec-1   Error: cannot find volume "spark-local-dir-1" to mount into container "executor"
0s          Normal    Pulled                             pod/spark-pi-1567027559176-exec-1   Successfully pulled image "gcr.io/spark-operator/spark:v2.4.0" 

Actually I'm trying to run a basic example spark-pi.yaml to discard another changes of configuration in my yaml files

Can you please help me to resolve it?

@liyinan926
Copy link
Collaborator

This issue is also being discussed in the Slack channel. Will get back to this once we have a solution.

@liyinan926
Copy link
Collaborator

See this thread on Slack: https://kubernetes.slack.com/archives/CALBDHMTL/p1567109649027000?thread_ts=1567027800.012500&cid=CALBDHMTL.

@liyinan926
Copy link
Collaborator

Looks like there's a workaround by upgrading the version of the fabric8 k8s client used. You can add the following to your Spark Dockerfile:

RUN rm $SPARK_HOME/jars/kubernetes-client-3.0.0.jar
ADD https://repo1.maven.org/maven2/io/fabric8/kubernetes-client/4.4.2/kubernetes-client-4.4.2.jar $SPARK_HOME/jars

@jemega
Copy link
Author

jemega commented Aug 29, 2019

Cool, works for me, I remove kubernetes-client-4.1.2.jar in the Dockerfile

RUN rm $SPARK_HOME/jars/kubernetes-client-4.1.2.jar

Thanks

@liyinan926
Copy link
Collaborator

Great to know it works.

@michalzxc
Copy link

Should it be closed, before upgrading all the defaults from examples and stuff?
spark-pi.yaml doesn't work with its current image

@AceHack
Copy link

AceHack commented Sep 7, 2019

Yes please create new base images to address this issue, thanks.

@liyinan926
Copy link
Collaborator

I have built and pushed gcr.io/spark-operator/spark:v2.4.4, gcr.io/spark-operator/spark-py:v2.4.4, and gcr.io/spark-operator/spark-r:v2.4.0 with the patch. Spark 2.4.5 will use 4.4.2 by default thanks to https://issues.apache.org/jira/browse/SPARK-28921, so this issue won't happen in Spark 2.4.5+.

@jiangjian0920
Copy link

@liyinan926 I used the gcr. io/spark operator/spark r: v2.4.0 image, but the problem still exists

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants