New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in Spark Submit #2883
Comments
Here is the Kubernetes resources for the drivers process
The other example that i used
|
Hi @mohamedih, did you try with the commands shown in the chart notes? You can obtain them with:
It will show how to execute a |
@andresbono it shows command that use the workers directly, but in my case i will not use the workers to submit the jobs. not sure what is actually the problem. |
Hi @mohamedih, we have identified some issues related to the submit command from remote nodes. We are going to investigate it and we will try to find a solution. We will update this issue when we have more information. |
FYI @mohamedih, #2946 (comment). You might find it useful. |
Hi, I am experiencing the exact same thing on Minikube. Works when submitting from a worker, but not from a new pod created: helm install spark-bitnami bitnami/spark
kubectl run -i --tty spark-interactive --image=bitnami/spark:3.0.1-debian-10-r12 -- bash
spark-submit --master spark://spark-bitnami-master-svc:7077 --class org.apache.spark.examples.SparkPi /opt/bitnami/spark/examples/jars/spark-examples_2.12-3.0.1.jar 5 results (on the submitting container) in
on the Master
Been working on this for hours, but didn't find a way fix it. Any ideas? Thanks, Mo |
Finally, I think, I got somewhere. The executor can't resolve the Pod hostname to the IP (for Pod hostname resolution, see https://stackoverflow.com/questions/59258223/how-to-resolve-pod-hostnames-from-other-pods). Therefore, passing the driver IP explicitly using
or
Now, the executor can respond correctly. |
@MohamedKari try to add "spark.driver.host" to your spark-submit command. where the value will be your spark driver service internal host name. example : |
Thanks for this valuable information! We will try this fix for the chart. Contributions are more than welcome, in case you want to draft a PR addressing the changes. |
@andresbono, Thinking about it, I think it's not really a problem of the Bitnami Spark Image. Doing it for Native Spark on Kubernetes (not using the Bitnami image; I'm was doing a comparison of deployment alternatives), I came up with the following setup: The setup of using a headless service directly relates to https://spark.apache.org/docs/latest/running-on-kubernetes.html#client-mode-networking. apiVersion: v1
kind: Pod
metadata:
name: spark-notebook
labels:
role: spark-notebook
spec:
serviceAccountName: spark
containers:
- name: spark-notebook
image: mokari94/spark-app:latest
env:
- name: HOME # overwrites HOME so Jupyter will for sure have permission
value: "/tmp/.jupyter"
- name: PYSPARK_DRIVER_PYTHON
value: "jupyter"
- name: PYSPARK_DRIVER_PYTHON_OPTS
value: "notebook . --ip 0.0.0.0"
command:
- /opt/spark/bin/pyspark
- --master
- k8s://https://kubernetes:443
- --conf
- spark.jars.ivy=/tmp/.ivy
- --conf
- spark.kubernetes.container.image=mokari94/spark-app:latest
- --conf
- -- spark.driver.port=40694
- --conf
- spark.driver.host=spark-notebook-service
# The above is the name of the headless service
# executors will need to send their communication to the driver process
# which we want to be on the Jupyter server.
# However, pod names are not directly resolvable to the host.
# So instead, we use a service that then routes to the pod.
# Instead of using a "standard" service that can loadbalance across multiple pods,
# we use a headless service that simply passes through requests to the pod that matches the selector.
- --conf
- spark.kubernetes.driver.pod.name=spark-notebook
ports:
- containerPort: 8888
- containerPort: 40694
---
apiVersion: v1
kind: Service
metadata:
name: spark-notebook-service
spec:
clusterIP: None
selector:
role: spark-notebook
ports:
- protocol: TCP
port: 8888
targetPort: 8888
name: jupyter-web-ui
- protocol: TCP
port: 40694
targetPort: 40694
name: spark-driver-port So, I guess, including a note in the image docs should be enough to close this issue. Pondering over it a bit conceptually, I don't see how the image itself could be changed, because it's really a problem of deploying the driver, isn't it? What do you think? |
I met a error ,this is my command:./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://10.96.254.134:7077 --deploy-mode cluster file:///examples/jars/spark-examples_2.12-3.0.1.jar 1000 but show this: the command path is /opt/spark. |
Hi @MohamedKari, I agree with that, the container image itself doesn't need changes. We can change the chart, or at least the chart documentation to explain how to connect to Spark from remote nodes. |
Hi @yangjinlogic, are you using the Bitnami Chart for Spark? It seems like you are using other container images. |
Took me a minute to understand what was being said in the comments above; it's that spark.driver.host needs to be the IP of the client submitting the job. Programmatically, that is: ./bin/run-example \
--master spark://spark-master-svc:7077 \
--deploy-mode client \
--conf spark.driver.host=$(hostname -i) \
SparkPi |
thank you so much, this really helped me ! after adding spark.driver.host conf, it finally started to distribute tasks to other workers! |
I am glad that helped you. |
Sure. please close it.
| |
pangtong2015
|
|
***@***.***
|
…---- Replied mail ----
| From | Rafael Ríos ***@***.***> |
| Date | 07/11/2022 15:02 |
| To | ***@***.***> |
| Cc | Frank ***@***.******@***.***> |
| Subject | Re: [bitnami/charts] Error in Spark Submit (#2883) |
I am glad that helped you.
Could I close this issue ?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Unfortunately, this issue was created a long time ago and although there is an internal task to fix it, it was not prioritized as something to address in the short/mid term. It's not a technical reason but something related to the capacity since we're a small team. Being said that, contributions via PRs are more than welcome in both repositories (containers and charts). Just in case you would like to contribute. During this time, there are several releases of this asset and it's possible the issue has gone as part of other changes. If that's not the case and you are still experiencing this issue, please feel free to reopen it and we will re-evaluate it. |
I'm at just this point in working this all out but when I use
I get
|
I'm using Spark chart in IBM Cloud. I'm using values.yaml. The cluster master and the workers are up and running in Kubernetes. When i use spark submit from pod inside Kubernetes, I see the below behavior:
This log from the driver itself.
This is error form one of the workers.
Here is the spark submit command
The text was updated successfully, but these errors were encountered: