Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/spark] Allow to access Spark worker page easily #16227

Closed
hongbo-miao opened this issue Apr 25, 2023 · 7 comments
Closed

[bitnami/spark] Allow to access Spark worker page easily #16227

hongbo-miao opened this issue Apr 25, 2023 · 7 comments

Comments

@hongbo-miao
Copy link

hongbo-miao commented Apr 25, 2023

Name and Version

bitnami/spark 3.3.2-debian-11-r19

What is the problem this feature will solve?

It would be great to allow to access Spark worker page.

Prefect also has a master / worker pattern.
In Prefect server (master) helm chart, there is a publicApiUrl, so that all UI related URLs will point to that URL.

If existing helm values not able to do it, it would be great to provide something like that, thanks! 😃

What is the feature you are proposing to solve the problem?

Originally asked at Stack Overflow. Below is a copy:


I have a local k3s Kubernetes created by Rancher Desktop.

I installed Spark by

helm upgrade \
  spark \
  spark \
  --install \
  --repo=https://charts.bitnami.com/bitnami \
  --namespace=hm-spark \
  --create-namespace
➜ kubectl get services -n hm-spark
NAME                                                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
spark-headless                                        ClusterIP   None            <none>        <none>                       32h
spark-master-svc                                      ClusterIP   10.43.200.139   <none>        7077/TCP,80/TCP              32h

➜ kubectl get pods -n hm-spark
NAME                                              READY   STATUS      RESTARTS   AGE
spark-master-0                                    1/1     Running     0          32h
spark-worker-0                                    1/1     Running     0          32h
spark-worker-1                                    1/1     Running     0          32h

Currently I am doing port-forward

➜ kubectl port-forward service/spark-master-svc --namespace=hm-spark 4040:80

to access the Spark master UI at http://localhost:4040/

However, I won't be able to open Spark worker job pages as they are using Kubernentes internal cluster IP.

enter image description here

Here is this chart values: https://github.com/bitnami/charts/blob/main/bitnami/spark/values.yaml

Is there any value I can set to help me access the Spark worker job page? Thanks!

I found a similar question deployed by Docker Swarm but also has no accepted answers.

What alternatives have you considered?

spark-on-k8s-operator would be potential option.

@bitnami-bot bitnami-bot added this to Triage in Support Apr 25, 2023
@github-actions github-actions bot added the triage Triage is needed label Apr 25, 2023
@javsalgar javsalgar changed the title Allow to access Spark worker page easily [bitnami/spark] Allow to access Spark worker page easily Apr 26, 2023
@javsalgar
Copy link
Contributor

Hi,

It seems to me that this would be a case where add a public service endpoint to each node might be necessary. You can check the externalAccess section in other charts like bitnami/redis-cluster if I'm not mistaken. Would that work for your use case?

@github-actions github-actions bot moved this from Triage to Pending in Support Apr 26, 2023
@hongbo-miao
Copy link
Author

hongbo-miao commented Apr 26, 2023

Thanks @javsalgar ! I think externalAccess is the thing I want.

However, for redis-cluster helm chart, externalAccess is defined at here in the values.yaml file.
Unfortunately, Spark helm chart values.yaml does not have this externalAccess currently.

@bitnami-bot bitnami-bot moved this from Pending to Triage in Support Apr 26, 2023
@javsalgar
Copy link
Contributor

Thanks! I will forward this to the engineering team, but we cannot guarantee an ETA. However, if you want to speed up the process, would you like to submit a PR?

@github-actions github-actions bot moved this from Triage to Pending in Support May 2, 2023
@hongbo-miao
Copy link
Author

hongbo-miao commented May 2, 2023

Thanks @javsalgar ! Unfortunately, I think I don't have enough knowledge for this task.
No hurry, it is nice-to-have feature. Definitely gonna attract more people using this chart 😃

@bitnami-bot bitnami-bot moved this from Pending to Triage in Support May 2, 2023
@javsalgar javsalgar moved this from Triage to In progress in Support May 3, 2023
@github-actions github-actions bot added in-progress and removed triage Triage is needed labels May 3, 2023
@bitnami-bot bitnami-bot assigned rafariossaa and unassigned javsalgar May 3, 2023
@rafariossaa
Copy link
Contributor

Hi,
I am creating an internal tasks in order to implement this. However, I can not provide an ETA.
If someone if interested in sending a PR we will be glad to review and merge it.

@github-actions github-actions bot moved this from In progress to Pending in Support May 4, 2023
@jeluizferreira
Copy link

jeluizferreira commented May 10, 2023

Hi.
Did you tried the following config option, at master and worker sections:

  configOptions: >-
    -Dspark.ui.reverseProxy=true
    -Dspark.ui.reverseProxyUrl=https://spark.youdomain.com

@github-actions github-actions bot moved this from Pending to In progress in Support May 10, 2023
@rafariossaa rafariossaa moved this from In progress to Pending in Support May 11, 2023
@hongbo-miao
Copy link
Author

hongbo-miao commented May 18, 2023

Thanks @jeluizferreira ! And thanks @javsalgar and @rafariossaa too!

I found the related document at https://github.com/bitnami/charts/tree/main/bitnami/spark#configuring-spark-master-as-reverse-proxy

In my case I am using Cloudflare Tunnel. So I don't need ingress related part in the doc.

Here is my step:

First I pointed http://spark-master-svc.hm-spark.svc:80 to spark.mydomain.com in my Cloudflare Tunnel.

Then deploy Spark by

helm upgrade \
  spark \
  spark \
  --install \
  --repo=https://charts.bitnami.com/bitnami \
  --namespace=hm-spark \
  --create-namespace \
  --values=my-values.yaml

my-values.yaml

master:
  configOptions:
    -Dspark.ui.reverseProxy=true
    -Dspark.ui.reverseProxyUrl=https://spark.mydomain.com
worker:
  configOptions:
    -Dspark.ui.reverseProxy=true
    -Dspark.ui.reverseProxyUrl=https://spark.mydomain.com

Then I am able to visit both master and worker pages! 😃

image

image

@github-actions github-actions bot moved this from Pending to In progress in Support May 18, 2023
@bitnami-bot bitnami-bot moved this from In progress to Solved in Support May 18, 2023
@github-actions github-actions bot removed this from Solved in Support May 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants