Skip to content

KubedAI/spark-history-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

💥 Spark History Server (Spark Web UI) 💥

Spark History Server is a Web user interface to monitor the metrics and performance of the spark jobs from Apache Spark.

🚀 Helm Chart bootstraps Spark History Server in Amazon EKS Cluster or any Kubernetes Cluster which uses Amazon S3 as a Spark event log data source using Helm package manager.

🚀 Spark History Server configured to read Spark Event Logs from Amazon S3 buckets with this Helm chart using IRSA.

🚀 Check out the instructions to run Spark WebUI using a local Docker container.

Prerequisites

✅ Kubernetes 1.19+

Helm 3+

✅ Ensure IRSA role created to add as an annotation for service account in values.yaml.

Install eksctl and run the following command to create AWS IRSA. Or use any other IaC tool to create IRSA.

eksctl create iamserviceaccount --cluster=<eks-cluster-name> --name=<serviceAccountName> --namespace=<serviceAccountNamespace> --attach-policy-arn=<policyARN>

Example:

Note: If the namespace doesn't exist already, it will be created

eksctl create iamserviceaccount --cluster=eks-demo-cluster --name=spark-history-server --namespace=spark-history-server --attach-policy-arn=arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess

Update values.yaml with annotations, serviceAccount name and the s3 bucket name and prefix

serviceAccount:
  create: false
  annotations:
    eks.amazonaws.com/role-arn: "<ENTER_IRSA_IAM_ROLE_ARN_HERE>"
  name: "<SERVICE_ACCOUNT_NAME>"

sparkHistoryOpts: "-Dspark.history.fs.logDirectory=s3a://<ENTER_S3_BUCKET_NAME>/<PREFIX_FOR_SPARK_EVENT_LOGS>/"

Get Repo Info

helm repo add kubedai https://kubedai.github.io/spark-history-server
helm repo update

Install Chart

helm install spark-history-server kubedai/spark-history-server --namespace spark-history-server

Uninstall Chart

helm uninstall spark-history-server --namespace spark-history-server

Upgrading Chart

helm upgrade spark-history-server --namespace spark-history-server

How to access Spark WebUI

Spark WebUI can be accessed via ALB with Ingress or using port-forward once the Helm chart deployed to Amazon EKS or Kubernetes cluster.

Access Spark Web UI using port-forward

Step1:

kubectl port-forward services/spark-history-server 18085:80 -n spark-history-server

Step2:

Open any browser with and enter http://localhost:18085/ to access Spark Web UI

You should see the following home page

example of Spark Web UI Homepage

Spark Web UI Executors page

example of Spark Web UI Executors page

Community

Give us a star ⭐️ - If you are using Spark History Server, we would love a star ❤️